Fault Analysis

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

2015 IEEE International Conference on Big Data (Big Data)

Data Driven Predictive Analytics for A Spindle’s


Health
Divya Sardana, Raj Bhatnagar Radu Pavel, Jon Iverson
University of Cincinnati Techsolve, Inc., Cincinnati
email: [email protected], [email protected] email: [email protected], [email protected]

Abstract—Prediction of a spindle’s health is of critical signifi- The goal is to predict the future condition of a component,
cance in a manufacturing environment. Unexpected breakdowns generally to predict the remaining useful life of the component.
in a spindles functioning can lead to high costs and production Using prognostics methodologies, smart products and systems
delays. Therefore, developing methods which can predict the
time-to-failure of a spindle and its bearings can be of significant with embedded intelligence can be enabled to predict and
importance. One of the main challenges for successful prediction forecast their performance, as well as to synchronize their
by a purely data-driven techniques is the management and service needs with business support systems [1], [2].
analysis of a huge volume of spindle’s monitored operational A methodology followed by many researchers is to build
data. In this paper, we build a regression and clustering based mathematical models grounded in Physics and Engineering
prediction methodology, suitable for exploiting very high volumes
of monitored data, for a spindle’s time-to-failure prediction. We Sciences for various types of failures and faults. Then a
conquer the problem of dealing with huge volumes of monitored combination of a model and some features extracted from
data by aggregating features related to spindle vibration acceler- the monitored data are used to infer the remaining life of a
ation in the frequency domain for 24-72 hour windows of time. machine. These failure models are generally mathematically
A Fast Fourier Transform analysis of the spindle vibration and sophisticated but are constrained in the number of influencing
angular acceleration data is used to extract twelve aggregated
frequency domain features to train regression models. Further, factors that they can be designed to account for.
a graph clustering algorithm is used to improve the feature In this paper we present a methodology for using the
selection and thus the accuracy of the predictions. Once the model large quantities of monitored data and building purely data
is trained, it can be used to make predictions based upon energy driven models of failures and then deploying these models to
bursts observed over 24-72 hour windows. This can prove to be predict the remaining life of a spindle. A failure model derived
very useful as a cost effective prognostic tool that can be easily
deployed in the manufacturing industry. The spindle data used from monitored data is likely to be representative of multiple
in the paper has been collected at Techsolve Inc. over three run- factors, major and minor, affecting the operation of the spindle.
to-failure experiments performed on a spindle test-bed. In this The model building task is done by us in an off-line mode
paper, we present our model-building methodology along with using large amounts of monitored data. In the case presented
an experimental setup and empirical evaluation of the prediction here we worked with the data without any prior knowledge of
models and results.
the failure mode involved. Datasets collected by monitoring
I. M OTIVATION spindles till failures were collected over few months and used
for the model building exercise.
A. Data Driven Failure Models The results presented in this paper demonstrate that it is
In manufacturing environments it is necessary to improve indeed possible to derive very good prognostics models from
machine performance and decrease downtime. Early detection large amounts of monitored data. There are some aspects that
of emerging faults and the degradation trends can help pre- still need to be developed to make the models more deployable
vent downtime, plan maintenance efforts, increase productivity operationally and these are discussed in the following sections.
and save costs. Condition-based maintenance (CBM) systems
in manufacturing plants continuously monitor machines and B. A Spindle System
collect data related to the machines status and performance. The spindle is a key component of machine tools. Its failure
The challenge for field engineers and management staff is in renders the machine unusable until a replacement is installed.
making effective use of the humongous amount of data to Comparing to the other subsystems of a machine such as feed
accurately detect equipment degradation and also predict their axes, fluid tanks, and tool magazines, the spindle has one of
remaining useful life. the highest rates of failure and the ability to predict its end of
CBM can reduce the cost of total maintenance while keep- life is highly desirable.
ing the reliability high. This is achieved by monitoring critical Many studies have been conducted with respect to charac-
parts inside the machines and performing the maintenance terization of the condition of bearings and ability to predict
only when indicated by the monitored data. Prognostics- remaining time till their end of life. Various algorithms have
based health maintenance (PHM) involves models of failure been developed in an effort to identify robust methods for
mechanisms to help predict systems’ life-cycle management. predicting the remaining life of bearings, in particular, or

978-1-4799-9926-2/15/$31.00 ©2015 IEEE 1378


other machinery components, in general. However, the area maintenance purposes, and the amount of research in this field.
of prognostics is still considered in its early development as However there are still many practical issues that have been
the robustness needed for an industrial implementation is yet insufficiently addressed. For example, a combination of signal
to be achieved. features may capture well the trend of the fault progression but
may not correlate well with the fault progression in real time.
C. Related Work This may be due to atypical failure modes or a few minor
Many diagnostics and prognostics models for spindles have aspects that the models failed to build in [17].
been investigated in various operating environments. Medjaher
et al [3] apply a mixture of Gaussian hidden Markov models in D. Goals of Investigation
a dynamic Bayesian network for bearing wear prognostics and The primary goal of our investigation is to determine if
remaining useful life estimation. Mohanty et al. [4] propose a the monitored data has sufficient information embedded in it
multivariate Gaussian process approach using online training to help us build a prediction model for the remaining health
for fatigue crack length estimation in aluminum specimens. of a spindle. Typically this would require taking a signature
Liu et al [5], [6] propose an advanced autoregressive time of the monitored data, extracting some features from it, and
series model with a nonlinear accelerated degradation factor using these features to make a prediction. Identification of
for remaining useful life estimation of lithium-ion batteries. such useful features is one of the main challenges of the
Sikorska J.Z. et al [7] described RUL prediction models, work reported here. It is highly desirable that the feature set
including knowledge-based, life expectancy, artificial neural identified should be such that it can be easily computed during
networks and physical models, and presented a classification real-time operations and used for predicting the remaining
table to assist model selection for various industries. health of the spindle. Our work in the following sections
The existing machine health prognostic approaches can demonstrates that there are excellent features embedded in
broadly be divided into two main categories: Model based the dataset that enabled us to predict the remaining life of
and data-driven approaches [8]. The model-based approaches the spindle. However, these features are not easily computable
assume that a mathematical or a physical model is available in real time in operational manufacturing environments. The
that can describe the failure progression of the machine com- focus of our continuing research is to make the prediction
ponent; e.g., crack propagation model [9] [10]. The drawback system more amenable to the production environments.
of these approaches is that they can be applied only to specific The remaining paper is organized as follows. In section II
machine components and can be very complex to build. we describe the testbed from which our data was collected.
The data-driven methods, also known as artificial intelligent In section III we describe the results of our exploration of the
approaches are based upon the analysis of condition based large dataset to discover primitive and higher level aggregated
monitoring (CBM) data [11] available in the form of monitored features with good prediction capabilities. Section IV presents
entities like temperature, vibration, current, etc. The extracted a clustering based approach to identify more effective features.
entities, most commonly, the vibration data, are used to learn The following two sections describe our regression models and
prediction models. The raw vibration data is first used to their results for predicting the time to failure of the spindles.
extract features using one of the following three types of
II. T ESTBED AND M ONITORED DATA
methods: time domain analysis, frequency domain analysis,
and time-frequency domain analysis. For example in [12], Yan A. Testbed Design
et al. used short-time Fourier transform to extract the features, A spindle test-bed was built by TechSolve using a frequency
whereas in [13], Ocak et al. chose the wavelet packet transform drive, a motor, a poly-V belt transmission, and a simplified
to extract the relevant features for remaining life prediction. spindle using two bearings identical to the ones used in the
Recently, the data-driven techniques have increasingly been horizontal machining center. Figure 1 shows the spindle test-
used for the remaining life prediction of spindle bearings. In bed including the motor, the belt transmission, and the actual
2008, Gebraeel and Lawley [14] developed neural network spindle. A loading mechanism pulling on the nose of the
based models to predict the residual life of bearings. They spindle was added to accelerate the degradation. The force
used the characteristic increase in frequency amplitudes that pushing down on the spindle nose was kept approximately
accompanies the degradation process to build bearing-specific constant throughout all tests, at 35 lbf. Figure 3 presents a
degradation signals. Huang et al. [15] used self-organizing view of the section of the loading mechanism located under
maps and back propagation neural network-based models for the supporting stand. With the exception of one case (Case 4
bearing failure prediction. However, neural networks suffer below), the bearings were manufactured by NSK and can be
from the problem of slow convergence, especially, when deal- identified by their code: 7014CSN24TRDULP4Y.
ing with large quantities of monitored data. More recently, in The spindle test-bed was instrumented with a number of
2014, Dong et al. [16] used Support Vector Machine (SVM) to sensors. A uniaxial accelerometer type 607A11 (manufactured
predict the bearing degradation process using time-frequency by IMI Sensors) was placed on the spindle housing, on top
domain analysis of the vibration signal. of the back bearing. A thermocouple was inserted in a hole
These publications give a good perspective on the interest drilled into the spindle housing, close to the back bearings
in the health characterization and prognostics for safety and outer surface. The amperage drawn at the motor was monitored

1379
applied perpendicular on the nose of the spindle. For the first
three tests the degradation process was artificially accelerated
by using salt water and alcohol during the test period. For the
last three tests, named Dataset 4, Dataset 5, and Dataset 6
the spindle was run until its failure without intervention. A
preliminary analysis providing some insights into the datasets
was presented in [18].

TABLE I
D ESCRIPTION OF S PINDLE DATASETS USED

Data Start Failure Date Total Size on


set Date life disk
No.
4 7 Sept 4 March 2012 180 170 GB
2011 days
5 15 May 21 Sept 2012 129 155 GB
Fig. 1. Spindle Testbed at Techsolve, Inc. 2012 days
6 29 April 2 August 2013 96 days 99 GB
2013

To leave out the effects of mid-life interventions in the


degradation process and build data driven models only for
the natural progression of a spindle’s health we have used
in this paper only the last three datasets. For the sake of
consistency with dataset names in our collection and in our
previous paper [18], the three datasets names used in this paper
are: Dataset 4, Dataset 5 and Dataset 6, respectively. The
details of these datasets have been summarized in table I. The
bearings used for datasets 4 and 6 were manufactured by NSK
Fig. 2. Bottom View of the Loading Mechanism and can be identified by their code: 7014CSN24TRDULP4Y.
Dataset 5 was generated using BARDEN C114HCUL bear-
ings, extracted from the original spindle of a machine tool.
using a current sensor type SCT 050CX5 (manufactured by The bearings were re-greased with approximately 0.25 ml
Ohio Semitronics, Inc.), and the temperature of the motor was grease each and were re-installed in the spindle test-bed.
captured using a thermocouple. The ambient temperature was For each of the three experiments, the spindle was run non-
also acquired using a thermocouple placed in the vicinity of stop until its failure, apart from brief interruptions due to
the spindle test-bed. All thermocouples were J-type, Teflon power failures/Christmas break. Further, the spindle speed was
covered for insulation (manufactured by Watlow). The data kept the same (approx. 9100 rpm) for all the experiments.
channels are schematically represented in Figure 3. Data was collected at regular intervals of a few minutes
throughout the experiment’s duration. The data was saved in
B. Datasets Collected tdms format, a National Instruments (NI) file format for saving
A total of six experiments were run on the testbed and the well-documented measurement data.
data was collected. For all the tests, the spindle was run at
the same rpm (approx. 9100 rpm), while a load of 35 lbs. was III. DATA E XPLORATION FOR F EATURE D ISCOVERY

We examined various monitored quantities, including the


input current in amperes, temperatures at various locations
around the testbed, and the angular acceleration of the spindle
assembly. After preliminary evaluations it was very clear
that the angular acceleration contained significant information
about the health of the spindle and the other quantities had
very little, if any, correlation with the health of the spindle.
This conclusion was arrived at after analyzing the all the data
streams in time and frequency domains, extracting some pre-
liminary set of features, and determining their correlations with
the time-to-failure of the spindle. For the remaining discussion
Fig. 3. Data Channels Diagram for the Spindle Testbed in this paper we focus solely on the angular acceleration data.

1380
be noted that the range of amplitudes in Figure 5 is almost
three times that of the range for the spectrum in Figure 4 (The
y-axis scales are different). That is, on average, the energy in
the signatures near the end of the bearing’s life is about nine
times compared to the energy in the same spectrum when the
bearing’s life started.
The spectrum of positive frequencies in Figure 4 is shown
to be divided into three equal-sized windows, marked level
1, level 2, and level 3 and also indicated by black, green,
and red blocks. The intuitive idea we are pursuing is that
as the bearings face faults and degradations the energy in
some parts of the spectrum is likely to change more than
in some other parts. An investigation of a large number of
signatures’ Fourier transforms was performed to determine the
ideal number of distinct regions in the frequency spectrum.
We divided the frequency range into two, three, four, five,
Fig. 4. DFT for Spindle Acceleration During Early Life and six equal sized intervals and then measured the average
entropy of the energy distributions of each signature parti-
tioned into distinct intervals. That is, if the energy was equally
divided across the number of intervals then the entropy is
high and the signature has less discriminatory information.
But if the energy is concentrated primarily in only one of
the intervals then the entropy is low and the signature has
high discriminatory information. The average entropy for a
large number of sampled signatures was the smallest when
the spectrum is divided into three partitions; thus indicating
that the maximum discriminatory information about energy
distribution can be obtained by dividing the spectrum into
three equal parts. Comparing the DFT plots in Figures 4 and
5 we can see that near the end of life of spindle, the energy
concentrated in the frequency regions corresponding to level-
1 increases considerably than that in its early life. It is this
type of discriminatory information that we are interested in.
The very primitive level features we have decided to use are
shown below.
Fig. 5. DFT for Spindle Acceleration Close Just Prior to failure Primitive Features:
Level-1: Energy in the lowest third of the Spectrum
Level-2: Energy in middle third of the Spectrum
A. Primitive Frequency Domain Features
Level-3: Energy in the highest third of the Spectrum
The acceleration sensor records 25,000 samples per second
and therefore does account for some very high frequency B. Higher Level Aggregate Features
components in the spindle assemble. There is a 1-minute We computed the primitive energy value features for signa-
long sample of data recorded at least once every five minutes tures scattered throughout the life of a spindle and obtained
during the monitoring period. We take a 0.4 seconds worth the plot shown in Figure 6. This plot shows two times of
of signature which contains 10,000 values and perform its behavior during the life of a spindle. There are time periods
Discrete Fourier Transform. during which energy in each of the three frequency zones is
Figure 4 shows the Discrete Fourier Transform (DFT) of low. We call these periods as those of Calm operation. There
an acceleration signature from the spindle Dataset-4 during are also time periods during which the energy in one or all
its beginning life (9/7/2011). Figure 5 shows the frequency of the three spectrum partitions become high. We call these
distributions just before its failure (3/3/2012). The x-axis in periods as those of energy bursts. The plot in Figure 6 is
these plots is the normalized frequency ranging from −π/2 somewhat compacted by reducing the actual time-width of
to +π/2. The absolute frequency can be obtained by multi- some of the calm periods. This compaction is done to show the
plying the normalized frequency by the sampling rate (25,000 larger whole-life perspective on the energy values in spindle’s
samples / second). signatures.
The y-axis represents the amplitude of each frequency A Burst of Energy: For our analysis we use a crisp
component as it exists in the underlying signatures. It should definition of a burst of energy. When the average of energy

1381
intensities and durations of such energy-bursts in different fre-
quency zones was studied for the entire life-time of spindles.
Each energy-burst was scored based upon these characteristics.
Further, a global score was accumulated to keep building an
estimate of the degradation of the spindle at any given time.
Prognostic failure warnings were then issued based upon these
accumulated score values, and they provided very good, but
not so precise, warnings about the remaining time-to-failures.
This analysis of energy-bursts in a spindle’s history clearly
demonstrates that the number, duration and the energy con-
centrated in different frequency zones for the energy-bursts are
good indicators of spindle’s health and remaining life. Inspired
by this insight, the whole historical acceleration signature data
Fig. 6. A Compressed Lifetime Historical Record of Energy Features in
Dataset-4
was divided into small windows of time. More specifically,
features were evaluated for three window sizes, 24 hours, 48
hours and 72 hours, and used to predict the remaining life
values in any one partition (low, medium, or high frequencies) of a spindle. These windows specify the continuous durations
of the spectrum remains consistently above a selected burst- within which the bursts and their properties are computed.
threshold-energy for longer than a selected minimum burst- For example, we may identify hundred possibly overlapping
threshold-duration, then we consider that a burst of energy windows, each of 72 hours duration, from the entie lifetime
has been observed in the signatures. Typically, the minimum data of a spindle. For each of these windows twelve features as
energy threshold for a burst is at least a hundred times the listed in table II were computed. This gives us a feature vector
energy level of the calm period and the minimum duration of of hundred rows (samples) and twelve columns (attributes).
a burst is at least half an hour. So, the bursts are almost always Also, for each window, a target-value called percentage time-
detectable from the samples that have been taken every few (4- to-failure was calculated as the remaining percentage of life
5) minutes. We look for bursts of energy in the low frequency time, starting at the end of each time window. This is the target
window, in the middle frequency and in the high frequency value that we want to predict based on the twelve energy burst
window of the spectrum. Figure 6 shows the observed bursts related features. These feature vectors and target values were
by rectangular envelopes of appropriate maximum energy computed for each of the three datasets (datasets 4, 5, and 6)
(height) and duration (width). Different colors of the rectangles to get a total of nine feature matrices (FM), three for each
show the frequency window in which the burst was observed. window size, and for three datasets. These feature matrices
This plot of energy levels reveals a lot of information for each window size will be referred to in the format 4-
about the failure mode of the bearings and the features being FM (window size), 5-FM (window size) and 6-FM (window
manifested by this failure mode. This same phenomenon is size) in the remaining discussion.
observed for each of the three datasets under consideration and
therefore we can conclude that each of the three experiments TABLE II
is reflecting the same failure mode. The main inferences about H IGH L EVEL F EATURES FROM ACCELERATION DATA
the data plot in Figure 6 that we can draw are the following:
No. Feature Name
1) The spindle experiences periodic bursts of energy, inter- 1 Average energy in low frequency zone
spersed by calm periods of low energy in the signatures. 2 Average energy in med. frequency zone
It is hypothesized that, possibly, pieces of bearings 3 Average energy in low frequency zone
break loose and get grounded in the bearing assembly, 4 Max. energy in low frequency zone
5 Max. energy in med. frequency zone
resulting in the observed bursts of energy. 6 Max. energy in high frequency zone
2) There is a trend towards bursts of higher and higher 7 No. of bursts in low frequency zone
energy as we move closer to the end of life of the 8 No. of bursts in med. frequency zone
9 No. of bursts in high frequency zone
bearings.
10 Avg. duration of bursts in low frequency zone
3) There is a trend towards increasing frequency of bursts 11 Avg. duration of bursts in med. frequency zone
as we move closer to the end of life of the bearings. 12 Avg. duration of bursts in high frequency zone
4) In early part of a bearing’s life there are bursts of energy
seen in the lower third of the frequency spectrum and as
we get closer to the bearing’s end of life, we observe an C. Regression Models With Higher Level Features
increase in the bursts in the highest third of the frequency a) Training: The nine feature matrices, 4-FM 24, 5-
spectrum. FM 24, 6-FM 24, 4-FM 48, 5-FM 48, 6-FM 72, 4-FM 72,
These insights from the energy burst analysis of the his- 5-FM 72, 6-FM 72 (along with their corresponding actual
torical acceleration data help us in identifying periods of percentage remaining life vectors) were fed as inputs to a
disturbance experienced by the spindle. In [18], the counts, regression training model. A multiple-polynomial regression

1382
analysis was performed to model the relationship between for all datasets, are very good, and for some other feature
percentage time-to-failure and the twelve aggregated features, vectors the predicted TTF values are far off from the actual
individually, for each of the feature nine feature matrices. values.
Then, significance tests including goodness of fit and F-test Our next attempt is to identify and isolate the good feature
were conducted for each of the nine trained models. For each vectors and use only them for making the predictions. The
dataset, the models obtained for different window sizes were other feature vectors may be discarded and not used to make
compared so as to see the effect of changing the window predictions. Our intuition here is that there are few different
size on the regression results. These results are described in types of feature vectors - some work well for predictions and
more detail in section V. These nine regression models were some don’t. We perform a clustering operation to see this
trained on feature matrices for the nine individual datasets. pattern among all the feature vectors, and hopefully isolate a
However, in order to build a generic regression model, which cluster of feature vectors that performs well. We could do this
would transcend different spindle datasets, a combined feature successfully and the procedure and results are described in the
matrix from all the datasets was constructed. This analysis following section. With these results we discover a cluster of
was performed for only one window size, that of 72 hours. well-performing feature vectors. For real-time predictions we
More specifically, individual feature matrices 4-FM 72, 5- then first collect the raw data and build its feature vector. Only
FM 72 and 6-FM 72 were first normalized individually so if this feature vector is close to the known cluster of good-
that merging the feature vectors is done in a meaningful way. quality features, we use it to make predictions. Our results
The normalization was linear and limited each feature value show that we significantly improve the prediction quality but
to within the bounds of 0 and 1 within its own dataset. Then, for about half of the feature vectors we have to decline to
the normalized individual matrices were combined to form a make prediction about the TTF.
combined feature matrix called Comb-FM 72. A regression
model was trained on this combined matrix Comb-FM 72. IV. I MPROVING F EATURE V ECTOR S ELECTION BY
The significance test results including goodness of fit and F- C LUSTERING
test results for all these cases have been presented in Table III A study of calm and disturbed regions in historical ac-
in section V on empirical evaluation of results. celeration data, as done by the authors in [18], showed that
b) Insights from Regression Results: It can be seen from these regions are spread all over the spindle lifetime. Using
Table III that the best performance is obtained when we use this as intuition, it was hypothesized that there should be
the window size of 72 hours. The performance successively similarity among the feature vectors corresponding to calm
decreases for window sizes of 48 and 24 hours. It suggests that regions from all the datasets. Similarly, the windows cor-
we need prior 72 hours of data to make very good predictions responding to burst regions across the datasets should also
about the remaining life of a spindle. This establishes that be similar to each other. We used a graph clustering based
the monitored data has sufficient information to predict the approach to test our hypothesis. A graph G was constructed
remaining life of a spindle even though it is disruptive in real as follows. Each row of the feature matrix Comb-FM 72,
life manufacturing situations to obtain this data for full 72 i.e., each feature vector corresponding to a 72 hour period
hours from a spindle. All the results in this table are also for window (from all three datasets) was considered as a node in
the cases when a model trained on a spindle is used to make the graph. A value of correlation coefficient was calculated
a prediction for the same spindle. for each pair of feature vectors. Edges were drawn among
To design more general prediction models we trained a the pairs of feature vectors where the correlation coefficient
regression model with the combined Comb-FM 72 data set as value among them was greater than 0.50 (denoting moderate
mentioned in the previous paragraph. Results in Table V, third to high correlation). Further, these edges were labeled using
column, show the performance of the combined regression their correlation coefficient values. This edge-weighted graph
model when tested on the three datasets using the 72 hour was next fed as input to a weighted graph clustering algorithm
windows. This combined model has somewhat acceptable called Markov chain clustering (MCL) [19]. MCL is a widely
performance for dataset 4 but has very poor performance for used density-based graph clustering algorithm. It is based upon
datasets 5 and 6 (R2 values of 0.39 and 0.35). finding dense clusters in a graph by the simulation of a flow
Plots in Figures 9, 10, and 11 show the performance of diffusion process or a random walk in a graph. We define
different regression models for predicting the time-to-failue a measure called Cluster Burst Duration (CBD) in order to
(TTF) for features vectors from different time windows of the characterize each cluster as belonging to a bursty region or a
monitored data. The points on the blue curve in each plot are calm region. The clustering results are described in detail in
the actual TTF values for the feature vectors on the x-axis. section V.
The points on the red curve show the predicted TTF values The intuition behind doing the clustering analysis is to see
from the model built from the vectors for the same dataset. if we could separate the feature vectors corresponding to the
The points on the green curve show the predicted TTF values primarily bursty region windows from those corresponding
from the combined model built by merging data from all the to the primarily calm region windows for all the datasets.
three datasets. It can be seen from these plots that R2 is poor The clustering analysis using the MCL algorithm resulted in
but for some of the feature vectors the predicted TTF values, six clusters. Based upon the CBD values, these clusters were

1383
categorized as belonging to either primarily bursty or primarily for the number of explanatory terms in a model relative
calm regions. A total of three clusters were labeled as bursty to the number of data points since the R2 automatically
clusters. Out of these three bursty clusters, only two clusters and spuriously increases when extra explanatory variables
contained feature vectors from all the three datasets. This are added to the model. The smaller the difference between
conveys to us that this type of signatures are common among R2 and the adjusted R2 , the better is the fit of the model.
all three types of datasets. The cluster that did not contain Mathematically,
feature vectors from all the datasets signifies characteristics SSE(n − 1)
not shared by all the datasets. Based upon this intuition, the AdjustedR2 = 1 − .
SST (v)
feature vectors corresponding to the two bursty clusters were
combined in a matrix called Clus-FM 72. The training of the where v is the residual degrees of freedom in the model.
regression model was repeated using this short-selected feature
B. Individual Dataset Models
matrix Clus-FM 72 as input. The significance results show
that this trained model has much better goodness of fit than In order to study the effect of changing the window size for
the previous set of models. So, the conclusion from this is that aggregating the features, a regression model was trained for
the feature vectors that are close to the prototypes of the two three different window sizes: 24 hours, 48 hours and 72 hours.
selected clusters are much better at predicting the remaining Such models were trained for each of the three datasets 4, 5
life of a spindle. and 6. The regression statistics for these nine trained models
have been summarized in table III. In this table, the models
V. E MPIRICAL E VALUATION OF T HREE R EGRESSION have been named as the dataset followed by the window size in
M ODELS hours (e.g. 4 72). For each of the dataset, similar trends could
A. Significance Study of Regression Models. be seen as the window size was increased from 24 hours to 72
All the Regression analysis experiments were performed hours. It can be seen that the R2 value increases on increasing
using an excel plugin called NUMXL. We conducted signif- the window size from 24 hours to 72 hours and the SSE value
icance tests for the constructed models, including, goodness reduces. It follows the intuition that larger the window size, it
of fit and F-test. Goodness of fit describes how well a model better captures the aggregate statistics like avg. burst duration,
fits the set of observations. Such measures are mostly used number of bursts for disturbance periods which last for a long
to summarize the discrepancy between observed values and time. For all the nine test cases in table III, it can be seen that
the actual expected values under the model in question. F- the trained models are significant as per the F-test and the null
Test is a statistical test that is most commonly used to test the hypothesis can be rejected. This provides good evidence that
applicability of the regression model for the given problem. A the polynomial relationship between the aggregated features
p-value cutoff of 5% was used to discard the null hypothesis. and the predicted time-to-failure values does exist for the
We have used three standard measures for testing the chosen window sizes. The difference between the R2 and the
goodness of fit of the model, namely the Coefficient of adjusted R2 values for all the cases is quite small, further
Determination, R2 , Adjusted R2 , and Sum of Squared Error indicating that the models fit well.
(SSE) [20]. We briefly introduce these quantities here. Let
the predicted values or fitted values by the model be ŷj and TABLE III
R EGRESSION RESULTS USING INDIVIDUAL MODELS FOR DIFFERENT
the actual observations be yj . Let n be the total number of WINDOW SIZES
observations. Coefficient of determination, R2 measures the
Model R Adj. R Std. Obs. F p
proportion of the total variations in the dependent variable y, square square Error value
that can be explained by the regression model and it is defined % % %
as 4 72 87.8 75 0.15 48 6.87 0.0
SSE 4 48 77.4 65.5 0.18 71 6.55 0.0
R2 = 1 − . 4 24 57.5 48.6 0.22 139 6.44 0.0
SST
5 72 87.7 69.2 0.16 41 4.75 0.1
Here, the total sum of squares (SST) measures the total 5 48 71.6 53.1 0.2 62 3.88 0.0
amount of variation in observed y values and is defined as 5 24 55.0 43.8 0.22 121 4.89 0.0
n 6 72 99.0 92.9 0.08 29 16.35 0.7
 6 48 92.1 82.2 0.13 44 9.28 0.0
SST = (yj − ȳ)2 . 6 24 83.4 77.0 0.14 88 13.16 0.0
j=1

The sum of squared residuals (SSE) measures the amount


C. Feature Vector Clustering Results.
of variability that the regression model can not explain and is
defined as Feature vectors in the combined feature matrix Comb-
n
FM 72 (for window size 72 hours and all datasets) were
SSE = (yj − ŷj )2 . clustered together to see if we could separate the feature
j=1
vectors corresponding to the bursty region windows from those
The Adjusted R2 is a modification of R2 , that has been corresponding to the calm region windows for all the datasets.
adjusted based upon the residual degrees of freedom. It adjusts As described above, a graph clustering algorithm called MCL

1384
was used for this purpose. The results for MCL clustering would help in the identification of the characteristics of the
were obtained by using its implementation the ClusterMaker bursts that occur in all the three datasets. This in turn, helps
plugin [21] available in a graph analysis tool called Cytoscape in designing a more generic prognostic framework.
[22]. MCL has a single parameter called inflation, which can The feature vectors belonging to clusters 1 and 3 have been
be used to tune the granularity of the clusters. It was set to presented in Figures 7 and 8. In these Figures, the feature
be = 3 in our analysis. Out of a total 119 feature vectors, vectors belonging to different datasets have been labeled and
MCL assigned 110 feature vectors into clusters of size >= colored differently. The width of each node has been drawn
3, resulting in a total of six clusters. In order to characterize proportionately to the TBD value for that feature vector.
each cluster as belonging to the bursty region or calm region, Further, each node label also includes the actual percentage
we define a measure called Cluster Burst Duration (CBD). time-to-failure value for the window to which the node belongs
Before describing CBD, we define what we mean by Total to. For example Figure 7 shown the composition of cluster
Burst Duration or TBD for a feature vector. 1. It contains feature vectors from dataset 5 with 0.0% time-
Total Burst Duration (TBD:) For a feature vector, let the to-failure, feature vectors from dataset 4 with a time-to-failure
total burst duration in each frequency zone be b1 , b2 and of 26.8%, and feature vectors from dataset 6 with a time-to-
b3 . These quantities are computed for each feature vector by failure of 23.5%. These feature vectors in cluster 1 are strongly
multiplying its avg. burst duration with the number of bursts correlated with each other.
in a frequency zone. Depending upon the dataset a feature
vector belonged to, these burst durations were normalized TABLE IV
C LUSTERS OBTAINED USING MCL CLUSTERING ON ALL THE FEATURE
according to the maximum burst durations that occurred in VECTORS IN C-FM 72
different zones in that complete dataset. Let the normalized
burst durations in the three frequency zones be nb1 , nb2 and Cluster # #4 #5 #6 CBD
# Nodes
nb3 . Based upon these values, the measure TBD for a feature 1 3 1 1 1 300
vector was defined as follows. 2 10 8 0 3 53.4
3 47 6 23 8 11.83
T BD = (nb1 + nb2 + nb3 ). 4 31 20 1 10 0
5 11 6 0 5 0
The above defined value of TBD was calculated for each 6 8 0 6 2 0
feature vector belonging to a cluster. A high value of TBD for
a feature vector implies a large amount of burst activity taking On further evaluation of cluster 1, it is found that the three
place within the window corresponding to that feature vector. feature vectors (5 0.0, 4 26.8 and 6 23.5 ) in this cluster
Based upon these TBD values, the quantity CBD is defined corresponded to the peak values of TBD for their respective
as follows. datasets! From this, it can be concluded that the peaks of
Cluster Burst Duration (CBD): The median of TBD values energy bursts belonging to different datasets do share some
for all the feature vectors belonging to a cluster is computed similar features. Further, it also strengthens our hypothesis
to be the Cluster Burst Duration or CBD for that cluster. that the choice of aggregated features used in the regression
A non-zero CBD value for a cluster implies that more than model does help in grouping similar feature vectors together.
half of the time windows belonging to that cluster contain a This also reveals the fact, that in the life time of a spindle,
large amount of energy burst activity within them. A CBD the peak burst occurs, not necessarily towards the very end
value of zero implies that more than half of the time windows of the spindle lifetime. Even if a peak burst is followed by
in that cluster had no energy burst activity. Therefore, this relatively calmer regions, as was the case for datasets 4 and 6,
value of CBD can be used as a measure to characterize a the spindle may not have much remaining life left after such
cluster as belonging to primarily burst windows or primarily a peak burst occurs.
calm windows.
A summary of CBD values obtained for all the six clusters D. Regression Model Using Combined Feature Vectors
returned by the MCL algorithm is shown in table IV. Based In the preceding discussion it was observed that out of all
upon the CBD values for each cluster, it can be said that the window sizes, the regression models trained for the window
top three clusters can be characterized as belonging to the size of 72 hours were the most significant for all the three
bursty time windows, whereas, the last three clusters can be datasets. We denote this class of regression models trained
characterized as belonging to the calm time windows. This on individual datasets as indv Model. However, in order to
table also contains, for each cluster, the number of feature make the regression model more generic, it was decided
vectors coming from each of the three datasets. Focusing to retrain the regression model using the combined feature
attention only on the bursty clusters(1, 2 and 3), it can be seen matrix Comb-FM 72. Let this regression model be called as
that clusters 1 and 3 contain feature vectors belonging to each comb Model. Further, using the feature vectors from clusters
of the three datasets, whereas cluster 2 has feature vectors only 1 and 3 of the clustering output were combined together in a
belonging to datasets 4 and 6. Based upon this observation, feature matrix called clus-FM 72. The regression model was
the attention was further narrowed down to study clusters 1 retrained using this feature matrix clus-FM 72 as input. Let
and 3 in further detail. This decision was made because this this regression model be called as clus Model. R2 values were

1385
Fig. 7. Cluster #1 obtained after clustering of C-FM 72.

Fig. 9. Actual and Predicted TTF values for dataset 4 for window size of 72
hours.

Fig. 8. Cluster #3 obtained after clustering of C-FM 72.

computed for each dataset to evaluate the predictions made


using comb Model and clus Model. As discussed earlier, table
III lists the values of R2 obtained for all datasets on their Fig. 10. Actual and Predicted TTF values for dataset 5 for window size of
72 hours.
respective indv Model. In table V, we compare the R2 values
obtained for each dataset for their indv Model with the values
obtained using comb Model and clus Model. Further, for each the dataset, it can be seen that the clus Model predicts the
of the three datasets, plots comparing the actual value of time- peak bursts with very good accuracy.
to-failure (TTF) and the corresponding predicted values using
all the three trained regression models are shown figures 9, TABLE V
10 and 11. R2 VALUES FOR THE REGULAR , COMBINED , AND CLUSTERED
REGRESSION MODELS FOR WINDOW SIZE 72.
As is evident from these plots, the regression models trained
on the individual feature matrices 4-FM 72, 5-FM 72 and 6- Dataset indv TTF R2 comb TTF R2 clus TTF R2
FM 72 obtain the highest value of R2 . When the regression # % % %
4 87.8 75.0 96.8
is performed on the combined matrix Comb-FM 72, the R2 5 87.7 39.0 52.0
value goes down for each of the dataset. This was expected 6 99.0 35.0 90.2
as the matrix Comb-FM 72 contains feature vectors from all
the three datasets. The MCL clustering of feature vectors
belonging to Comb-FM 72 helped in identifying the feature VI. S IGNIFICANCE AND I MPACT.
vectors corresponding to bursts from all the datasets which In the manufacturing industry, spindle is a key component of
had high similarity. When the regression model is retrained the machine tool. Any breakdown occurring in the spindle can
using only the feature matrix corresponding to the burst lead to a failure in the machine function until a replacement
clusters (Clus-FM 72), the R2 value for predicting improves is installed. Therefore, any attempt to predict its time- to-
considerably. This improvement is significant for datasets 4 failure even a few weeks before its actual failure can be of
and 6, whereas not that much for dataset 5. Also, for each of great help. Several model-based and data-driven solutions have

1386
[2] D. Djurdjanovic, J. Lee, and J. Ni, “Watchdog agentan infotronics-based
prognostics approach for product performance degradation assessment
and prediction,” Advanced Engineering Informatics, vol. 17, no. 3, pp.
109–125, 2003.
[3] K. Medjaher, D. A. Tobon-Mejia, and N. Zerhouni, “Remaining useful
life estimation of critical components with application to bearings,”
Reliability, IEEE Transactions on, vol. 61, no. 2, pp. 292–302, 2012.
[4] S. Mohanty, A. Chattopadhyay, P. Peralta, S. Das, and C. Willhauck,
Fatigue Life Prediction Using Multivariate Gaussian Process. Defense
Technical Information Center, 2008.
[5] D. Liu, Y. Luo, Y. Peng, X. Peng, and M. Pecht, “Lithium-ion battery
remaining useful life estimation based on nonlinear ar model combined
with degradation feature,” in Annual Conference of the Prognostics and
Health Management Society, vol. 3, 2012, pp. 1803–1836.
[6] D. Liu, Y. Luo, J. Liu, Y. Peng, L. Guo, and M. Pecht, “Lithium-ion
battery remaining useful life estimation based on fusion nonlinear degra-
dation ar model and rpf algorithm,” Neural Computing and Applications,
vol. 25, no. 3-4, pp. 557–572, 2014.
[7] J. Sikorska, M. Hodkiewicz, and L. Ma, “Prognostic modelling options
for remaining useful life estimation by industry,” Mechanical Systems
and Signal Processing, vol. 25, no. 5, pp. 1803–1836, 2011.
[8] A. Moosavian, H. Ahmadi, A. Tabatabaeefar, and M. Khazaee, “Compar-
Fig. 11. Actual and Predicted TTF values for dataset 6 for window size of ison of two classifiers; k-nearest neighbor and artificial neural network,
72 hours. for fault diagnosis on a main engine journal-bearing,” Shock and
Vibration, vol. 20, no. 2, pp. 263–272, 2013.
[9] J. A. Harter, “Comparison of contemporary fcg life prediction tools,”
International Journal of Fatigue, vol. 21, pp. S181–S185, 1999.
been proposed in the literature for spindle failure prediction. [10] S. Marble and B. P. Morton, “Predicting the remaining life of propulsion
However, challenges still exist related to the complexity of the system bearings,” in Aerospace Conference, 2006 IEEE. IEEE, 2006,
models and the availability of data. Further, in order for such a pp. 8–pp.
[11] J. H. Williams, A. Davies, and P. R. Drake, Condition-based mainte-
unit to be deployed in the actual industry, minimizing the total nance and machine diagnostics. Springer Science & Business Media,
amount of data needed to be stored for predictive analytics can 1994.
lead to huge cost and effort savings. [12] J. Yan, C. Guo, and X. Wang, “A dynamic multi-scale markov model
based methodology for remaining life prediction,” Mechanical Systems
VII. C ONCLUSION and Signal Processing, vol. 25, no. 4, pp. 1364–1376, 2011.
[13] H. Ocak, K. A. Loparo, and F. M. Discenzo, “Online tracking of bearing
In this paper, we have presented a purely data-driven wear using wavelet packet decomposition and probabilistic modeling: A
prognostic methodology to predict a spindle’s time-to-failure. method for bearing prognostics,” Journal of sound and vibration, vol.
302, no. 4, pp. 951–961, 2007.
Three very large volume spindle vibration acceleration datasets [14] N. Z. Gebraeel, M. Lawley et al., “A neural network degradation model
were collected at a test bed set up at Techsolve. A detailed for computing and updating residual life distributions,” Automation
frequency domain analysis of the monitored data revealed that Science and Engineering, IEEE Transactions on, vol. 5, no. 1, pp. 154–
163, 2008.
numerous bursts of energy occurred over the entire life of the [15] R. Huang, L. Xi, X. Li, C. R. Liu, H. Qiu, and J. Lee, “Residual life
spindle. The number, duration and the energy concentration in predictions for ball bearings based on self-organizing map and back
these bursts were found to be strong indicators of the health propagation neural network methods,” Mechanical Systems and Signal
Processing, vol. 21, no. 1, pp. 193–207, 2007.
of the spindle. This motivated us to train regression models on [16] S. Dong, S. Yin, B. Tang, L. Chen, and T. Luo, “Bearing degradation
aggregated data for short windows of time. More specifically, process prediction based on the support vector machine and markov
polynomial regression models were trained on a set of 12 model,” Shock and Vibration, vol. 2014, p. 15, 2014.
[17] L. Liao and R. Pavel, “Machinery time to failure prediction-case study
features (as shown in table II) aggregated for small 24-72 and lesson learned for a spindle bearing application,” in Prognostics and
hour period windows of time. A graph based clustering of the Health Management (PHM), 2013 IEEE Conference on. IEEE, 2013,
feature vectors was further used to improve the accuracy of pp. 1–11.
[18] D. Sardana, R. Bhatnagar, R. Pavel, and J. Iverson, “Investigations on
predictions. Once the model is trained, it can be used to make spindle bearings health prognostics using a data mining approach,” in
prognostic decisions using monitored data for only small 24- Proceedings of the Society for machinery Failure Prevention Technology
72 hour period windows of time. This can have a huge impact (MFPT), 2014, 2014.
[19] S. Van Dongen, “Graph clustering via a discrete uncoupling process,”
on the machine maintenance and data storage costs incurred in SIAM Journal on Matrix Analysis and Applications, vol. 30, no. 1, pp.
manufacturing plants. Further, The regression and clustering 121–141, 2008.
based prediction methodology presented in this paper can be [20] J. H. Zar et al., Biostatistical analysis. Pearson Education India, 1999.
[21] J. H. Morris, L. Apeltsin, A. M. Newman, J. Baumbach, T. Wittkop,
seen as a proof of concept. By using data from more spindle G. Su, G. D. Bader, and T. E. Ferrin, “clustermaker: a multi-algorithm
test bed experiments, our proposed methodology could be used clustering plugin for cytoscape,” BMC bioinformatics, vol. 12, no. 1, p.
to build generic prediction framework for predicting the time- 436, 2011.
[22] P. Shannon, A. Markiel, O. Ozier, N. S. Baliga, J. T. Wang, D. Ramage,
to-failure of any given spindle. N. Amin, B. Schwikowski, and T. Ideker, “Cytoscape: a software en-
vironment for integrated models of biomolecular interaction networks,”
R EFERENCES Genome research, vol. 13, no. 11, pp. 2498–2504, 2003.
[1] J. Lee, J. Ni, D. Djurdjanovic, H. Qiu, and H. Liao, “Intelligent
prognostics tools and e-maintenance,” Computers in industry, vol. 57,
no. 6, pp. 476–489, 2006.

1387

You might also like