1

15

th

International Congress on Sound and Vibration

6-10 July 2008, Daejeon, Korea

LEAST SQUARES SUPPORT VECTOR MACHINE BASED

CONDITION PREDICTION FOR BEARING HEALTH

Fagang Zhao

1

, Jin Chen

1

and Lei Guo

1

1

State Key Laboratory of Mechanical System and Vibration, Shanghai Jiao Tong University

Shanghai 200240, China

fagang@sjtu.edu.cn

Abstract

Due to the importance of condition maintenance, it is urgent to predict future condition in order to avoid

unexpected failure. So this paper presents a new scheme for the condition prediction of ball bearings

health based on least squares support vector machine (LS-SVM). Simulation and the practical

application have been carried out to validate the method. In the practical application, vibration data

which was collected from equipment is used to predict the future condition

1. INTRODUCTION

The manufacturing and industrial sectors are increasingly required to produce more and higher

quality products but avoid accidents as less as possible. As manufacturing equipments become

more complex and sophisticated, machine breakdowns are common. However, failure

conditions are difficult to identify and localize in a timely manner, scheduled maintenance

practices tend to reduce machine lifetime and increase down-time, resulting in loss of

productivity. So in order to prevent unexpected failures from shutdown, and reduce the

economic loss, the abnormal condition should be found as early as possible. Therefore,

condition monitoring and trend prediction is important for condition maintenance [1-3]. It uses

the features extracted form the raw data to make sure of the machine condition, and to predict

the trend. Trend prediction and residual life prediction is meaningful to maintenance decision.

To fulfil prognostics, there are three steps. At first, the defect or abnormality should be able

to be detected at its early stage and it is better to know which part causes the fault. Secondly, the

part or machine should be monitored continuously, so that we can get the trend data to predict

the machine state in future. At last, a prediction needs to be generated estimating the trend or the

residual useful life (RUL). Above the three steps the third step is the most difficult.

There are many indicators to detect the fault of equipments, and these indicators can also be

used to track the trend and predict the future condition. But how to select useful indicators as

the prediction parameters is difficult for researchers. There are many indicators, such as

time-domain statistical indicators: Peak-Peak (P-P), Root Mean Squares (RMS), Crest Factor,

Skew and Kurtosis; wavelet index, energy factor etc, there are so many indicators that we can

not use any indicator to predict the condition or residual life. According to [4] [5], we choose

RMS as the indicator in this research. Furthermore, to select the proper model is difficult in the

prediction of residual life, some researchers constructed the prediction models based on crack

propagation models: namely Paris Law [6], [7]. Ref. [8] uses neural network to predict bearing

life, and compares with the real life. Wang et al. [9] compared the results of applying recurrent

neural networks and neural–fuzzy inference systems to predict the fault damage propagation

trend. Yan et al. [10] employed a logistic regression model to calculate the probability of failure

ICSV15 • 6-10 July 2008 • Daejeon • Korea

2

for given condition variables and an ARMA time series model to trend the condition variables

for failure prediction. Wang and Vachtsevanos [11] applied dynamic wavelet neural networks

to predict the fault propagation process and estimate the RUL as the time left before the fault

reaches a given value. Yam et al. [12] applied a recurrent neural network for predicting the

machine condition trend. Wang and Lee proposed a wavelet-neural network prediction

algorithm for performance evaluation, and evaluated and predicted the wearing condition of

machine spindle and cutting tools [13], and so on. In recent years, because the industry is in

urgent need of condition prediction and residual life prediction, the research in the field of fault

diagnosis is transferred to condition monitoring and prediction. So now researchers focus on

how to predict the future condition of machines intelligently and accurately, and reduce the

frequency of sudden accidents.

In this paper, we propose a new scheme for the condition prediction of a ball bearing health

based on least squares support vector machine (LS-SVM). This scheme can effectively research

equipments’ whole life cycle from the first time it comes into use to the final failure. In order to

validate the model, we carry out an experiment to test the new method. Fig. 1 is the whole flow

chart of this research.

Figure

.1.

the overall flow diagram of this research

2. THEORETICAL BACKGROUND OF LS-SVM [14]

Vapnik proposed support vector machines (SVM) method based on statistical learning [15].

Traditional support vector machine gets the solutions with optimal quadratic function. In the

process of optimal solution, the dimension of the matrix is related with the number of training

samples directly, and it is feasible to use inner product to solve the medium-scale optimal

solution. But to large-scale, the matrix should be decomposed or trimmed to reduce the

complexity. Much research has been done in the large-scale optimal solution. However, they

still use quadratic inequality constraints, which cost much time and can not process real-time

data. So it usually has to be used to process off-line data, which constricts the application of

SVM. Suykens [14] introduced variance term in the optimal function of SVM, and changed

constraints from inequality to equality, and then proposed SVM based on the equality

constraints, which is called Least Squares Support Vector Machine (LS-SVM). Since the

variance term was introduced into LS-SVM, optimal function of traditional SVM changed into

equality constraints, which the solution has changed from optimal quadratic function to linear

function, simplified the complexity of solution.

ICSV15 • 6-10 July 2008 • Daejeon • Korea

3

The LS-SVM algorithm is as follows. Suppose the training set:

( )

{

}

,| 1,2,....,

k k

D x y k N= =

,

n

k

x

R∈

,

m

k

y R∈

Where

k

x

is the input data,

k

y

is the output data, in the primal space (

w

space), the

optimization problem can be describe as:

( )

2

,,1

1 1

,,

2 2

min

M

T

L

S i

w B i

L w B w w

ε

ε

γ ε

=

= +

∑

(1)

Subject to the equality constraints:

( )

,1,...,

T

i i i

y w x B i M

ϕ ε= + + =

(2)

Where the nonlinear mapping

ϕ

:

n m

→

maps the input data into a high dimensional feature

space, which can be infinite dimensional. In the high dimensional feature space, the super

classification face is defined by

n

w R∈

,

B

∈

.

w

is weight vector in the high dimensional

feature space,

B

is the bias term, and

i

ε

is the error variable,

γ

is the adjusting factor, and Eqn.

(1) is formula of the least squares support vector machine, which has been investigated by

Saunders et al [16] and Suykens & Vandewalle [17].

According to the optimal function Eqn. (1), we can define the Lagrange function:

( ) ( ) ( )

( )

1

,,;,,

M

T

LS LS i i i i

i

L w B J w B w x B yε α ε α ϕ ε

=

= − + + −

∑

(3)

Where

i

α

denotes Lagrange multiplier, and the KTT optimality function is

( )

( )

1

1

0

0 0

0,1,...,

0 0,1,...,

M

i i

i

M

i

i

i i

i

T

i i i

i

LS

w x

w

LS

B

LS

i M

LS

w x B y i M

αϕ

α

α γε

ε

ϕ ε

α

=

=

∂⎧

= → =

⎪

∂

⎪

∂

⎪

= → =

⎪

∂

⎪

⎨

∂

⎪ = → = =

∂

⎪

⎪

∂

⎪

= → + + − = =

∂

⎪

⎩

∑

∑

(4)

After eliminating of

w

and

i

ε

, we can get the following set of linear equations.

1

0

0 1

1

T

B

y

K I

α

γ

−

⎡ ⎤

⎡

⎤ ⎡ ⎤

=

⎢ ⎥

⎢

⎥ ⎢ ⎥

+

⎣

⎦ ⎣ ⎦⎢ ⎥

⎣ ⎦

(5)

Where

[ ]

1

...

M

x

x x=

,

[

]

1

;...;

M

y y y=

,

[

]

1 1;...;1

=

,

[

]

1

;...;

M

α

α α=

,1,...,

i j N

=

. As in the SVM

theory, according to Mercer’s condition, the matrix

K

can be written as

( )

( )

(

)

,

T

ij i j i j

K K x x x xϕ ϕ= =

(6)

Then the function estimation of LS-SVM is

( )

1

N

i ij

i

y

x K b

α

=

=

+

∑

(7)

Where

i

α

and

b

can be computed by Eq. (5). RBF kernels one can take [15]

(

)

2

2

exp

ij i j

K x xη= − −

(8)

We can see from above that all the constraints have changed to be the equations, and we can

solve the linear equations to get the results. Obviously, linear equations can solved by least

ICSV15 • 6-10 July 2008 • Daejeon • Korea

4

square, which makes the computation easy and reduces computation time, So LS-SVM has

strong adaptability.

Furthermore, we choose normalized rooted mean squares error (NRMSE) as the index to

decide whether the prediction result is good or not. The expression is:

( )

2

,,

1

1

1

N

i pre i obs

i

obs

O O

N

NRMSE

S

=

−

−

=

∑

(9)

Where

N

is the number of prediction data;

obs

S

is the standard deviation of samples;

,

i pre

O

is

the predicted value; and

,

i obs

O

is the true value at the time of

i

.

3. METHOD

In this research, we propose the method of LS-SVM to predict machines condition. The flow

chart of the proposed method is given in Fig.2.

Figure

.2. Flowchart of predict method with LS-SVM

There are many basis functions for LS-SVM, such as radial basis function (RBF) kernel,

linear function, polynomial function, wavelet function. RBF based LS-SVM has a good

adaptability to vibration signals, its robustness is better than the other basis functions based

LS-SVM, and its prediction preciseness is better than traditional SVM and neural network,

computation time is very small, but high efficiency. So In this Paper, RBF kernel will be used,

which is defined as:

2

2

(,) exp

2

K

σ

⎛ ⎞

−

=

⎜ − ⎟

⎜ ⎟

⎝ ⎠

x

y

x y

(10)

In order to get the more precise result, we utilize the leave-one-out cross validation approach.

The kernel width and the regularization parameter must be decided when we use the RBF

kernel. In this paper, we adopt a method to determine these parameters based on the

cross-validation idea. We define two data sets, namely the training set and the validation set

from the observed time series, respectively. The prediction error is estimated via cross

validation and when the model provides the lowest estimated error,

σ

is chosen. It can be

shown that for large data sets, cross validation is asymptotically equivalent to analytical model

selection. In this case, the computational cost of cross validation in terms of computational time

and training time is high.

ICSV15 • 6-10 July 2008 • Daejeon • Korea

5

4. SIMULATION

In this section we present the results of the simulations and compare with the traditional

LS-SVM method. In the research of time series prediction, sunspot series and Mackey-Glass

time series are often used to test the algorithm. Here we use sunspot series. The sunspot data

extracted from Matlab toolbox. It is a sample of size

280m

=

. The first 200 values of the

sunspot data is used to train the model and the remaining values are used to predict. The

NRMSE yielded by LS-SVM is 1.758. For the sunspot dataset (normalized) we see that the

LS-SVM model provides us a good result. This method is significant when compare with

traditional LS-SVM. To illustrate the performance of the LS-SVM, the predicted time series are

shown in Fig. 3.

5

10

15

20

25

30

35

40

−

0.4

−

0.2

0

0.2

0.4

0.6

0.8

1

Predicted Serial

Real Serial

Figure

. 3.

The predicted result of the sunspot

5. EXPERIMENT

An experiment of condition monitoring is set up for validate the model. Fig. 4 is the position

where the sensor is installed.

Figure

. 4. Photo of equipment installed sensors

Then through data acquisition, preprocessing, feature extraction and feature reduction,

training samples and test samples are obtained. After that, train LS-SVM with the training

samples which are the time series. At last the model is employed to predict future condition and

compared with the result of traditional LS-SVM. Fig. 5 is the sketch of data acquisition system,

which includes sensors and signal conditioner, anti-aliasing filter, data acquisition computer,

oscilloscope and

dynamic analyzer

. Signals are probed by sensors. Then after signal conditioned

ICSV15 • 6-10 July 2008 • Daejeon • Korea

6

and anti-aliasing filtered, the information is collected by computer. Oscilloscope and on-line

monitoring system are employed to analyze the validity of the signals. Fig. 6 is the result of the

predicted experiment series (normalized) used LS-SVM.

0

5

10

15

20

25

30

35

40

4

5

−0.15

−0.1

−0.05

0

0.05

0.1

0.15

0.2

0.25

0.3

RMS Value

Predicted Serial

Real Serial

Figure 5. Sketch of data acquisition system Figure

. 6.

The predicted result of the experiment

6. CONCLUSIONS AND FUTURE WORK

According to the nonlinearity of bearings vibration, LS-SVM model is introduced into times

series prediction of vibration in this paper to predict bearing condition.

To provide a more reliable and real-time prognostic tool for the bearing condition, we

developed LS-SVM prediction approach to predict the behaviour of dynamic systems in this

paper. According to the example given above, we can see that it is useful to implement for both

bearing residual life prediction and condition prediction. Test results of this study showed that

the LS-SVM model is a reliable forecasting tool. It can capture the system’s dynamic behaviour

quickly and track the system’s features accurately. It is also a robust forecasting tool in terms of

its capabilities to accommodate different system operation conditions and variations in

system’s dynamic characteristics.

There are two aspects for the further research: one is to implement the predictor in other

complex industrial facilities and to develop new strategies for multi-step predictions, the other

one to find out if there is a better method to predict the time series or not.

7. ACKNOWLEDGEMENTS

The research was supported by the National Natural Science Foundation of China (Approved

Grant: 50675140) and the National High Technology Research and Development Program of

China (863 Program, NO. 2006AA04Z175).

REFERENCES

[1]

Vichare, N, and Pecht, M., "Prognostics and Health Management of Electronics", Trans. on

Components and Packaging Technologies, IEEE 29, 222-229 (2006).

[2]

Wang, W., “A two-stage prognosis model in condition based maintenance”. European Journal of

Operational Research 182(3), 1177-1187 (2007).

[3]

[W. Wang, “A model to predict the residual life of rolling element bearings given monitored

condition information to date”, IMA Journal of Management Mathematics 13, 3-16 (2002).

ICSV15 • 6-10 July 2008 • Daejeon • Korea

7

[4]

T. Williams, X. Ribadeneira, S. Billington, T. Kurfess, “Rolling element bearing diagnostics in

run-to-failure lifetime testing”, Mechanical Systems and Signal Processing 15 979–993 (2001).

[5]

Runqing Huang, Lifeng Xi. “Residual life predictions for ball bearings based on self-organizing

map and back propagation neural network methods”, Mechanical Systems and Signal Processing

21, 193–207 (2007).

[6]

Yawei Li, Dynamic prognostics of rolling element bearing condition, Ph. D. dissert, Georgia

Institute of technology, 1999.

[7]

Tara Reeves Lindsay. Applying adaptive prognostics to rolling element bearings, Master Dissert,

Georgia Institute of technology, 2005.

[8]

N. Gebraeel, M. Lawley, R. Liu, V. Parmeshwaran, “Residual life predictions from

vibration-based degradation signals: A neural network approach”, IEEE Transactions on

Industrial Electronics 51, 694–700 (2004).

[9]

W.Q. Wang, M.F. Golnaraghi, F. Ismail, “Prognosis of machine health condition using

neuro-fuzzy systems”, Mechanical Systems and Signal Processing 18, 813–831 (2004).

[10]

J. Yan, M. Koc, J. Lee, A prognostic algorithm for machine performance assessment and its

application, Production Planning and Control 15, 796–801 (2004).

[11]

P. Wang, G. Vachtsevanos, “Fault prognostics using dynamic wavelet neural networks”, AI

EDAM-Artificial Intelligence for Engineering Design Analysis and Manufacturing 15, 349–365

(2001).

[12]

R.C.M. Yam, P.W. Tse, L. Li, P. Tu, “Intelligent predictive decision support system for

condition-based maintenance”, International Journal of Advanced Manufacturing Technology 17,

383–391(2001).

[13]

Wang X, Yu G, Koc M, Lee J. “Wavelet neural network for machining performance assessment

and its implication to machinery prognostic”. Proceedings of MIM 2002: 5th International

Conference on Managing Innovations in Manufacturing (MIM), Milwaukee, Wisconsin, USA,

150-156 (2002).

[14]

Suykens J.A .K, Vandewalle J, and De Moor B. “Optimal Control by Least Squares Support

Vector Machines”, Neural Networks 14, (2001) 23-35.

[15]

V. N. Vapnik, “Statistical Learning Theory”. John Wiley and Sons Inc., New York, 1998.

[16]

Saunders C., Gammerman A., Vovk V., “Ridge Regression Learning Algorithm in Dual

Variables”, in Proceedings of the 15th International Conference on Machine

Learning,Madison-Wisconsin, 515-521 (1998)

[17]

Suykens,J. Vandewalle. “Least squares support vector machine classifiers”. Neural Process,

Letters 9, 293-300 (1999).

## Comments 0

Log in to post a comment