Best Basis Intelligent Monitoring
Transcript of Best Basis Intelligent Monitoring
-
8/13/2019 Best Basis Intelligent Monitoring
1/14
Mechanical Systems
and
Signal ProcessingMechanical Systems and Signal Processing 19 (2005) 357370
Best basis-based intelligent machine fault diagnosis
S. Zhang, J. Mathew, L. Ma, Y. Sun
CRC for Integrated Engineering Asset Management, School of Mechanical, Manufacturing and Medical Engineering,
Queensland University of Technology, Brisbane, QLD 4001, Australia
Received 5 April 2004; received in revised form 21 May 2004; accepted 16 June 2004
Abstract
The wavelet packet transform decomposes a signal into a set of bases for timefrequency analysis. This
decomposition creates an opportunity for implementing distributed data mining where features are
extracted from different wavelet packet bases and served as feature vectors for applications. This paper
presents a novel approach for integrated machine fault diagnosis based on localised wavelet packet bases of
vibration signals. The best basis is firstly determined according to its classification capability. Data mining
is then applied to extract features and local decisions are drawn using Bayesian inference. A final conclusion
is reached using a weighted average method in data fusion. A case study on rolling element bearing
diagnosis shows that this approach can greatly improve the accuracy of diagnosis.
r 2004 Elsevier Ltd. All rights reserved.
Keywords: Wavelet packet transform; Best basis; Fault diagnosis; Bayesian inference; Data mining/fusion
1. Introduction
Condition monitoring is an important part of the process of modern equipment maintenance.
Its implementation typically consists of data acquisition, feature extraction, conditionidentification and fault diagnosis [1]. Researchers in the field have tended to focus on two areas
for their work. The extraction of features that represent the faults in some way is an identified area
of work. The other is design and implementation of an automatic fault diagnosis procedure.
ARTICLE IN PRESS
www.elsevier.com/locate/jnlabr/ymssp
0022-460X/$ - see front matter r 2004 Elsevier Ltd. All rights reserved.
doi:10.1016/j.ymssp.2004.06.001
Corresponding author.
E-mail address: [email protected] (S. Zhang).
http://www.elsevier.com/locate/jnlabr/ymssphttp://www.elsevier.com/locate/jnlabr/ymssp -
8/13/2019 Best Basis Intelligent Monitoring
2/14
Various methods are available for feature extraction. For example, statistical methods are used
to derive time-domain features, such as signal energy and kurtosis. The fast Fourier transform
(FFT) is a traditional tool to extract frequency-domain features. Joint timefrequency features
which can be generated by short-time Fourier transforms are increasingly used since the majorityof real-world signals are essentially time varying. In the past two decades, the wavelet transform
(WT) and wavelet packet transform (WPT)[2,3], have been researched and applied in a variety of
ways[4]. More particularly, in machine fault diagnostics, WT and WPT have become preferred
techniques to the traditional FFT method in the analysis of transient signals [57].
The features extracted from signals build a foundation for subsequent condition identification
and fault diagnosis. On the other hand, different approaches have been developed to design
condition classifiers, aimed at enhancing the accuracy of diagnosis and automating the diagnosis
procedure. Linear discriminant analysis (LDA), quadratic discriminant analysis (QDA) and
Bayesian inference are known statistical methods. Modern methods, such as neural networks,
fuzzy logic and expert systems, are preferred due to their intelligent properties. Some integratedapproaches consider different signal features in a combined fashion to enhance the accuracy of
diagnosis [8].
In this work, the authors propose a novel approach to conduct integrated fault diagnosis based
on the best bases of the WPT of vibration signals, using data mining and fusion. The best bases
of WPT are firstly selected according to their classification capability. Features are then
extracted from individual best basis and local decisions are made by classifiers such as Bayesian
inference. A final conclusion is reached using the decision-fusion technique, where the
classification capabilities of the best bases are served as the decision weights. This proposed
approach is similar to the distributed data mining approach (DDM), which generally starts from
local data analysis and subsequently generates a global model [9]. However, the proposed
approach has not been previously reported in work related to wavelet packet-based faultdiagnosis in the literature.
This paper is arranged as follows. Section 2 presents the techniques used in this work, such as
WPT, best basis selection, Bayesian inference, data mining and fusion. Section 3 describes the
integrated procedure for fault classification by fusing local information from each best basis of
wavelet packets. The proposed method is validated using signals from faulty rolling element
bearings in Section 4. In addition, back propagation neural networks (BP) as a classifier is
compared. The conclusions are presented in Section 5.
2. Brief introduction of the techniques
2.1. WPT
Both WT and WPT have continuous and discrete formats. The discrete format of WPT was
adopted in this work because it is more popularly used in engineering applications. To illustrate
the underlying mathematical theory of WPT briefly, we denote fhkgk2Z andfgkgk2Z as thequadrature mirror filter banks. A signal can be decomposed at different scales on the basis
functions with the form 2j=2un2jtk; j; k2Z; n2Z; where, Zdenotes the integer and Z+
ARTICLE IN PRESS
S. Zhang et al. / Mechanical Systems and Signal Processing 19 (2005) 357370358
-
8/13/2019 Best Basis Intelligent Monitoring
3/14
denotes the non-negative integer. These functions are iterated as
u2nt ffiffiffi2
p
Xn2Z
hkun2tk; 1
u2n1t ffiffiffi
2pX
n2Zgkun2tk; 2
where,jis a scale parameter, kis a time localisation parameter and n is an oscillation parameter.
Thus, u0t is a scale function which corresponds to a low-pass filter. The filtered signal is anapproximation of the analysed signal. The function u1tis a wavelet function which correspondsto a high-pass filter. The filtered signal is a detail of the analysed signal.
The approximation and detail can be further sliced by dyadic decomposition using the dilated
and translated scale functions and wavelet functions. Consequently, WPT generates a binary tree,
with 2
j
1 bases at decomposition level j. Each basis is indexed by a pair of integers (j, k).The binary structure of the tree enables WPT to be used in various applications. For signalrepresentation, for example, a signal can be reconstructed from wavelet packet coefficients
confined in some specific frequency bands. For pattern recognition, features can be extracted from
different wavelet packet bases. In addition, the distributed best bases create opportunities for
feature extraction and combination, where data mining, a convergence of knowledge discovering
techniques [10], can play an important role. Based on the features of each best basis, local
decisions can be made by a classifier.
2.2. Best basis selection
The binary tree of bases can also be considered as a 2D timefrequency plane. The information
in the bases is redundant along two axes, i.e. information in child bases are overlapped with that
in the parent basis. The best basis is preferably selected from the binary tree, so as to reduce the
data analysis effort without losing information. For signal representation, best bases are defined
in that they cover the complete horizontal axis while not overlapping the vertical axis[11]. This
definition results in a complete tree and ensures no redundant information. The Shannon entropy-
based criterion[12]is well suited to the selection of the complete tree. When signals come from
different classes and common best basis is required, a WPT-structured tree[13,14]is used for the
best basis selection. For pattern recognition, common best bases are selected in that they have the
best classification capability. They are not necessary for the construction of a complete tree
[15,16]. In this work, the best bases were searched to guarantee class separation, since faultdiagnosis is essentially about pattern recognition.
Suppose there are c classes oi; i1; . . .; c; in a classification problem, si is denoted as thecluster centre for theith class oi;then the normalised distance between two closest classes iandjis
di;j jjsisjjjPi1j1Pc1
i1 jjsisjjj: 3
The minimal distance infdi;jg is selected as the discriminant distance for best basis selectiond minfdi;jg i1;. . .; c1;j1;. . .; c1: 4
ARTICLE IN PRESS
S. Zhang et al. / Mechanical Systems and Signal Processing 19 (2005) 357370 359
-
8/13/2019 Best Basis Intelligent Monitoring
4/14
Apparently, a largerdindicates a better capability of classification. It is noted that the minimal
distance, rather than other measures, such as mean distance, is adopted. This choice assists in the
determination of a best basis in which the classes are relatively well separated.
2.3. Bayesian inference for classification
Bayesian inference is an application of Bayesian theorem and has been used as a fundamental
classifier for pattern recognition [17,18]. Bayesian inference works by assigning an unknown
patternx to the class which has the highest posterior probability. According to Bayesian theorem,
the posterior probability is given by
Poijx PoiPxjoi
Px ; 5
where, Poi is the prior probability of class oi and Pxjoi is the class-conditional probabilitywhich represents the probability distribution ofxin class oi:The total probabilityPxis given by
Px Xci1
PoiPxjoi; 6
To obtain the posterior probability, the prior probability and class-conditional probability
must be known. The prior probability can be inferred from prior knowledge of the application,
estimated from the data or assumed to be equal. The class-conditional probability can be
estimated from the data using either parametric or non-parametric methods. For simplicity, the
parametric multivariate normal distribution is always used as an approximation of probabilitydensity estimation in case of multivariate features. If the underlying distribution does not follow
normal probability distribution, the non-parametric density estimation provides an alternative
approach. In this work, the signal energy and kurtosis were extracted from each best basis as a
scale feature separately. Their distributions were estimated using both parametric and non-
parametric methods.
2.4. Data fusion at the decision level
Distributed data resources, such as the distributed sensors, require the integration of local
information to make a final decision. The data-fusion technique provides such a solution and hasbeen successfully used in military and civilian applications [17]. Data fusion helps improve the
identification accuracy in pattern classification and is typically performed at three levels, i.e. (1)
sensor-level fusion, (2) feature-level fusion, and (3) decision-level fusion. More recently, the
decision-level fusion has been termed classifier fusion[1921]. In the work reported in this paper,
local decisions were drawn from each best basis of wavelet packets. The decision-level data fusion
is therefore used for integration. Different methods are used for decision-level fusion, such as the
weighted average method, majority voting technique, Bayesian inference and DempsterShafers
method. The discriminant distance in Eq. (4) supplies a reasonable decision weight for each best
basis. As a result, the weighted average method was adopted for decision fusion in this work.
ARTICLE IN PRESS
S. Zhang et al. / Mechanical Systems and Signal Processing 19 (2005) 357370360
-
8/13/2019 Best Basis Intelligent Monitoring
5/14
3. A procedure to implement integrated fault diagnosis
A procedure using the above techniques to implement integrated fault diagnosis based on
wavelet packets is illustrated in Fig. 1. It has the following steps.
(1) Wavelet packet transformation of signals: The m signals from c signal classes collected for
training and testing classifiers are decomposed by WPT. This step results in m wavelet packet
trees.
(2) Common best basis selection: The discriminant distance (Eq. (4)) is applied for the selection of
a set ofn best bases from m binary trees.
(3) Feature extraction from best basis: The data in a best basis is essentially a time-domain signal
confined in a specific frequency band. Features, such as signal energy or signal kurtosis, are
extracted to construct a feature vector x:A local feature set for a best basis from the m signalsis X
fx
g:
(4) Decision making on local feature set: The class-conditional probability is firstly estimated fromfeature setX:Given the prior probability, the posterior probability Poijxis computed usingBayesian inference for the unknown signal with a feature vector xin a best basis. The posterior
probability represents which class x belongs to and is prepared for final decision making.
(5) Data fusion for final decision making: The weighted average method is adopted for decision
fusion, while Bayesian inference produces probabilities or confidence values corresponding to
each class in an individual best basis. For a specific class oi; a fused probability is given by
Poijx Xnj1
wi;jPoi;jjx; 7
where,wi;j
is the normalised weight given by Eq. (4). A final decision thatxis assigned to class
Iis made by selecting a maximal averaged posterior probability.
PoIjx maxPoijx i1; . . .; c; 8
ARTICLE IN PRESS
WPT
Bayes infer. 1
Feature set 1
Best basis 1
Bayes infer. 2
Feature set 2
Best basis 2
Bayes infer. n
Feature set n
Best basis n
Fusion center
Fig. 1. A procedure for wavelet packets-based fault diagnosis by data mining/fusion.
S. Zhang et al. / Mechanical Systems and Signal Processing 19 (2005) 357370 361
-
8/13/2019 Best Basis Intelligent Monitoring
6/14
Alternatively, a binary classification method can be used by voting the highest posterior
probability (Eq. (9)). This majority voting technique takes the winner-take-all principle.
PoIjx 1 max
Po
ijx;
i
1;
. . .;
c;
0 others: 9
This procedure facilitates an automatic integrated fault diagnosis approach, since WPT, best
basis selection and Bayesian inference are highly computational.
4. A case study
Rolling element bearings are key components in mechanical systems. Their failures account for
a large percentage of breakdowns in rotating machinery. Some of these breakdowns can be
catastrophic. Conducting diagnosis and prognosis on bearings is therefore fundamental tomaintaining the integrity of mechanical systems.
In this case study, experimental data of faulty ball bearings were used to test our methodology.
For these bearings, a single defect was introduced by laser processing on the outer-race, inner-race
and ball, respectively. The data were collected under different operation conditions, i.e. different
speeds and loads, to ensure that broad conditions are covered for the benefit of the generalisation
of classifiers.
Seven hundred samples were acquired for each fault class. Among them, 600 samples were used
for classifier training, while 100 samples were used for classifier testing. Since three types of faults
were involved, there were a total of 2100 samples.
Following the procedure in Section 3, the signals were decomposed by WPT up to level 3 by
Db20 wavelets. Fig. 2 illustrates a signal from the faulty outer-race and its WPT. The signal
energy and kurtosis were adopted as features separately and formed the training and testing
ARTICLE IN PRESS
Fig. 2. WPT for an outer race signal.
S. Zhang et al. / Mechanical Systems and Signal Processing 19 (2005) 357370362
-
8/13/2019 Best Basis Intelligent Monitoring
7/14
datasets [22,23]. Tables 1 and 2list the normalised discriminant distances for all nodes. The six
selected common best bases are illustrated in Fig. 3 and 4for the energy and kurtosis features,
respectively. The corresponding decision weights measured by the discriminant distances are
shown inTables 3 and 4.For each best basis, it was assumed that the prior probabilities for the three faults were equal,
i.e. Poi 13; i1; 2; 3; and the class-conditional probabilities were estimated from the trainingdatasets. According to Bayesian inference (Eqs. (5) and (6)), the testing signals were classified to
reach the local decisions, which were further fused (Eq. (7)) to produce a final decision using
Eq. (8) or (9).
ARTICLE IN PRESS
Table 1
Discriminant distance (energy)
0.1865
0.3949 0.1024
0.2558 0.2758 0.1148 0.1021
0.4918 0.2043 0.2306 0.4812 0.0618 0.1430 0.0897 0.2886
Table 2
Discriminant distance (kurtosis)
0.4898
0.2026 0.4791
0.3257 0.3362 0.1705 0.3982
0.0355 0.4888 0.4676 0.0236 0.1730 0.4503 0.3836 0.3728
(3,7)(3,3)(3,0)
(2,0)
(1,0)
(2,1)
Fig. 3. Common best basis, (energy).
S. Zhang et al. / Mechanical Systems and Signal Processing 19 (2005) 357370 363
-
8/13/2019 Best Basis Intelligent Monitoring
8/14
ARTICLE IN PRESS
Table 3
Decision weights (energy)
Node (1,0) (2,0) (2,1) (3,0) (3,3) (3,7)
Weight 0.1805 0.1169 0.1261 0.2248 0.2199 0.1319
Table 4
Decision weights (kurtosis)
Node (0,0) (1,1) (2,3) (3,1) (3,2) (3,5)
Weight 0.1766 0.1727 0.1436 0.1762 0.1686 0.1623
(2,3)
(1,1)(0,0)
(3,5)(3,2)(3,1)
Fig. 4. Common best basis (kurtosis).
0 20 40 60 80 1000
0.05
0.1
Energy
0 5 10 15 200
0.2
0.4
Energy
0 0.5 1 1.5 2 2.50
1
2
3
Energy
PDF
PDF
PDF
Fig. 5. ASH estimated distribution (energy).
S. Zhang et al. / Mechanical Systems and Signal Processing 19 (2005) 357370364
-
8/13/2019 Best Basis Intelligent Monitoring
9/14
To obtain the class-conditional probabilities, the features are assumed to follow a normal
distribution or an unknown distribution. For the normal distribution, its mean and variance are
estimated relatively easily. The averaged shifted histogram (ASH), a non-parametric estimation
technique, was used to estimate the unknown probability distribution [24]. The estimated
ARTICLE IN PRESS
-50 0 50 1000
0.02
0.04
Energy
-5 0 5 10 15 200
0.2
0.4
Energy
-1 -0.5 0 0.5 1 1.5 20
1
2
Energy
PDF
PDF
PDF
Fig. 6. Estimated normal distribution (energy).
0 50 100 150 200 250 300 3500
0.02
0.04
0.06
Kurtosis
0 50 100 150 200 250 3000
0.05
0.1
Kurtosis
0 50 100 150 200 250 300 3500
0.01
0.02
0.03
Kurtosis
PDF
PDF
PDF
Fig. 7. ASH estimated distribution (kurtosis).
S. Zhang et al. / Mechanical Systems and Signal Processing 19 (2005) 357370 365
-
8/13/2019 Best Basis Intelligent Monitoring
10/14
-
8/13/2019 Best Basis Intelligent Monitoring
11/14
The misclassification rate was calculated using the winner-take-all principle.
ri11
JXJ
j1
bi;j; 11
where, bi;j was 1 if the related result of Bayesian inference for class i is maximum, otherwise it
was 0.
Table 5shows that when signal energy and signal kurtosis were employed as the feature. All 100
test signals in each class were correctly classified. However, employing energy as the feature
resulted in significantly higher decision confidences. It is concluded that signal energy is a better
feature choice. Another findingthe classification results by ASH estimation are comparable with
that obtained by normal distribution estimation for each feature case. This finding suggests that
both probability estimation methods work well for the case study.
For classification problems, a feature vector may alternatively be built in that its elements come
from different best bases. Instead of using the DDM approach, a final decision can be madedirectly based on this feature vector. A matched classifier is required. However, if Bayesian
inference is used, the assumption of multivariate normality distribution for the feature vectors is
always violated, resulting in unacceptable misclassifications. The non-parametric multivariate
probability density estimation is also difficult to implement for this case. As a comparison, a
6 nh 3 BP neural network[25,26] was designed. In the three-layer neural network, six inputnodes corresponded to the features extracted from the six best bases, and three output nodes
corresponded to the three types of faults. The target outputs were [1, 0, 0], [0, 1, 0] and [0, 0, 1],
respectively. The number of hidden nodes,nh, was varied from 5 to 20 to reach an optimal design.
The signal features in each common best basis were concatenated into a normalised feature vector
which constructed the training and testing datasets. During network training, the cross- validation
technique[26]was used to prevent over fitting. Four-fifths of the training samples were used for
ARTICLE IN PRESS
0 50 100 150 20010
-6
10-5
10-4
10-3
10-2
10-1
100
101
Epoch
Training
ValidationE
rror
Goal
Fig. 9. Learning curve (energy).
S. Zhang et al. / Mechanical Systems and Signal Processing 19 (2005) 357370 367
-
8/13/2019 Best Basis Intelligent Monitoring
12/14
training and one-fifth was used for validation. The maximum iteration was 1000 and the target
error was 0.00001. The training of the BP neural networks ceased when either the maximum
iteration or the target error was reached. Another criterion to stop training was the cross-
validation. The initial values of weights and bias of the networks were randomly set. It was found
thatnh=15 generated best results for energy feature vectors, andnh=13 generated best results for
kurtosis feature vectors.Figs. 9 and 10present the learning curves. In our case study, the cross-validation ceased the training of BP neural networks with an acceptable total error lower than
0.01. In addition, 230 and 108 epochs were required for the energy and kurtosis features,
respectively. The classification results are also listed in Table 5. Similarly, the averaged confidence
and misclassification rate were computed by Eqs. (10) and (11).
FromTable 5, it is clear that BP networks relying on the concatenated feature vectors resulted
in deteriorated classification. Misclassifications occurred in each fault class for both individual
features. The signal kurtosis led to poorer results. The comparison using BP networks suggests the
proposed integrated method significantly outperforms adopted BP neural networks for
classification.
5. Conclusion
This work has presented an automatic fault classification technique based on the WPT and best
basis selection. The novel approach performs integrated fault diagnosis based on vibration
signals. The following conclusions are drawn:
(1) Signals can be classified (diagnosed) based on the best basis of WPT. For each best basis, the
capability to discriminate features served as the decision weight for final decision fusion.
ARTICLE IN PRESS
0 20 40 60 80 10010
-6
10-5
10-4
10-3
10-2
10-1
100
101
Epoch
Validation
Training
Goal
Error
Fig. 10. Learning curve (kurtosis).
S. Zhang et al. / Mechanical Systems and Signal Processing 19 (2005) 357370368
-
8/13/2019 Best Basis Intelligent Monitoring
13/14
(2) Both signal energy and kurtosis can be used to classify the signals 100% correctly by the
integrated method. Signal energy, however, resulted in higher decision confidences and is
preferred.
(3) The probability estimation methods by ASH and normal estimation led to comparable resultsin the case study.
(4) BP neural networks employing concatenated feature vectors with the element coming from
individual best basis, deteriorated classification results, in terms of both misclassification rate
and decision confidence.
References
[1] A. Davies, Handbook of Condition MonitoringTechniques and Methodology, Chapman & Hall, UK, 1998.[2] I. Daubechies, Ten lectures on wavelets, CBMS-NSF Regional Conference Series in Applied Mathematics, vol. 61,
SIAM, Philadelphia, PA, 1992.
[3] S. Mallat, A theory for multiresolution signal decomposition: the wavelet representation, IEEE Transactions on
Pattern Analysis and Machine Intelligence 11 (1989) 674692.
[4] B.K. Alsberg, A.M. Woodward, D.B. Kell, An introduction to wavelet transforms for chemometricians: a
timefrequency approach, Chemometrics and Intelligent Laboratory Systems 37 (1997) 215239.
[5] S.K. Goumas, M.E. Zervakis, G.S. Stavrakakis, Classification of washing machines vibration signals using discrete
wavelet analysis for feature extraction, IEEE Transactions on Instrumentation and Measurement 51 (3) (2002)
497508.
[6] G. Meltzer, Y.H. Ivanov, Fault detection in gear drives with non-stationary rotational speedpart II: the
timefrequency approach, Mechanical Systems and Signal Processing 17 (2) (2003) 273283.
[7] N.G. Nikolaou, I.A. Antoniadis, Rolling element bearing fault diagnosis using wavelet packets, NDT&EInternational 35 (2002) 197205.
[8] S.L. Chen, Y.W. Jen, Data fusion neural network for tool condition monitoring in CNC milling machining,
International Journal of Machine Tools and Manufacture 40 (2000) 381400.
[9] D.E. Hershberger, H. Kargupta, Distributed multivariate regression using wavelet-based collective data mining,
Journal of Parallel and Distributed Computing 61 (2001) 372400.
[10] K. Mehmed, Data Mining: Concepts, Models, Methods and Algorithms, IEEE Press, Wiley, New York, 2002.
[11] M. Cocchi, R. Seeber, A. Ulrici, WPTER: wavelet packet transform for efficient pattern recognition of signals,
Chemometrics and Intelligent Laboratory Systems 57 (2001) 97119.
[12] R.R. Coifman, M.V. Wickerhauser, Entropy-based algorithms for best basis selection, IEEE Transactions on
Information Theory 38 (2) (1992) 713718.
[13] B. Walczak, D.L. Massart, Wavelet packet transform applied to a set of signals: a new approach to the best-basis
selection., Chemometrics and Intelligent Laboratory Systems 38 (1997) 3950.
[14] N. Saito, R.R. Coifman, F.B. Geshwind, F. Warner, Discriminant feature extraction using empirical probability
density estimation and a local basis library, Pattern Recognition 35 (2002) 28412852.
[15] Y. Wu, R. Du, Feature extraction and assessment using wavelet packets for monitoring of machining processes,
Mechanical Systems and Signal Processing 10 (1) (1996) 2953.
[16] S. Zhang, J. Mathew, L. Ma, Common best basis selection of wavelet packets for machine fault diagnosis,
Proceedings of the 10th Asia-Pacific Vibration Conference, 2003, pp. 835840.
[17] D.L. Hall, J. Llinas, Handbook of Multisensor Data Fusion, CRC Press, Boca Raton, FL, 2001.
[18] B. Chen, P.K. Varshney, A Bayesian sampling approach to decision fusion using hierarchical model, IEEE
Transactions on Signal Processing 50 (8) (2002) 18091818.
[19] J. Kittler, M. Hatef, R.P.W. Duin, J. Matas, On combining classifiers, IEEE Transactions on Pattern Analysis and
Machine Intelligence 20 (3) (1998) 226239.
ARTICLE IN PRESS
S. Zhang et al. / Mechanical Systems and Signal Processing 19 (2005) 357370 369
-
8/13/2019 Best Basis Intelligent Monitoring
14/14
[20] D.M.J. Tax, M.V. Breukelen, R.P.W. Duin, J. Kittler, Combining multiple classifiers by averaging or by
multiplying?, Pattern Recognition 33 (2000) 14751485.
[21] S. Prabhakar, A.K. Jain, Decision-level fusion in fingerprint verification, Pattern Recognition 35 (2002) 861874.
[22] B. Samanta, K.R. Al-Balushi, Artificial neural network based fault diagnostics of rolling element bearings usingtime-domain features, Mechanical Systems and Signal Processing 17 (2) (2003) 317328.
[23] J. Shiroishi, Y. Li, S. Liang, T. Kurfess, S. Danyluk, Bearing condition diagnostics via vibration and acoustic
emission measurements, Mechanical Systems and Signal Processing 11 (5) (1997) 693705.
[24] W.L. Martinez, A.R. Martinez, Computational Statistics Handbook with MATLAB, Chapman & Hall/CRC,
New York, 2002.
[25] D. Rumelhart, G. Hinton, R. Williams, Learning representation by back-propagating errors, Nature 323 (1986)
533536.
[26] C.M. Bishop, Neural Networks for Pattern Recognition, Oxford University Press, Oxford, 1995.
ARTICLE IN PRESS
S. Zhang et al. / Mechanical Systems and Signal Processing 19 (2005) 357370370