Fault detection and diagnosis for missing data systems with a three time-slice dynamic Bayesian...

11
Fault detection and diagnosis for missing data systems with a three time-slice dynamic Bayesian network approach Zhengdao Zhang , Feilong Dong Key Laboratory of Advanced Process Control for Light Industry, Ministry of Education, Jiangnan University, Wuxi 214122, China abstract article info Article history: Received 9 April 2014 Received in revised form 1 July 2014 Accepted 9 July 2014 Available online 17 July 2014 Keywords: Dynamic Bayesian network Mixture Gaussian output Fault detection and identication Missing data EM algorithm Non-imputation A multi-time-slice dynamic Bayesian network with a mixture of the Gaussian output (MT-DBNMG) based data-driven fault identication method is proposed to handle the missing data samples and the non-Gaussian process data. First, via introducing more time slices, a new dynamic Bayesian network structure with multi-time-slice is constructed which can describe the dependence between the current state and historic states. Second, a parameter learning strategy based on expectation maximization algo- rithm is deduced, from the complete historical data with the non-Gaussianity, to train the parameters of MT-DBNMG. Subsequently, for the missing measurements, an online non-imputation inference method for MT-DBNMG is proposed to conduct fault detection and identication. The effectiveness of the pro- posed approach is demonstrated by the continuous stirred tank reactor system and the Tennessee Eastman chemical process. The results show that the presented approach can accurately detect abnormal events, identify the fault, and is also robust to unknown noise. © 2014 Elsevier B.V. All rights reserved. 1. Introduction With the development of the industrial manufacturing alone with the advanced automation and control system, the complexity of systems is increased. Thus, process monitoring and fault detection are very important in modern industry. Traditional fault detection and diagnosis methods proposed in the literature can be classied as quantitative model-based approaches [1,2], qualitative knowledge- based approaches [3] and process data-driven approaches. Compar- ing with other methods, data-driven methods, especially multivari- ate statistical process monitoring methods, are developed and have attracted growing attention in the eld [4,5]. Although the data- driven methods will have difculty in diagnosing on-line faulty data with a much different magnitude or new faulty data, it is worth to implement them due to their well-known excellent prop- erties which include no requirement of the in-depth process knowl- edge or the rst principle of controlled systems, easy to collect mass data, and easy to apply to real processes of a rather large scale com- pared to other methods based on systems theory or rigorous process models. Among the data-driven methods, principal component analysis (PCA) and partial least squares (PLSs) are the two most well-known techniques, and many extensions are further developed based on them (see [4,6,7] and references therein). However, the PCA/PLS methods depend on the assumption that the process data follow an ap- proximate multivariate Gaussian distribution, which may not be unsat- ised in real industry such that the traditional PCA/PLS monitoring approaches become inappropriate [8]. The Gaussian mixture model (GMM) based monitoring approach can nicely handle the multi- Gaussianity which is approximated by multi-Gaussian distributions [8,9]. Compared with neural networks and other methods to handle the non-Gaussianity, the GMM only uses the historical data of pro- cess and avoids the performance degradation caused by the initial parameter selection. Moreover, the GMM can deal with the partly missing measurements. Therefore, the GMM algorithm has been introduced to automatically detect, isolate, and even forecast the faults [10,11]. From the perspective of statistical inference, fault detection and identication can be treated as an uncertain evidence inference prob- lem, and Bayesian methods are the best tools to infer and formulate un- certainty of evidence [12,13]. Among Bayesian methods, static Bayesian network (BN) is suitable for dealing with conditional-dependent uncer- tain modeling and inference [14], and has been applied to different areas including fault detection and diagnosis (FDD) [6,1519]. Moreover, BN can be combined with GMM to deal with the non- Gaussianity problem [2022]. Nevertheless, since static BN does not Chemometrics and Intelligent Laboratory Systems 138 (2014) 3040 Corresponding author. Tel.: +86 136 1618 7667. E-mail address: [email protected] (Z. Zhang). http://dx.doi.org/10.1016/j.chemolab.2014.07.009 0169-7439/© 2014 Elsevier B.V. All rights reserved. Contents lists available at ScienceDirect Chemometrics and Intelligent Laboratory Systems journal homepage: www.elsevier.com/locate/chemolab

Transcript of Fault detection and diagnosis for missing data systems with a three time-slice dynamic Bayesian...

Chemometrics and Intelligent Laboratory Systems 138 (2014) 30–40

Contents lists available at ScienceDirect

Chemometrics and Intelligent Laboratory Systems

j ourna l homepage: www.e lsev ie r .com/ locate /chemolab

Fault detection and diagnosis for missing data systems with a threetime-slice dynamic Bayesian network approach

Zhengdao Zhang ⁎, Feilong DongKey Laboratory of Advanced Process Control for Light Industry, Ministry of Education, Jiangnan University, Wuxi 214122, China

⁎ Corresponding author. Tel.: +86 136 1618 7667.E-mail address: [email protected] (Z. Zhang).

http://dx.doi.org/10.1016/j.chemolab.2014.07.0090169-7439/© 2014 Elsevier B.V. All rights reserved.

a b s t r a c t

a r t i c l e i n f o

Article history:Received 9 April 2014Received in revised form 1 July 2014Accepted 9 July 2014Available online 17 July 2014

Keywords:Dynamic Bayesian networkMixture Gaussian outputFault detection and identificationMissing dataEM algorithmNon-imputation

A multi-time-slice dynamic Bayesian network with a mixture of the Gaussian output (MT-DBNMG)based data-driven fault identification method is proposed to handle the missing data samples and thenon-Gaussian process data. First, via introducing more time slices, a new dynamic Bayesian networkstructure with multi-time-slice is constructed which can describe the dependence between the currentstate and historic states. Second, a parameter learning strategy based on expectation maximization algo-rithm is deduced, from the complete historical data with the non-Gaussianity, to train the parameters ofMT-DBNMG. Subsequently, for the missing measurements, an online non-imputation inference methodfor MT-DBNMG is proposed to conduct fault detection and identification. The effectiveness of the pro-posed approach is demonstrated by the continuous stirred tank reactor system and the TennesseeEastman chemical process. The results show that the presented approach can accurately detect abnormalevents, identify the fault, and is also robust to unknown noise.

© 2014 Elsevier B.V. All rights reserved.

1. Introduction

With the development of the industrial manufacturing alonewith the advanced automation and control system, the complexityof systems is increased. Thus, process monitoring and fault detectionare very important in modern industry. Traditional fault detectionand diagnosis methods proposed in the literature can be classified asquantitative model-based approaches [1,2], qualitative knowledge-based approaches [3] and process data-driven approaches. Compar-ing with other methods, data-driven methods, especially multivari-ate statistical process monitoring methods, are developed and haveattracted growing attention in the field [4,5]. Although the data-driven methods will have difficulty in diagnosing on-line faultydata with a much different magnitude or new faulty data, it isworth to implement them due to their well-known excellent prop-erties which include no requirement of the in-depth process knowl-edge or the first principle of controlled systems, easy to collect massdata, and easy to apply to real processes of a rather large scale com-pared to other methods based on systems theory or rigorous processmodels.

Among the data-driven methods, principal component analysis(PCA) and partial least squares (PLSs) are the two most well-knowntechniques, and many extensions are further developed based onthem (see [4,6,7] and references therein). However, the PCA/PLSmethods depend on the assumption that the process data follow an ap-proximate multivariate Gaussian distribution, which may not be unsat-isfied in real industry such that the traditional PCA/PLS monitoringapproaches become inappropriate [8]. The Gaussian mixture model(GMM) based monitoring approach can nicely handle the multi-Gaussianity which is approximated by multi-Gaussian distributions[8,9]. Compared with neural networks and other methods to handlethe non-Gaussianity, the GMM only uses the historical data of pro-cess and avoids the performance degradation caused by the initialparameter selection. Moreover, the GMM can deal with the partlymissing measurements. Therefore, the GMM algorithm has beenintroduced to automatically detect, isolate, and even forecast thefaults [10,11].

From the perspective of statistical inference, fault detection andidentification can be treated as an uncertain evidence inference prob-lem, and Bayesianmethods are the best tools to infer and formulate un-certainty of evidence [12,13]. Among Bayesianmethods, static Bayesiannetwork (BN) is suitable for dealingwith conditional-dependent uncer-tain modeling and inference [14], and has been applied to differentareas including fault detection and diagnosis (FDD) [6,15–19].Moreover, BN can be combined with GMM to deal with the non-Gaussianity problem [20–22]. Nevertheless, since static BN does not

Nomenclature

Ct state nodeMt mixture Gaussian nodeP(C0) initial state probability distributionP(Yt|Ct) observation variable probability distributionP(Ct|Ct − 1) state transition probability distributionyt,u missing part of ytCAf feed concentrationTc coolant inlet temperatureq reactor feed flow rateρ density−ΔH heat of reactionϕc(t) deactivation coefficientYt observed nodet time instantya,t ath element of ytyt measurement vector at time tyt,o observable part of ytCA effluent concentrationT reactor temperatureTf feed temperatureV reactor volumek0 pre-exponential factorCp,Cpc heat capacityϕh(t) fouling coefficient

Fig. 1. The structure of the 2T-DBNMGmodels.

31Z. Zhang, F. Dong / Chemometrics and Intelligent Laboratory Systems 138 (2014) 30–40

consider the temporal relationship among states of dynamic system[23], it is not suitable for expressing and dealing with the sequence re-lationship between samples in the output measurement time series ofindustrial process. As an extension of static BN, dynamic Bayesian net-work (DBN) combines static network with temporal information, andforms a probability model which can deal with timing sequence data[24]. Even if the study of the dynamic Bayesian network is not very ma-ture, it has been applied in FDD [25–30]. Yu et al. developed a novel DBNbased networked process monitoring approach which can accuratelydetect abnormal events, identify the fault propagation pathways, anddiagnose the root cause variables [31]. At present, the major limitationsof DBN based FDD methods lie in the fact that the network structure isdesigned depending on the prior process knowledge and process flowdiagram.

Although data-driven approaches have beenwidely applied for FDD,the missing data output problem is the major challenge of thesemethods. Caused by a suddenmechanical breakdown, hardware sensorfailure or data acquisition system malfunction, etc., missing data or ir-regularly sampled data is a common phenomenon in industrial practice[32,33]. The data losses and packet dropouts in communication net-works are the increasing common sources for this missing data prob-lem. However, the process monitoring and fault detection techniquesin the presence of missing observation have not been well studied.Most of the existing data-driven methods, including neural networks,k-nearest neighbors and decision trees, are designed for well-conditioned data sets and cannot treat the incomplete data. Therefore,these methods will result in detection delays or failures in FDDwith in-complete data.

In order to meet the requirements of the real-time fault detectionand diagnosis, the problem of missing data should be considered, inother words, we must use the partly observed data to detect and diag-nose fault if the data are missing at a certain moment. To achieve thisend, some common data imputation approaches are used [13], such asmean substitution, regression imputation, multiple imputation, nearestneighborhood shift, support vector machine (SVM) and expectation

maximization, to make the missing samples complete [34,35], thenthe complete estimated data is used to detect and diagnose fault. How-ever, the variances of the datamay be considerably changedwith impu-tation, which was pointed out by Khatibisepehr et al. [13]. Moreover,there are three kinds of missing data mechanisms, i.e. missing at ran-dom, missing completely at random and not missing at random [35].Unfortunately, neither a single imputation approach is suitable for allof themissing datamechanism assumptions. On the other hand, the im-puted value is an approximation of the real value and the imputationerror increaseswith the increase ofmissing rate. Therefore, the imputedvalue cannot take the place of the original one for fault detection andidentification, because a bias repaired value may lead to a false alarmor missing alarm. Also, there were only limited literatures reported onthe use of Bayesian networks for process fault detection and diagnosiswith missing data. Unlike those heuristic schemes which deal with spe-cial problems, the concept of correntropy has been applied to developmore general methods based on existing models without resort to un-necessary efforts for outlier detection [37]. A similar idea also can befound in [38]. Therefore, we also focus on direct fault detection and di-agnosis without imputation of missing data.

This paper proposes a multi-time-slice dynamic Bayesian networkwith mixture of Gaussian output (MT-DBNMG), and then, achievesfault detection and identification with the partially observed data forthose systems that have the missing data output problem and thenon-Gaussianity. The research of this paper is the expansion of our pro-phase research work [7,36]. In these literature, the two-time-slice dy-namic Bayesian network with a mixture of Gaussian output (2T-DBNMG) is proposed to solve the problem of incomplete data andnon-Gaussianity in processes. But it is not effective to detect incipientfault and has a large delay alarm rate for this incipient fault detection.Inspired by the high-order Markov model, we introduce more timeslices into DBN, which can relate the current state with more historicdata. The proposed algorithm can be divided into two steps. First,using the complete historical data, the parameter learning algorithmof MT-DBNMG is deduced based on expectation maximization method.Second, based on the trainedMT-DBNMG, the inference algorithm is de-veloped with partly missing data to accomplish the fault detection andidentification. At last, the proposed approach is applied to monitor thecontinuous stirred-tank reactor (CSTR) and the Tennessee Eastman(TE) chemical process in this study and the presented method is dem-onstrated to be effective in monitoring and diagnosing for these twobenchmark processes.

The remainder of this paper is organized as follows. In Section 1,after a brief introduction of the DBNMG model and missing data pro-cessing, the parameter learning algorithm of MT-DBNMG and the infer-ence algorithm are deduced in detail. Then, Section 2 introduces theprocess of fault detection and identification based on the proposedMT-DBNMG. The presented method is applied to the CSTR and the TEprocess in Section 3. Finally, the conclusions are summarized inSection 4.

32 Z. Zhang, F. Dong / Chemometrics and Intelligent Laboratory Systems 138 (2014) 30–40

2. Multi-time-slice DBN model and missing data processing

2.1. DBNMG and its parameter configuration

DBNMG is a special DBN structure, which uses mixed Gaussian to approximate likelihood distribution of the output. The structure of the 2T-DBNMGmodels is shown in Fig. 1. Each time slice in 2T-DBNMG is a GMMwith three nodes, where node Ct represents the set of random variables,which can take onN possible values, Ct ϵ{1,2,…,N}, node Yt denotes the observed variable, nodeMt denotes themixture Gaussian component numberof the observed variable,Mt = 1,2,…,M, and tmeans the t th time instant. Thus, the main parameters of 2T-DBNMG are initial state probability dis-tribution P (C0), state transition probability distribution P (Ct\Ct-1) (transition distribution) and observation variable probability distribution P (Yt\Ct)(likelihood distribution).

Obviously, similar to a first-order Markov process, 2T-DBNMG uses the current output and the previous output to estimate the current state,which leads to the loss of the valuable information in the historic data. Especially, 2T-DBNMG incurs as a result of incapability of detection and iden-tification for incipient fault, because the variation of two sequential measurements is small in the case of incipient fault, and it will not cause an ob-vious variation on its distribution features so that the incipient fault is insensitive in the probability form. In order to effectively use the datainformation in historic data, just like the way that extends the first-order Markov process to a higher-order Markov process, the multi-time-sliceunfolding of dynamic Bayesian network with mixture of Gaussian output is proposed. For simplicity in presentation, the three-time-slice dynamicBayesian network with amixture of Gaussian output (3T-DBNMG) shown in Fig. 2 is used to illustrate the network construction, parameter learningand inference for proposed MT-DBNMG.

Considering the dependence between the states at t and t-2 time instants, the structure of 3T-DBNMG is constructed and shown in Fig. 2, and eachnode in this model is similar to that in 2T-DBNMG.

Commonly, the each node of DBN obeys the first-order Markov assumption [14], namely, P(Xt|X1, X2, …, Xt − 1) = P(Xt|Xt − 1), where Xt iseach of the nodes in slice t (whichmay be hidden or observed). But, in 3T-DBNMG, the first-orderMarkov assumption has not been satisfied, asP(Xt|X1, X2,…, Xt − 1)= P(Xt|Xt − 2, Xt − 1). In order to use the first-orderMarkov relationship to simplify subsequent derivation, an auxiliary processis introduced into the network structure of 3T-DBNMG. Consequently, the structure shown in Fig. 2 is converted to the structure shown in Fig. 3.

Therefore, in 3T-DBNMG, themain parameters become initial state probability distribution P (C0), state transitionmodel P(Ct|Ct − 1, Ct − 2), and theobservation model P (Yt\Ct), then, these parameters are rearranged to a parameter vector of model θ = {P(Ct|Ct − 1, Ct − 2), P(Yt|Ct,Mt), P(Mt|Ct)}.

2.2. Parameter learning of 3T-DBNMG

After the network structure is determined, the model parameters θ corresponding to the conditional probability density functions of all nodesneed to be identified. The goal of network learning is to estimate each conditional probability density function corresponding to every node thatmax-imizes the likelihood of the training data. Although facing the data-driven fault detection and identificationwhich are based on process recordswithmissing data, the historical data also is considered abundant. Therefore, it is reasonable to assume that the data is complete in the off-line trainingphase.

According to the properties of conditional probability, the likelihood distribution P(Yt|Ct = i) can be written as:

P Ytð jCt ¼ iÞ ¼XMm¼1

PðYt jMt ¼ m;Ct ¼ iÞP Mt ¼ mjCt ¼ ið Þ: ð1Þ

Based on the assumption that the output measurements obey approximate multi-Gaussian distributions, we have P(Yt|Mt = m, Ct = i) =N(yt; μi,m, ∑ i,m), where covariance matrix ∑ i,m is chosen as a diagonal matrix. According to the Bayesian rules, the posterior probability ofthe state node is

P Ct ¼ ijYt ;Ct−1 ¼ j;Ct−2 ¼ nð Þ ¼ P Yt jCt ¼ ið ÞP Ct ¼ ijCt−1 ¼ j;Ct−2 ¼ nð ÞP Ct−2 ¼ nð ÞP Ct−1 ¼ jjCt−2 ¼ nð ÞXck¼1

P Ct ¼ k;Yt ;Ct−1 ¼ j;Ct−2 ¼ nð Þ

¼ P Yt jCt ¼ ið ÞP Ct ¼ ijCt−1 ¼ j;Ct−2 ¼ nð ÞXck¼1

P Yt jCt ¼ kð ÞP Ct ¼ kjCt−1 ¼ j;Ct−2 ¼ nð Þ:

ð2Þ

It can be found that each node is conditionally independent on its non-descendants given that the parent nodes and the joint probability distri-bution of all the network nodes is used to develop the log-likelihood function for parameter estimations. The logarithmic likelihood-based objectivefunction of the model parameters θ is expressed as

L ¼XSa¼1

XTt¼1

logP ya;t� �

¼XSa¼1

XTt¼1

logXNn¼1

XNj¼1

XNi¼1

XMm¼1

P Ct ¼ i;Ct−1 ¼ j;Ct−2 ¼ n;Mt ¼ m; ya;t� �

¼XSa¼1

XTt¼1

logXNn¼1

XNj¼1

XNi¼1

XMm¼1

Pθt Ct ¼ i;Ct−1 ¼ j;Ct−2 ¼ n;Mt ¼ mjya;t� �

•Pθt Ct ¼ i;Ct−1 ¼ j;Ct−2 ¼ n;Mt ¼ m; ya;t� �

Pθt Ct ¼ i;Ct−1 ¼ j;Ct−2 ¼ n;Mt ¼ mjya;t� �

ð3Þ

where ya,t is the ath element of themeasurement vector at time t, a={1,2,…,S}, S is the dimension of themeasurement vector and θt is the parametervector of current model at time t.

33Z. Zhang, F. Dong / Chemometrics and Intelligent Laboratory Systems 138 (2014) 30–40

Based on Jensen's inequality, (3) can be further expanded as

L≥XSa¼1

XTt¼1

XNn¼1

XNj¼1

XNi¼1

XMm¼1

Pθt Ct ¼ i;Ct−1 ¼ j;Ct−2 ¼ n;Mt ¼ mjya;t� �

logPθt Ct ¼ i;Ct−1 ¼ j;Ct−2 ¼ n;Mt ¼ m; ya;t� �

Pθt Ct ¼ i;Ct−1 ¼ j;Ct−2 ¼ n;Mt ¼ mjya;t� �

¼XSa¼1

XTt¼1

XNn¼1

XNj¼1

XNi¼1

XMm¼1

Pθt Ct ¼ i;Ct−1 ¼ j;Ct−2 ¼ n;Mt ¼ mjya;t� �

logPθt Ct ¼ i;Ct−1 ¼ j;Ct−2 ¼ n;Mt ¼ m; ya;t� �

−XSa¼1

XTt¼1

XNn¼1

XNj¼1

XNi¼1

XMm¼1

Pθt Ct ¼ i;Ct−1 ¼ j;Ct−2 ¼ n;Mt ¼ mjya;t� �

logPθt Ct ¼ i;Ct−1 ¼ j;Ct−2 ¼ n;Mt ¼ mjya;t� �

:

ð4Þ

In (4), θt satisfies

XNn¼1

XNj¼1

XNi¼1

XMm¼1

Pθt Ct ¼ i;Ct−1 ¼ j;Ct−2 ¼ n;Mt ¼ mjya;t� �

¼ 1 ð5Þ

and

0≤Pθt Ct ¼ i;Ct−1 ¼ j;Ct−2 ¼ n;Mt ¼ mjya;t� �

≤1: ð6Þ

The process of calculating the maximum parameter of complete data of (4) is equivalent to the process of maximizing the expected logarithmiclikelihood function [39]. Therefore, EM algorithm is employed here in an attempt to solve the optimization problem as given in (4) iteratively, whoseprocess consists the expectation step (E-step) and the maximization step (M-step) and can be summarized as follows.

Let θt be the current best approximation to themode of the observed posterior or the best estimated parameters using all available data.With theparameters currently available and the observed data, the expectation of the next estimation of parameters can be derived, which is known as the Qfunction. The E-step is to calculate the Q function, which is defined by

Q θjθtð Þ ¼XSa¼1

XTt¼1

XNn¼1

XNj¼1

XNi¼1

XMm¼1

P Ct ¼ i;Ct−1 ¼ j;Ct−2 ¼ n;Mt ¼ mjya;t ; θt� �

� logP Ct ¼ i;Ct−1 ¼ j; Ct−2 ¼ n;Mt ¼ m; ya;t jθ� �

¼XSa¼1

XTt¼1

XNn¼1

XNj¼1

XNi¼1

XMm¼1

P Ct ¼ i; Ct−1 ¼ j;Ct−2 ¼ n;Mt ¼ mjya;t ; θt� �

� logP Ct ¼ i;Ct−1 ¼ j;Ct−2 ¼ n;Mt ¼ mjθð Þ

þXSa¼1

XTt¼1

XNn¼1

XNj¼1

XNi¼1

XMm¼1

P ya;t jCt ¼ i;Ct−1 ¼ j;Ct−2 ¼ n;Mt ¼ mjya;t ; θt� �

� logP ya;t jCt ¼ i;Ct−1 ¼ j; Ct−2 ¼ n;Mt ¼ m; θ� �

:

ð7Þ

The M-step is to maximize the Q function in (7) with respect to θt + 1

θtþ1 ¼ arg maxθ

Q θjθtð Þ: ð8Þ

The E-step and M-step are repeated iteratively until convergence and then the process of parameter learning is finished.

2.3. The inference of 3T-DBNMG with missing output data

Given the measurement at time instant t and the historical states of hidden node at time t-1 and t-2, namely, Ct-1, Ct-2, the main inference taskperformed in 3T-DBNMG is to compute the maximum a posteriori estimate of Ct.

For an incomplete measurement yt, it can be divided into the observable part yt,o and themissing part yt,u. Then, given the joint probability functionP(yt) = P(yt,o, yt,u), the marginal probability function of yt,o can be obtained by integrating out yt,u such that P(yt,o) = ∫ P(yt,o, yt,u)dyt,u. Thus, the likeli-hood function and posterior probability of observable parts yt,o can be obtained respectively by integrating out yt,u as

P yt;ojCt ¼ i� �

¼Z

P yt;o; yt;ujCt ¼ i� �

dyt;u ð9Þand

P Ct ¼ ijyt;o;Ct−1 ¼ j;Ct−2 ¼ n� �

¼P Ct ¼ ijyt;o;Ct−1 ¼ j;Ct−2 ¼ n� �P yt;o;Ct−1 ¼ j;Ct−2 ¼ n� � ¼

P Ct−2 ¼ nð ÞP Ct−1 ¼ jjCt−2 ¼ nð ÞP Ct ¼ ijCt−1 ¼ j;Ct−2 ¼ nð ÞP yt;ojCt ¼ i� �

P yt;o� �

P Ct−2 ¼ nð ÞP Ct−1 ¼ jjCt−2 ¼ nð Þ¼ βi

tP yt;ojCt ¼ i� �

P Ct ¼ ijCt−1 ¼ j;Ct−2 ¼ nð Þð10Þ

where βit ¼ 1

P yt;oð Þ is the normalized factor.

Considering the missing part of data, (1) can be expressed as

P yt jCt ¼ ið Þ ¼XMm¼1

P Mt ¼ mjCt ¼ ið ÞP yt;o; yt;ujMt ¼ m;Ct ¼ i� �

ð11Þ

where P(yt,o, yt,u|Mt =m, Ct = i) = N(yt; μi,m,∑ i,m) is the conditional probability distribution table of the observable nodes. Because the covariance

34 Z. Zhang, F. Dong / Chemometrics and Intelligent Laboratory Systems 138 (2014) 30–40

matrix is a diagonal matrix, the attributes of the same Gaussian mixture components are uncorrelated. Then, according to the equivalence betweenGaussian distribution independence and uncorrelation, the attributes of the same mixture composition are independent between each other. Thus,(11) can be transformed into

P yt jCt ¼ ið Þ ¼XMm¼1

ωm;iP yt;ojMt ¼ m;Ct ¼ i� �

P yt;ujMt ¼ m;Ct ¼ i� �

ð12Þ

where ωm,i = P(Mt = m|Ct = i), joints (9), (10), and (11), the posterior probability of state variables can be transferred as

P Ct ¼ ijyt;o;Ct−1 ¼ j;Ct−2 ¼ n� �

¼ βit

ZPðyt;o; yt;ujCt ¼ iÞdyt;uP Ct ¼ ijCt−1 ¼ j;Ct−2 ¼ nð Þ ¼ βi

tγit

ZP yt;ojCt ¼ i� �

P yt;ujCt ¼ i� �

dyt;u

¼ βitγ

itP yt;ojCt ¼ i� �Z

P yt;ujCt ¼ i� �

dyt;u ¼ βitγ

itP yt;ojCt ¼ i� � ð13Þ

where γ ti = P(Ct = i|Ct − 1 = j, Ct − 2 = n).

According to (13), it can be seen that the missing part yt,u has been marginalized out and will not play a role in the following derivation process.Therefore, the posterior probability only depends on the observable part. Note that only the constrained full-rank diagonal matrix is chosen as thecovariance matrix for GMM, and the observable part of the diagonal matrix is also a full-rank diagonal matrix which is readily computable, thus,the likelihood distribution of measurement P(yt,o|Ct = i, Mt = m) is also a diagonal Gaussian and is easily determined.

Remark 1. Comparedwith imputation approaches, the proposed approach does not require pre-processing ofmissing values, so it can guaranteethe real-time performance in FDD.

Remark 2. If all of the sampling data are lost at some time point, the nearest neighborhood shift is implemented that the current data is replacedby the previous measurement.

3. Fault identification of missing data system based on DBNMG

Consider the multi-input–multi-output non-linear system given by

y tð Þ ¼ F x0;u tð Þ; ξ; f; eð Þ ð14Þ

where x0 ∈ Rn denotes the initial state of the system, y(t) and u(t) de-note the output and control input at t time, respectively, and e is the sys-temnoise. F is the systemmathematicalmodel function and assumed tobe unknown. The fault input f= [ f0, f1,…, fN] is also unknown and rep-resents normal work of the system and possible various different fail-ures and failure functions are also unknown, where class f0 denotesthe normal state and fi denotes the different faulty conditions. Becausethe mathematical model of system (13) is unknown, we apply thedata-driven approach for fault identification. To indicate whether thesystem is at fault or not, an indication vector ξ ∈ RN + 1 is defined,where ξi is binary with 0 meaning that the ith fault does not occurwhile 1 indicating that the ith fault occurs. Especially, when the systemworks in a normal state, ξ = [1, 0...0]T. In order to simulate randomlymissing data phenomenon in the actual process, an indication vectorλt = (λt

1, λt2, …, λt

n) ∈ Rn is introduced, whose element is a Bernoullidistributed stochastic variable taking values on 0 and 1. λt

i =0meansthat the ith element in corresponding output at t time instant ismissing.

The purpose of 3T-DBNMG is to determine the current fault state ofthe system according to the observable parts of output measurement.For the sake of simplicity, assume that only one fault occurs at thesame time in this work. Thus, the fault detection and diagnosis problemof system (14) is transformed to determine the location of non-zero el-ement of ξ. Because the singular data will cause the failure of trainingthe network, the sampled output data is normalized as

y ¼ y−MinValueð ÞMaxValue−MinValueð Þ ð15Þ

where theMinValue andMaxValue denote theminimum andmaximumof eachmeasurement respectively. Using this procedure, the data valueswill all fall between 0 and 1.

To use MT-DBNMG, y(t) in (14) is expressed by the output node Yt, fis expressed by the hidden node Ct, and the non-Gaussian property ofmeasurements is expressed by the Gaussian components in node Mt.The initial condition probability distribution table of MT-DBNMG is se-lected randomly and the coefficient value of the mixed components isdetermined by the experimental process. The step-by-step procedureof the proposed approach is given below and the corresponding dia-gram is shown in Fig. 4.

Step (a) Construct the MT-DBNMG structure as shown in Fig. 3, andthen use the network nodes denoting the monitored processvariables.

Step (b) Determine the Gaussian component number of output mea-surements based on the completed historical data.

Step (c) According to (5), (6), and (7), learn the conditional probabilitydensity functions of variable nodes in the networkmodel fromnormalized process historical data.

Step (d) Use the parameters obtained in the training process and thedata to infer according to (12), then obtain a series of posteriordistribution.

Step (e) Find the state with the maximum posterior probability basedon (15), and choose the state with the maximum posteriorprobability as the current state of the system, namely,

Ct ¼ arg maxCt¼i

Ct ¼ ijλt ; ytð Þ: ð16Þ

4. Simulations

4.1. Continuous stirred-tank reactor example

A simulated process of the a non-isothermal CSTRwith time varyingparameters including activation energy and heat transfer coefficient isused to illustrate theusage of theproposed networkedprocessmonitor-ing and fault propagation diagnosis approach. Themathematicalmodelsof the CSTR and parameter settings are as follows [40,41]:

dCA

dt¼ q

VCAf−CA

� �−k0CA exp − E

RT

� �ϕc tð Þ ð17Þ

Fig. 2. The structure of the 3T-DBNMGmodels.

Fig. 4. Schematic diagram of the MT-DBNMG based process monitoring and fault identifi-cation approach.

35Z. Zhang, F. Dong / Chemometrics and Intelligent Laboratory Systems 138 (2014) 30–40

dTdt

¼ qV

T f−T� �

þ−ΔHρCp

k0CA exp − ERT

� �ϕc tð Þ

þ ρcCpcqcVρCp

1− exp − hAρCpcqc

ϕh tð Þ ! !

Tc−Tð Þ ð18Þ

where CA is the effluent concentration, CAf is the feed concentration, T isthe reactor temperature, Tc is the coolant inlet temperature, Tf is coolantinlet temperature, q is the reactor feedflow rate, V is the reactor volume,ρ is the liquid density, k0 is the pre-exponential factor,−ΔH is the heatof reaction, Cp and Cpc are the heat capacity, ϕc(t) is the deactivation co-efficient and ϕh(t) is the fouling coefficient. Under the nominal condi-tion, the parameters of the reaction kettle are shown in Table 1 [41].

The measurement noise of the reactor temperature and the outletreagent concentration is assumed to follow the Gaussian distributionN (0,0.82) andN (0,0.042), and the reactor feed flow rate has an externaldisturbance that followed the Gaussian distribution N (0,1).

Considering the feed flow rate fault, the feed concentration fault, andthe linear fouling fault, namely, the test scenarios are designedwherebythe feed flow rate and the feed concentration are affected by decreasingvariations along the exponential curve starting from t0, and the foulingresistance increases linearly starting from t0, which can be expressedas follows:

q tð Þ ¼100 ; tbt0

q t0ð Þ þ Γq 1− expt−t080

� �; t≥t0

�8<: ð19Þ

Fig. 3. The transformed structure

CAf tð Þ ¼1 ; tbt0

CAf t0ð Þ þ ΓC 1− expt−t080

� �; t≥t0

�8<: ð20Þ

ϕh tð Þ ¼ 1 ; tbt01−αh t−t0ð Þ ; t≥t0

�ð21Þ

where Γq, ΓC and αh denote the magnitudes of faults. In the simulation,we set Γq = 1, ΓC = 0.02 and αh = 0.01 in the train phase, and then,they are changed to Γq = 1.5, ΓC = 0.02 and αh = 0.008 in the testphase.

In the simulation, the sampling period is set to 0.1 min. The processstate variables to be monitored include the reaction concentration CAand the reaction temperature T, and their normal ranges are set as[0.88∗CA,1.12∗CA] and [0.98∗T,1.02∗T], respectively. Therefore, the sys-tem is considered as normal when the reaction concentration and

of the 3T-DBNMG models.

Table 1The normal working point of CSTR.

q = 100 L/min − ΔH = 837360 J/mol T = 440.2 K CAf = 1 mol/Lρ = 1000 g/L ρc = 1000 g/L V = 100 L hA = 2, 930, 760 J/(min ∗ K)E/R = 9950 K CA = 8.36 × 10−2 mol/L Tf = 350 K k0 = 7.2 × 1010 min−1

Cp = 4.1868 J/(g ∗ K) Cpc = 4.1868 J/(g ∗ K) Tc = 350 K

36 Z. Zhang, F. Dong / Chemometrics and Intelligent Laboratory Systems 138 (2014) 30–40

reaction temperature fluctuatewithin the normal ranges, otherwise thesystem is faulty.

We structure the DBNMG model according to Fig. 3. In this model,the output node Yt has two variables; they are the reaction concentra-tion CA and the reaction temperature T, Yt=(CA,T)T. The root node Ct de-notes the state of the system, which has four kinds of states, state zerodenotes normal state, state one denotes the feed flow rate fault, statetwo denotes the feed concentration fault, and state three denotes thelinear fouling fault. Mt denotes the number of the mixture Gaussiancomponent. A 10-fold cross-validation is used to determine the numberof Gaussian, and in this network, we get Mt = 3. In the training phase,we assume that the fault occurs at the t0 = 100 sample and ϕc(t) = 1.

In the test phase, the purpose of detection and identification is toreport the occurrence of fault and the fault type before the value beyondthe threshold. Assuming that the test sets contained fault from the t0 =200 sample, the complete data, 5% missing data and 10% missing dataare used to test the performance of the proposed method, respectively.The simulation results are evaluated in terms of 3 performance indexes:the false alarm rate, the delay alarm rate, and the accuracy rate. Thedelay alarmmeans that the fault is reported after the measurement ex-ceeds the threshold or the fault detection has failed. The false alarmmeans that the fault is reported when the system is normal or faultidentification does not conform to the real faults. The detection andidentification are accurate only if the fault is reported before the indexexceeds the threshold and the identification results are accurate. There-fore, assuming that the number of simulation time is FN, the number offalse alarm is FW, and the number of delay alarm is FL, thus, false alarm

0 100 200 300 400 5000

0.2

0.4

0.6

0.8

1

CA

(mol

/L)

Sample

(a)

0 100 200 300 400 50020

40

60

80

100

120

q (L

/min

)

Sample

(c)

Fig. 5. Simulation result for the fee

rate can be represented as FW/FN, the delay alarm rate is FL/FN, and theaccuracy rate can be expressed as (FN–FW–FL)/FN.

Figs. 5–7 show the different fault output measurements of the CSTRand the identification results of 3T-DBNMGwith 10% randomlymissing10% data. The subfigures (a) and (b) show the trajectories of output var-iables CA and T, respectively, where the blue solid line shows the mea-sured value, the red dash line represents theoretical normal value andthe green dash dot line shows the fault threshold value. Subfigure(c) shows the trajectory of the fault variables. Subfigure (d) shows thefault identification results with 3T-DBNMG. From these figures, it canbe seen that the proposed approach can efficiently detect and identifythe fault with missing data.

To evaluate the accuracy of the proposed method, Monte Carlo sim-ulations are repeated 100 times with different missing data rates. Theresults are shown in Table 2. Then, in order to compare the results forincipient fault identification with two-time-slice DBNMG, the simula-tion results of 2T-DBNMG with different missing rates [36] are listedin Table 3. From Table 2, it can be seen that the proposed method candetect and identify three kinds of faults accurately when the data iscomplete. With the increase of missing data, the detection effectivenessfor the three faults are all decreased. Among them, the accuracy rate ofthe feed flow fault detection drops about 1% when missing data in-creases to 10% because the false alarm rate is increased. For the feedconcentration failure, the accuracy rate drops 3% when missing data in-creases to 5% and drops another 2%whenmissing data increases to 10%,but both performance indexes are better than those of 2T-DBNMG. Theaccuracy rate of the fouling fault drops about 1% and the delay alarm

0 100 200 300 400 500

360

380

400

420

440

460

T (

K)

Sample

(b)

0 100 200 300 400 5000

1

2

3

Ct

Sample

(d)

d flow rate fault of the CSTR.

0 100 200 300 400 5000

0.2

0.4

0.6

0.8

CA

(mol

/L)

Sample

(a)

0 100 200 300 400 500300

350

400

450

T (

K)

Sample

(b)

0 100 200 300 400 5000

0.5

1

1.5

CA

f (m

ol/L

)

Sample

(c)

0 100 200 300 400 5000

1

2

3

Ct

Sample

(d)

Fig. 6. Simulation result for the feed concentration fault of the CSTR.

37Z. Zhang, F. Dong / Chemometrics and Intelligent Laboratory Systems 138 (2014) 30–40

rate increases with the increase of missing data. The advanced value ofdiagnosis time remained in the same level for the three fault scenarioswhen the missing rate increases from 0% to 10%. Comparing Table 2with Table 3, it can be seen that the proposed 3T-DBNMG outperforms2T-DBNMG with different missing rates. The reason is that 2T-DBNMGconsiders that the current state is only related to the previous state,

0 200 400 600 800 1000 1200−0.2

−0.1

0

0.1

0.2

CA

(mol

/L)

Sample

(a)

0 200 400 600 800 1000 12000

0.2

0.4

0.6

0.8

1

φ h(t)

Sample

(c)

Fig. 7. Simulation result for the

for some faults of which characteristics do not change obviously, andit could not fully exploit the relationships between data. But 3T-DBNMG considers that the current state is related to the state of thefirst two moments. Compared with 2T-DBNMG, 3T-DBNMG couldfully mine more information from the data, so it can improve the accu-racy of identification.

0 200 400 600 800 1000 1200400

410

420

430

440

450

460

T (

K)

Sample

(b)

0 200 400 600 800 1000 12000

1

2

3

Ct

Sample

(d)

fouling fault of the CSTR.

Table 2The identification results of the three kinds of faults by 3T-DBNMG with different missing rates.

Feed flow fault Feed concentration fault Fouling fault

Missing rate (%) 0 5 10 0 5 10 0 5 10False alarm rate (%) 0 1 1 1 2 4 0 0 0Delay alarm rate (%) 0 0 0 0 2 2 0 1 1Accuracy rate (%) 100 99 99 99 96 94 100 99 99Advance value (min) 6.49 6.37 6.04 1.52 1.48 1.49 85.46 85.04 84.46

Table 3The identification results of the three kinds of faults by 2T-DBNMG with different missing rates.

Feed flow fault Feed concentration fault Fouling fault

Missing rate (%) 0 5 10 0 5 10 0 5 10False alarm rate (%) 0 1 1 1 7 10 0 0 0Delay alarm rate (%) 0 0 0 0 1 1 0 1 2Accuracy rate (%) 100 99 99 99 92 89 100 99 98Advance value (min) 7.05 7.04 6.93 1.82 1.76 1.74 85.50 85.21 84.57

38 Z. Zhang, F. Dong / Chemometrics and Intelligent Laboratory Systems 138 (2014) 30–40

4.2. Tennessee Eastman chemical process example

In this subsection, the Tennessee Eastman chemical process is fur-ther used to evaluate the effectiveness of the proposed fault detectionand identification approach. The TE process is an open-loop unstableplant-wide process control problem considered as a benchmark simula-tion for various process monitoring techniques. The process consists offive major units, which include an exothermic two-phase reactor, a

Fig. 8. Process flow diagram of the Ten

flash separator, a recycle compressor, a reboiled stripper, and a productcondenser. The process flow diagram is shown in Fig. 8. The TE processhas a total of 11 input variables (without agitator speed) and 41 mea-surement variables. And the process measurements are sampled withan interval of 3 min.

There are 20 types of identified faults in the TE process. Amongthem, researchers especially focus on faults 4, 9 and 11 [42], becausethese three faults are good representations of overlapping data and

nessee Eastman chemical process.

Table 4Simulation results for the case of TE process by 3T-DBNMG.

Fault 4 Fault 9 Fault 11

Missing rate (%) 0 5 10 0 5 10 0 5 10False alarm rate (%) 0 0 0 1.8 0.8 0.5 0.3 1.7 2.3Delay alarm rate (%) 0.1 0.2 0.2 66.7 68.1 68.7 0.1 0.2 0.3Accuracy rate (%) 99.9 99.8 99.8 31.5 31.1 30.8 99.6 98.2 97.4

0 200 400 600 800

1

2

3

sample

dete

ctio

n st

ate

Fig. 10. Trend plots of the state of root node for fault 9 with 10% missing data.

39Z. Zhang, F. Dong / Chemometrics and Intelligent Laboratory Systems 138 (2014) 30–40

they are difficult to classify. Thus, these three types of faults are selectedto examine the performance of the proposed monitoring and diagnosismethod. For each fault, a training data set consisting of 480 samples isused to develop the 3T-DBNMG model, and a testing set that consistsof 800 samples, each sample is a 52-dimensional vector. The test casebegins with normal operation for the initial 8 h and then is followedby the process fault for the remaining 32 h.

In the simulation, the output is a 52-dimensional vector, and tocompare with the results of the Bayesian network classifier based onthe GMM (GMMBN) in literature [7], 5 feature variables, variables 38,39, 40, 41, and 51 are selected in simulation, so the node Yt ={38,39,40,41,51}. After a 10-fold cross-validation, the number of Gauss-ian is set toMt=7. The root node Ct denotes the state of the system andit also has four kinds of states, where state zero denotes normal state,state one denotes fault 4, state two denotes fault 9, and state three de-notes fault 11. A training data set consisting of normal data, and thedata of fault 4, fault 9, and fault 11 are used to develop the 3T-DBNMGmodel. In addition, a test case is designed to examine the performanceof the proposed monitoring and diagnosis method. The test case beginswith normal operation for the initial 8 h and then introduces differentfailures for the remaining time. The results of fault identification areshown in Table 4, and Figs. 9–11 are the results of simulations with10% missing random data by the proposed method.

The fault diagnosis result with missing data of the proposedmethodfor fault 4 is shown in Fig. 9. It is readily seen that the system works innormal state (state zero) for the first 163 samples from the normal op-erating conditions and then jumps to state one once the process fault ofstep change in the reactor cooling water inlet temperature takes place.The proposed method in this work can give an accurate alarm on pro-cess fault with very minimal delay and falsity. The diagnosis result forfault 9 is shown in Fig. 10. The result reveals that the system works innormal state in the first 710 samples and then jumps to state two, butin practice, the process fault of random variation in D feed temperaturehas occurred from sample 161, so the delay is very large. For fault 9, be-cause the features of abnormal samples are so similar to those of thenormal samples it is hard to distinguish them, so the delay alarm rateis large. The diagnosis result for fault 11 is shown in Fig. 11, the systemworks in normal state for the first 164 samples and then when the pro-cess fault of random variation in the reactor cooling water inlet

0 200 400 600 800

1

2

3

sample

dete

ctio

n st

ate

Fig. 9. Trend plots of the state of root node for fault 4 with 10% missing data.

temperature takes place, the proposed method can give an accuratealarm on process fault with very minimal delay and short falsity.

From Table 4, we can know that for fault 4, 3T-DBNMG can fully de-tect the fault and it is not sensitive to the missing data. The accuracy offault 11 drops 2.2% with the increasing missing data. For fault 9, the ac-curacy is about 31%. In a word, the proposed approach of this work canaccurately detect fault 4 and fault 11 of the TE process. The accuracy offault 9 is not satisfied. Although that is true, compared with the diagno-sis results of fault 9 by 2T-DBNMGwhich is shown in Table 5, we can seethat the proposed approach can effectively diagnose the fault. Because2T-DBNMGdoes not consider the correlation of the data betweendiffer-ent times, it cannot use the information in the data to detect and diag-nose the fault. While in this work, we propose 3T-DBNMG, since itrelates the current state with historic data, and it can utilize the usefulinformation to detect and diagnose the fault. So compared with 2T-DBNMG, 3T-DBNMG can more accurately and effectively diagnose thefault.

We compare our method with the work in reference [42]. Since themethodsmentioned in the reference do not consider the partly missingmeasurements, only the results with completed data are discussed. Inthat reference, the authors focused on the fault diagnosis only, inother words, only three faults are classified. However, fault 9 has a seri-ous overlap with the normal state. In our results, the detection rate offault 9 is about 24%, and the overall fault detection rate is 68.37%. There-fore, the overall misdiagnosis rate is about 33.34%. In our simulation,fault detection and fault diagnosis are carried out simultaneously, andthe overall misdiagnosis rate is about 23%. Our method has a better di-agnosis performance.

5. Conclusions

In this study, aMT-DBNMGbased processmonitoring and fault iden-tification approach is developed for the non-Gaussian processes with

0 200 400 600 800

1

2

3

sample

dete

ctio

n st

ate

Fig. 11. Trend plots of the state of root node for fault 11 with 10% missing data.

Table 5Simulation results for the case of TE process by 2T-DBNMG.

Fault 4 Fault 9 Fault 11

Missing rate (%) 0 5 10 0 5 10 0 5 10False alarm rate (%) 0 0 0 2.3 1.1 0.8 0.2 2.1 3.7Delay alarm rate (%) 0.1 0.2 0.2 75.2 76.6 77.9 0.4 0.3 0.7Accuracy rate (%) 99.9 99.8 99.8 22.5 22 21.3 99.4 97.6 95.6

40 Z. Zhang, F. Dong / Chemometrics and Intelligent Laboratory Systems 138 (2014) 30–40

randomly missing data. Different from the conventional data-drivenprocess monitoring techniques, the proposed method uses the com-plete historical data to design and estimate MT-DBNMG, and then ma-nipulates the missing measurement data to detect and identify thefaults on-line without imputation. In addition, we also focus on the de-tection and identification of the incipient fault, which is a difficult task inthe probability framework, since a small variation of measurement willnot cause an obvious variation on its distribution features. Via introduc-ing more time slices, MT-DBNMG can exploit the correlation informa-tion among the historical data and enhance the identification effect ofthe incipient fault. The proposed approach is applied to an illustrativeCSTR and the TE process with the results compared against 2T-DBNMG and the conventional Bayesian network method. The resultsdemonstrate that the presented approach can effectively monitor pro-cess operation and identify fault, and it is also robust to unknownnoise as the small disturbances will not trigger false alarms.

Conflicts of interest

The authors declare that they have no conflict of interests.

Acknowledgments

The authors would like to acknowledge the supports by the NationalNatural Science Foundation of China (No. 61374047, No. 61202473) andthe Fundamental Research Funds for Central Universities (JUSRP51322B,JUSRP111A49).

References

[1] X. He, Z. Wang, D.H. Zhou, Robust fault detection for networked systems with com-munication delay and data missing, Automatica 45 (2009) 2634–2639.

[2] H. Dong, Z. Wang, H. Gao, On design of quantized fault detection filters with ran-domly occurring nonlinearities and mixed time-delays, Signal Process. 92 (2012)1117–1125.

[3] H. Dong, Z.Wang, J. Lam, H. Gao, Fuzzy-model-based robust fault detectionwith sto-chastic mixed time delays and successive packet dropouts, IEEE Trans. Syst. ManCybern. B Cybern. 42 (2012) 365–376.

[4] J. MacGregor, A. Cinar, Monitoring, fault diagnosis, fault-tolerant control and optimi-zation: data driven methods, Comput. Chem. Eng. 47 (2012) 111–120.

[5] S.J. Qin, Survey on data-driven industrial process monitoring and diagnosis, Annu.Rev. Control. 36 (2012) 220–234.

[6] M. Azhdari, N. Mehranbod, Application of Bayesian belief networks to faultdetection and diagnosis of industrial processes, Proc. Proceedings of the Inter-national Conference on Chemistry and Chemical Engineering, Beijing, China,2010, pp. 92–96.

[7] Z.D. Zhang, J.L. Zhu, F. Pan, Fault detection and diagnosis for data incomplete indus-trial systems with new Bayesian network approach, J. Syst. Eng. Electron. 24 (2013)500–511.

[8] Yu, A particle filter driven dynamic Gaussian mixture model approach for complexprocess monitoring and fault diagnosis, J. Process Control 22 (2012) 778–788.

[9] T. Chen, J. Zhang, On-line multivariate statistical monitoring of batch processesusing Gaussian mixture model, Comput. Chem. Eng. 34 (2010) 500–507.

[10] Z.F. Wang, J.L. Zarader, S. Argentieri, A novel aircraft fault diagnosis and prognosissystem based on Gaussianmixture models, Proc. IEEE 12th International Conferenceon Control Automation Robotics & Vision, Guangzhou, China, 2012, pp. 1794–1799.

[11] X. Xie, H. Shi, Dynamic multimode process modeling andmonitoring using adaptiveGaussian mixture models, Ind. Eng. Chem. Res. 51 (2012) 5497–5505.

[12] F. Qi, B. Huang, Bayesian methods for control loop diagnosis in the presence of tem-poral dependent evidences, Automatica 47 (2011) 1349–1356.

[13] S. Khatibisepehr, B. Huang, S. Khare, Design of inferential sensors in the process in-dustry: a review of Bayesian methods, J. Process Control 23 (2013) 575–1596.

[14] Q. Ji, P. Lan, C. Looney, A probabilistic framework for modeling and real-time mon-itoring human fatigue, IEEE Trans. Syst. Man Cybern. Syst. Hum. 36 (2006) 862–875.

[15] Y. Zhu, L. Huo, J. Lu, Bayesian networks-based approach for power systems fault di-agnosis, IEEE Trans. Power Deliv. 21 (2006) 634–639.

[16] S. Verron, J. Li, T. Tiplica, Fault detection and isolation of faults in a multivariate pro-cess with Bayesian network, J. Process Control 20 (2010) 902–911.

[17] Q. Li, Z. Li, Q. Zhang, L. Zeng, Research of Bayesian networks application totransformer fault diagnosis, Artificial Intelligence and Computational Intelligence,Springer Berlin, Heidelberg, 2011, pp. 385–391.

[18] L. Yang, J. Lee, Bayesian belief network-based approach for diagnostics and prognos-tics of semiconductor manufacturing systems, Robot. Comput. Integr. Manuf. 28(2012) 66–74.

[19] T. Yamaguchi, S. Inagaki, T. Suzuki, Data based construction of Bayesian network forfault diagnosis of event-driven systems, Proc. IEEE International Conference on Au-tomation Science and Engineering, Guangzhou, China, 2012, pp. 508–514.

[20] J. Tao, Q. Li, C. Zhu, J. Li, A hierarchical naive Bayesian network classifier embeddedGMM for textural image, Int. J. Appl. Earth Obs. Geoinf. 14 (2012) 139–148.

[21] X. Wei, Evolutionary continuous optimization by Bayesian networks and Gaussianmixture model, IEEE 10th International Conference on Signal Processing, Beijing,China, 2010, pp. 1437–1440.

[22] J. Yu, A new fault diagnosis method of multimode processes using Bayesian infer-ence based Gaussian mixture contribution decomposition, Eng. Appl. Artif. Intell.26 (2013) 456–466.

[23] J.W. Robinson, A.J. Hartemink, Learning non-stationary dynamic Bayesian networks,J. Mach. Learn. Res. 11 (2010) 3647–3680.

[24] K.P. Murphy, Dynamic Bayesian Networks: Representation, Inference and Learning,(Ph.D. Thesis) University of California, 2002.

[25] F. Camci, R.B. Chinnam, Dynamic Bayesian networks for machine diagnostics: hier-archical hidden Markov models vs. competitive learning, Proc. IEEE InternationalJoint Conference on Neural Networks, Montreal, Canada, 2005, pp. 1752–1757.

[26] M. Fan, Z. Liu, X. Huang, W. Shi, Research on SOM-DBN based fault early warningsystem for dispatching automation, Proc. International Conference on Power SystemTechnology, Hangzhou, China, 2006, pp. 1–4.

[27] Z. Li, L. Cheng, X.S. Qiu, L. Wu, Fault diagnosis for high-level applications based ondynamic Bayesian network, Management Enabling the Future Internet for ChangingBusiness and New Computing Services, Springer Berlin Heidelberg, 2009, pp. 61–70.

[28] S. Jha, W. Li, S.A. Seshia, Localizing transient faults using dynamic Bayesian net-works, Proc. IEEE International High Level Design Validation and Test Workshop,California, USA, 2009, pp. 82–87.

[29] Y. Cheng, T. Xu, Yang, Bayesian network based fault diagnosis and maintenance forhigh-speed train control systems, Proc. IEEE International Conference on Reliability,Risk, Maintenance, and Safety Engineering, Chengdu, China, 2013, pp. 1753–1757.

[30] K. Al-jonid, J.Wang, M. Nurudeen, A new fault classificationmodel for prognosis anddiagnosis in CNCmachine, Proc. IEEE 25th Chinese Control and Decision Conference,Guiyang, China, 2013, pp. 3538–3543.

[31] J. Yu, M. Rashid, A novel dynamic Bayesian network-based networked process mon-itoring approach for fault detection, propagation identification, and root cause diag-nosis, AIChE J. 59 (2013) 2348–2365.

[32] X. Jin, S.Y. Wang, B. Huang, F. Forbes, Multiple model based LPV soft sensor develop-ment with irregular/missing process output measurement, Control. Eng. Pract. 20(2012) 165–172.

[33] J. Deng, B. Huang, Identification of nonlinear parameter varying systems with miss-ing output data, AIChE J. 58 (2012) 3454–3467.

[34] J.L. Schafer, J.W. Graham, Missing data: our view of the state of the art, Psychol.Methods 7 (2002) 147–177.

[35] S. Imtiaz, S. Shah, Treatment of missing values in process data analysis, Can. J. Chem.Eng. 86 (2008) 838–858.

[36] J.L. Zhu, Z.D. Zhang, F. Pan, Fault identification for data incomplete systems with adynamic Bayesian network approach, Inf. Control. (China) 42 (2013) 499–505.

[37] Y. Liu, J. Chen, Correntropy kernel learning for nonlinear system identification withoutliers, Ind. Eng. Chem. Res. 53 (2014) 5248–5260.

[38] S. Khatibisepehr, B. Huang, A Bayesian approach to robust process identificationwith ARX models, AIChE J. 59 (2013) 845–859.

[39] X.G. Shao, B. Huang, J.M. Lee, Constrained Bayesian state estimation: a comparativestudy and a new particle filter based approach, J. Process Control 20 (2010)143–157.

[40] Y. Liu, H.Q. Wang, J. Yu, P. Li, Selective recursive kernel learning for online iden-tification of nonlinear systems with NARX form, J. Process Control 20 (2010)181–194.

[41] M. Nikravesh, A.E. Farell, T.G. Stanford, Control of nonisothermal CSTR with timevarying parameters via dynamic neural network control (DNNC), Chem. Eng. J. 76(2000) 1–16.

[42] L.H. Chiang, M. Kotanchek, A. Kordon, Fault diagnosis based on Fisher discriminantanalysis and support vector machines, Comput. Chem. Eng. 28 (2004) 1389–1401.