Sparse Coding-Based Failure Prediction for Prudent Operation of...

7
1 Sparse Coding-Based Failure Prediction for Prudent Operation of LED Manufacturing Equipment Jia-Min Ren 1 , Chuang-Hua Chueh 1 , and H. T. Kung 2 1 Computational Intelligence Technology Center, Industrial Technology Research Institute, Hsinchu, Taiwan [email protected] [email protected] 2 Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge MA, USA [email protected] ABSTRACT A sudden failure of a critical component in light-emitting diode (LED) manufacturing equipment would result in unscheduled downtime, leading to a possibly significant loss in productivity for the manufacturer. It is therefore important to be able to predict upcoming failures. A major obstacle to failure prediction is the limited amount of equipment lifecycle data available for training, as equipment failure is not expected to be frequent. This calls for machine learning techniques capable of making accurate failure predictions with limited training data. This paper describes such a method based on sparse coding. We demonstrate the prediction performance of the method on a real-world dataset from LED manufacturing equipment. We show that sparse coding can draw out salient features associated with failure cases, and can thus produce accurate failure predictions. We also analyze how sparse coding-based failure prediction can lead to significant efficiency improvements in equipment operation. 1. INTRODUCTION Reducing costs and increasing productivity are crucial concerns in the competitive manufacturing industry. Many manufacturers are seeking to implement intelligent manufacturing processes including the use of automated data analysis techniques (Scoville, 2011) which can allow for cost- and time-saving predictive maintenance (Rothe, 2008). This paper addresses approaches to optimizing predictive maintenance methods for light-emitting diode (LED) manufacturing equipment. Modern LEDs are multilayered structures of chemical materials in which the thickness and composition of the various layers determine the color and brightness of the emitted light and device energy efficiency. The layers are deposited sequentially through the metal organic chemical vapor deposition (MOCVD) process, a critical determinant process in LED performance. The crystalline structure of each new layer is epitaxially aligned with that of the underlying layer. This complex process is affected by the conditions of a multitude of components, such as pumps, heaters, mass flow controllers, and particle filters 1 . Unexpected component failure can reduce LED production yields, and finding and repairing the source of the failure can take engineers up to 5 days. For example, the failure of a particle filter will cause the pump to shut down, resulting in all the raw materials consumed in that run to be wasted. Here, we focus on developing a failure prediction algorithm for the particle filter to ensure continuous high-performance operation in the MOCVD process. Learning features associated with failure cases plays a critical role in a data-driven prediction approach. The goal is to come up a compact yet discriminative feature representation, in which samples related to failure cases can be accurately expressed and easily differentiated from others. Many feature learning methods have been discussed in the literature; see, e.g., Huang & Aviyente, 2006. These include principle component analysis (PCA), linear discriminant analysis (LDA) and sparse coding (SC). The SC approach used in this paper has emerged as one of the most popular feature learning methods in recent years, in areas such as computer vision (Wright et al., 2010). It computes a sparse representation of input data in terms of a linear combination of atoms in an overcomplete dictionary (more details are given in Section 4.1). Compared to methods based on 1 Note that by a particle filter we mean in this paper a physical filter in MOCVD equipment. This is not to be confused with particle filtering methods in the diagnostics and prognostics prediction literature (see, e.g., Orchar & Vachtsevanos, 2007). J.-M. Ren et al. This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 United States License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Transcript of Sparse Coding-Based Failure Prediction for Prudent Operation of...

Page 1: Sparse Coding-Based Failure Prediction for Prudent Operation of …htk/publication/2015-phm-ren-chueh-kung.pdf · Sparse Coding-Based Failure Prediction for Prudent Operation of LED

1

Sparse Coding-Based Failure Prediction for Prudent Operation of LED Manufacturing Equipment

Jia-Min Ren1, Chuang-Hua Chueh1, and H. T. Kung2

1Computational Intelligence Technology Center, Industrial Technology Research Institute, Hsinchu, Taiwan [email protected]

[email protected]

2Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge MA, USA [email protected]

ABSTRACT

A sudden failure of a critical component in light-emitting diode (LED) manufacturing equipment would result in unscheduled downtime, leading to a possibly significant loss in productivity for the manufacturer. It is therefore important to be able to predict upcoming failures. A major obstacle to failure prediction is the limited amount of equipment lifecycle data available for training, as equipment failure is not expected to be frequent. This calls for machine learning techniques capable of making accurate failure predictions with limited training data. This paper describes such a method based on sparse coding. We demonstrate the prediction performance of the method on a real-world dataset from LED manufacturing equipment. We show that sparse coding can draw out salient features associated with failure cases, and can thus produce accurate failure predictions. We also analyze how sparse coding-based failure prediction can lead to significant efficiency improvements in equipment operation.

1. INTRODUCTION

Reducing costs and increasing productivity are crucial concerns in the competitive manufacturing industry. Many manufacturers are seeking to implement intelligent manufacturing processes including the use of automated data analysis techniques (Scoville, 2011) which can allow for cost- and time-saving predictive maintenance (Rothe, 2008). This paper addresses approaches to optimizing predictive maintenance methods for light-emitting diode (LED) manufacturing equipment.

Modern LEDs are multilayered structures of chemical

materials in which the thickness and composition of the various layers determine the color and brightness of the emitted light and device energy efficiency. The layers are deposited sequentially through the metal organic chemical vapor deposition (MOCVD) process, a critical determinant process in LED performance. The crystalline structure of each new layer is epitaxially aligned with that of the underlying layer. This complex process is affected by the conditions of a multitude of components, such as pumps, heaters, mass flow controllers, and particle filters 1 . Unexpected component failure can reduce LED production yields, and finding and repairing the source of the failure can take engineers up to 5 days. For example, the failure of a particle filter will cause the pump to shut down, resulting in all the raw materials consumed in that run to be wasted. Here, we focus on developing a failure prediction algorithm for the particle filter to ensure continuous high-performance operation in the MOCVD process.

Learning features associated with failure cases plays a critical role in a data-driven prediction approach. The goal is to come up a compact yet discriminative feature representation, in which samples related to failure cases can be accurately expressed and easily differentiated from others. Many feature learning methods have been discussed in the literature; see, e.g., Huang & Aviyente, 2006. These include principle component analysis (PCA), linear discriminant analysis (LDA) and sparse coding (SC). The SC approach used in this paper has emerged as one of the most popular feature learning methods in recent years, in areas such as computer vision (Wright et al., 2010). It computes a sparse representation of input data in terms of a linear combination of atoms in an overcomplete dictionary (more details are given in Section 4.1). Compared to methods based on

1 Note that by a particle filter we mean in this paper a physical filter in MOCVD equipment. This is not to be confused with particle filtering methods in the diagnostics and prognostics prediction literature (see, e.g., Orchar & Vachtsevanos, 2007).

J.-M. Ren et al. This is an open-access article distributed under the termsof the Creative Commons Attribution 3.0 United States License, whichpermits unrestricted use, distribution, and reproduction in any medium,provided the original author and source are credited.

Page 2: Sparse Coding-Based Failure Prediction for Prudent Operation of …htk/publication/2015-phm-ren-chueh-kung.pdf · Sparse Coding-Based Failure Prediction for Prudent Operation of LED

osf(2

Istmaoimif

T2pecMepm6

2

Tcatpworeco

FmMuepmmHbdcme

orthonormal tsuperior perfoface recogniti(Chen et al., 22015). TherefSC.

In evaluating sensor data frthe SC-basedmeasure aboapproach. In toperation, thincrease it by maintenance pis the first appfailure in MO

The remainde2 provides anprior work. Sexperiments. coding and tMOCVD eqexperimental prediction mmaintenance a6. Conclusion

2. FAILURE

The goal of component faalarm or a mathen intervenpaper, we focwords, followof whether therun denotes aequipment. Tconsidered as outcome. We

Failure predicmodel-based aModel-based understand fequipment parameters, smethods (Orcmethods usuaHowever, to abuilding an detailed mecconsuming. Imachine learnequipment wit

ANN

transformationormance in a ion (Wright et2015), and wirfore, the propo

our SC-baserom LED MOd failure predout 39% oveterms of annuae SC-based about 600 hou

policy. To the plication of SCCVD equipme

er of this papern overview ofSection 3 desSection 4 expthe proposed quipment. In

results of PCmethods. Cost

and predictionns are given in

PREDICTION

failure predicailure in MOaintenance adve to prevent

cus on the nexwing each run,

e particle filtean execution oThe next-run f

a decision prthus address i

ction methods and data-driveapproaches h

failure modecomponents. such as thosechar and Vacally require relachieve accepappropriate p

chanistic knoIn contrast, dned model basthout relying s

NUAL CONFEREN

ns, SC has bevariety of appt al., 2009), e

reless link predosed failure pr

ed approach, OCVD equipmdiction methoer a convenal uptime of Mfailure predi

urs over a tradbest of our kn

C to the predient.

r is organized f failure prediscribes the d

plains the basifailure pred

n Section 5CA-based andt-benefit analn policies is dSection 7.

PROBLEM AN

ction is to prOCVD equipmvice to the ma

unscheduled xt-run failure p

the system prer will fail in tof a fabricationfailure predictroblem of preit as a binary c

can be roughen approacheshave been trae progression

In trainine in Kalmanchtsevanos, 2latively smalle

ptable performphysical mod

owledge and data-driven ased on observestrongly on do

NCE OF THE PRO

een shown toplications inclemotion recogdiction (Tarsa redictor is bas

we use real-ment. We founod can improvntional PCA-MOCVD equipiction methodditional prevennowledge, thisction of comp

as follows: Seiction problem

dataset used ic concept of siction pipelin

5, we showd SC-based flysis for dif

discussed in Se

ND PRIOR WO

redict an upcoment, and raianufacturer wh

downtime. Inprediction. Inrovides a predthe next run. Hn task on MOtion can be s

edicting a yes classification t

hly categorizes (Lee et al., 2aditionally usn associated ng physics-mn or particle 2007), model-er amounts of

mance in predidel would re

could beapproaches bued sensor dataomain knowled

GNOSTICS AND H

offer luding

gnition et al.,

sed on

-world nd that ve F--based pment d can ntative s work ponent

ection m and in our sparse ne for w the failure fferent ection

ORK

oming ise an ho can n this

n other diction Here a OCVD simply

or no task.

d into 2014). sed to

with model

filter -based f data. iction, equire time-

uild a a from dge.

Figcycdp.dp.a ruequtimrepcanMo

3.

MOcanpumavoMOmoaccpretaskandusuthe Thidenpardp.Accin repsho

Not“clesommenis mtasksub

HEALTH MANAG

gure 1. Data icle. It is an efilter raw datafilter maximuun denotes an uipment. (Run

me.) The maximresent the fea

n be representeore details abou

DATA DESCR

OCVD processn cause unexmp equipmenoid contaminaOCVD equipmnitoring the p

cumulated therssure betweenks are carried d particulates ually increase

degradation ois cycle consnoting the borticle filter wilfilter over a rucordingly, we our experimresented as a

own in Figure

te that the dipean runs” on

metimes needentioned earlie

much less thank execution, bsequent dp.fil

GEMENT SOCIET

illustration ofexample of tha over runs in

um values overexecution of a

ns may vary mum value of

ature of each red as a sequenut dp.filter are

RIPTION

ses produce poxpected and st. A particle

ation of the pment, dp.filter article filter stre. This senson the reactor out on MOCVare stacked ongradually. Fig

of the dp.filtersists of multioundaries betll be replaced un exceeds a c

extracted thements. This m

sequence of 1(b).

ps in Figure 1MOCVD equ

ed to clean uper. The amounn those associa

leading to lter maximum

Y 2015

f a particle fihe 22 cycles n a replacemer runs in the sa fabrication tsomewhat in f dp.filter rawrun. Thus a rence of these me given in Sect

owders and pasignificant dafilter is ther

pump. Amongis the most c

tatus in the levor measures tand the pumpVD equipmenn the filter, sogure 1(a) showr signal in a repiple runs, wittween runs. when the ma

configured three maximum vameans that e

dp.filter max

(b) were causuipment. Suchp residual gasnt of gases useated with a redips in a se

m values.

lter replacemconsidered.

ent cycle and same cycle. Htask on MOCV

their executiw data is usedeplacement cymaximum valution 3.

articulates whiamage to cosrefore needed g the sensors critical sensorvel of dust beithe difference p. As fabricatint, more powdo dp.filter valuws an exampleplacement cycth vertical linPractically, t

aximum valueeshold (e.g., 3alue for each reach cycle wximum values

sed by executih clean runs es in the reaced in a clean rgular fabricatiequence of t

2

ent (a) (b) ere VD ion

d to ycle ues.

ich stly

to in

r in ing of

ion ders ues e of cle. nes the of

30). run was

as

ing are

ctor run ion the

Page 3: Sparse Coding-Based Failure Prediction for Prudent Operation of …htk/publication/2015-phm-ren-chueh-kung.pdf · Sparse Coding-Based Failure Prediction for Prudent Operation of LED

IewTwmhsorftsc2ttsf

4

4

GD

ftcpe

GLf

Fo(2

4

Fptc

In total, 22 experiments, twere respectivTo predict whwe labeled thmaximum valhistorical infosliding windooverlapped byrun. In other from the prevthe feature vesample has a cycle consists23 to 104 runthus collectedthe training dsamples, and tfaulty samples

4. SPARSE C

4.1 Basic Co

Given N data D using the fo

min ∈

≜ ∈

for certain the dictionarycolumn (i.e., patches of lenexperiments.

Given the leaLASSO (Leasformulated op

min ∈

For a givenoptimization u(LARS) (Efro2007) we find

4.2 LEARNING

Figure 2 shoparticle filter the followingclassifier train

1. Patch geeach sampatches {

ANN

replacement the first 15 cyvely used to buhether the partihe previous rulue exceeded 3formation for ow of size 10y one run) to cwords, we usious nine runsector for the dimensionalit

s of a differenns), different

d for the trainindata consists the test data is.

CODING BASE

ncept of Spar

samples ∈ollowing math

, ∈ ∑

∈ . .

0, where y composed o

atom) in D. ngth four; the

arned dictionarst Absolute Shptimization pro

∈ ‖

n data pointusing method

on et al., 2004d the sparse co

G AND TRAINI

ows the learnfailure predic

g steps for ning.

enerating. Tomple, we spli{x1,…, xj,…, x

NUAL CONFEREN

cycles were ycles and the uild the traininicle filter will

un before the o30 as a faulty

failure pred0 runs (with create the featsed 10 dp.filtes and the prespresent run.

ty of 10. Sincnt number of numbers of nng and the tes

of 387 norms composed o

ED FAILURE PR

rse Coding

, we firshematical optim

1, ∀

is the sparse cof columns

We split samerefore, we h

ry D, we conhrinkage and Soblem:

‖ ‖ ‖

t , by solds such as lea4) and interio

ode for .

ING PIPELINES

ning and traiction via spardictionary le

o capture locit each samplx10-p+1}, where

NCE OF THE PRO

collected. Inremaining 7 cng and the testfail in the nex

one whose dprun. To incorp

diction, we uadjacent win

ture vector forer maximum vent run to repThus each cre each replaceruns (varying

normal samplest data. Specifimal and 15

of 319 normal

REDICTION

st learn a dictimization:

‖ ‖ ‖

1,… , (

code of , ans and is thmples into sm

have M = 4 i

nsider the folloSelection Ope

(2)

ving the LAast angle regrer-point (Koh

S

ining pipelinerse coding. Wearning and

al variation wle into overlae the patch siz

GNOSTICS AND H

n our cycles t data. xt run, p.filter porate

used a ndows r each values resent reated ement

g from es are

fically, faulty and 7

ionary

(1)

nd D is he jth maller in our

owing erator)

ASSO ession et al.,

es for We use

SVM

within apping ze is p

Figpip

2.

3.

Thefail

1.

2.

3.

HEALTH MANAG

gure 2. Dictioelines

and the ovetypically set

Patch selectnext-run failrun in a replruns are labnormal runimbalance, respectively using (1). (Nsingle dictionormal samp

Specifically,dictionaries, smaller thanthe other hanThF, these ppatches that are discarded

Dictionary forms the fin

e following stlure prediction

Patch geneoverlapping

Sparse codin(2) to compu10-p+1 sparsample.

Max poolingto reflect glmax poolingpooled sparsmax( , , …element froma pooled spasparse code known (see,2015).

GEMENT SOCIET

nary learning

erlapping is ot to four.

ting and dictilure predictionlacement cyclebeled as normns than faulty

two dictionalearned on

Note that if allonary, the dictiples.)

, to discrimonly the pa

n a threshold nd, if one valupatches will do not satisfyd.

concatenatingnal dictionary

teps are used in based on spa

erating. We patches {x1,…

ng. Given the ute sparse codrse codes are

g. To incorporlobal features g over these 10se code z such, , , … ,

m . Therefoarse code. The

in classificat, e.g., Chen e

Y 2015

g and SVM cl

one. In our ex

ionary learninn, only the see is labeled as mal. Thus thety runs. To aries DN and

normal and l samples werionary would b

minatively leaatches whoseThN are used ue within the be used to l

y either of thes

g. A simpleD = [DN|DF].

in SVM classarse code.

split each…, xj,…, x10-p+

learned dictiode of each pe thereby ob

rate local variof each samp

0-p+1 sparse ch that the kth e

, ) wherere, each samp

e effectivenesstion as opposet al., 2015, a

lassifier traini

xperiments, p

ng. Note that econd to the lfaulty. All oth

ere are far modeal with t

d DF are thfaulty samp

e used to learnbe dominated

arn these twe values are

to learn DN. patches excee

learn DF. Othse two conditio

e concatenati

sifier training

h sample in1}.

onary D, we upatch xj. In tot

btained for ea

iation of patchple, we perfocodes to obtainelement in z, ze , is the kple is encodeds of using poolsed to is wand Tarsa et

3

ing

p is

for last her ore this hen ples n a by

wo all On eds her ons

ion

for

nto

use tal, ach

hes orm n a

zk = kth

d as led

well al.,

Page 4: Sparse Coding-Based Failure Prediction for Prudent Operation of …htk/publication/2015-phm-ren-chueh-kung.pdf · Sparse Coding-Based Failure Prediction for Prudent Operation of LED

4

4

FTmfap

5

5

Tbprcw

5

Ttfnetd

vtsa

Fra(mi

5

Trgspt(cmd

4. Classifierfeatures prediction

4.3 FAILURE P

For a test samThen, we commax pooling feature vectoran input vecprediction resu

5. EXPERIM

5.1 Baseline:

The proposedbaseline methproject data orelatively smacovariance mawas used as th

5.2 Experim

To provide a the penalty pafor both PCAnumber of experiments wtraining data dictionary DN

Since a particvalue of dp.filto 10 and 20 size p was emabout three no

Four standardrate (TPR), faarea under the(AUC)—weremethods. Notein our experim

5.3 Patch Se

To learn threspectively rgenerated twoschemes werepatch sets wethe patch belo(Note that wconsisting of making it difdictionaries D

ANN

r training. Wto train a linn.

PREDICTION O

mple, we first dmpute sparse c

on these spr. Finally, the ctor for the pult.

MENTAL SETUP

: PCA+SVM

d method was hod. In this baonto a lower dall number oatrix of the trhe predictor.

ental Setup a

fair comparisarameter in a lA- and SC-bprincipal comwas six, whic

variance. DN and dictionacle filter will lter over a runrespectively.

mpirically set toon-zero coeffic

d metrics (Salfalse positive e receiver opere used to compe that a positiv

ments.

election for D

e two dictiorepresent normo patch sets.e compared. ere separately ongs to the la

we can create 10 runs.) Cle

fficult to diffDN and DF.

NUAL CONFEREN

We use poolenear SVM cl

ON TEST SAM

divide it into ocodes for eachparse codes to

pooled featurpre-trained SV

P AND RESULT

compared agaseline methoddimensional s

of dominant eraining data. T

and Evaluatio

son, we used linear SVM (C

based approacmponents (Pch can explainDifferent numary DF were sbe replaced w

n exceeds 30, As mentione

o four. Sparse cients in our e

lfner et al., 20rate (FPR), F

rating charactepare the perfove sample me

Dictionary Lea

onaries DN mal and faul. For this, twAs shown increated by c

st sliding winseven patche

early, some pferentiate betw

In contrast,

NCE OF THE PRO

d sparse codlassifier for f

MPLES

overlapping pah patch and peo obtain a pre vector is usVM to obtai

TS

gainst a PCA-d, PCA was uspace spannedeigenvectors oThen, a linear

on Metrics

the same settiChang & Lin, ches. The maPCs) used inn over 95% o

mbers of atomset for compawhen the maxwe set ThN and earlier, the coding is set

experiments.

010)—true poF-measure aneristic (ROC)

ormance of difeans a faulty sa

arning

and DF thatlty regularitiewo patch selen Figure 3(a)onsidering wh

ndow in each es from a wipatches overlaween atoms i we defined

GNOSTICS AND H

des as failure

atches. erform pooled sed as in the

-based sed to

d by a of the SVM

ing of 2011)

aximal n the of the ms in arison. ximum nd ThF

patch to use

ositive nd the curve

fferent ample

t can es, we ection ), two hether cycle. indow apped, in the d two

threFigdesThe

Figusinschrespusinpatcdiscinco12, patclearalmas schbe u

Figplotrai

Figdictsch

HEALTH MANAG

esholds to selgure 3(b), onscribed in Sece other patche

gure 4 shows ng dictionari

hemes. The dpectively indeng the patch ches makes icriminatively.orrectly codedas shown in F

ches into two rned in DN an

most coded by shown in Figu

heme 2, as useused to train D

gure 3. Two pt, black pointin DN)

gure 4. Sparsetionaries learn

heme 2.

GEMENT SOCIET

lect non-overlnly the patchction 4.2 weres were discard

the sparse codes learned b

dictionary DN

exed as 1~15 selection sc

it difficult to Thus the pad by atoms in Figure 4(a)). Isets without o

nd DF. Accordonly the atom

ure 4(b)). In ed in our pipeDN and DF mor

patch selectionts denote othe

e codes of a ned by patch s

Y 2015

lapped patchees satisfying

e used for dictded.

des of a patchby different

N and the dicand 16~18. O

cheme 1, the train these tatch at a fauDN (e.g., indi

In contrast, whoverlap, differedingly, the samm in DF (e.g., summary, theeline, creates re discriminat

n schemes. (Fr overlapping

patch at a fselection (a) s

es. As shown the constrai

tionary learnin

h at a faulty rpatch selecti

ctionary DF Obviously, wh

overlapping two dictionarulty run can ices of 6, 10 ahen we separatent atoms can me patch can the index of

e patch selectipatches that c

tely.

For simplicity patches used

faulty run usicheme 1 and

4

in ints ng.

run ion are hen

of ries

be and ted be be

18, ion can

of d to

ing (b)

Page 5: Sparse Coding-Based Failure Prediction for Prudent Operation of …htk/publication/2015-phm-ren-chueh-kung.pdf · Sparse Coding-Based Failure Prediction for Prudent Operation of LED

T(tFat

CI

MTFFA

5

T(PwmpttwtaA

Fc(t

Fa

I(h(mi

Table 1. Perfo(in parenthesitwo PCs (a simFor SC+SVMatoms in DN ato 15/3).

Column Index

Method

Metric

P+

(PTPR (%) FPR (%) F-measure AUC

85160.0.9

5.4 Predicti

Table 1 comp(PCA+SVM) PCA+SVM mwere used, wmeans that prediction. Unthe proposed Sthan that of words, the prothe PCA+SVMalso achieves AUC.

Figure 5 showcurves of the(columns 1 anthat the PCA+

Figure 5. ROCand the propo

In addition, w(RBF) SVM highest AUC (under PCs=2method achievin DF). This

ANN

ormance compis) denotes thamilar definitio

M, 10/3 (in pand 3 atoms in

1 2

PCA +SVM PCs=2)

PCA+SV

(PCs=5.71 6.93 179 916

71.4312.230.1960.911

on Results

pares the perfand the propo

method (columwe obtained reserving monder TPR equaSC+SVM metthe PCA+SV

oposed methoM method. Th

the best pred

ws the receiveese two methond 5 in Table 1+SVM method

C curves of thsed method (S

when using a as a predictovalue of PCA

2) and the hives 0.965 (unexperiment sh

NUAL CONFEREN

parisons. For Pat data is projon applies to Pparenthesis) mn DF (a simila

3

A VM

=3)

PCA +SVM

(PCs=6)

6

71.43 5.64 0.333 0.847

formance of thosed method (

mns 1, 2 and 3higher F-mea

ore PCs is als to 85.71% thod achieves

VM method (od raised fewehe proposed mdiction perfor

er operating chods under the1). From this fd raises more f

he baseline meSC+SVM)

non-linear, rar rather than A-based methighest AUC v

nder 15 atoms hows again th

NCE OF THE PRO

PCA+SVM, Pjected onto thPCs=3 and PCmeans we havar definition a

4

SC +SVM(10/3)

+

85.71 9.4 0.279 0.942

1100

he baseline m(SC+SVM). F3), when morasure values.useful for f (columns 1 aa lower FPR

(16.93%). In er false alarmsmethod (columrmance in term

haracteristic (e best AUC vfigure, we obsfalse alarms.

ethod (PCA+S

adius basis funa linear SVM

hod achieves value of SC-in DN and 3

hat the use o

GNOSTICS AND H

PCs=2 e first

Cs=6). ve 10

applies

5

SC +SVM(15/3)00 3.17

0.25 0.983

method For the e PCs This failure and 4), (9.4%) other

s than mn 5) ms of

ROC) values served

SVM)

nction M, the

0.925 -based atoms

of SC-

baswhe

Furvarwe par

6

In prosomdiffharcertannfoll

UT

DT

H:cert

Praundcalc

wheschNothouexeMO

Thecalceacyea

In apol(1)parprefilteexc

Figrepof tanndiscalar

HEALTH MANAG

sed features ouen a non-linea

rthermore, to rious partitions

also evaluatertitions. Simila

COST-BENE

REPLACEME

this section, oposed SC-basme other metficulty of quardware and softification (Sax

nual uptime oflowing notatio

T: average upti

T: average dow

the probabilittain replaceme

actically, UT ider a given reculated as

ere 1.5 and 1heduled and ate that in conturs. Here thecution time wOCVD equipm

e expected uptculated by m

ch of which is ar and the aver

Expected upt

addition to theicy, we consi

run-to-failurrticle filter wilventive mainer will be repceeds the avera

gure 6 comparlacement polithe PCA+SVMnual uptime ofcussion, the torm was raised

GEMENT SOCIET

utperforms thaar SVM is used

assess robusts of the data seed the perforar results as m

EFIT ANALYSIS

ENT POLICIES

we provide sed prediction thods for MOantizing detaiftware design,xena et al., 2f MOCVD equons to facilitate

me per cycle,

wntime for a re

ty that the maent policy is b

s calculated beplacement po

∗ 1.5 1

08 are averagan unscheduletrast, the exec

hese average were obtained ment.

time in a year multiplying the

the time duratrage uptime pe

time in a year

e SC- or PCA-der two convere replacemenll be used unti

ntenance policplaced once tage number of

res uptime agcies for the te

M and SC+SVf the MOCVDotal number od under a rep

Y 2015

at of traditiond as a predicto

tness of perfoet into trainingrmance of tw

mentioned abov

S OF DIFFERE

S FOR MOCVD

cost-benefit method when

OCVD equipmiled factors o, engineering q2010), we onuipment operae the discussio

eplacement, an

aintenance timbefore an actua

based on the molicy on the te

∗ 108

ge downtimes ed replacemecution time of

downtimes from enginee

r under a replae number of tion for a pair er cycle:

=

-based predictentional replant policy, unil it fails (i.e., cy, under whthe number of runs in traini

ainst the FPRst data. The X

VM methods. TD equipment. of particle filteplacement pol

nal PCA featuor.

formance agaig and test cycl

wo other randove were found

NT

D EQUIPMENT

analysis of tn compared wment. Given tof costs such qualification anly consider tation. We use ton.

nd

me provided byal failure.

maintenance timest cycles. DT

(in hours) font, respective

f a run is abouand the run

ers who maint

acement policyoperation unUT and DT, i

tive maintenanacement policinder which t

0), and hich the partiof executed ruing cycles.

R under differX-axis is the FPThe Y-axis is tTo facilitate t

ers for which licy before th

5

ures

inst les, om .

T

the with

the as

and the the

y a

me T is

or a ely. ut 8 n’s ain

y is its, n a

nce ies: the (2)

icle uns

ent PR the the no

hey

Page 6: Sparse Coding-Based Failure Prediction for Prudent Operation of …htk/publication/2015-phm-ren-chueh-kung.pdf · Sparse Coding-Based Failure Prediction for Prudent Operation of LED

ffap#PFv

I

ttti

Iowcp3F

Iost

ItPthzgmtbl

7

Ipebbpmbtcp

TIal

failed is denotfailure replaceare used untilpolicy, only #misses is 1. TPCA+SVM aFPR subintervvarious metho

Interval1 (#miSC+SVM). Wthat an alarm the baseline mthat of prevenin the lower-ri

Interval2 (#misor 3 for PCAwith preventivcan see that thpreventive ma300+ hours (aFigure 6).

Interval3 (#mior 2 for PCsuccessfully ratherefore achi

In summary, the other repParticularly, tthe preventivehours under Fzoomed-in pagiven data semethod is usethat if FPR isbased failure leading to a sh

7 CONCLUS

In this paperprediction mequipment. Ubased methodbaseline methpreventative mmethod increaby 600+ hourthe first applicomponent performance d

The paper focIt is generallyaccurate than latter are accu

ANN

ted as #missesement policy,l they fail; unone filter is To study cost-

and SC+SVM vals shown inods exhibit the

isses is larger When we allow

will be raisedmethod and thentive maintenaight zoomed-i

sses is 1 for SA+SCM). Weve maintenanhe proposed Saintenance straas shown in th

isses is 0 for SCA+SVM). Th

aises alarms beves the best u

the proposedplacement polthe proposed e maintenanceFPR equal to anel of Figure et, when the ed, the target s set too high

predictor whortened uptim

SION AND FUT

r, we proposemethod for thUsing a real-wd raises fewerhod under the maintenance sases the annuars. To the bestication of spafailure in

demonstrated u

cuses on classiy true that en

their individurate and diver

NUAL CONFEREN

s. For exampl, #misses is 7nder the prevenot replaced

-benefit effectmethods, we

n Figure 6. In eir relative stre

than 1 for bow a very low d. Then the pe proposed me

ance in terms oin panel of Fig

SC+SVM, ande compare thence under the SC-based methategy with an i

he upper-right

SC+SVM, andhe proposed

before any partuptime among

d SC-based mlicies in Inter

SC+SVM m policy in ann5% (as shown6). This sugg

proposed SCFPR should b

h (e.g., 50%),would incur mme.

TURE WORK

e a sparse cohe particle f

world dataset, r false alarms same TPR. Cstrategy, the pal uptime of Mt of our knowarse coding to

MOCVD using a real-w

ifiers rather thnsemble methodual classifiersrse (Dietterich

NCE OF THE PRO

e, under the ru7 since these entive maintebefore it fai

ts of #misses fe consider the

these FPR reengths.

oth PCA+SVMFPR, it is unerformance ofethod is worse

of uptime (as sgure 6).

d #misses is eie proposed msame #misse

hod outperformincreased uptizoomed-in pa

d #misses is eiSC-based m

ticle filter failg all methods.

method outperrval2 and Inte

method outpernual uptime byn in the uppergests that undC-based predbe set at 5%., the proposed

many false al

oding-based ffilter in MO

our proposedthan a PCA-

Compared wiproposed SC-

MOCVD equipwledge, this wo the predictiequipment,

world dataset.

han their ensemods are often s provided th

h, 2000). As a

GNOSTICS AND H

un-to-filters

enance ils, so for the three egions

M and nlikely f both e than shown

ither 2 method

s. We ms the ime of anel of

ither 1 method ls, and

rforms erval3. rforms y 600+ r-right

der the diction . Note d SC-larms,

failure OCVD d SC--based th the -based pment

work is ion of

with

mbles. more

hat the future

Figund

worpro

AC

ThiTecTecInstHarthe SchNavthisthe namend

RE

Cha

Che

Die

Efr

Hua

HEALTH MANAG

gure 6. Annuader different re

rk, we plan toposed SC-bas

KNOWLEDGE

is work is parchnologies andchnology Cetitute, Hsinchrvard for this Intel Corpora

hool Agreemeval Supply Sys paper do no

Naval Postgrmes, commerdorsement by t

FERENCES

ang C. C., &support veIntelligent pp.1-27.

en H. C., ComSparse CodRecognitionAnalysis an86.

etterich T. GLearning, MComputer S

ron B., HastieLeast angle32, No. 2, p

ang K., & AvSignal ClasProcessing

GEMENT SOCIET

al uptime of Meplacement po

to study ensesed classifiers

MENT

rtially supportd Applicationenter, Industhu, Taiwan. paper was su

ation and in pent No. N002ystems Comm

ot necessarily raduate Schoorcial practicethe U.S. Gove

Lin. C. J. (2ector machinSystems and

miter M., Kungding Trees wn, In Proceednd Modeling of

. (2000). EnsMultiple ClassiScience, Vol. 1

e T., Johnstone regression, Tpp. 407-499.

viyente S. (200ssification, AdSystems, Vol.

Y 2015

MOCVD equipolicies.

emble methodof this paper.

ed by the projns, Computatiotrial TechnoH. T. Kung

upported in paart by the Nav244-15-0050

mand. The viereflect the offol nor does mes, or organernment.

2011). LIBSVnes, ACM T

Technology,

g H. T., & Mcwith Applicatidings of IEEE

of Faces and G

semble Methoifier Systems, 1857, pp. 1-15

e I., & TibshThe Annals of

06). Sparse Redvances in Neu

19, pp. 131-1

pment operati

ds based on t

oject of Big Donal Intelligenology Researg’s research art by gifts froval Postgraduawarded by t

ews expressedfficial policies mention of tranizations imp

VM: a library Transactions

Vol. 2, No.

cDanel B. (201ion to EmotiE Workshop

Gestures, pp. 7

ods in MachiLecture Notes

5.

irani, R. (200f Statistics, V

epresentation ural Informati138.

6

ion

the

ata nce rch

at om

uate the

d in of

ade ply

for on 3,

15). ion on

77-

ine s in

04). Vol.

for ion

Page 7: Sparse Coding-Based Failure Prediction for Prudent Operation of …htk/publication/2015-phm-ren-chueh-kung.pdf · Sparse Coding-Based Failure Prediction for Prudent Operation of LED

ANNUAL CONFERENCE OF THE PROGNOSTICS AND HEALTH MANAGEMENT SOCIETY 2015

7

Lee J., Wu F., Zhao W., Ghaffari M., Liao L., & Siegel D. (2014). Prognostics and Health Management Design for Rotary Machinery Systems—Reviews, Methodology and Applications, Mechanical Systems and Signal Processing, Vol. 42, pp. 314-334.

Koh K., Kim S. J., & Boyd S. (2007). An Interior-Point Method for Large-Scale l1-Regularized Logistic Regression, The Journal of Machine Learning Research, Vol. 8, pp.1519-1555.

Mairal J., Bach F., Ponce J., & Sapiro G. (2009). Online dictionary learning for sparse coding, In Proceedings of the 26th Annual International Conference on Machine Learning, pp. 689-696.

Mairal J., Bach F., Ponce J., & Sapiro G. (2010). Online learning for matrix factorization and sparse coding, The Journal of Machine Learning Research, Vol. 11, pp. 19-60.

Orchard M. E., & Vachtsevanos G. J. (2007). A Particle Filtering-based Framework for Real-time Fault Diagnosis and Failure Prognosis in a Turbine Engine, Mediterranean Conference on Control and Automation, pp. 1-6.

Saxena A., Roychoudhury I., Celaya J., Saha S., Saha B., & Goebel K. (2010). Requirements Specifications for Prognostics: An Overview, AIAA Infotech@Aerospace Conference.

Salfner F., Lenk M., & Malek M. (2010). A survey of Online Failure Prediction Methods, ACM Computing Surveys, Vol. 42, No. 3, pp. 1-42.

Scoville J. (2011). Predictability as a Key Component of Productivity, ISMI Manufacturing Week 2011, Austin, TX.

Tarsa S. J., Comiter M., Crouse M., McDanel B., & Kung H. T. (2015). Taming Wireless Fluctuations by Predictive Queuing Using a Sparse-Coding Link-State Model, ACM MobiHoc, pp. 287-296.

Wright J., Yang A. Y., Ganesh A., Sastry S. S., & Yi M. (2009). Robust Face Recognition via Sparse

Representation, IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 31, No. 2, pp. 210-227.

Wright J., Yi M., Mairal J., Sapiro G., Huang, T. S., & Yan S. (2010). Sparse Representation for Computer Vision and Pattern Recognition, Proceedings of the IEEE, Vol. 98, No. 6, pp. 1031-1044.

Zhu J., Nostrand T., Spiegel C., & Morton B. (2014). Mechanical Diagnostics System Engineering in IMS HUMS, In Proceedings of Annual Conference of the Prognostics and Health Management Society, pp. 635-647.

BIOGRAPHIES

Jia-Min Ren received his Ph.D. from the Computer Science Department, National Tsing Hua University, Hsinchu, Taiwan. He currently works at the Computational Intelligence Technology Center, Industrial Technology Research Institute, Hsinchu, Taiwan. His research interests include machine learning, semantic analysis of musical signals, music information retrieval, and analysis of manufacturing data. Chuang-Hua Chueh received his Ph.D. from the Computer Science Department, National Cheng Kung University, Tainan, Taiwan. He currently works at the Computational Intelligence Technology Center, Industrial Technology Research Institute, Hsinchu, Taiwan. His research interests include machine learning, pattern recognition and speech recognition. H. T. Kung is William H. Gates Professor of Computer Science and Electrical Engineering at Harvard University. He is interested in computing, communications and sensing, with a current focus on machine learning, compressive sampling and the Internet of Things. He has been a consultant to major companies in these areas. Prior to joining Harvard in 1992, he taught at Carnegie Mellon University for 19 years after receiving his Ph.D. there. Professor Kung is well-known for his pioneering work on systolic arrays in parallel processing and optimistic concurrency control in database systems. His academic honors include membership in the National Academy of Engineering in the US and the Academia Sinica in Taiwan.