Application of Maximum Entropy Principle to software failure prediction Wu Ji Software Engineering...

Post on 18-Jan-2018

223 views 0 download

description

Introduction Failure prediction is one of the key problems for software quality (reliability) estimation. Generally, failure prediction can be defined as y = f(x). – y is failure related variable – x is the foundation on which prediction works As far as we know, x has been set as: –Software execution time  reliability growth prediction –Software execution trace  anomaly detection

Transcript of Application of Maximum Entropy Principle to software failure prediction Wu Ji Software Engineering...

Application of Application of Maximum Maximum Entropy PrincipleEntropy Principle to software to software

failure predictionfailure prediction

Wu JiWu JiSoftware Engineering InstituteSoftware Engineering Institute

BeiHang UniversityBeiHang University

AgendaAgenda

• Introduction• Problem and focus• Method and models• Results• Conclusions

IntroductionIntroduction

• Failure prediction is one of the key problems for software quality (reliability) estimation.

• Generally, failure prediction can be defined as y = f(x).– y is failure related variable– x is the foundation on which prediction works

• As far as we know, x has been set as:– Software execution time reliability growth predic

tion– Software execution trace anomaly detection

Introduction (cont.)Introduction (cont.)

• Reliability has been a big concern for high reliability requirement (HRR) software.

• Reliability engineering has very high cost. Reliability testing is seldom done for the software without HRR.

• Anomaly detection is usually implemented as a built-in module of software.

Introduction (cont.)Introduction (cont.)

• Generally, all managers are striving for high quality.

• What does manager really care for failure prediction?– Given an usage scenario, if software can

survive?• How to predict software failure from input

is still a new problem.

Problem and focusProblem and focus

How to predict failure from software

input?

Problem and focus (cont.)Problem and focus (cont.)

failure observation = ? (0/1)

left context

execution time line

execution start s

t

Problem and focus (cont.)Problem and focus (cont.)• If we can model the left context, we get th

e distribution {(lc, fo)}.

Failure Learning

Failure Prediction

{(lc,fo)}Software input

Failure observation

Failure law

Method and modelsMethod and models

• The whole left context is hard to model. – A probability model: po(y|x)– x: partial left context, y: failure observation.

• Maximum Entropy Principle (MEP) is applied to model the po(y|x).

1

{0,1} 1

1( | ) exp{ ( , )}( )

( ) exp{ ( , )}

ro r o

r N

rr o

y r N

p y x f x yZ x

Z x f x y

Method and models (cont.)Method and models (cont.)

• MEP is a well-known and widely used learning principle:– Great generalization ability– Dynamic and open– Good adaptive with data sparseness

Method and models (cont.)Method and models (cont.)

Failure cannot be well modeled without

modeling fault.

Failure can be well modeled only from

input, and its relations with failures.

Structure ViewerSurface Viewer

Structure Model Surface Model

Method and models (cont.)Method and models (cont.)

• Surface Model: learns the statistical co-occurrence of the surface information.

• Structure Model: learns the statistical cause-effect (fault-failure) relationship.

Method and models (cont.)Method and models (cont.)

SIU-Seg-Ftrs

SIU-Num-Ftrs

Failure-Ftrs

Flr

The features applied in the surface model

Method and models (cont.)Method and models (cont.)

Fault-Ftrs

(Flt -> Flr) Ftrs

Failure-Ftrs

Flr

The features applied in the structure model

Method and models (cont.)Method and models (cont.)

• Supervised training• Training data• Objective: maximize the likelihood function.

Method and models (cont.)Method and models (cont.)

• Models Evaluation:– For a given test case:

• Test engineer would run it and get the test_fo_sequence;

• The prediction model would return the predicted pred_fo_sequence.

– Evaluate by the match degree (precision) between test_fo_sequence and pred_fo_sequence.

ResultsResults

• Two groups of experiments, totally 5 software involved in, 17 testing.

• Open test method– Testing data keeps separate with training data

and keeps unknown for training.• Surface Model: average precision: 0.876• Structure Model: average precision: 0.858

Results (cont.)Results (cont.)

02468101214

(0.5,0.65] (0.65,0.75] (0.75,0.85] (0.85,1.0]

Surf_precStruc_prec

Evaluation Score Distribution

Results (cont.)Results (cont.)

Model Performance wi th TDS

0. 400

0. 500

0. 600

0. 700

0. 800

0. 900

1. 000

61 64 67 70 72 73 77 83 111 257 300 320 324 344 403 462 462

Trai ni ng Data Si ze

Prec

isio

n

Sur f ace ModelSt ruct ure Model

Results (cont.)Results (cont.)

Model Perforamce wi th average TDS

0. 400

0. 500

0. 600

0. 700

0. 800

0. 900

1. 000

Aver age TDS f or SI U

Prec

isio

n

Sur f ace ModelSt ruct ur e Model

Results (cont.)Results (cont.)

• Potential applications of the prediction model– Test case prioritization– Reliability Estimation– Reliability Growth Modeling

ConclusionsConclusions• A new failure prediction problem• Apply statistical learning method to learn

failure law and then predict failure• Two models, surface model and structure

model• Promising evaluation results:

– Surface Model: 0.876– Structure Model: 0.858.

Conclusions (cont.)Conclusions (cont.)

• Lessons learnt:– To design and start experiments ASAP to

verify model.– Complex model does not always perform well.

model simplification.– DO NOT draw much assumption on the

generation of data.

Thank you for the attentionsThank you for the attentions

Ready for questions!