May 9, 2008IPA Lentedagen, Rhenen1 Dynamic Fault Tree analysis using Input/Output Interactive Markov...

30
May 9, 2008 IPA Lentedagen, Rhenen 1 Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains Hichem Boudali 1 , Pepijn Crouzen 2 , and Mariëlle Stoelinga 1 . 1 Formal Methods and Tools group CS, University of Twente, NL. 2 Dependable Systems and Software group, CS, Saarland University, Germany

Transcript of May 9, 2008IPA Lentedagen, Rhenen1 Dynamic Fault Tree analysis using Input/Output Interactive Markov...

Page 1: May 9, 2008IPA Lentedagen, Rhenen1 Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains Hichem Boudali 1, Pepijn Crouzen 2, and Mariëlle.

May 9, 2008 IPA Lentedagen, Rhenen 1

Dynamic Fault Treeanalysis using

Input/Output Interactive Markov Chains

Hichem Boudali1, Pepijn Crouzen2, and Mariëlle Stoelinga1.

1Formal Methods and Tools groupCS, University of Twente, NL.

2Dependable Systems and Software group,CS, Saarland University, Germany

Page 2: May 9, 2008IPA Lentedagen, Rhenen1 Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains Hichem Boudali 1, Pepijn Crouzen 2, and Mariëlle.

May 9, 2008 IPA Lentedagen, Rhenen 2

Introduction:Dependability

Dependability:The trustworthiness of a computer system such that reliance can justifiably be placed upon the service it delivers.

Reliability:

The probability that a computer system does not fail within a given time bound.

Page 3: May 9, 2008IPA Lentedagen, Rhenen1 Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains Hichem Boudali 1, Pepijn Crouzen 2, and Mariëlle.

May 9, 2008 IPA Lentedagen, Rhenen 3

Introduction:Formal dependability

Continuous-time Markov chains (CTMC)

States and Markovian transitions Probability of traversing a λ-

transition within t time-units is:

1-e-λt

Tools: Reachability analysis (among others)

λ μ

μ λ

Pepijn Crouzen
Page 4: May 9, 2008IPA Lentedagen, Rhenen1 Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains Hichem Boudali 1, Pepijn Crouzen 2, and Mariëlle.

May 9, 2008 IPA Lentedagen, Rhenen 4

Introduction:CTMC characteristics

CTMCs describe probability distributions (phase-type distributions)

Phase-type distributions can approximate any arbitrary distribution arbitrarily closely

Goal: Find a CTMC which describes the probability of system failure within t time-units (i.e. the unreliability of the system)

Problem: Difficult to find the CTMC that models a large system

λ μ

μ λ

Pepijn Crouzen
Page 5: May 9, 2008IPA Lentedagen, Rhenen1 Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains Hichem Boudali 1, Pepijn Crouzen 2, and Mariëlle.

May 9, 2008 IPA Lentedagen, Rhenen 5

Introduction:Engineering dependability

Fault Trees (1960’s) Graphical Easy to use Syntax:

Basic events Gates

Semantics: logical formula Problem: Not expressive

enoughMem1fails

CPUfails

Workstation fails

OR

AND

Mem1fails

Page 6: May 9, 2008IPA Lentedagen, Rhenen1 Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains Hichem Boudali 1, Pepijn Crouzen 2, and Mariëlle.

May 9, 2008 IPA Lentedagen, Rhenen 6

Introduction:Engineering dependability

Dynamic Fault Trees (1992) Extension of classic fault

trees Additions:

Use of spares Dependencies Order-based failure

Tools: Convert to CTMC P2

System failure

P1

SPARE

OR

S

Page 7: May 9, 2008IPA Lentedagen, Rhenen1 Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains Hichem Boudali 1, Pepijn Crouzen 2, and Mariëlle.

May 9, 2008 IPA Lentedagen, Rhenen 7

But…DFT Drawbacks

Scalability Ambiguous syntax and semantics Lack of modularity:

Dynamic modules can not be reused Restrictions on spares and dependencies

Existing analysis technique is hard to extend or modify

Page 8: May 9, 2008IPA Lentedagen, Rhenen1 Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains Hichem Boudali 1, Pepijn Crouzen 2, and Mariëlle.

May 9, 2008 IPA Lentedagen, Rhenen 8

Outline

Case study: FTPP system DFT approach Formalizing DFTs

DFT semantics in I/O-IMCs Deep compositionality Extending the DFT formalism

Conclusion Future work

Page 9: May 9, 2008IPA Lentedagen, Rhenen1 Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains Hichem Boudali 1, Pepijn Crouzen 2, and Mariëlle.

May 9, 2008 IPA Lentedagen, Rhenen 9

Case study: FTPP

A

D

C

B

A

D

C

B

A DCB

A DCB

NE1

NE3

N

E

2

N

E

4

16 processors divided into 4 groups

4 network elements connect the processors

Per group 2 processors must be operational

Different configurations are possible

Page 10: May 9, 2008IPA Lentedagen, Rhenen1 Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains Hichem Boudali 1, Pepijn Crouzen 2, and Mariëlle.

May 9, 2008 IPA Lentedagen, Rhenen 10

D

D

D

D

S

S

S

S

C

B

AA

C

B

A

B

CC

A

B

NE1

NE3

N

E

2

N

E

4

Case study: FTPP

16 processors divided into 4 groups

4 network elements connect the processors

Per group 2 processors must be operational

Different configurations are possible

Dynamic redundancy management is possible

How reliable is each configuration?

Page 11: May 9, 2008IPA Lentedagen, Rhenen1 Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains Hichem Boudali 1, Pepijn Crouzen 2, and Mariëlle.

May 9, 2008 IPA Lentedagen, Rhenen 11

FTPP DFT

Group 1 Failure

2/3

S

CBA

Group 2 Failure

2/3

S

CBA

Group 3 Failure

2/3

S

CBA

Group 4 Failure

2/3

S

CBA

System Failure

FDEP

NE1

A A A A

FDEP

NE2

B B B B

FDEP

NE3

C C C C

FDEP

NE4

S S S S

OR

S

S

S

S

C

B

AA

C

B

A

B

CC

A

B

NE1

NE3

N

E

2

N

E

4

Page 12: May 9, 2008IPA Lentedagen, Rhenen1 Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains Hichem Boudali 1, Pepijn Crouzen 2, and Mariëlle.

May 9, 2008 IPA Lentedagen, Rhenen 12

C

A B

0.2

0.20.4

0.4

Failure rate:0.2 f/h

Failure rate:0.4 f/h

AND-gate Starting state:A is operationalB is operational

A has failedB is operational

Pr(A fails in T hours) = 1 – e-0.2•T

A’s Mean time to failure = 1/0.2 = 5 hours

A is operationalB has failed

A has failedB has failed

For static fault trees binary decision diagrams can be used! Otherwise: Convert the DFT into a CTMC. Analyze CTMC using standard solution techniques.

Existing DFT analysis[Dugan et al. 1992]

Unreliability =

Prob[Reaching in time T]

But…State space explosion:

CTMC grows exponentiallyFTPP difficult to analyze

Page 13: May 9, 2008IPA Lentedagen, Rhenen1 Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains Hichem Boudali 1, Pepijn Crouzen 2, and Mariëlle.

May 9, 2008 IPA Lentedagen, Rhenen 13

FTPP Results

Group 1 Failure

2/3

S

CBA

Group 2 Failure

2/3

S

CBA

Group 3 Failure

2/3

S

CBA

Group 4 Failure

2/3

S

CBA

System Failure

FDEP

NE1

A A A A

FDEP

NE2

B B B B

FDEP

NE3

C C C C

FDEP

NE4

S S S S

Analysis

methodMax number of

statesMax number of

transitionsUnreliability

(T=10)

Standard 32757 426826 2.55479 · 10-8

Compositional 1325 14153 2.55479 · 10-8

S

S

S

S

C

B

AA

C

B

A

B

CC

A

B

NE1

NE3

N

E

2

N

E

4

Page 14: May 9, 2008IPA Lentedagen, Rhenen1 Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains Hichem Boudali 1, Pepijn Crouzen 2, and Mariëlle.

May 9, 2008 IPA Lentedagen, Rhenen 14

What’s behind it?

Model local behavior We need compositional Markov

chains Combination of LTS and CTMC,

with I/O automata features Markovian transitions (CTMC) Interactive transitions (LTS) Action signature (IOA)

? - Input actions ! - Output actions ; - Internal actions

λ

failed!

I/O-IMC for

Basic event

Input/Output Interactive Markov Chains (I/O-IMC)

Page 15: May 9, 2008IPA Lentedagen, Rhenen1 Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains Hichem Boudali 1, Pepijn Crouzen 2, and Mariëlle.

May 9, 2008 IPA Lentedagen, Rhenen 15

Properties of IMCs: Combines stochastic behavior and interactive

behavior orthogonally CSP-style synchronization + interleaving semantics Maximal progress for internal transitions

Properties of IOIMCs: Unique outputs Input enabledness Outputs cannot be blocked! Maximal progress for output transitions

Input/Output Interactive Markov Chains

λ τ

Page 16: May 9, 2008IPA Lentedagen, Rhenen1 Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains Hichem Boudali 1, Pepijn Crouzen 2, and Mariëlle.

May 9, 2008 IPA Lentedagen, Rhenen 16

f(C)!f(A)?

f(B)?

f(B)?

f(A)?

f(C)!f(A)?

f(B)?

f(B)?

DFT semanticsDFT gate to I/O-IMC

Page 17: May 9, 2008IPA Lentedagen, Rhenen1 Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains Hichem Boudali 1, Pepijn Crouzen 2, and Mariëlle.

May 9, 2008 IPA Lentedagen, Rhenen 17

What is deep compositionality?

Group 1 Failure

2/3

S

CBA

Semantics of a DFT arises naturally ascomposition of the semantics of its building blocks

But: This may lead to huge models.

f(G1)

f(NE1) f(NE4)…f(NE1) f(NE4)

f(G1)

f(NE2) f(NE3)

Page 18: May 9, 2008IPA Lentedagen, Rhenen1 Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains Hichem Boudali 1, Pepijn Crouzen 2, and Mariëlle.

May 9, 2008 IPA Lentedagen, Rhenen 18

Why use deep compositionality?

Formally define semantics Many useful techniques

Combining models: Composition Refining models: Hiding Minimizing models: Bisimulation Reusing models: Renaming

Well supported by CADP toolset (VASY/INRIA)

Combat

State-space

explosion

Page 19: May 9, 2008IPA Lentedagen, Rhenen1 Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains Hichem Boudali 1, Pepijn Crouzen 2, and Mariëlle.

May 9, 2008 IPA Lentedagen, Rhenen 19

Compositional Aggregation

Translation Composition +

Abstraction

Aggregation

(minimization)

Repeat

Aggregated system CTMC (CTMDP)

Result: System failure probability

Analysis

Page 20: May 9, 2008IPA Lentedagen, Rhenen1 Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains Hichem Boudali 1, Pepijn Crouzen 2, and Mariëlle.

May 9, 2008 IPA Lentedagen, Rhenen 20

Compositional AggregationExample

f(C)!f(A)?

f(B)?

f(B)?

f(A)?Failure rate:

0.2 f/h

Failure rate:

0.4 f/h

f(A)!0.2 f(B)!0.4

Page 21: May 9, 2008IPA Lentedagen, Rhenen1 Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains Hichem Boudali 1, Pepijn Crouzen 2, and Mariëlle.

May 9, 2008 IPA Lentedagen, Rhenen 21

Compositional AggregationParallel Composition

1 2 3

1

2

3

4 5

1||1

0.2 f(A)!

f(A)?

f(A)?f(B)?

f(B)?

f(C)!

0.2

f(B)?

f(B)?

f(A)!

f(C)!1||2

2||3

3||1

f(B)?

0.2

f(A)!

3||2

4||3 5||3Inputs: f(A)? and f(B)?Outputs: f(C)!

Inputs: noneOutputs: f(A)!

C

A

C||A

Synchronize on f(A)

Page 22: May 9, 2008IPA Lentedagen, Rhenen1 Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains Hichem Boudali 1, Pepijn Crouzen 2, and Mariëlle.

May 9, 2008 IPA Lentedagen, Rhenen 22

f(A);

f(A);f(A)!

f(A)!

Compositional AggregationAbstraction (hiding)

1||10.2

f(B)?

f(B)?

f(B)?

0.2

f(C)!1||2

2||3

3||1

3||2

4||3 5||3

C

A B

Abstraction (hiding):

Makes signal internal

Page 23: May 9, 2008IPA Lentedagen, Rhenen1 Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains Hichem Boudali 1, Pepijn Crouzen 2, and Mariëlle.

May 9, 2008 IPA Lentedagen, Rhenen 23

f(A);

f(A);

Compositional AggregationAggregation (weak bisimulation)

1||10.2

f(B)?

f(B)?

f(B)?

0.2

f(C)!1||2

2||3

3||1

3||2

4||3 5||3

Weak bisimulation:

Disregard internal steps

Aggregation:

Finding a smaller model

equivalent (behaviorally)

to the original

Page 24: May 9, 2008IPA Lentedagen, Rhenen1 Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains Hichem Boudali 1, Pepijn Crouzen 2, and Mariëlle.

May 9, 2008 IPA Lentedagen, Rhenen 24

Compositional AggregationExample (continued)

1 2 3

1

2

3

4 5

1||1

0.2 f(B)!

0.2

0.2f(B)?

f(B)?

f(C)!

0.2

0.4

2||1

1||2

0.4

0.2

2||2

C||A

B

C||A||B

f(B)!

f(B)!4||3

3||3

0.2

f(C)!

5||3

Page 25: May 9, 2008IPA Lentedagen, Rhenen1 Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains Hichem Boudali 1, Pepijn Crouzen 2, and Mariëlle.

May 9, 2008 IPA Lentedagen, Rhenen 25

Compositional AggregationExample (continued)

0.2

0.4

0.4

0.2

C||A||Bf(C)!

Page 26: May 9, 2008IPA Lentedagen, Rhenen1 Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains Hichem Boudali 1, Pepijn Crouzen 2, and Mariëlle.

May 9, 2008 IPA Lentedagen, Rhenen 26

DFT extensions

Extensions: Inhibition Repair-policies Complex spares Complex dependencies …

Adding extensions in the compositional framework is easy: Modify translation of DFT building blocks Compositional aggregation algorithm is

unaltered

Free!

DSN07

Page 27: May 9, 2008IPA Lentedagen, Rhenen1 Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains Hichem Boudali 1, Pepijn Crouzen 2, and Mariëlle.

May 9, 2008 IPA Lentedagen, Rhenen 27

Extension: Repair

f(C)!f(A)?

f(B)?

f(B)?

f(A)?

r(C)!

r(A)?

r(A)?

r(B)?

r(B)?

r(C)!

r(C)!r(A)? r(B)?

r(B)? r(A)?

λ

f(A)!

µr(A)!

AND-gate C

Basic event A

Page 28: May 9, 2008IPA Lentedagen, Rhenen1 Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains Hichem Boudali 1, Pepijn Crouzen 2, and Mariëlle.

May 9, 2008 IPA Lentedagen, Rhenen 28

Conclusion:How we tackled drawbacks

State-space explosion. Ambiguous syntax and

semantics. Lack of modularity:

Dynamic modules can not be reused.

Restrictions on spares and dependencies.

Existing analysis technique is hard to extend and/or modify.

Compositional Aggregation

DAG

Extensions at thelowest level

I/O-IMC

Formal translation

Renaming!

Lifted!

Page 29: May 9, 2008IPA Lentedagen, Rhenen1 Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains Hichem Boudali 1, Pepijn Crouzen 2, and Mariëlle.

May 9, 2008 IPA Lentedagen, Rhenen 29

Future work

Fully automated tool (CORAL) More aggressive state reduction

Recent work: specialized acyclic algorithm

Apply deep compositionality to more advanced engineering formalisms! (see Boudali et al., DSN08)

Extend DFT formalism Repair Failure modes Non-exponential failure distributions Sophisticated dependencies

Page 30: May 9, 2008IPA Lentedagen, Rhenen1 Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains Hichem Boudali 1, Pepijn Crouzen 2, and Mariëlle.

May 9, 2008 IPA Lentedagen, Rhenen 30

The end!

Questions?