Dimensions of Formal Verification and Validation

Dimensions of Formal Verification and Validation

Doron DrusinskyBret Michael

Mantak Shing

Naval Postgraduate School

2

Contents

1. Tradeoffs MC vs. TP vs. EMC

Model

/

Progra

m

Sp

ecif

icat

ion

Verification

3

The Role of Specification: “Have we built the right product?”

Model

/Pro

gram

Sp

ecif

icat

ion

Verification

E.g.,

“if pump pressure is turned Low then High and then Low again all within 10 seconds then pump should not be High for at least 20 additional seconds” class PumpCtl {

int x;

void pumpOn() {

…

}

}

Customer cognitive

requirements

Spec. =

Formal representation

10sec 20sec

x

4

The Role of Verification: “Have we built the product right?”

Model

/Pro

gram

Sp

ecif

icat

ion

Verification Verification =

The bridge between specification and implementation

E.g.,

“if pump pressure is turned Low then High and then Low again all within 10 seconds then pump should not be High for at least 20 additional seconds”

class PumpCtl {

int x;

void pumpOn() {

…

}

}

10sec 20sec

x

5

Verification vs. Validation Emphasis

Model

/Pro

gram

Sp

ecif

icat

ion

Verification

Most academic work is verification centered

We care about modeling, programming, and validation just as much.

6

Background: Primary Verification Techniques

“True” Model-Checking: automatic verification

system

Test suite =

many inputs sequences

Formal spec

FM promise: no need to execute the system

(Finite State) Model of system

==?

Manual modeling (e.g., in Promela) or via abstraction tool

Cognitive/NL requirement

Formalization and validation

7


“True” Model-Checking:

system

Formal spec


?

Manual translation (e.g., Promela) or via abstraction tool

Limitations: (1) limited validation (weak, hard-to-use spec.-langs)

(2) state-explosion

8


“True” Model-Checking:

system

Formal spec


?

Manual translation (e.g., Promela) or via abstraction tool

Limitations: (1) limited validation (weak, hard-to-use spec.-langs)

(2) state-explosion

Focus of academic interest

A significant limitation in our opinion

9


Theorem Proving:

system

Test suite =

many inputs sequences

Formal spec

FM promise: no need to execute the system

(Infinite State) system

==?



10

Background: Primary Verification TechniquesTheorem Proving:

system

Formal spec

(Infinite State) system

==?

Limitations: (1) limited validation

(even weaker…, hard-to-use spec.-langs)

(2) Requires Ph.D. driver

11


Manual Testing:

Limitations: (1) Slow, poor verification coverage, expensive, hard to repeat

(2) Requires many human testers (slow and expensive)

(3) Validation is missing (effectively tester does both V&V)

Cognitive requirement

Express as NL

12

Background: Primary Verification TechniquesExecution based Model-Checking (EMC) =

Run-time Execution Monitoring (REM/RV) + Automatic Test Generation (ATG):



Monitor (REM)

ATG

13

Background: Primary Verification TechniquesExecution based Model-Checking (EMC):

Limitations: (1) No absolute coverage

(but more can be specified…)

PRO:Better validation -- easy to use, expressive languages, with simulation.

14

The Coverage Cube(more is better)

Progra

m C

overa

geS

pec

ific

atio

n C

ove

rag

e Verification Coverage

Validation related: How well are requirements covered?

How well does model under verification match actual program?

Are all possible spec violations detectable?

15

The Coverage Cube(more is better)

MC/TP: 100% verification coverage (only for those req’s that can be specified…), much weaker in other dimensions

EMC: restricted verification coverage, good in other two dimensions

Progra

m C

overa

ge

Sp

ecif

icat

ion

Co

vera

ge

Verification Coverage

EMC

MC, TP

16

The Cost Cube(more is worse)

Cost of writing specifications: how easy is it to write them and to get them right?

Cost of modeling: is a special modeling language required, is guided abstraction required?

Cost of verification

Model

ing C

ost

Sp

ecif

icat

ion

Co

st

Verification cost

17

The Cost Cube(more is worse)

· requires special modeling language or abstraction

· uses academic, relatively weak spec languages

· automatic verification

· program==model à 0 cost of modeling

· UML based specification with simulation

· Automatic test-generation and monitoring

Model

ing C

ostS

pec

ific

atio

n C

ost

Verification cost

· requires special modeling language

· uses academic, relatively weak spec languages, supports limited patterns of requirements

· equired Ph.D level verification driver

EMC

TPMC

18

Example#1 of A Validation Issue:Weak Specification Coverage

“if pump pressure is repeatedly turned Low then High N or more times (N>1) within 10 seconds then pump should not be Low for at least 20 additional seconds”

Customer cognitive

requirement

19

Example #1 (cont.)

Init[]

Low

pumpLow[]/timer10.restart();

High /*Local Variables*/static final int N=3;TRTimeout timer10 = new TRTimeout(10);TRTimeout timer20 = new TRTimeout(20);int nCnt = 0;

pumpHigh[]/nCnt++;

nCnt>=N

Erroron entry/System.err.println("Assertion 22 failed!");bSuccess=false;

[true]

[false]

pumpLow[]

timeoutFire[]

Verify_Low_for_20_secondson entry/timer20.restart();

PumpLow[]

Customer cognitive

requirement

Statechart-assertion for RV and EMC

20

Example #1 (cont.)

Model

/Pro

gram

Sp

ecif

icat

ion

Verification

Outside the scope of MC/TP. They do not support:

•Real-time constraints (10 sec, 20 sec…)

•Counting (N times…)

•In general, they support at most ω-regular properties.

Init[]

Low

pumpLow[]/timer10.restart();

High /*Local Variables*/static final int N=3;TRTimeout timer10 = new TRTimeout(10);TRTimeout timer20 = new TRTimeout(20);int nCnt = 0;

pumpHigh[]/nCnt++;

nCnt>=N

Erroron entry/System.err.println("Assertion 22 failed!");bSuccess=false;

[true]

[false]

pumpLow[]

timeoutFire[]

Verify_Low_for_20_secondson entry/timer20.restart();

pumpHigh[]

21

Example#2 of Poor

Specification Coverage

Customer cognitive

requirement

Statechart-assertion for RV and EMCNL (time-series):Whenever the track count (cnt) Average Arrival Rate (ART) exceeds 80% of the MAX_COUNT_PER_MIN cnt ART must be reduced back to 50% of the MAX_COUNT_PER_MIN within 2 minute and cnt ART must remain below 60% of the MAX_COUNT_PER_MIN for at least 10 minutes.

If ART>80% Then ART>50%

2min 10min

And ART>60%

22

More about Specification LanguagesLTL or Buchi-Automata vs. Statechart-Assertions

LTL & Buchi-Automata have lower specification coverage and are more expensive to use, partial list of reasons:

1. Theoretical: weak descriptive power (ω-regular at best).

2. Hard to use – the National Team can attest w.r.t. LTL

3. Lack of support for most basic constraints (real-time).

4. Infinite sequence semantics.

5. They are propositional (e.g., Always P Eventually Q ), while real systems are both conditional (propositional) and event-driven (see UML standard).

23

Example of Poor Program Coverage

Model

/Pro

gram

Sp

ecif

icat

ion

Verification

Program: InfusionPump.java

Can we verify the property in the context

of the REAL code?

24

Validation using JUnit or MSC

Validation. The StateRover uses JUnit-based simulation for validation.

Initon entry/ nCnt = 0;

[]

T

Erroron entry/bSuccess = false;

System.err.println("Assertion failed");

[]

P[]

A

Q/nCnt++;

/*Local Variables*/static final int N=2;int nCnt;

nCnt>N

[true][false]P[]/nCnt=0;

JUnit-based scenario:assertion.P();assertion.Q();assertion.Q();assertion.P();assertion.Q();assertTrue( assertions.isSuccess());

“No more than N (e.g., 2) Q events can follow a P event”

3 Q’s after 1’st P.

Is that OK?

Depends on cognitive expectation

25

Validation: What can go Wrong?


[]

T



[]

P[]

A

Q/nCnt++;


nCnt>N





Is that OK?


1. Assertion is incorrect (usually where blame is assigned).

2. Natural lang. is ambiguous.

3. NL was written for main scenario, doesn’t work as well for other scenarios.

4. Validation scenario is not what we think it is…

26

Thank you

27

Blunt User Questions

Q1. A property says “light must be on for at least 5 seconds after door opens”. My program already implements that, why write a spec.-property for that?

A. • Indeed, if everything we implemented was always correct the world would be a

nice place…• When the implementation changes, who is the “lobbyist” for this requirement?

We need a separate representative for each requirement.

28


Q2. Why not write a specification in Java (or in the language of the model).

A. We write spec’s as statechart-assertions. The motivation for not writing in Java is the same motivation that applies to using a code generator in general.


[]

T



[]

newTruck[]

A

newCar/nCnt++;


nCnt>N

[true][false]newTruck[]/nCnt=0;

“No more than N newCar events can follow a newTruck event”

29


Q3. What’s the difference between a model and a program.

A. Abstraction.

Once the model has sufficient detail to be used as source code then it’s a program. That’s how StateRover statechart models/programs are used.

30


Q4. Who says the spec. is correct?

A. Validation. The StateRover uses JUnit-based simulation for validation.


[]

T



[]

P[]

A

Q/nCnt++;


nCnt>N





Is that OK?


31

Comments of IV&V Director Dr. Caffal

1. Natural language requirements are typically vague, inconsistent, and incomplete.

2. Natural language requirements frequently have counter-examples to the expressed logic. The counter-examples are not easily observed by reading the requirement.

3. Unlike other disciplines, software developers oftentimes do not employ tools to describe behavior and elicit requirements.

4. Behavior specification comes in three flavors: what we want the system to do, what we do not want the system to do, and what we want the system to do under adverse conditions.

5. Nearly impossible to detect missing requirements.

6. Natural language requirements typically express constraints and limitations - rarely express behavior.The Team had to be hard pressed to come-up with behavioral requirements

7. Without specifying behavior, developers implicitly allow programmers to define behaviors. As such, system behaviors emerge without design and structure. Thus, emergent behaviors of systems are frequently an unhappy surprise to developers.

32

Behavioral specifications about: What we want the system to do

Whenever stop

command is

received then

vehicle should

reach complete

stop within 30

seconds

Init

Stop

stopCommand[]

[]

/*Local Variables*/TRTimeoutFireSimulatedTime timer = new TRTimeoutFireSimulatedTime(30, this);

timeoutFire[]

Primary.getSpeed() < 1Erroron entry/bSuccess=false;System.err.println("Assertion for Req.213 failed");

[false]

[true]

33

Behavioral specifications about: what we do not want the system to

do (“negative behavior”)

Pump should never operate until at least two seconds after valve-shut.

[]

Init

valveShut[]

/*Local Variables*/TRTimeoutFireSimulatedTime timer = new TRTimeoutFireSimulatedTime(2, this);

Count_2_secon entry/timer.restart();

timeoutFire[]

Erroron entry/bSuccess=false;System.err.println("Assertion for Req.213 failed");pumpStarted[]This is where the end user says:

I’ve already implemented this behavior the positive way, why do I need a negative behavior assertion?

34

Behavioral specifications about: what the system will do under adverse

conditions (recovery)

Red

Camera Count CarsC_0 []

CountBREAKon entry/nCnt = 1;

On

[]

Off

Shoot

bTest()

increment

newCar(Car obj)[isRolls(obj)]/nCnt = 4;

CriticalRegion

newCar[]newTruck[][]start[]

[false]

[true]

newTruck[]

[]

newCar[]

Assertion

Init

[]

T

ErrorBREAKon entry/bSuccess = false;


[]newCar[]/timeout.restart(); // restart timer newCar fires

primaryEntered("Off")[]

newTruck[]

timeoutFire()[]

35

Doing More for Validation


[]

T



[]

P[]

A

Q/nCnt++;


nCnt>N


Under development: tool that point out missing assertion simulation/validation scenarios

36

Thank You

37

backup

38


do (“negative behavior”)

As of cruiseSet speed should not change by more than 2% unless incline is more than 5% for more than 10 seconds.

V

stable

5sec

Speed instability

5secAmbiguous:a. Incline after speed instability

b. Incline during speed instability

DoorClosedon entry/timer_1sec.restart();

timeoutFire[]

[]

/*Local Variables*/TRTimeoutFireSimulatedTime timer_1sec = new TRTimeoutFireSimulatedTime(1, this);int speedAtCruiseSetTime = 0;int speedNow;int inclineDuration = 0;

Erroron entry/bSuccess=false;System.err.println("Assertion for Req.213 failed");

Init

cruiseSet[]/speedAtCruiseSetTime =primary.getSpeed();

on entry/speedNow= primary.getSpeed();if (primary.getIncline() > 5) inclineDuration++;else inclineDuration = 0;

abs(speedNow-speedAtCruiseSetTime)<0.02

[]

[true]

[false]

cruiseOff[]

checkInclineon entry/timer_1sec.restart();

timeoutFire[]primary.getIncline() > 5

[true]/inclineDuration++;

inclineDuration == 0

[true]

[false]

[false]

inclineDuration > 10

[]

[false]

39


do (“negative behavior”)Negative statement:

As of cruiseSet speed should not change by more than 2% unless incline is more than 5% for more than 10 seconds.

Positive statement:

As of cruiseSet speed should be 98% stable unless incline is more than 5% for more than 10 seconds.

The key about negative behavior is not the way its phrased. It’s the fact that a system is built to do the positive, so it is assumed the negative is

Dimensions of Formal Verification and Validation

Documents

Transcript of Dimensions of Formal Verification and Validation