Error Explanation with Distance Metrics Authors: Alex Groce, Sagar Chaki, Daniel Kroening, and Ofer...

1

Error Explanation with Distance Metrics

Authors: Alex Groce, Sagar Chaki, Daniel Kroening, and Ofer StrichmanInternational Journal on Software Tools for Technology, 2005. (Extended from two previous work: TACAS 2004 and CAV 2004) Presented by: Yean-Ru Chen 2011/04/14 @CCU

2

OutlinesIntroductionDistance Metrics for Program

ExecutionsProducing an Explanation∆-slicingConclusions

3

IntroductionThis paper describes a (semi-) automated

approach for assisting users in understanding and isolating errors in ANSI C programs.

Model checking can provide a counterexample trace if the verified system does not satisfy a specification (property). ◦This work needs the help from model checking

(use bounded model checking), but further provides error explanation information.

4

Motivated from:◦Lewis’ counterfactual approach

E.g.(a): “If I had eaten more at breakfast, I would not have been hungry at 11am.”

On Lewis' account, the truth of this statement consists in the fact that, among possible worlds where I ate more for breakfast, there is at least one world where I am not hungry at 11am and which is closer to our world than any world where I ate more for breakfast but am still hungry at 11am.

5

Lewis holds that a cause is something that makes a difference: if there had not been a cause c, there would not have been an effect e.◦An effect e is dependent on a cause c

at a world w if and only if at all worlds most similar to w in which ¬c, it is also that ¬e. This is the causal dependence. We will give

its formal definition on Definition 1.

6

Thus, we need to know what happens when we alter w as little as possible, other than to remove the possible cause c.

This seems reasonable. Considering this question: “Was Larry slipping on the banana peel causally dependent on Curly dropping?” We do not take into account worlds where

another alteration (such as Moe dropping a banana peel) is introduced.

We want to see the world where Curly did not drop banana peel, what happen to Larry.

8

However, why need the successful execution b which is “most similar” to a?◦Reason comes from another intuition.

e is causally dependent on c in execution a

e is NOT causally dependent on c in execution a

9

◦Another common intuition is: Successful executions that closely resemble

(i.e. as similar as possible to) a faulty run can shed considerable light on the sources of the error. In 1973, Lewis has proposed a theory of causality that

provides a justification for this intuition if we assume explanation is the analysis of causal relationships.

◦Lewis equates causality to an evaluation on the basis of the distance metrics between possible worlds. Bridge a philosophical link between causality

and distance metrics for program executions.

10

Error explanation ◦Approaches that aid users in moving

from a trace of a failure to an understanding of the essence of the failure and, perhaps, to a correction for the problem.

Fault localization◦This is the more specific task of

identifying the faulty core of a system, and is suitable for quantitative evaluation.

11

Work Flow

12

The methodology (error explanation and ∆-slicing ) presented in this paper is applied to the MAGIC model checker. ◦ MAGIC is a checker for modular verification of software

components in C. Steps:

◦ 1 and 2: CBMC inputs: Program P and its specification (property, shown

in assertion) CMBC will produce constraints form of program P (loop

unrolling and static single assignment (SSA)) and specification constraint. We call original system constraints and property constraint, respectively.

CBMC uses zChaff SAT solver to find a counterexample. (when verify, need to negate the specification constraint)

13

◦3 and 4: “explain” is the tool name.

This tool can produce a successful execution that is as similar as possible to the counterexample obtained from the previous step.

Three inputs for explain with the help of PBS (another SAT solver to help explain): (a) original system constrains. (b) property constraint. (c) distance constraints.

Output of explain (with the help of PBS): A closest successful execution with the computed

distance. (notated as ∆s, this is the distance between this successful execution and the counterexample execution)

14

◦5 and 6: Use slicing technology to reduce the number of ∆s.

Finally, the tool can provide useful candidate causes of the counterexample to user. This is the error explanation.

In this final step, users need to check which candidate is the real cause. Moreover, if users cannot find the real cause, they can add new assumptions as constraints and return to step 1 to work again.

Thus, this work is not totally automatic. Question:

◦ If we have found “closest successful execution”, why we need to reduce the number of ∆s? Discuss it later.

15

Distance Metrics for Program ExecutionsThe distance metric is proposed by D.

Sankoff and J. Kruskal in 1983, which focuses on sequence comparison.

From that previous work, the authors of this paper propose a function d(a,b) as distance metric for program execution, and prove this function that satisfies the following properties. ◦a and b are executions of the same

program

16

d(a,b) satisfies the following four properties:

17

Presenting program executions

18

Input program P (as shown in Fig. 3) and its specification (shown as an assertion in P) to CBMC (with help of zChaff)

19

CBMC inputs: Program and its specification (C program)

A counterexample result is the output of CBMC with help of zChaff. ◦ This counterexample execution is shown

with the form of loop unrolling (no need this case) and SSA.

20

CBMC also produces constraints for minmax.c. Original system constraints + property constraint.

Original system constraints

Specification (property) constraint

21

Counterexample Execution for minmax.c

22

Counterexample values of each variable (SSA form) for minmax.cAll variable assignments are

shown. typo: should be most#0 = 1

23

The distance metric dIn order to let explain tool

generate the closest successful execution, we need to know how “closest” is. So, first define the distance metric d.

24

Note◦Two executions a, b should have

same control flow. ◦A matching assignment in b for each

assignment in a. (guaranteed by SSA form)

◦Because they use same variables, distance computation will not cause serious overhead.

25

Producing an Explanation

26

Producing an explanationFirst, we need to have “most

similar” successful execution. ◦How?

Add distance constraints for each variable (SSA form)! Distance constraints are generated by explain

tool. We need to compute the values for the ∆

functions, so that we can add distance constraints. ∆ functions is used to compute distance.

27

Distance constraints for each variableThe distance constraints are generated based

on the counterexample execution.

28

Note:◦The distance constraints do not

affect satisfiability. ◦ The values of ∆ functions are used

to encode the optimization problem.

29

For a fixed execution a, d(a,b) = n can directly be encoded as a constraint by requiring that exactly n of the ∆s (∆ function value) be set to 1. ◦Use pseudo-Boolean function to model

above. f: Bn →R B={0,1}, R is real number, n is a non-negative integer.

30

So the distance metric can be modeled in this general form:

Here ci =1, k is a rational constant, bi is one of the ∆. (Recall Definition 2, next page)

is one of {<, ≦,≧,>, =}Thus, transform the distance computation

problem to an optimization problem.◦PBS solver can solve this optimization

problem!!

31

Recall Def. 2

32

So, input problem of PBS are:◦Original system constraints \conj

Distance constraints \conj Property constraint = True

33

Output of PBS

34

Closest Successful Execution

35

Change set (compare counterexample and successful)

Thus, we can slice more!

No executed in both!!

36

∆-slicing method1: two-phase

37

New problem of PBS:(New system constraints) \conj

(distance constraints) \conj (original property) = True

After applying method I, a reduced system model will be generated.

But real candidate causes of error will be completely generated.

38

∆-slicing method II: one-stepUse primed variable in successful

execution. Add the distance constraints:

New problem of PBS:(original system constraints) \conj (new

distance constraints) \conj (original property of execution a) \conj (property of execution successful execution) = True

39

Compare method 1 and method IIOne-step provides less useful

results.Two-phase executes faster.

Relevance is not a deterministic artifact of a program and a statement! It is a function of an explanation.

40

ConclusionsThey use one real case to show

how to get error explanation, also apply two slicing methods.

They also use some scoring function proposed by other work to evaluate the proposed method (also evaluate two slicing methods).

41

Thank you very much!

Error Explanation with Distance Metrics Authors: Alex Groce, Sagar Chaki, Daniel Kroening, and Ofer...

Documents

Transcript of Error Explanation with Distance Metrics Authors: Alex Groce, Sagar Chaki, Daniel Kroening, and Ofer...