Error Explanation with Distance Metrics Authors: Alex Groce, Sagar Chaki, Daniel Kroening, and Ofer...
-
Upload
karen-owens -
Category
Documents
-
view
220 -
download
0
description
Transcript of Error Explanation with Distance Metrics Authors: Alex Groce, Sagar Chaki, Daniel Kroening, and Ofer...
1
Error Explanation with Distance Metrics
Authors: Alex Groce, Sagar Chaki, Daniel Kroening, and Ofer StrichmanInternational Journal on Software Tools for Technology, 2005. (Extended from two previous work: TACAS 2004 and CAV 2004) Presented by: Yean-Ru Chen 2011/04/14 @CCU
2
OutlinesIntroductionDistance Metrics for Program
ExecutionsProducing an Explanation∆-slicingConclusions
3
IntroductionThis paper describes a (semi-) automated
approach for assisting users in understanding and isolating errors in ANSI C programs.
Model checking can provide a counterexample trace if the verified system does not satisfy a specification (property). ◦This work needs the help from model checking
(use bounded model checking), but further provides error explanation information.
4
Motivated from:◦Lewis’ counterfactual approach
E.g.(a): “If I had eaten more at breakfast, I would not have been hungry at 11am.”
On Lewis' account, the truth of this statement consists in the fact that, among possible worlds where I ate more for breakfast, there is at least one world where I am not hungry at 11am and which is closer to our world than any world where I ate more for breakfast but am still hungry at 11am.
5
Lewis holds that a cause is something that makes a difference: if there had not been a cause c, there would not have been an effect e.◦An effect e is dependent on a cause c
at a world w if and only if at all worlds most similar to w in which ¬c, it is also that ¬e. This is the causal dependence. We will give
its formal definition on Definition 1.
6
Thus, we need to know what happens when we alter w as little as possible, other than to remove the possible cause c.
This seems reasonable. Considering this question: “Was Larry slipping on the banana peel causally dependent on Curly dropping?” We do not take into account worlds where
another alteration (such as Moe dropping a banana peel) is introduced.
We want to see the world where Curly did not drop banana peel, what happen to Larry.
7
8
However, why need the successful execution b which is “most similar” to a?◦Reason comes from another intuition.
e is causally dependent on c in execution a
e is NOT causally dependent on c in execution a
9
◦Another common intuition is: Successful executions that closely resemble
(i.e. as similar as possible to) a faulty run can shed considerable light on the sources of the error. In 1973, Lewis has proposed a theory of causality that
provides a justification for this intuition if we assume explanation is the analysis of causal relationships.
◦Lewis equates causality to an evaluation on the basis of the distance metrics between possible worlds. Bridge a philosophical link between causality
and distance metrics for program executions.
10
Error explanation ◦Approaches that aid users in moving
from a trace of a failure to an understanding of the essence of the failure and, perhaps, to a correction for the problem.
Fault localization◦This is the more specific task of
identifying the faulty core of a system, and is suitable for quantitative evaluation.
11
Work Flow
12
The methodology (error explanation and ∆-slicing ) presented in this paper is applied to the MAGIC model checker. ◦ MAGIC is a checker for modular verification of software
components in C. Steps:
◦ 1 and 2: CBMC inputs: Program P and its specification (property, shown
in assertion) CMBC will produce constraints form of program P (loop
unrolling and static single assignment (SSA)) and specification constraint. We call original system constraints and property constraint, respectively.
CBMC uses zChaff SAT solver to find a counterexample. (when verify, need to negate the specification constraint)
13
◦3 and 4: “explain” is the tool name.
This tool can produce a successful execution that is as similar as possible to the counterexample obtained from the previous step.
Three inputs for explain with the help of PBS (another SAT solver to help explain): (a) original system constrains. (b) property constraint. (c) distance constraints.
Output of explain (with the help of PBS): A closest successful execution with the computed
distance. (notated as ∆s, this is the distance between this successful execution and the counterexample execution)
14
◦5 and 6: Use slicing technology to reduce the number of ∆s.
Finally, the tool can provide useful candidate causes of the counterexample to user. This is the error explanation.
In this final step, users need to check which candidate is the real cause. Moreover, if users cannot find the real cause, they can add new assumptions as constraints and return to step 1 to work again.
Thus, this work is not totally automatic. Question:
◦ If we have found “closest successful execution”, why we need to reduce the number of ∆s? Discuss it later.
15
Distance Metrics for Program ExecutionsThe distance metric is proposed by D.
Sankoff and J. Kruskal in 1983, which focuses on sequence comparison.
From that previous work, the authors of this paper propose a function d(a,b) as distance metric for program execution, and prove this function that satisfies the following properties. ◦a and b are executions of the same
program
16
d(a,b) satisfies the following four properties:
17
Presenting program executions
18
Input program P (as shown in Fig. 3) and its specification (shown as an assertion in P) to CBMC (with help of zChaff)
19
CBMC inputs: Program and its specification (C program)
A counterexample result is the output of CBMC with help of zChaff. ◦ This counterexample execution is shown
with the form of loop unrolling (no need this case) and SSA.
20
CBMC also produces constraints for minmax.c. Original system constraints + property constraint.
Original system constraints
Specification (property) constraint
21
Counterexample Execution for minmax.c
22
Counterexample values of each variable (SSA form) for minmax.cAll variable assignments are
shown. typo: should be most#0 = 1
23
The distance metric dIn order to let explain tool
generate the closest successful execution, we need to know how “closest” is. So, first define the distance metric d.
24
Note◦Two executions a, b should have
same control flow. ◦A matching assignment in b for each
assignment in a. (guaranteed by SSA form)
◦Because they use same variables, distance computation will not cause serious overhead.
25
Producing an Explanation
26
Producing an explanationFirst, we need to have “most
similar” successful execution. ◦How?
Add distance constraints for each variable (SSA form)! Distance constraints are generated by explain
tool. We need to compute the values for the ∆
functions, so that we can add distance constraints. ∆ functions is used to compute distance.
27
Distance constraints for each variableThe distance constraints are generated based
on the counterexample execution.
28
Note:◦The distance constraints do not
affect satisfiability. ◦ The values of ∆ functions are used
to encode the optimization problem.
29
For a fixed execution a, d(a,b) = n can directly be encoded as a constraint by requiring that exactly n of the ∆s (∆ function value) be set to 1. ◦Use pseudo-Boolean function to model
above. f: Bn →R B={0,1}, R is real number, n is a non-negative integer.
30
So the distance metric can be modeled in this general form:
Here ci =1, k is a rational constant, bi is one of the ∆. (Recall Definition 2, next page)
is one of {<, ≦,≧,>, =}Thus, transform the distance computation
problem to an optimization problem.◦PBS solver can solve this optimization
problem!!
31
Recall Def. 2
32
So, input problem of PBS are:◦Original system constraints \conj
Distance constraints \conj Property constraint = True
33
Output of PBS
34
Closest Successful Execution
35
Change set (compare counterexample and successful)
Thus, we can slice more!
No executed in both!!
36
∆-slicing method1: two-phase
37
New problem of PBS:(New system constraints) \conj
(distance constraints) \conj (original property) = True
After applying method I, a reduced system model will be generated.
But real candidate causes of error will be completely generated.
38
∆-slicing method II: one-stepUse primed variable in successful
execution. Add the distance constraints:
New problem of PBS:(original system constraints) \conj (new
distance constraints) \conj (original property of execution a) \conj (property of execution successful execution) = True
39
Compare method 1 and method IIOne-step provides less useful
results.Two-phase executes faster.
Relevance is not a deterministic artifact of a program and a statement! It is a function of an explanation.
40
ConclusionsThey use one real case to show
how to get error explanation, also apply two slicing methods.
They also use some scoring function proposed by other work to evaluate the proposed method (also evaluate two slicing methods).
41
Thank you very much!