DART and CUTE: Concolic Testing › ~aldrich › courses › 17-355-17... · 3 Software Reliability...
Transcript of DART and CUTE: Concolic Testing › ~aldrich › courses › 17-355-17... · 3 Software Reliability...
3/28/2017 1
DART and CUTE: Concolic Testing
Koushik SenUniversity of California, Berkeley
Joint work with Gul Agha, Patrice Godefroid, Nils Klarlund,
Rupak Majumdar, Darko Marinov
Used and adapted by Jonathan Aldrich, with permission,for 17-355/17-655 Program Analysis
2
Today, QA is mostly testing“50% of my company employees are testers, and the rest spends 50% of their time testing!”
Bill Gates 1995
3
Software Reliability Importance Software is pervading every aspects of life
Software everywhere Software failures were estimated to cost the US economy
about $60 billion annually [NIST 2002] Improvements in software testing infrastructure may save one-
third of this cost Testing accounts for an estimated 50%-80% of the cost of
software development [Beizer 1990]
4
Big PictureSafe Programming
Languages and Type systems
Static Program Analysis
Dynamic Program Analysis
Model Based Software Development and Analysis
Model Checking
and Theorem Proving
Runtime Monitoring
Testing
5
Big PictureSafe Programming
Languages and Type systems
Model Checking
and Theorem Proving
Runtime Monitoring
Model Based Software Development and Analysis
Static Program Analysis
Dynamic Program Analysis
Testing
6
A Familiar Program: QuickSortvoid quicksort (int[] a, int lo, int hi) {
int i=lo, j=hi, h; int x=a[(lo+hi)/2];
// partition do {
while (a[i]<x) i++; while (a[j]>x) j--; if (i<=j) {
h=a[i]; a[i]=a[j]; a[j]=h; i++; j--;
} } while (i<=j);
// recursion if (lo<j) quicksort(a, lo, j); if (i<hi) quicksort(a, i, hi);
}
7
A Familiar Program: QuickSortvoid quicksort (int[] a, int lo, int hi) {
int i=lo, j=hi, h; int x=a[(lo+hi)/2];
// partition do {
while (a[i]<x) i++; while (a[j]>x) j--; if (i<=j) {
h=a[i]; a[i]=a[j]; a[j]=h; i++; j--;
} } while (i<=j);
// recursion if (lo<j) quicksort(a, lo, j); if (i<hi) quicksort(a, i, hi);
}
Test QuickSort Create an array Initialize the elements of
the array Execute the program on
this array How much confidence
do I have in this testing method?
Is my test suite *Complete*?
Can someone generate a small and *Complete* test suite for me?
8
Automated Test Generation Studied since 70’s King 76, Myers 79
30 years have passed, and yet no effective solution
What Happened???
9
Automated Test Generation Studied since 70’s King 76, Myers 79
30 years have passed, and yet no effective solution
What Happened??? Program-analysis techniques were expensive Automated theorem proving and constraint solving
techniques were not efficient
10
Automated Test Generation Studied since 70’s King 76, Myers 79
30 years have passed, and yet no effective solution
What Happened??? Program-analysis techniques were expensive Automated theorem proving and constraint solving
techniques were not efficient In the recent years we have seen remarkable
progress in static program-analysis and constraint solving SLAM, BLAST, ESP, Bandera, Saturn, MAGIC
11
Automated Test Generation Studied since 70’s King 76, Myers 79
30 years have passed, and yet no effective solution
What Happened??? Program-analysis techniques were expensive Automated theorem proving and constraint solving
techniques were not efficient In the recent years we have seen remarkable
progress in static program-analysis and constraint solving SLAM, BLAST, ESP, Bandera, Saturn, MAGIC
Question: Can we use similar techniques in Automated Testing?
12
Systematic Automated Testing
Testing
Static Analysis
Automated Theorem Proving
Model-Checking and
Formal Methods
Constraint Solving
13
CUTE and DART Combine random testing (concrete execution) and
symbolic testing (symbolic execution) [PLDI’05, FSE’05, FASE’06, CAV’06,ISSTA’07,
ICSE’07]
Concrete + Symbolic = Concolic
14
Goal Automated Unit Testing of real-world C and
Java Programs Generate test inputs Execute unit under test on generated test inputs
so that all reachable statements are executed Any assertion violation gets caught
Sub-problem: Input Generation Given a statement s in program P, compute input
i, such that P(i) executes s This formulation is due to Wolfram Schulte
15
Execution Paths of a Program Can be seen as a binary
tree with possibly infinite depth Computation tree
Each node represents the execution of a “if then else”statement
Each edge represents the execution of a sequence of non-conditional statements
Each path in the tree represents an equivalence class of inputs
0 1
0 0
0
0
1
1
1
1
1
1
16
Example of Computation Tree
void test_me(int x, int y) {if(2*x==y){
if(x != y+10){printf(“I am fine here”);
} else {printf(“I should not reach here”);ERROR;
}}
}
2*x==y
x!=y+10
N Y
N Y
ERROR
17
Concolic Testing: Finding Security and Safety Bugs
Divide by 0 Error
x = 3 / i;
Buffer Overflow
a[i] = 4;
18
Concolic Testing: Finding Security and Safety Bugs
Divide by 0 Error
if (i !=0)x = 3 / i;
elseERROR;
Buffer Overflow
if (0<=i && i < a.length)a[i] = 4;
elseERROR;
Key: Add Checks Automatically and
Perform Concolic Testing
19
Existing Approach I Random testing
generate random inputs execute the program on
generated inputs Probability of reaching
an error can be astronomically less
test_me(int x){if(x==94389){
ERROR;}
}
Probability of hitting ERROR = 1/232
20
Existing Approach II Symbolic Execution
use symbolic values for input variables
execute the program symbolically on symbolic input values
collect symbolic path constraints
use theorem prover to check if a branch can be taken
Does not scale for large programs
Cannot solve all constraints
test_me(int x, int y){if((x%10)*4==y){
ERROR;} else {
// OK}
}
Symbolic execution cannot solve for x, yMissed error(or false positive)
21
Existing Approach II Symbolic Execution
use symbolic values for input variables
execute the program symbolically on symbolic input values
collect symbolic path constraints
use theorem prover to check if a branch can be taken
Does not scale for large programs
Cannot solve all constraints
test_me(int x, int y){if(bbox(x)==y){
ERROR;} else {
// OK}
}
Symbolic execution cannot look inside the bbox functionMissed error(or false positive)
22
Concolic Execution Concolic Execution
Randomly choose x and y the first time
Use concrete value of x and result of running bbox(x) to solve for y
Finds the error!
test_me(int x, int y){if(bbox(x)==y){
ERROR;} else {
// OK}
}
23
Concolic Testing Approach
Random Test Driver: random value for x
and y Probability of reaching
ERROR is extremely low
int double (int v) {
return 2*v; }
void testme (int x, int y) {
z = double (y);
if (z == x) {
if (x > y+10) {
ERROR;}
}
}
24
Concolic Testing Approachint double (int v) {
return 2*v; }
void testme (int x, int y) {
z = double (y);
if (z == x) {
if (x > y+10) {
ERROR;}
}
}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
path condition
x = 22, y = 7 x = x0, y = y0
25
Concolic Testing Approachint double (int v) {
return 2*v; }
void testme (int x, int y) {
z = double (y);
if (z == x) {
if (x > y+10) {
ERROR;}
}
}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
path condition
x = 22, y = 7, z = 14
x = x0, y = y0, z = 2*y0
26
Concolic Testing Approachint double (int v) {
return 2*v; }
void testme (int x, int y) {
z = double (y);
if (z == x) {
if (x > y+10) {
ERROR;}
}
}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
path condition
x = 22, y = 7, z = 14
x = x0, y = y0, z = 2*y0
2*y0 != x0
27
Concolic Testing Approachint double (int v) {
return 2*v; }
void testme (int x, int y) {
z = double (y);
if (z == x) {
if (x > y+10) {
ERROR;}
}
}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
path condition
2*y0 != x0
Solve: 2*y0 == x0
Solution: x0 = 2, y0 = 1
x = 22, y = 7, z = 14
x = x0, y = y0, z = 2*y0
28
Concolic Testing Approachint double (int v) {
return 2*v; }
void testme (int x, int y) {
z = double (y);
if (z == x) {
if (x > y+10) {
ERROR;}
}
}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
path condition
x = 2, y = 1 x = x0, y = y0
29
Concolic Testing Approachint double (int v) {
return 2*v; }
void testme (int x, int y) {
z = double (y);
if (z == x) {
if (x > y+10) {
ERROR;}
}
}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
path condition
x = 2, y = 1, z = 2
x = x0, y = y0, z = 2*y0
30
Concolic Testing Approachint double (int v) {
return 2*v; }
void testme (int x, int y) {
z = double (y);
if (z == x) {
if (x > y+10) {
ERROR;}
}
}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
path condition
x = 2, y = 1, z = 2
x = x0, y = y0, z = 2*y0
2*y0 == x0
31
Concolic Testing Approachint double (int v) {
return 2*v; }
void testme (int x, int y) {
z = double (y);
if (z == x) {
if (x > y+10) {
ERROR;}
}
}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
path condition
x = 2, y = 1, z = 2
x = x0, y = y0, z = 2*y0
2*y0 == x0
x0 � y0+10
32
Concolic Testing Approachint double (int v) {
return 2*v; }
void testme (int x, int y) {
z = double (y);
if (z == x) {
if (x > y+10) {
ERROR;}
}
}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
path condition
x = 2, y = 1, z = 2
x = x0, y = y0, z = 2*y0
Solve: (2*y0 == x0) � (x0 > y0 + 10)
Solution: x0 = 30, y0 = 15
2*y0 == x0
x0 � y0+10
33
Concolic Testing Approachint double (int v) {
return 2*v; }
void testme (int x, int y) {
z = double (y);
if (z == x) {
if (x > y+10) {
ERROR;}
}
}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
path condition
x = 30, y = 15 x = x0, y = y0
34
Concolic Testing Approachint double (int v) {
return 2*v; }
void testme (int x, int y) {
z = double (y);
if (z == x) {
if (x > y+10) {
ERROR;}
}
}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
path condition
x = 30, y = 15 x = x0, y = y0
2*y0 == x0
x0 > y0+10
Program Error
35
Explicit Path (not State) Model Checking Traverse all execution
paths one by one to detect errors assertion violations program crash uncaught exceptions
combine with valgrindto discover memory errors
F T
F F
F
F
T
T
T
T
T
T
36
Explicit Path (not State) Model Checking Traverse all execution
paths one by one to detect errors assertion violations program crash uncaught exceptions
combine with valgrindto discover memory errors
F T
F F
F
F
T
T
T
T
T
T
37
Explicit Path (not State) Model Checking Traverse all execution
paths one by one to detect errors assertion violations program crash uncaught exceptions
combine with valgrindto discover memory errors
F T
F F
F
F
T
T
T
T
T
T
38
Explicit Path (not State) Model Checking Traverse all execution
paths one by one to detect errors assertion violations program crash uncaught exceptions
combine with valgrindto discover memory errors
F T
F F
F
F
T
T
T
T
T
T
39
Explicit Path (not State) Model Checking Traverse all execution
paths one by one to detect errors assertion violations program crash uncaught exceptions
combine with valgrindto discover memory errors
F T
F F
F
F
T
T
T
T
T
T
40
Explicit Path (not State) Model Checking Traverse all execution
paths one by one to detect errors assertion violations program crash uncaught exceptions
combine with valgrindto discover memory errors
F T
F F
F
F
T
T
T
T
T
T
41
Novelty : Simultaneous Concrete and Symbolic Execution
int foo (int v) {
return (v*v) % 50; }
void testme (int x, int y) {
z = foo (y);
if (z == x) {
if (x > y+10) {
ERROR;}
}
}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
path condition
x = 22, y = 7 x = x0, y = y0
42
int foo (int v) {
return (v*v) % 50; }
void testme (int x, int y) {
z = foo (y);
if (z == x) {
if (x > y+10) {
ERROR;}
}
}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
path condition
x = 22, y = 7, z = 49
x = x0, y = y0, z = (y0 *y0)%50
(y0*y0)%50 !=x0
Solve: (y0*y0 )%50 == x0
Don’t know how to solve!
Stuck?
Novelty : Simultaneous Concrete and Symbolic Execution
43
void testme (int x, int y) {
z = foo (y);
if (z == x) {
if (x > y+10) {
ERROR;}
}
}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
path condition
x = 22, y = 7, z = 49
x = x0, y = y0, z = foo (y0)
foo (y0) !=x0
Novelty : Simultaneous Concrete and Symbolic Execution
Solve: foo (y0) == x0
Don’t know how to solve!
Stuck?
44
int foo (int v) {
return (v*v) % 50; }
void testme (int x, int y) {
z = foo (y);
if (z == x) {
if (x > y+10) {
ERROR;}
}
}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
path condition
x = 22, y = 7, z = 49
x = x0, y = y0, z = (y0 *y0)%50
(y0*y0)%50 !=x0
Solve: (y0*y0 )%50 == x0
Don’t know how to solve!
Not Stuck!
Use concrete state
Replace y0 by 7 (sound)
Novelty : Simultaneous Concrete and Symbolic Execution
45
int foo (int v) {
return (v*v) % 50; }
void testme (int x, int y) {
z = foo (y);
if (z == x) {
if (x > y+10) {
ERROR;}
}
}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
path condition
x = 22, y = 7, z = 48
x = x0, y = y0, z = 49
49 !=x0
Solve: 49 == x0
Solution : x0 = 49, y0 = 7
Novelty : Simultaneous Concrete and Symbolic Execution
46
int foo (int v) {
return (v*v) % 50; }
void testme (int x, int y) {
z = foo (y);
if (z == x) {
if (x > y+10) {
ERROR;}
}
}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
path condition
x = 49, y = 7 x = x0, y = y0
Novelty : Simultaneous Concrete and Symbolic Execution
47
int foo (int v) {
return (v*v) % 50; }
void testme (int x, int y) {
z = foo (y);
if (z == x) {
if (x > y+10) {
ERROR;}
}
}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
path condition
x = 49, y = 7, z = 49
x = x0, y = y0 ,z = 49
2*y0 == x0
x0 > y0+10
Program Error
Novelty : Simultaneous Concrete and Symbolic Execution
48
Concolic Testing: A Middle Approach
+ Complex programs+ Efficient- Less coverage+ No false positive
- Simple programs- Not efficient+ High coverage- False positive
Random Testing
Symbolic Testing
Concolic Testing
+ Complex programs+/- Somewhat efficient+ High coverage+ No false positive
49
Implementations DART and CUTE for C programs jCUTE for Java programs Goto http://srl.cs.berkeley.edu/~ksen/ for CUTE
and jCUTE binaries MSR has four implementations SAGE, PEX, YOGI, Vigilante
Similar tool: EXE at Stanford Easiest way to use and to develop on top of
CUTE Implement concolic testing yourself
50
Testing Data Structures(joint work with Darko Marinov and
Gul Agha
51
Exampletypedef struct cell { int v; struct cell *next;
} cell;
int f(int v) { return 2*v + 1;
}
int testme(cell *p, int x) {if (x > 0)if (p != NULL)if (f(x) == p->v)if (p->next == p)abort();
return 0;}
• Random Test Driver:
• random memory graphreachable from p
• random value for x
• Probability of reaching abort( ) isextremely low
52
CUTE Approachtypedef struct cell { int v; struct cell *next;
} cell;
int f(int v) { return 2*v + 1;
}
int testme(cell *p, int x) {if (x > 0)if (p != NULL)if (f(x) == p->v)if (p->next == p)abort();
return 0;}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
constraints
p=p0, x=x0
p, x=236
NULL
53
CUTE Approachtypedef struct cell { int v; struct cell *next;
} cell;
int f(int v) { return 2*v + 1;
}
int testme(cell *p, int x) {if (x > 0)if (p != NULL)if (f(x) == p->v)if (p->next == p)abort();
return 0;}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
constraints
p=p0, x=x0
p, x=236
NULL
54
CUTE Approachtypedef struct cell { int v; struct cell *next;
} cell;
int f(int v) { return 2*v + 1;
}
int testme(cell *p, int x) {if (x > 0)if (p != NULL)if (f(x) == p->v)if (p->next == p)abort();
return 0;}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
constraints
x0>0
p=p0, x=x0
p, x=236
NULL
55
CUTE Approachtypedef struct cell { int v; struct cell *next;
} cell;
int f(int v) { return 2*v + 1;
}
int testme(cell *p, int x) {if (x > 0)if (p != NULL)if (f(x) == p->v)if (p->next == p)abort();
return 0;}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
constraints
x0>0
p=p0, x=x0
p, x=236
NULL
!(p0!=NULL)
56
CUTE Approachtypedef struct cell { int v; struct cell *next;
} cell;
int f(int v) { return 2*v + 1;
}
int testme(cell *p, int x) {if (x > 0)if (p != NULL)if (f(x) == p->v)if (p->next == p)abort();
return 0;}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
constraints
x0>0
p=p0, x=x0
p, x=236
NULL
p0=NULL
solve: x0>0 and p0NULL
57
CUTE Approachtypedef struct cell { int v; struct cell *next;
} cell;
int f(int v) { return 2*v + 1;
}
int testme(cell *p, int x) {if (x > 0)if (p != NULL)if (f(x) == p->v)if (p->next == p)abort();
return 0;}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
constraints
x0>0
p=p0, x=x0
p, x=236
NULL
p0=NULL
solve: x0>0 and p0NULL
x0=236, p0
634
NULL
58
CUTE Approachtypedef struct cell { int v; struct cell *next;
} cell;
int f(int v) { return 2*v + 1;
}
int testme(cell *p, int x) {if (x > 0)if (p != NULL)if (f(x) == p->v)if (p->next == p)abort();
return 0;}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
constraints
p=p0, x=x0,p->v =v0,
p->next=n0
p, x=236
NULL
634
59
CUTE Approachtypedef struct cell { int v; struct cell *next;
} cell;
int f(int v) { return 2*v + 1;
}
int testme(cell *p, int x) {if (x > 0)if (p != NULL)if (f(x) == p->v)if (p->next == p)abort();
return 0;}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
constraints
p=p0, x=x0,p->v =v0,
p->next=n0
p, x=236
NULL
634
x0>0
60
CUTE Approachtypedef struct cell { int v; struct cell *next;
} cell;
int f(int v) { return 2*v + 1;
}
int testme(cell *p, int x) {if (x > 0)if (p != NULL)if (f(x) == p->v)if (p->next == p)abort();
return 0;}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
constraints
p=p0, x=x0,p->v =v0,
p->next=n0
p, x=236
NULL
634
x0>0p0NULL
61
CUTE Approachtypedef struct cell { int v; struct cell *next;
} cell;
int f(int v) { return 2*v + 1;
}
int testme(cell *p, int x) {if (x > 0)if (p != NULL)if (f(x) == p->v)if (p->next == p)abort();
return 0;}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
constraints
p=p0, x=x0,p->v =v0,
p->next=n0
p, x=236
NULL
634
x0>0p0NULL
2x0+1v0
62
CUTE Approachtypedef struct cell { int v; struct cell *next;
} cell;
int f(int v) { return 2*v + 1;
}
int testme(cell *p, int x) {if (x > 0)if (p != NULL)if (f(x) == p->v)if (p->next == p)abort();
return 0;}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
constraints
p=p0, x=x0,p->v =v0,
p->next=n0
p, x=236
NULL
634
x0>0p0NULL
2x0+1v0
63
CUTE Approachtypedef struct cell { int v; struct cell *next;
} cell;
int f(int v) { return 2*v + 1;
}
int testme(cell *p, int x) {if (x > 0)if (p != NULL)if (f(x) == p->v)if (p->next == p)abort();
return 0;}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
constraints
p=p0, x=x0,p->v =v0,
p->next=n0
p, x=236
NULL
634
x0>0p0NULL
2x0+1v0
solve: x0>0 and p0NULL and 2x0+1=v0
64
CUTE Approachtypedef struct cell { int v; struct cell *next;
} cell;
int f(int v) { return 2*v + 1;
}
int testme(cell *p, int x) {if (x > 0)if (p != NULL)if (f(x) == p->v)if (p->next == p)abort();
return 0;}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
constraints
p=p0, x=x0,p->v =v0,
p->next=n0
p, x=236
NULL
634
x0>0p0NULL
2x0+1v0
solve: x0>0 and p0NULL and 2x0+1=v0
x0=1, p0
3
NULL
65
CUTE Approachtypedef struct cell { int v; struct cell *next;
} cell;
int f(int v) { return 2*v + 1;
}
int testme(cell *p, int x) {if (x > 0)if (p != NULL)if (f(x) == p->v)if (p->next == p)abort();
return 0;}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
constraints
p=p0, x=x0,p->v =v0,
p->next=n0
p, x=1
NULL
3
66
CUTE Approachtypedef struct cell { int v; struct cell *next;
} cell;
int f(int v) { return 2*v + 1;
}
int testme(cell *p, int x) {if (x > 0)if (p != NULL)if (f(x) == p->v)if (p->next == p)abort();
return 0;}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
constraints
p=p0, x=x0,p->v =v0,
p->next=n0
p, x=1
NULL
3
x0>0
67
CUTE Approachtypedef struct cell { int v; struct cell *next;
} cell;
int f(int v) { return 2*v + 1;
}
int testme(cell *p, int x) {if (x > 0)if (p != NULL)if (f(x) == p->v)if (p->next == p)abort();
return 0;}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
constraints
p=p0, x=x0,p->v =v0,
p->next=n0
p, x=1
NULL
3
x0>0p0NULL
68
CUTE Approachtypedef struct cell { int v; struct cell *next;
} cell;
int f(int v) { return 2*v + 1;
}
int testme(cell *p, int x) {if (x > 0)if (p != NULL)if (f(x) == p->v)if (p->next == p)abort();
return 0;}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
constraints
p=p0, x=x0,p->v =v0,
p->next=n0
p, x=1
NULL
3
x0>0p0NULL
2x0+1=v0
69
CUTE Approachtypedef struct cell { int v; struct cell *next;
} cell;
int f(int v) { return 2*v + 1;
}
int testme(cell *p, int x) {if (x > 0)if (p != NULL)if (f(x) == p->v)if (p->next == p)abort();
return 0;}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
constraints
p=p0, x=x0,p->v =v0,
p->next=n0
p, x=1
NULL
3
x0>0p0NULL
2x0+1=v0
n0p0
70
CUTE Approachtypedef struct cell { int v; struct cell *next;
} cell;
int f(int v) { return 2*v + 1;
}
int testme(cell *p, int x) {if (x > 0)if (p != NULL)if (f(x) == p->v)if (p->next == p)abort();
return 0;}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
constraints
p=p0, x=x0,p->v =v0,
p->next=n0
p, x=1
NULL
3
x0>0p0NULL
2x0+1=v0
n0p0
71
CUTE Approachtypedef struct cell { int v; struct cell *next;
} cell;
int f(int v) { return 2*v + 1;
}
int testme(cell *p, int x) {if (x > 0)if (p != NULL)if (f(x) == p->v)if (p->next == p)abort();
return 0;}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
constraints
solve: x0>0 and p0NULL and 2x0+1=v0 and n0=p0
.
p=p0, x=x0,p->v =v0,
p->next=n0
p, x=1
NULL
3
x0>0p0NULL
2x0+1=v0
n0p0
72
CUTE Approachtypedef struct cell { int v; struct cell *next;
} cell;
int f(int v) { return 2*v + 1;
}
int testme(cell *p, int x) {if (x > 0)if (p != NULL)if (f(x) == p->v)if (p->next == p)abort();
return 0;}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
constraints
solve: x0>0 and p0NULL and 2x0+1=v0 and n0=p0
x0=1, p0
3
p=p0, x=x0,p->v =v0,
p->next=n0
p, x=1
NULL
3
x0>0p0NULL
2x0+1=v0
n0p0
73
CUTE Approachtypedef struct cell { int v; struct cell *next;
} cell;
int f(int v) { return 2*v + 1;
}
int testme(cell *p, int x) {if (x > 0)if (p != NULL)if (f(x) == p->v)if (p->next == p)abort();
return 0;}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
constraints
p=p0, x=x0,p->v =v0,
p->next=n0
p, x=1
3
74
CUTE Approachtypedef struct cell { int v; struct cell *next;
} cell;
int f(int v) { return 2*v + 1;
}
int testme(cell *p, int x) {if (x > 0)if (p != NULL)if (f(x) == p->v)if (p->next == p)abort();
return 0;}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
constraints
x0>0p=p0, x=x0,p->v =v0,
p->next=n0
p, x=1
3
75
CUTE Approachtypedef struct cell { int v; struct cell *next;
} cell;
int f(int v) { return 2*v + 1;
}
int testme(cell *p, int x) {if (x > 0)if (p != NULL)if (f(x) == p->v)if (p->next == p)abort();
return 0;}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
constraints
x0>0p0NULLp=p0, x=x0,
p->v =v0,p->next=n0
p, x=1
3
76
CUTE Approachtypedef struct cell { int v; struct cell *next;
} cell;
int f(int v) { return 2*v + 1;
}
int testme(cell *p, int x) {if (x > 0)if (p != NULL)if (f(x) == p->v)if (p->next == p)abort();
return 0;}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
constraints
x0>0p0NULL
2x0+1=v0
p=p0, x=x0,p->v =v0,
p->next=n0
p, x=1
3
77
CUTE Approachtypedef struct cell { int v; struct cell *next;
} cell;
int f(int v) { return 2*v + 1;
}
int testme(cell *p, int x) {if (x > 0)if (p != NULL)if (f(x) == p->v)if (p->next == p)abort();
return 0;}
Concrete Execution
Symbolic Execution
concrete state
symbolic state
constraints
x0>0p0NULL
2x0+1=v0
n0=p0
p=p0, x=x0,p->v =v0,
p->next=n0
p, x=1
3
Program Error
78
Pointer Inputs: Input Graphtypedef struct cell { int v; struct cell *next;
} cell;
int f(int v) { return 2*v + 1;
}
int testme(cell *p, int x) {if (x > 0)if (p != NULL)if (f(x) == p->v)if (p->next == p)abort();
return 0;}
79
CUTE in a Nutshell Generate concrete inputs one by one
each input leads program along a different path
80
CUTE in a Nutshell Generate concrete inputs one by one
each input leads program along a different path On each input execute program both concretely and
symbolically
81
CUTE in a Nutshell Generate concrete inputs one by one
each input leads program along a different path On each input execute program both concretely and
symbolically Both cooperate with each other
concrete execution guides the symbolic execution
82
CUTE in a Nutshell Generate concrete inputs one by one
each input leads program along a different path On each input execute program both concretely and
symbolically Both cooperate with each other
concrete execution guides the symbolic execution concrete execution enables symbolic execution to
overcome incompleteness of theorem prover replace symbolic expressions by concrete values if
symbolic expressions become complex resolve aliases for pointer using concrete values handle arrays naturally
83
CUTE in a Nutshell Generate concrete inputs one by one
each input leads program along a different path On each input execute program both concretely and
symbolically Both cooperate with each other
concrete execution guides the symbolic execution concrete execution enables symbolic execution to
overcome incompleteness of theorem prover replace symbolic expressions by concrete values if
symbolic expressions become complex resolve aliases for pointer using concrete values handle arrays naturally
symbolic execution helps to generate concrete input for next execution increases coverage
84
Handling Pointer Casting and Arithmeticstruct foo {
int i;char c;
};void * memset(void *s,char c,size_t n){
for(int i=0;i<n;i++){((char *)s)[i] = c; }
return s;}bar(struct foo *a){
if(a && a->c==1){memset(a,0,sizeof(struct foo));if(a->c!= 1) {
ERROR;}
}}
Reasoning about dynamic data is easy
Due to limitation of alias analysis “static analyzers”cannot determine that “a->c” has been rewritten BLAST would infer that
the program is safe practical
CUTE finds the error
85
Library Functions With Side-effectsstruct foo {
int i;char c;
};
bar(struct foo *a){if(a && a->c==1){
memset(a,0,sizeof(struct foo));if(a->c== 1) {
ERROR;}
}}
Reasoning about function with side-effects
Sound static tool will give false alarm
CUTE will not report the error because it cannot generate
a concrete input that will make the program hit ERROR.
86
Underlying Random Testing Helps1 foobar(int x, int y){2 if (x*x*x > 0){3 if (x>0 && y==10){4 ERROR;5 }6 } else {7 if (x>0 && y==20){8 ERROR;9 }10 }11 }
static analysis based model-checkers would consider both branches both ERROR statements
are reachable false alarm
Symbolic execution gets stuck at line number
2 or warn that both
ERRORs are reachable CUTE finds the only
error
87
Data-structure TestingSolving Data-structure Invariantsint isSortedSlist(slist * head) {slist * cur, *tmp;int i,j;if (head == 0) return 1;i=j=0;for (cur = head; cur!=0; cur = cur->next){i++;j=1;for (tmp = head; j<i; tmp = tmp->next){j++;if(cur==tmp) return 0;
}}for (cur = head; cur->next!=0; cur = cur-
>next){if(cur->i > cur->_next->i) return 0;
}return 1;
}
testme(slist *L,slist *e){CUTE_assume(isSortedSlist(L));sglib_slist_add(&L,e);CUTE_assert(isSortedSlist(L));
}
88
Data-structure TestingGenerating Call Sequencefor (i=1; i<10; i++) {
CU_input(toss);CU_input(e);switch(toss){case 2: sglib_hashed_ilist_add_if_not_member(htab,e,&m); break;case 3: sglib_hashed_ilist_delete_if_member(htab,e,&m); break;case 4: sglib_hashed_ilist_delete(htab,e); break;case 5: sglib_hashed_ilist_is_member(htab,e); break;case 6: sglib_hashed_ilist_find_member(htab,e); break;}
}
89
Instrumentation
90
Instrumented Program Maintains a symbolic state
maps each memory location used by the program to symbolic expressions over symbolic input variables
Maintains a path constraint a sequence of constraints
C1, C2,…, Ck such that Ci s are symbolic
constraints over symbolic input variables
C1 � C2 � … � Ck holds over the path currently executed by the program
91
Instrumented Program Maintains a symbolic state
maps each memory location used by the program to symbolic expressions over symbolic input variables
Maintains a path constraint a sequence of constraints
C1, C2,…, Ck such that Ci s are symbolic
constraints over symbolic input variables
C1 � C2 � … � Ck holds over the path currently executed by the program
Arithmetic Expressionsa1x1+…+anxn+c
Pointer expressionsx
92
Instrumented Program Maintains a symbolic state
maps each memory location used by the program to symbolic expressions over symbolic input variables
Maintains a path constraint a sequence of constraints
C1, C2,…, Ck such that Ci s are symbolic
constraints over symbolic input variables
C1 � C2 � … � Ck holds over the path currently executed by the program
Arithmetic Expressionsa1x1+…+anxn+c
Pointer expressionsx
Arithmetic Constraintsa1x1+…+anxn � 0
Pointer Constraintsx=yx yx=NULLx NULL
93
Constraint SolvingPath constraintC1� C2�… � Ck �… Cn
94
Constraint SolvingPath constraintC1� C2�… � Ck �… Cn
Solve C1� C2�… � � Ck
to generate next input
95
Constraint SolvingPath constraintC1� C2�… � Ck �… Cn
Solve C1� C2�… � � Ck
to generate next input
If Ck is pointer constraint then invoke pointer solver C
i1� C
i2� … � � Ck where ij � [1..k-1] and C
ijis
pointer constraintIf C
kis arithmetic constraint then invoke linear solver
Ci1� C
i2� … � � Ck where ij � [1..k-1] and C
ijis
arithmetic constraint
96
Optimizations Fast un-satisfiability check
C1� C2�… � � Ck
97
Optimizations Fast un-satisfiability check
C1� C2�… � � Ck
Eliminate same constraints(y>2) � (y>2)� �(y==4)
�
(y>2)� �(y==4)that is
C1� C2�…Ci … Cj … � � Ck = C1� C2�… Ci … � � Ck if Ci = Cj
98
Optimizations Fast un-satisfiability check
C1� C2�… � � Ck
Eliminate same constraints(y>2) � (y>2)� �(y==4)
�
(y>2)� �(y==4)that is
C1� C2�…Ci … Cj … � � Ck = C1� C2�… Ci … � � Ck if Ci = Cj
Eliminate non-dependent constraints(x==1) � (y>2)� �(y==4)
�
(y>2)� �(y==4)
99
SGLIB: popular library for C data-structures Used in Xrefactory a commercial tool for refactoring
C/C++ programs Found two bugs in sglib 1.0.1
reported them to authors fixed in sglib 1.0.2
Bug 1: doubly-linked list library
segmentation fault occurs when a non-zero length list is concatenated with a zero-length list
discovered in 140 iterations ( < 1second) Bug 2:
hash-table an infinite loop in hash table is member function 193 iterations (1 second)
100
SGLIB: popular library for C data-structures
NameRun time
in seconds
# of Iterations
# of BranchesExplored
% Branch Coverage
# of Functions
Tested
# of BugsFound
Array Quick Sort 2 732 43 97.73 2 0
Array Heap Sort 4 1764 36 100.00 2 0
Linked List 2 570 100 96.15 12 0
Sorted List 2 1020 110 96.49 11 0
Doubly Linked List 3 1317 224 99.12 17 1
Hash Table 1 193 46 85.19 8 1
Red Black Tree 2629 1,000,000 242 71.18 17 0
101
Experiments: Needham-Schroeder Protocol
Tested a C implementation of a security protocol (Needham-Schroeder) with a known attack 600 lines of code Took less than 13 seconds on a machine with
1.8GHz Xeon processor and 2 GB RAM to discover middle-man attack
In contrast, a software model-checker (VeriSoft) took 8 hours
102
Testing Data-structures of CUTE itself Unit tested several non-standard data-
structures implemented for the CUTE tool cu_depend (used to determine dependency
during constraint solving using graph algorithm) cu_linear (linear symbolic expressions) cu_pointer (pointer symbolic expresiions)
Discovered a few memory leaks and a copule of segmentation faults these errors did not show up in other uses of
CUTE for memory leaks we used CUTE in conjunction
with Valgrind
103
Limitations Path Space of a Large Program is Huge
Path Explosion Problem
Entire Computation Tree
104
Limitations Path Space of a Large Program is Huge
Path Explosion Problem
Explored by Concolic Testing
Entire Computation Tree
105
Limitations Path Space of a Large Program is Huge
Path Explosion Problem
Explored by Concolic Testing
Entire Computation Tree
106
Hybrid Concolic Testing(joing work with Rupak Majumdar)
107
Limitations: A Comparative View
Concolic: Broad, shallow Random: Narrow, deep
108
Hybrid Concolic Testing Interleave Random Testing and Concolic Testing to
increase coverage
Motivated by similar idea used in VLSI design validation:
Ganai et al. 1999, Ho et al. 2000
109
Hybrid Concolic Testing Interleave Random Testing and Concolic Testing to
increase coverage
while (not required coverage) {
while (not saturation)
perform random testing;
Checkpoint;
while (not increase in coverage)
perform concolic testing;
Restore;
}
110
Hybrid Concolic Testing Interleave Random Testing and Concolic Testing to
increase coverage
while (not required coverage) {
while (not saturation)
perform random testing;
Checkpoint;
while (not increase in coverage)
perform concolic testing;
Restore;
}
Deep, broad search
Hybrid Search
111
Hybrid Concolic Testing 4x more coverage than random testing 2x more coverage than concolic testing
112
Results
113
Results
114
Summary
Program Verification
Static Analysis
Automated Theorem Proving
Model-Checking and
Formal Methods
Constraint Solving
115
Summary
Program Verification
Static Analysis
Automated Theorem Proving
Model-Checking and
Formal Methods
Constraint Solving
Testing
116
References DART: Directed Automated Random Testing. Patrice
Godefroid, Nils Klarlund, and Koushik Sen (PLDI 2005) CUTE: A Concolic Unit Testing Engine for C. Koushik Sen,
Darko Marinov, and Gul Agha (ESEC/FSE 2005)1. "Dynamic Test Input Generation for Database Applications” Michael Emmi, Rupak Majumdar,
and Koushik Sen, International Symposium on Software Testing and Analysis (ISSTA'07), ACM, 2007.
2. "Hybrid Concolic Testing” Rupak Majumdar and Koushik Sen, 29th International Conference on Software Engineering (ICSE'07), IEEE, 2007.
3. "A Race-Detection and Flipping Algorithm for Automated Testing of Multi-Threaded Programs”Koushik Sen and Gul Agha, Haifa verification conference 2006 (HVC'06), Lecture Notes in Computer Science, Springer, 2006.
4. "CUTE and jCUTE : Concolic Unit Testing and Explicit Path Model-Checking Tools” Koushik Sen and Gul Agha, 18th International Conference on Computer Aided Verification (CAV'06), Lecture Notes in Computer Science, Springer, 2006. (Tool Paper).
5. "Automated Systematic Testing of Open Distributed Programs” Koushik Sen and Gul Agha, International Conference on Fundamental Approaches to Software Engineering (FASE'06) (ETAPS'06 conference), Lecture Notes in Computer Science, Springer, 2006.
6. “EXE: Automatically Generating Inputs of Death,” Cristian Cadar, Vijay Ganesh, Peter M. Pawlowski, David L. Dill, Dawson R. Engler, 13th ACM Conference on Computer and Communications Security, 2006.
7. “Automatically generating malicious disks using symbolic execution,” Junfeng Yang, Can Sar, Paul Twohey, Cristian Cadar, and Dawson Engler. IEEE Security and Privacy, 2006.