Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE)...

200
Efficient Symbolic Execution for Software Testing Johannes Kinder Royal Holloway, University of London Joint work with: Stefan Bucur, George Candea, Volodymyr Kuznetsov @ EPFL

Transcript of Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE)...

Page 1: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Efficient Symbolic Execution

for Software Testing

Johannes Kinder

Royal Holloway, University of London

Joint work with:

Stefan Bucur, George Candea, Volodymyr Kuznetsov @ EPFL

Page 2: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Symbolic Execution

• Automatically explore program paths

• Execute program on “symbolic” input values

• “Fork” execution at each branch

• Record branching conditions

• Constraint solver

• Decides path feasibility

• Generates test cases for paths and bugs

2

Page 3: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Symbolic Execution

• (Very brief) history

• Test generation by SE in 70s [King ‘75] [Boyer et al. ’75]

• SAT / SMT solvers lead to boom in 2000s

[Godefroid et al. ‘05][Cadar et al. ’06]

• Many successful tools

• KLEE, SAGE, PEX, SPF, CREST, Cloud9, S2E, …

• Specific advantages

• No false positives, useful partial results

• Reduces need for modeling

Page 4: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Outline

• Symbolic Execution for Testing

• State Merging – Fighting Path Explosion

• Interpreted High-Level Code

Page 5: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Outline

• Symbolic Execution for Testing

• State Merging – Fighting Path Explosion

• Interpreted High-Level Code

Page 6: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

6

Page 7: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

7

Symbolic

program state

Page 8: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

8

Symbolic

program state

Input symbol

Page 9: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

9

Symbolic

program state

Path condition

Input symbol

Page 10: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

10

Page 11: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

11

✓ ✓

Page 12: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

12

Page 13: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

13

Page 14: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

14

Page 15: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

15

Page 16: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

16

Page 17: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

17

Page 18: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

18

Page 19: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

19

Page 20: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

20

Satisfying assignments:

Test cases:

Page 21: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Finding Bugs

• Symbolic execution enumerates paths

• Runs into bugs that trigger whenever path executes

• Assertions, buffer overflows, division by zero, etc.,

require specific conditions

• Error conditions

• Treat assertions as conditions

• Creates explicit error paths

21

Page 22: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Finding Bugs

• Instrument program with properties

• Translate any safety property to reachability

• Division by zero

22

Page 23: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Finding Bugs

• Instrument program with properties

• Translate any safety property to reachability

• Division by zero

• Buffer overflows

23

Page 24: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Finding Bugs

• Instrument program with properties

• Translate any safety property to reachability

• Division by zero

• Buffer overflows

• Implementation is usually implicit

24

Page 25: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Symbolic Execution Algorithms

• Static symbolic execution

• Simulate execution on program source code

• Computes strongest postconditions from entry point

• Dynamic symbolic execution (DSE)

• Run / interpret the program with concrete state

• Symbolic state computed in parallel (“concolic”)

• Solver generates new concrete state

• DSE-Flavors

• EXE-style [Cadar et al. ‘06] vs. DART [Godefroid et al. ‘05]

25

Page 26: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

EXE

26

Page 27: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

EXE

27

Symbolic

program state

Page 28: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

EXE

28

Symbolic

program state

Concrete state

Page 29: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

EXE

29

Page 30: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

EXE

30

Page 31: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

EXE

31

Page 32: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

EXE

32

Page 33: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

EXE

33

Page 34: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

EXE

34

Page 35: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

EXE

35

Page 36: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

EXE

36

Page 37: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

EXE

37

Page 38: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

EXE

38

Page 39: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

EXE

39

Satisfying assignments:

Test cases:

Page 40: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

DART

40

Path condition:

Test cases:

Page 41: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

DART

41

Path condition:

Test cases:

Page 42: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

DART

42

Path condition:

Test cases:

Page 43: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

DART

43

Path condition:

Test cases:

Page 44: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

DART

44

Path condition:

Test cases:

Page 45: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

DART

45

Path condition:

Test cases:

Page 46: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

DART

46

Path condition:

Test cases:

Page 47: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

DART

47

Path condition:

Test cases:

Page 48: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

DART

48

Path condition:

Test cases:

Page 49: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

DART

49

Path condition:

Test cases:

Page 50: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

DART

50

Path condition:

Test cases:

Page 51: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

DART

51

Path condition:

Test cases:

Page 52: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

DART

52

Path condition:

Test cases:

Page 53: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

DART

53

Path condition:

Test cases:

Page 54: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

DART

54

Path condition:

Test cases:

Page 55: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

DART

55

Path condition:

Test cases:

Page 56: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

DART vs. EXE

• Complete execution

from 1st step

• Deep exploration

• One query per run

• Offline SE possible

• Follow recorded trace

56

• Fine-grained control of

execution

• Shallow exploration

• Many queries early on

• Online SE

• SE and interpretation

in lockstep

Page 57: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Concretization

• Parts of input space can be kept concrete

• Reduces complexity

• Focuses search

• Expressions can be concretized at runtime

• Avoid expressions outside of SMT solver theories

(non-linear etc.)

• Sound but incomplete

57

Page 58: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Concretization (Example)

58

Page 59: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Concretization (Example)

59

Page 60: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Concretization (Example)

60

Page 61: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Concretization (Example)

61

Page 62: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Concretization (Example)

62

Page 63: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Concretization (Example)

63

Page 64: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Concretization (Example)

64

Page 65: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Concretization (Example)

65

Solution diverges from

expected path! (e.g., X = 2)

Page 66: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Concretization (Example)

66

Concretization constraint

Page 67: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Soundness & Completeness

• Conceptually, each path is exact

• Strongest postcondition in predicate transformer

semantics

• No over-approximation, no under-approximation

• Globally, SE under-approximates

• Explores only subset of paths in finite time

• “Eventual” completeness

67

Page 68: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Soundness & Completeness

• Conceptually, each path is exact

• Strongest postcondition in predicate transformer

semantics

• No over-approximation, no under-approximation

• Globally, SE under-approximates

• Explores only subset of paths in finite time

• “Eventual” completeness

68

Explored

symbolically State Space

Page 69: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Soundness & Completeness

• Symbolic Execution = Underapproximates

• Soundness = does not include infeasible behavior

• Completeness = explores all behavior

• Concretization restricts state covered by path

• Remains sound

• Loses (eventual) completeness

69

Explored

symbolically State Space

Page 70: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Soundness & Completeness

• Symbolic Execution = Underapproximates

• Soundness = does not include infeasible behavior

• Completeness = explores all behavior

• Concretization restricts state covered by path

• Remains sound

• Loses (eventual) completeness

70

Explored

symbolically State Space

Page 71: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Concretization

• Key strength of dynamic symbolic execution

• Enables external calls

• Concretize call arguments

• Callee executes concretely

• Concretization constraints can be omitted

• Sacrifices soundness (original DART)

• Deal with divergences by random restarts

71

Page 72: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Outline

• Symbolic Execution for Testing

• State Merging – Fighting Path Explosion

• Interpreted High-Level Code

Page 73: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

void main(int argc, char **argv) { int r = 1, i = 1; if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } } for (; i < argc; ++i) { for (int j = 0; argv[i][j] != 0; ++j) { putchar(argv[i][j]); } } if (r) { putchar('\n'); } }

“echo” Coreutil

Page 74: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

void main(int argc, char **argv) { int r = 1, i = 1; if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } } for (; i < argc; ++i) { for (int j = 0; argv[i][j] != 0; ++j) { putchar(argv[i][j]); } } if (r) { putchar('\n'); } }

“echo” Coreutil

Page 75: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

void main(int argc, char **argv) { int r = 1, i = 1; if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } } for (; i < argc; ++i) { for (int j = 0; argv[i][j] != 0; ++j) { putchar(argv[i][j]); } } if (r) { putchar('\n'); } }

“echo” Coreutil

Page 76: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

void main(int argc, char **argv) { int r = 1, i = 1; if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } } for (; i < argc; ++i) { for (int j = 0; argv[i][j] != 0; ++j) { putchar(argv[i][j]); } } if (r) { putchar('\n'); } }

“echo” Coreutil

Page 77: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Symbolic Execution

77

void main(int argc, char **argv) { int r = 1, i = 1; if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } } for (; i < argc; ++i) { for (int j = 0; argv[i][j] != 0; ++j) { putchar(argv[i][j]); } } if (r) { putchar('\n'); } }

Page 78: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Symbolic Execution

78

void main(int argc, char **argv) { int r = 1, i = 1; if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } } for (; i < argc; ++i) { for (int j = 0; argv[i][j] != 0; ++j) { putchar(argv[i][j]); } } if (r) { putchar('\n'); } }

Page 79: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Symbolic Execution

79

void main(int argc, char **argv) { int r = 1, i = 1; if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } } for (; i < argc; ++i) { for (int j = 0; argv[i][j] != 0; ++j) { putchar(argv[i][j]); } } if (r) { putchar('\n'); } }

Page 80: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Symbolic Execution

80

void main(int argc, char **argv) { int r = 1, i = 1; if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } } for (; i < argc; ++i) { for (int j = 0; argv[i][j] != 0; ++j) { putchar(argv[i][j]); } } if (r) { putchar('\n'); } }

Page 81: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Symbolic Execution

81

✓ ✓

void main(int argc, char **argv) { int r = 1, i = 1; if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } } for (; i < argc; ++i) { for (int j = 0; argv[i][j] != 0; ++j) { putchar(argv[i][j]); } } if (r) { putchar('\n'); } }

Page 82: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

void main(int argc, char **argv) { int r = 1, i = 1; if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } } for (; i < argc; ++i) { for (int j = 0; argv[i][j] != 0; ++j) { putchar(argv[i][j]); } } if (r) { putchar('\n'); } }

Problem: Path Explosion

82

Page 83: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

void main(int argc, char **argv) { int r = 1, i = 1; if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } } for (; i < argc; ++i) { for (int j = 0; argv[i][j] != 0; ++j) { putchar(argv[i][j]); } } if (r) { putchar('\n'); } }

Problem: Path Explosion

83

Page 84: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

if (argv[i][0] == 'n') { r = 0; ++i; }

Solution (?): State Merging

else then

84

Page 85: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

if (argv[i][0] == 'n') { r = 0; ++i; }

Solution (?): State Merging

• Use disjunctions to represent

state at join points

• ite(x, y, z) : if x then y else z

else then

85

Page 86: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

if (argv[i][0] == 'n') { r = 0; ++i; }

Solution (?): State Merging

• Use disjunctions to represent

state at join points

• ite(x, y, z) : if x then y else z

else then

86

Page 87: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

if (argv[i][0] == 'n') { r = 0; ++i; }

Solution (?): State Merging

• Use disjunctions to represent

state at join points

• ite(x, y, z) : if x then y else z

• SE tree becomes a DAG

• Whole program can be turned into

one verification condition (BMC)

else then

87

Page 88: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Symbolic Execution vs. BMC

• Complexity does not disappear

• Work moved from the SE engine to the solver

• SE: set of conjunctive queries, BMC: 1 query with

nested disjunctions

• Complete merging sacrifices advantages of SE

• No dynamic mode

• No continuous progress

• No quick reaching of coverage goals

• Try to get the best of both worlds

Page 89: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

1 formula / path 1 formula / CFG

DART (SAGE) [Godefroid, PLDI’05]

EXE (KLEE) [Cadar et al., CCS’06]

Symbolic Execution Verification Condition

Generation

F-Soft [Ivancic et al., CAV’05]

CBMC [Clarke et al., TACAS’04]

Page 90: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

1 formula / path 1 formula / CFG

DART (SAGE) [Godefroid, PLDI’05]

EXE (KLEE) [Cadar et al., CCS’06]

Symbolic Execution Verification Condition

Generation

F-Soft [Ivancic et al., CAV’05]

CBMC [Clarke et al., TACAS’04]

Boogie [Barnett et al., FMCO’05]

Page 91: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

1 formula / path 1 formula / CFG

Compositional SE /

Summaries [Godefroid, POPL’07]

DART (SAGE) [Godefroid, PLDI’05]

EXE (KLEE) [Cadar et al., CCS’06]

Symbolic Execution Verification Condition

Generation

F-Soft [Ivancic et al., CAV’05]

CBMC [Clarke et al., TACAS’04]

Boogie [Barnett et al., FMCO’05]

Page 92: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

1 formula / path 1 formula / CFG

Compositional SE /

Summaries [Godefroid, POPL’07]

BMC slicing [Ganai&Gupta, DAC’08]

DART (SAGE) [Godefroid, PLDI’05]

EXE (KLEE) [Cadar et al., CCS’06]

Symbolic Execution Verification Condition

Generation

F-Soft [Ivancic et al., CAV’05]

CBMC [Clarke et al., TACAS’04]

Boogie [Barnett et al., FMCO’05]

Page 93: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

1 formula / path 1 formula / CFG

Compositional SE /

Summaries [Godefroid, POPL’07]

BMC slicing [Ganai&Gupta, DAC’08]

State joining [Hansen et al., RV’09]

DART (SAGE) [Godefroid, PLDI’05]

EXE (KLEE) [Cadar et al., CCS’06]

Symbolic Execution Verification Condition

Generation

F-Soft [Ivancic et al., CAV’05]

CBMC [Clarke et al., TACAS’04]

Boogie [Barnett et al., FMCO’05]

Page 94: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

1 formula / path 1 formula / CFG

Compositional SE /

Summaries [Godefroid, POPL’07]

BMC slicing [Ganai&Gupta, DAC’08]

State joining [Hansen et al., RV’09]

DART (SAGE) [Godefroid, PLDI’05]

EXE (KLEE) [Cadar et al., CCS’06]

Symbolic Execution Verification Condition

Generation

F-Soft [Ivancic et al., CAV’05]

CBMC [Clarke et al., TACAS’04]

Boogie [Barnett et al., FMCO’05]

Dynamic State Merging

Page 95: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

1 formula / path 1 formula / CFG

Compositional SE /

Summaries [Godefroid, POPL’07]

BMC slicing [Ganai&Gupta, DAC’08]

State joining [Hansen et al., RV’09]

DART (SAGE) [Godefroid, PLDI’05]

EXE (KLEE) [Cadar et al., CCS’06]

Symbolic Execution Verification Condition

Generation

F-Soft [Ivancic et al., CAV’05]

CBMC [Clarke et al., TACAS’04]

Boogie [Barnett et al., FMCO’05]

Query Count Estimation Dynamic State Merging

[KKBC PLDI’12]

Page 96: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Merging Increases Solving Cost

void main(int argc, char **argv) { int r = 1, i = 1; if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } } for (; i < argc; ++i) { for (int j = 0; argv[i][j] != 0; ++j) { putchar(argv[i][j]); } } if (r) { putchar('\n'); } }

96

Page 97: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Merging Increases Solving Cost

void main(int argc, char **argv) { int r = 1, i = 1; if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } } for (; i < argc; ++i) { for (int j = 0; argv[i][j] != 0; ++j) { putchar(argv[i][j]); } } if (r) { putchar('\n'); } }

97

Page 98: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Merging Increases Solving Cost

Condition becomes symbolic, extra check required.

void main(int argc, char **argv) { int r = 1, i = 1; if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } } for (; i < argc; ++i) { for (int j = 0; argv[i][j] != 0; ++j) { putchar(argv[i][j]); } } if (r) { putchar('\n'); } }

98

Page 99: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Merging Increases Solving Cost

Condition becomes symbolic, extra check required.

void main(int argc, char **argv) { int r = 1, i = 1; if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } } for (; i < argc; ++i) { for (int j = 0; argv[i][j] != 0; ++j) { putchar(argv[i][j]); } } if (r) { putchar('\n'); } }

99

Page 100: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Merging Increases Solving Cost

void main(int argc, char **argv) { int r = 1, i = 1; if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } } for (; i < argc; ++i) { for (int j = 0; argv[i][j] != 0; ++j) { putchar(argv[i][j]); } } if (r) { putchar('\n'); } }

100

Page 101: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Merging Increases Solving Cost

Query becomes more complex.

void main(int argc, char **argv) { int r = 1, i = 1; if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } } for (; i < argc; ++i) { for (int j = 0; argv[i][j] != 0; ++j) { putchar(argv[i][j]); } } if (r) { putchar('\n'); } }

101

Page 102: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Merging Increases Solving Cost

Query becomes more complex.

void main(int argc, char **argv) { int r = 1, i = 1; if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } } for (; i < argc; ++i) { for (int j = 0; argv[i][j] != 0; ++j) { putchar(argv[i][j]); } } if (r) { putchar('\n'); } }

Should not merge after

checking 1st argument!

102

Page 103: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Query Count Estimation (QCE)

Intuition

• Estimate the extra burden on the solver

• Merge only when merging amortizes extra cost

Cost ≈ number of solver queries

103

Page 104: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Applying QCE

1. Estimate query counts from each program location

• Total number of queries on all paths

• For each variable, number of dependent queries

Page 105: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Applying QCE

1. Estimate query counts from each program location

• Total number of queries on all paths

• For each variable, number of dependent queries

Sta

tic

Page 106: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Applying QCE

1. Estimate query counts from each program location

• Total number of queries on all paths

• For each variable, number of dependent queries

2. Determine “hot” variables

Sta

tic

Page 107: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Applying QCE

1. Estimate query counts from each program location

• Total number of queries on all paths

• For each variable, number of dependent queries

2. Determine “hot” variables

3. Symbolic Execution

• Do not merge two candidate states if they differ in hot

variables

• Avoids creating ite expressions

Sta

tic

Dyn

am

ic

Page 108: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Merging not Beneficial

void main(int argc, char **argv) { int r = 1, i = 1; if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } } for (; i < argc; ++i) { for (int j = 0; argv[i][j] != 0; ++j) { putchar(argv[i][j]); } } if (r) { putchar('\n'); } }

i is “hot”, leads to

many extra queries

108

Page 109: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Merging not Beneficial

void main(int argc, char **argv) { int r = 1, i = 1; if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } } for (; i < argc; ++i) { for (int j = 0; argv[i][j] != 0; ++j) { putchar(argv[i][j]); } } if (r) { putchar('\n'); } }

i is “hot”, leads to

many extra queries

109

Page 110: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Merging not Beneficial

void main(int argc, char **argv) { int r = 1, i = 1; if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } } for (; i < argc; ++i) { for (int j = 0; argv[i][j] != 0; ++j) { putchar(argv[i][j]); } } if (r) { putchar('\n'); } }

i is “hot”, leads to

many extra queries

110

Page 111: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

111

void main(int argc, char **argv) { int r = 1, i = 1; if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } } for (; i < argc; ++i) { for (int j = 0; argv[i][j] != 0; ++j) { putchar(argv[i][j]); } } if (r) { putchar('\n'); } }

Merging Beneficial

Page 112: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

112

void main(int argc, char **argv) { int r = 1, i = 1; if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } } for (; i < argc; ++i) { for (int j = 0; argv[i][j] != 0; ++j) { putchar(argv[i][j]); } } if (r) { putchar('\n'); } }

Merging Beneficial

Page 113: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

113

void main(int argc, char **argv) { int r = 1, i = 1; if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } } for (; i < argc; ++i) { for (int j = 0; argv[i][j] != 0; ++j) { putchar(argv[i][j]); } } if (r) { putchar('\n'); } }

Merging Beneficial

Page 114: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

114

void main(int argc, char **argv) { int r = 1, i = 1; if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } } for (; i < argc; ++i) { for (int j = 0; argv[i][j] != 0; ++j) { putchar(argv[i][j]); } } if (r) { putchar('\n'); } }

Merging Beneficial

Page 115: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

115

void main(int argc, char **argv) { int r = 1, i = 1; if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } } for (; i < argc; ++i) { for (int j = 0; argv[i][j] != 0; ++j) { putchar(argv[i][j]); } } if (r) { putchar('\n'); } }

Merging Beneficial

Page 116: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

116

void main(int argc, char **argv) { int r = 1, i = 1; if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } } for (; i < argc; ++i) { for (int j = 0; argv[i][j] != 0; ++j) { putchar(argv[i][j]); } } if (r) { putchar('\n'); } }

Merging Beneficial

Page 117: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

117

void main(int argc, char **argv) { int r = 1, i = 1; if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } } for (; i < argc; ++i) { for (int j = 0; argv[i][j] != 0; ++j) { putchar(argv[i][j]); } } if (r) { putchar('\n'); } }

Merging Beneficial

Page 118: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

118

void main(int argc, char **argv) { int r = 1, i = 1; if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } } for (; i < argc; ++i) { for (int j = 0; argv[i][j] != 0; ++j) { putchar(argv[i][j]); } } if (r) { putchar('\n'); } }

Merging Beneficial

Largc states reduced to 1 (L: maximum argument length)

Page 119: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Search Strategies

… …

• SE usually incomplete

• 100% path coverage is

impractical

• Uses search strategy to

reach a coverage goal

• Search strategy chooses

states from a worklist

119

Page 120: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Search Strategies

… …

• SE usually incomplete

• 100% path coverage is

impractical

• Uses search strategy to

reach a coverage goal

• Search strategy chooses

states from a worklist

120

Page 121: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Merging Breaks Search Strategies

… …

• Naïve state merging

blocks states at join points

• Allows earlier states to

“catch up”

• Thwarts strategy in

reaching its goal!

121

Page 122: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Merging Breaks Search Strategies

… …

• Naïve state merging

blocks states at join points

• Allows earlier states to

“catch up”

• Thwarts strategy in

reaching its goal!

122

Page 123: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Merging Breaks Search Strategies

… …

• Naïve state merging

blocks states at join points

• Allows earlier states to

“catch up”

• Thwarts strategy in

reaching its goal!

123

Page 124: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Merging Breaks Search Strategies

… …

• Naïve state merging

blocks states at join points

• Allows earlier states to

“catch up”

• Thwarts strategy in

reaching its goal! Ask QCE

124

Page 125: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Merging Breaks Search Strategies

… …

• Naïve state merging

blocks states at join points

• Allows earlier states to

“catch up”

• Thwarts strategy in

reaching its goal!

125

Page 126: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Dynamic State Merging

… …

• Maintain bounded history of predecessor states

126

Page 127: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Dynamic State Merging

… …

• Maintain bounded history of predecessor states

127

Page 128: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Dynamic State Merging

… …

• Maintain bounded history of predecessor states

128

Page 129: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Dynamic State Merging

… …

• Maintain bounded history of predecessor states

• QCE compares a state to predecessors of others

Ask QCE

129

Page 130: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Dynamic State Merging

… …

• Maintain bounded history of predecessor states

• QCE compares a state to predecessors of others

• Matching states are prioritized

• Original search strategy controls remaining worklist

130

Page 131: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Dynamic State Merging

… …

• Maintain bounded history of predecessor states

• QCE compares a state to predecessors of others

• Matching states are prioritized

• Original search strategy controls remaining worklist

Ask QCE

131

Page 132: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Dynamic State Merging

… …

• Maintain bounded history of predecessor states

• QCE compares a state to predecessors of others

• Matching states are prioritized

• Original search strategy controls remaining worklist

132

Page 133: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Evaluation

• Our prototype

• Based on state-of-the-art engine KLEE

• QCE implemented in LLVM – static analysis

completes in seconds

• Analysis Targets

• 96 GNU Coreutils (echo, ls, dd, who, …)

• 2’000 – 10’000 executable lines of code per tool

• 72 KLOC in total

133

Page 134: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Time for Exhaustive Exploration

Triangle: KLEE time-out

(> 2h)

QCE + SSM vs. KLEE.

Static state merging (SSM)

follows topological order.

Consistent speedups

10

100

1000

7200(Timeout)

10 100 1000 7200(Timeout)

SS

M C

om

ple

tio

n T

ime

(T

SS

M)

in s

eco

nd

s

KLEE Completion Time (T Klee) in seconds

No Speedup (TKlee = TSSM)

Page 135: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Speedup vs. Input-Size (Exhaustive)

QCE + SSM vs. KLEE.

0.1

1

10

100

1000

10000

0 5 10 15 20 25 30 35 40 45

Sp

ee

du

p (

TK

lee /

TS

SM

)

Input Size (# of symbolic bytes)

echolink

nicebasename

Exponential speedups in symbolic input size

135

Page 136: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

# Paths in Incomplete Exploration

0.0001

0.01

1

100

10000

1e+06

1e+08

1e+10

1e+12

ddpwdunlink

uptime

idwhoami

users

sync

readlink

comm

sleep

hostid

tsort

logname

link

ttyrmdir

mkfifo

mktemp

mknod

rmpinky

suuname

dircolors

printf

base64

lnmvtacruncon

date

mkdir

whosetuidgid

seqchcon

cpstty

teeuniq

dirshuf

fold

catprintenv

split

nice

prshred

dutail

ptxpaste

chown

trecho

join

cutbasename

dirname

nohup

chroot

expand

chgrp

cksum

sumfalse

true

yeshead

csplit

odfmtchmod

factor

stat

sort

unexpand

touch

wcenvnlsha1sum

md5sum

Pa

th R

atio

(P

DS

M / P

Kle

e)

Tool name

QCE + DSM vs. KLEE (1h time bound).

Up to 11 orders of magnitude more paths explored

136

Target Coreutil

Page 137: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Combating Path Explosion

• Dynamic merging (DSM + QCE)

• Static merging of small structures [Avgerinos et al.,

ICSE’14]

• Procedure summaries [Godefroid, POPL’07]

• Parallelization [Bucur et al, EuroSys’11]

137

Page 138: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Outline

• Symbolic Execution for Testing

• State Merging – Fighting Path Explosion

• Interpreted High-Level Code

Page 139: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Programming Languages

Symbolic Execution Engines

Page 140: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Programming Languages

Symbolic Execution Engines

Scala

Java

C C++

Page 141: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Programming Languages

Symbolic Execution Engines

Java Bytecode

LLVM

x86 ARM

Scala

Java

C C++

Page 142: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Programming Languages

Symbolic Execution Engines

Java Bytecode

LLVM

x86 ARM

Scala

Java

C C++

BitBlaze

KLEE

JPF

SAGE

S2E

Page 143: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Programming Languages

Symbolic Execution Engines

Java Bytecode

LLVM

x86 ARM

Scala

Java

C C++

Python Ruby

Lua JavaScript

Bash Perl

BitBlaze

KLEE

JPF

SAGE

S2E

Page 144: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Programming Languages

Symbolic Execution Engines

Java Bytecode

LLVM

x86 ARM

Scala

Java

C C++

Python Ruby

Lua JavaScript

Bash Perl

BitBlaze

KLEE

JPF

SAGE

S2E

?

Page 145: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

def parse_file(file_name):

with open(file_name, "r") as f:

data = f.read()

return json.loads(data, encoding="utf-8")

Interpreted Languages

Page 146: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

def parse_file(file_name):

with open(file_name, "r") as f:

data = f.read()

return json.loads(data, encoding="utf-8")

Interpreted Languages

Complex semantics +

Ambiguity in specifications +

Evolving language +

Large standard library +

Widespread native methods

Since Python 2.5

Complete File Read

Incomplete Specification

Page 147: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

def parse_file(file_name):

with open(file_name, "r") as f:

data = f.read()

return json.loads(data, encoding="utf-8")

Interpreted Languages

Complex semantics +

Ambiguity in specifications +

Evolving language +

Large standard library +

Widespread native methods

Since Python 2.5

Complete File Read

Incomplete Specification

Too m

uch

work

Page 148: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

“Consequently, if you were coming from

Mars and tried to re-implement Python

from this document alone, you might have

to guess things and in fact you would

probably end up implementing quite a

different language.”

- The Python Language Reference

Page 149: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

How can we efficiently obtain

a correct symbolic execution engine?

Page 150: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Key idea:

Use the language interpreter

as executable specification

[Bucur, K, Candea, ASPLOS’14]

Page 151: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Symbolic Execution

Engine for

Language X Chef

Key idea:

Use the language interpreter

as executable specification

Language X

Interpreter

[Bucur, K, Candea, ASPLOS’14]

Page 152: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Symbolic Execution

Engine for

Language X Chef

Program + Symbolic Tests

Test Cases

Key idea:

Use the language interpreter

as executable specification

Language X

Interpreter

[Bucur, K, Candea, ASPLOS’14]

Page 153: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Chef Overview

• Built on top of the S2E symbolic execution

engine for x86

• Relies on lightweight interpreter

instrumentation + optimizations

• Prototyped engines for Python and Lua in 5 +

3 person-days

Page 154: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Testing Interpreted Programs

Naive approach: Run interpreter in stock SE engine

Page 155: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Testing Interpreted Programs

def validateEmail(email):

pos = email.find("@")

if pos < 1:

raise InvalidEmailError()

if email.rfind(".") < pos:

raise InvalidEmailError()

Naive approach: Run interpreter in stock SE engine

Page 156: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Testing Interpreted Programs

def validateEmail(email):

pos = email.find("@")

if pos < 1:

raise InvalidEmailError()

if email.rfind(".") < pos:

raise InvalidEmailError()

Naive approach: Run interpreter in stock SE engine

Python Interpreter

./python program.py

Page 157: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Testing Interpreted Programs

def validateEmail(email):

pos = email.find("@")

if pos < 1:

raise InvalidEmailError()

if email.rfind(".") < pos:

raise InvalidEmailError()

x86 Symbolic Execution Engine (S2E)

Naive approach: Run interpreter in stock SE engine

Python Interpreter

./python program.py

Page 158: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Py_LOCAL_INLINE(Py_ssize_t)

fastsearch(const STRINGLIB_CHAR* s, Py_ssize_t n,

const STRINGLIB_CHAR* p, Py_ssize_t m,

Py_ssize_t maxcount, int mode)

{

unsigned long mask;

Py_ssize_t skip, count = 0;

Py_ssize_t i, j, mlast, w;

w = n - m;

if (w < 0 || (mode == FAST_COUNT && maxcount == 0))

return -1;

/* look for special cases */

if (m <= 1) {

if (m <= 0)

return -1;

/* use special case for 1-character strings */

if (mode == FAST_COUNT) {

for (i = 0; i < n; i++)

if (s[i] == p[0]) {

count++;

if (count == maxcount)

return maxcount;

}

return count;

} else if (mode == FAST_SEARCH) {

for (i = 0; i < n; i++)

if (s[i] == p[0])

return i;

} else { /* FAST_RSEARCH */

for (i = n - 1; i > -1; i--)

if (s[i] == p[0])

return i;

}

return -1;

}

mlast = m - 1;

skip = mlast - 1;

mask = 0;

if (mode != FAST_RSEARCH) {

/* create compressed boyer-moore delta 1 table */

/* process pattern[:-1] */

for (i = 0; i < mlast; i++) {

STRINGLIB_BLOOM_ADD(mask, p[i]);

if (p[i] == p[mlast])

skip = mlast - i - 1;

}

/* process pattern[-1] outside the loop */

STRINGLIB_BLOOM_ADD(mask, p[mlast]);

for (i = 0; i <= w; i++) {

/* note: using mlast in the skip path slows things down on x86

*/

if (s[i+m-1] == p[m-1]) {

/* candidate match */

for (j = 0; j < mlast; j++)

if (s[i+j] != p[j])

break;

if (j == mlast) {

/* got a match! */

if (mode != FAST_COUNT)

return i;

count++;

if (count == maxcount)

return maxcount;

i = i + mlast;

continue;

}

/* miss: check if next character is part of pattern */

if (!STRINGLIB_BLOOM(mask, s[i+m]))

i = i + m;

else

i = i + skip;

} else {

/* skip: check if next character is part of pattern */

if (!STRINGLIB_BLOOM(mask, s[i+m]))

i = i + m;

}

}

} else { /* FAST_RSEARCH */

/* create compressed boyer-moore delta 1 table */

/* process pattern[0] outside the loop */

STRINGLIB_BLOOM_ADD(mask, p[0]);

/* process pattern[:0:-1] */

for (i = mlast; i > 0; i--) {

STRINGLIB_BLOOM_ADD(mask, p[i]);

if (p[i] == p[0])

skip = i - 1;

}

for (i = w; i >= 0; i--) {

if (s[i] == p[0]) {

/* candidate match */

for (j = mlast; j > 0; j--)

if (s[i+j] != p[j])

break;

if (j == 0)

/* got a match! */

return i;

/* miss: check if previous character is part of pattern */

if (i > 0 && !STRINGLIB_BLOOM(mask, s[i-1]))

i = i - m;

else

i = i - skip;

} else {

/* skip: check if previous character is part of pattern */

if (i > 0 && !STRINGLIB_BLOOM(mask, s[i-1]))

i = i - m;

}

}

}

if (mode != FAST_COUNT)

return -1;

return count;

}

pos = email.find("@")

Naive approach:

Run interpreter in stock symbolic execution engine

Page 159: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

pos = email.find("@")

Naive approach:

Run interpreter in stock symbolic execution engine

Page 160: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

pos = email.find("@")

Naive approach:

Run interpreter in stock symbolic execution engine

Path Explosion

Page 161: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

pos = email.find("@")

Gets lost in the details of the implementation

Naive approach:

Run interpreter in stock symbolic execution engine

Path Explosion

Page 162: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

High-level Execution Paths def validateEmail(email):

pos = email.find("@")

if pos < 1:

raise

InvalidEmailError()

if email.rfind(".") < pos:

raise

InvalidEmailError()

High-level execution tree

Page 163: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

High-level Execution Paths def validateEmail(email):

pos = email.find("@")

if pos < 1:

raise

InvalidEmailError()

if email.rfind(".") < pos:

raise

InvalidEmailError()

High-level execution tree

Low-level (x86) execution tree

Page 164: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

High-level Execution Paths def validateEmail(email):

pos = email.find("@")

if pos < 1:

raise

InvalidEmailError()

if email.rfind(".") < pos:

raise

InvalidEmailError()

High-level execution tree

Low-level (x86) execution tree

Page 165: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

High-level Execution Paths def validateEmail(email):

pos = email.find("@")

if pos < 1:

raise

InvalidEmailError()

if email.rfind(".") < pos:

raise

InvalidEmailError()

High-level execution tree

Low-level (x86) execution tree

Page 166: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

High-level Execution Paths def validateEmail(email):

pos = email.find("@")

if pos < 1:

raise

InvalidEmailError()

if email.rfind(".") < pos:

raise

InvalidEmailError()

HL/LL path ratio is low

due to path explosion

3 HL paths

10 LL paths

High-level execution tree

Low-level (x86) execution tree

Page 167: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Goal:

Prioritize the low-level paths

that maximize the HL/LL path ratio.

Page 168: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

High-level Execution Paths

Alternative approach:

Select states at high-level

branches

Page 169: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

High-level Execution Paths Fork (low-level)

Divergence (high-level)

Alternative approach:

Select states at high-level

branches

Page 170: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

High-level Execution Paths Fork (low-level)

Divergence (high-level)

High-level fork points are

unpredictable

Alternative approach:

Select states at high-level

branches

Page 171: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Reducing Path Explosion

Program

Page 172: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Reducing Path Explosion

Fork points clustered in hot spots

Program Low High Selection probability

Page 173: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Reducing Path Explosion

Fork points clustered in hot spots

Clusters grow bigger ⇒ Slower overall progress

Program

“Fork bomb” (e.g., input dependent loop)

Low High Selection probability

Global DFS / BFS / randomized strategy

Page 174: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Reducing Path Explosion

Fork points clustered in hot spots

Clusters grow bigger ⇒ Slower overall progress

Program

“Fork bomb” (e.g., input dependent loop)

Low High Selection probability

Reduced state diversity

Global DFS / BFS / randomized strategy

Page 175: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Class-Uniform Path Analysis

Program

Idea: Partition the state space into groups

Page 176: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Class-Uniform Path Analysis

Program

Idea: Partition the state space into groups

Page 177: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Class-Uniform Path Analysis

Program

Idea: Partition the state space into groups

Select group Select state

from group

Page 178: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Class-Uniform Path Analysis

Program

Idea: Partition the state space into groups

Select group Select state

from group

Faster progress across all groups

Page 179: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Class-Uniform Path Analysis

Program

Idea: Partition the state space into groups

Select group Select state

from group

Faster progress across all groups

Increased state diversity

Page 180: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Class-Uniform Path Analysis

Page 181: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Class-Uniform Path Analysis

Page 182: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Class-Uniform Path Analysis

States arranged in a class hierarchy

Class 1

Class 2

Page 183: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Class-Uniform Path Analysis

States arranged in a class hierarchy

Class 1

Class 2

Page 184: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Partitioning High-level Paths

Page 185: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Partitioning High-level Paths

High-level Instruction (Bytecode Instruction)

Page 186: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Partitioning High-level Paths

High-level Instruction (Bytecode Instruction)

Page 187: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Partitioning High-level Paths

High-level Instruction (Bytecode Instruction)

High-level

Program Counter

Page 188: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Partitioning High-level Paths

High-level Instruction (Bytecode Instruction)

Low-level

x86 PC

High-level

Program Counter

Page 189: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Partitioning High-level Paths

High-level Instruction (Bytecode Instruction)

Low-level

x86 PC

High-level

Program Counter

1st CUPA Class

Page 190: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Partitioning High-level Paths

High-level Instruction (Bytecode Instruction)

Low-level

x86 PC

High-level

Program Counter

1st CUPA Class

2nd CUPA Class

Reconstruct high-level execution tree

Page 191: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

CUPA Classes

1. High-level PC

• Uniform HL instruction exploration

• Obtained via instrumentation

2. x86 PC

• Uniform native method exploration

• Approximated as the PC of fork point

Page 192: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

switch (opcode) {

case LOAD:

...

case STORE:

...

case CALL_FUNCTION:

...

...

}

hlpc++;

}

Interpreter Loop Instrumentation while (true) {

fetch_instr(hlpc, &opcode, &params);

Page 193: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

switch (opcode) {

case LOAD:

...

case STORE:

...

case CALL_FUNCTION:

...

...

}

hlpc++;

}

Interpreter Loop Instrumentation while (true) {

fetch_instr(hlpc, &opcode, &params);

Reconstruct high-level execution tree and CFG

chef_log_hlpc(hlpc, opcode);

Page 194: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Interpreter Optimizations

• Simple changes to

interpreter source

• “Anti-optimizations” in

linear performance...

• ... but exponential gains

in symbolic mode

static long

string_hash(PyStringObject *a)

{

#ifdef SYMBEX_HASHES

return 0;

#else

register Py_ssize_t len;

register unsigned char *p;

register long x;

len = Py_SIZE(a);

p = (char *) a->ob_sval;

x = _Py_HashSecret.prefix;

x ^= *p << 7;

while (--len >= 0)

x = (1000003*x) ^ *p++;

x ^= Py_SIZE(a);

x ^= _Py_HashSecret.suffix;

if (x == -1)

x = -2;

return x;

#endif

} Hash neutralization

Page 195: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Program + Symbolic Tests

Symbolic Execution

Engine for

Language X

Chef Summary

Language X

Interpreter (+instrumentation)

CHEF API

HL Tree

Reconstr.

CUPA

State

Selection

S2E x86 Symbolic Execution

Chef

Page 196: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Chef-Prototyped Engines

Python

5 person-days

321 LoC

Lua

3 person-days

277 LoC

Page 197: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Testing Python Packages

xlrd

simplejson

argparse

HTMLParser

ConfigParser

unicodecsv

6 Popular Packages

10.9K lines of Python code

30 min. / package

> 7,000 tests generated

4 undocumented exceptions found

Page 198: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Efficiency

Package

0.1

1

10

100

1000

10000

Pa

th R

atio

(P

/ P

Baselin

e)

CUPA + Optimizations

Baseline

Page 199: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Efficiency

Package

0.1

1

10

100

1000

10000

Pa

th R

atio

(P

/ P

Baselin

e)

CUPA + Optimizations Optimizations Only

CUPA Only Baseline

Page 200: Efficient Symbolic Execution for Software Testing · DART (SAGE) [Godefroid, PLDI’05] EXE (KLEE) [Cadar et al., CCS’06] Symbolic Execution Verification Condition Generation F-Soft

Symbolic Execution

• Path-wise under-approximate program analysis

• Mixes concrete and symbolic reasoning

• Automatic test case-generation

• Major-challenge: path explosion

• Solutions:

• State merging

• Domain-specific optimizations

• …