Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice...

38
Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International Static Analysis Symposium – September 2011 Microsoft Research University of Wisconsin – Madison

Transcript of Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice...

Page 1: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

Statically Validating Must Summaries for Incremental Compositional

Dynamic Test Generation

Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González

International Static Analysis Symposium – September 2011

Microsoft Research University of Wisconsin – Madison

Page 2: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

2

Valid inputConstraintsRecorded

trace

Background

• Systematic Dynamic Test Generation (= DART)

• Used in many toolso EXE, CUTE, SAGE, PEX, KLEE, BitScope, Apollo, etc.

Run program

Symbolically execute program

Negate and solve constraints

New inputs

And the process repeats (possibly forever!)

Page 3: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

3

o 200+ machines (since 2008)

• #1 application for SMT solvers today (CPU usage)o 1st whitebox fuzzer for security testing

SAGE @ Microsoft

o 1 billion+ constraints

o 100s of apps, 100s of security bugs Example: Win7 file fuzzing Found ~1/3 of all fuzzing bugs

o Millions of dollars saved for Microsoft + time/energy for the world

Page 4: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

Compositional Test Generation

Compositional Dynamic Test Generation• Compute summaries that can be reused later• Avoid retesting• Can provide the same path coverage

exponentially faster!

4

Systematically executing all feasible paths does not scale

Page 5: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

Example of Function Summary

5

1 int is_positive(int x) {2 if (x > 0) return 1;3 return 0;4 }

Where ret denotes the value returned by the function is_positive

Page 6: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

Function Summaries

6

• Function summary for a function fo Logic formula over constraintso Derived by successive iterations and defined as a

disjunction of formulas

𝜑𝑤𝑓=𝑝𝑟𝑒𝑤𝑓 ⋀𝑝𝑜𝑠𝑡𝑤𝑓

Conjunction of constraints on the

inputs of f

Conjunction of constraints on the

outputs of f

o Can be computed automatically from the path constraint for the intraprocedural path

Page 7: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

Must Summaries

7

• Symbolic execution of large programs impreciseo Complex program statementso Calls to operating-system and library functions

• Concrete values simplified constraintso Under-approximate path constraintso Summaries become must summaries

1 int g(int x, int y) {2 if ((x > 0) && (hash(y) > 10))

3 return 1;4 return 0;5 }

𝜑𝑔=(𝑥>0∧ 𝑦=45∧𝑟𝑒𝑡=1 )Under-approximate with smaller

precondition

Assume hash is a complex or unknown function

Assume if g is invoked with y = 45, then hash(45) = 987

Page 8: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

Must Summaries

8

• Defined as quadruple ⟨lp, P, lq, Q ⟩ where:

Prog

Ip

lq

P summary precondition holding at lp

Q summary postcondition holding at lq

Page 9: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

Some Facts About Summaries

• Time to be produced: weeks/months

9

• Number of summaries: millions

• Number of instructions executed between lp and lq: can be hundreds of thousands

Page 10: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

Incremental Compositional Test Generation

10

Have to start from scratch if there is a small code change

Incremental compositional test generation • As in smart/selective regression testing• Reuse summaries still valid in new program• Recompute invalid summaries

Page 11: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

Must Summary Checking

11

• Given a valid must summary for a program and a new version of the program, is the summary still valid for the new version?

• Intraprocedural summarieso locations lp and lq are in a same function fo function f does not return between lp to lq when the

summary is generated

Page 12: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

Some proposals

• Naïveo For each summary, record executed instructions

Too expensive, ~100K of instructions executed Runtime overhead

12

• Our proposalo Verify statically what summaries are valid in

order to reuse them Less precise than recomputing summaries from

scratch, but cheaper

Page 13: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

Algorithms

1. Static Change Impact Analysis

13

2. Predicate-Sensitive Change Impact Analysis

3. Must Summary Validity Checking Analysis

Page 14: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

Phase 1: Static Change Impact Analysis

• Impact analysis of code changes in the control-flow and call graphs of the program

14Old program New program

Ip

lq

Ip

lq

Page 15: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

Modified Instructions and Functions

• Instruction i of a program Prog is modified if:o i is changed or deleted in Prog’ oro Its ordered set of immediate successors has changed

15

• Function f in a program Prog is modified if f:o contains a modified instructiono calls a modified functiono calls an unknown function

Page 16: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

Phase 1: Static Change Impact Analysis

16

...... ......

...... ............

......

......

...... ......

............

Construct call graph for the program1

Page 17: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

17

...... ......

...... ............

......

......

...... ......

............

U

MMU

M

IM

IM IMIU

IU

IU

IU

IU

IUIU

IM

S

S

S

S

S S

Find modified and unknown functions2 Find indirectly modified and unknown functions3

Phase 1: Static Change Impact Analysis

4 Map summaries, construct control-flow graphs

Page 18: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

18

...... ......

...... ............

......

......

...... ......

............

U

MMU

M

IM

IM IMIU

IU

IU

IU

IU

IUIU

IM

S

S

S

S

S S

Find summaries as valid or invalid5

Phase 1: Static Change Impact Analysis

Page 19: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

Phase 2: Predicate-Sensitive Change Impact Analysis

19

• Exploit the predicates P and Q in a summary

if(x > 0)

if (y==0)

w = w + 1

w = 0 w = 1

...

Ip

lq

P: x>0 y<10

Q: w = 0

Old program

Invalidated by Phase 1

Page 20: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

Phase 2: Predicate-Sensitive Change Impact Analysis

20

...if (x > 0) { if (y == 10) w++; // MODIFIED else w = 0;}else { w = 1; // MODIFIED}...

Old program

void foo() {

return;}

Ip

lq

P: x>0 y<10

Q: w = 0

Page 21: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

goto lp;...assume P; modified = false;if (x > 0) { if (y == 10) { modified = true; w++; } else w = 0;}else { modified = true; w = 1; }assert(Q ¬modified);...

Phase 2: Predicate-Sensitive Change Impact Analysis

21Instrumented old program

1

2

4

3

3

void foo() {

return;}

Ip

lq

P: x>0 y<10

Q: w = 0

Page 22: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

Phase 2: Predicate-Sensitive Change Impact Analysis

• Check assertion in instrumented code does not fail for all possible inputs

22

• Verification-condition based program verifiero Create logic formula from program with assertionso Check formula validity using theorem provero If valid, the assertion does not fail in any execution

Page 23: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

Phase 3: Must Summary Validity Checking

23

• Check must summary validity against some code, independently of code changes

if(x < 0)

if (y < 0)

r = 1 r = 0 w = 1

...

Ip

lq

P: x < 0

Q: r 0

Old program

r = 4

New program

Invalidated by Phase 1 and Phase 2

Page 24: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

Phase 3: Must Summary Validity Checking

24

...if (x < 0) { if (y < 0) r = 1; else { r = 4; // r = 0 in old code }}...

New program

void bar() {

return;}

Ip

lq

P: x < 0

Q: r 0

Page 25: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

Phase 3: Must Summary Validity Checking

25

reach_lq = false; goto lp;...assume P;if (x < 0) { if (y < 0) r = 1; else { r = 4; // r = 0 in old code }}assert(Q); reach_lq = true;...assert(reach_lq);

Instrumented new program

1

2

3

4

void bar() {

return;}

Ip

lq

P: x < 0

Q: r 0

Page 26: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

Phase 3: Must Summary Validity Checking

• Check that assertions hold in the instrumented program for all possible inputs

26

Page 27: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

Result

27

Validated summaries can be reusedo Because of soundness

Invalidated summaries are discarded and need to be recomputed

o New tests are generated to cover their preconditions

Algorithms can be used in isolation or in a pipeline

Page 28: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

Experimental Results

28

Page 29: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

Implementation Details

29

Map summaries, find modified insts

and funcs (C++)

Old DLL SummariesOld

DLLOld DLL

NewDLLNew

DLLNewDLL

Vulcan

Produced by SAGE

Phase 1Change Impact

Phase 2Predicate Sensitive

Phase 3Validity Checking

Valid/Invalid Summaries

Library to statically analyze Windows binaries

Used in pipeline or isolation

Page 30: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

Implementation Details

Translator from X86 to BoogiePL

Procedure (x86)

Vulcan

Summary ⟨lp,P,lq,Q⟩

Sound translation

Instrumented BPL file (Phase 2 or Phase 3)

Boogie/Z3

Page 31: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

Benchmarks

31

• Image parsers embedded in Windows o ANI, GIF and JPEG

• Ran SAGE to generate summaries (small sample)o 286 for ANI, 288 for GIF and 517 for JPEG

• Identified the DLLs involvedo 3 for ANI, 4 for GIF and 8 for JPEG

• Compared old version against a randomly picked newer versiono Delta ~1 to 3 years

Page 32: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

Difference Between Program Versions

32

ANI GIF JPEG0

5000

10000

15000

20000

25000

6978

13897

20357

Number of Functions per Benchmark

Modified functions: 3% - 10% Indirectly modified functions: 30% - 45%

Unknown functions: 27% - 37% Indirectly unknown functions: 60% - 74%

Page 33: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

Applying Phases in Isolation

33

Phase 1 Phase 2 Phase 30

50

100

150

200

250

300

167

244

86

ANI (286 summaries)

Phase 1 Phase 2 Phase 30

50

100

150

200

250

300

198

264

90

GIF (288 summaries)

Phase 1 Phase 2 Phase 30

100

200

300

400

500

600

317

487

173

JPEG (517 summaries)

# Va

lidat

ed S

umm

arie

s

# Va

lidat

ed S

umm

arie

s

# Va

lidat

ed S

umm

arie

s

58% 85% 30% 69% 92% 31%

61% 94% 33%

Total Validated: 256/286 (90%)

Total Validated: 274/288 (95%)

Total Validated: 501/517 (97%)

Phase 1: Change ImpactPhase 2: Predicate SensitivePhase 3: Validity Checking

Page 34: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

Applying Phases in Pipeline FashionPhase 1 → Phase 2 → Phase 3

34

Phase 1 Phase 2 Phase 30

20406080

100120140160180

167

77

12

ANI (286 summaries)

Phase 1 Phase 2 Phase 30

50

100

150

200

250

198

73

3

GIF (288 summaries)

Phase 1 Phase 2 Phase 30

50

100

150

200

250

300

350

317

179

5

JPEG (517 summaries)

# Va

lidat

ed S

umm

arie

s

# Va

lidat

ed S

umm

arie

s

# Va

lidat

ed S

umm

arie

s

58% 27% 4%

Total Validated: 256/286 (90%)

69% 25% 1%

61% 35% 1%

Total Validated: 274/288 (95%)

Total Validated: 501/517 (97%)

Phase 1: Change ImpactPhase 2: Predicate SensitivePhase 3: Validity Checking

Page 35: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

Phase 1 Phase 2 Phase 30

5

10

15

20

25

30

35

40

12

31

37

JPEG

Phase 1 Phase 2 Phase 30

5

10

15

20

25

30

35

40

8

23

35

GIF

Phase 1 Phase 2 Phase 305

1015202530354045

5

3742

ANI

Running Time (Isolation)

35

# M

inut

es

# M

inut

es

# M

inut

es

Phase 1: Change ImpactPhase 2: Predicate SensitivePhase 3: Validity Checking

Page 36: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

Running Time Phase 1 → Phase 2 → Phase 3

36

# M

inut

es

43 min 28min 41min

Preliminary results show that statically validating must summaries is up to 20 times faster than recomputing them!

ANI GIF JPEG0

5

10

15

20

25

30

35

40

45

50

Running Time (Pipeline)

Phase 3Phase 2Phase 1Mapping, etc.

Phase 1: Change ImpactPhase 2: Predicate SensitivePhase 3: Validity Checking

Page 37: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

Summary• Formulated the problem of statically validating must

summaries

37

• Demonstrated the effectiveness of static must summary checkingo Validated hundreds of must summaries in minutes

• Described three approaches for validating must summaries

• Presented a preliminary evaluation on three large Windows image parsers

Page 38: Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation Patrice Godefroid Shuvendu K. Lahiri Cindy Rubio-González International.

Questions?

38

Map summaries, find modified insts

and funcs (C++)

Old DLL SummariesOld

DLLOld DLL

NewDLLNew

DLLNewDLL

Vulcan

Phase 1Change Impact

Phase 2Predicate Sensitive

Phase 3Validity Checking

Valid/Invalid Summaries