Automatic program repair using genetic programming

161
AUTOMATIC PROGRAM REPAIR USING GENETIC PROGRAMMING 1 http:// www.clairelegoues.com CLAIRE LE GOUES FEBRUARY 12, 2013

description

Automatic program repair using genetic programming. Claire Le Goues February 12, 2013. How do humans fix bugs?. Mike. Mike (recent undergrad). (Mike’s project). regression test cases. (modified code). ??!. Now what?. How do humans fix bugs?. printf transformer. printf transformer. - PowerPoint PPT Presentation

Transcript of Automatic program repair using genetic programming

Page 1: Automatic program repair using genetic programming

AUTOMATIC PROGRAM REPAIR USING GENETIC PROGRAMMING

1http://www.clairelegoues.com

CLAIRE LE GOUESFEBRUARY 12, 2013

Page 2: Automatic program repair using genetic programming

Claire Le Goues 2

HOW DO HUMANS FIX BUGS?

http://www.clairelegoues.com

Page 3: Automatic program repair using genetic programming

Claire Le Goues 3http://www.clairelegoues.com

Page 4: Automatic program repair using genetic programming

Claire Le Goues 4

Mike

http://www.clairelegoues.com

Page 5: Automatic program repair using genetic programming

Claire Le Goues 5http://www.clairelegoues.com

Mike(recent undergrad)

Page 6: Automatic program repair using genetic programming

Claire Le Goues 6http://www.clairelegoues.com

(Mike’s project)

Page 7: Automatic program repair using genetic programming

Claire Le Goues 7http://www.clairelegoues.com

Page 8: Automatic program repair using genetic programming

Claire Le Goues 8http://www.clairelegoues.com

regression test cases

Page 9: Automatic program repair using genetic programming

Claire Le Goues 9http://www.clairelegoues.com

Page 10: Automatic program repair using genetic programming

Claire Le Goues 10http://www.clairelegoues.com

Page 11: Automatic program repair using genetic programming

Claire Le Goues 11http://www.clairelegoues.com

(modified code)

Page 12: Automatic program repair using genetic programming

Claire Le Goues 12http://www.clairelegoues.com

Page 13: Automatic program repair using genetic programming

Claire Le Goues 13http://www.clairelegoues.com

Page 14: Automatic program repair using genetic programming

Claire Le Goues 14http://www.clairelegoues.com

??!

Page 15: Automatic program repair using genetic programming

Claire Le Goues 15http://www.clairelegoues.com

NOW WHAT?

Page 16: Automatic program repair using genetic programming

Claire Le Goues 16http://www.clairelegoues.com

HOW DO HUMANS FIX BUGS?

Page 17: Automatic program repair using genetic programming

Claire Le Goues 17http://www.clairelegoues.com

Page 18: Automatic program repair using genetic programming

Claire Le Goues 18http://www.clairelegoues.com

printf transformer

Page 19: Automatic program repair using genetic programming

Claire Le Goues 19http://www.clairelegoues.com

printf transformer

Page 20: Automatic program repair using genetic programming

Claire Le Goues 20http://www.clairelegoues.com

Input:

2

5 6

1

3 4

8

7

9

11

10

12

Page 21: Automatic program repair using genetic programming

Claire Le Goues 21http://www.clairelegoues.com

Input:

2

5 6

1

3 4

8

7

9

11

10

12

Legend:Likely faulty. probabilityMaybe faulty. probabilityNot faulty.

Page 22: Automatic program repair using genetic programming

Claire Le Goues 22http://www.clairelegoues.com

PROBLEM: BUGGY SOFTWARE

“Everyday, almost 300 bugs appear […] far too many for only the Mozilla programmers to handle.”

– Mozilla Developer, 2005

Annual cost of software errors in the

US: $59.5 billion (0.6% of GDP).

90%: Maintenance

10%: Everything Else

Average time to fix a security-critical error:

28 days.

Page 23: Automatic program repair using genetic programming

Claire Le Goues 23http://www.clairelegoues.com

HOW BAD IS IT?

Page 24: Automatic program repair using genetic programming

Claire Le Goues 24http://www.clairelegoues.com

Page 25: Automatic program repair using genetic programming

Claire Le Goues 25http://www.clairelegoues.com

Page 26: Automatic program repair using genetic programming

Claire Le Goues 26http://www.clairelegoues.com

Tarsnap: 125 spelling/style 63 harmless 11 minor+ 1 major

75/200 = 38% TP rate$17 + 40 hours per TP

…REALLY?

Page 27: Automatic program repair using genetic programming

Claire Le Goues 27http://www.clairelegoues.com

Tarsnap: 125 spelling/style 63 harmless 11 minor+ 1 major

75/200 = 38% TP rate$17 + 40 hours per TP

…REALLY?

Page 28: Automatic program repair using genetic programming

Claire Le Goues 28http://www.clairelegoues.com

Tarsnap: 125 spelling/style 63 harmless 11 minor+ 1 major

75/200 = 38% TP rate$17 + 40 hours per TP

…REALLY?

Page 29: Automatic program repair using genetic programming

Claire Le Goues 29http://www.clairelegoues.com

SOLUTION: PAY STRANGERS

Page 30: Automatic program repair using genetic programming

Claire Le Goues 30http://www.clairelegoues.com

SOLUTION: PAY STRANGERS

Page 31: Automatic program repair using genetic programming

Claire Le Goues 31http://www.clairelegoues.com

SOLUTION: AUTOMATE

Page 32: Automatic program repair using genetic programming

Claire Le Goues 32http://www.clairelegoues.com

GENPROG: AUTOMATIC, SCALABLE, COMPETITIVE BUG REPAIR.

Page 33: Automatic program repair using genetic programming

Claire Le Goues 33http://www.clairelegoues.com

GENPROG: AUTOMATIC, SCALABLE, COMPETITIVE BUG REPAIR.

Page 34: Automatic program repair using genetic programming

Claire Le Goues 34http://www.clairelegoues.com

Input: program, test cases encoding required functionality, at least one (failing!) test case encoding the bug.Approach: use genetic programming to conduct a biased, random search for a set of edits to a program that fixes a given bug.

THE PLAN

Page 35: Automatic program repair using genetic programming

Claire Le Goues http://www.clairelegoues.com 35

GENETIC PROGRAMMING: the application of evolutionary or

genetic algorithms to program source code.

Page 36: Automatic program repair using genetic programming

Claire Le Goues 36

Population of variants.Fitness function evaluates desirability.Desirable individuals are more likely to be selected for iteration and reproduction.New variants created via:

GENETIC PROGRAMMING

http://www.clairelegoues.com

• Mutation • Crossover

Page 37: Automatic program repair using genetic programming

Claire Le Goues 37

Population of variants.Fitness function evaluates desirability.Desirable individuals are more likely to be selected for iteration and reproduction.New variants created via:

GENETIC PROGRAMMING

http://www.clairelegoues.com

• Mutation • Crossover ABCDEF ABADEF

Page 38: Automatic program repair using genetic programming

Claire Le Goues 38

Population of variants.Fitness function evaluates desirability.Desirable individuals are more likely to be selected for iteration and reproduction.New variants created via:

GENETIC PROGRAMMING

http://www.clairelegoues.com

• Mutation • Crossover ABCDEF ABCWVU

ZYXWVU ZYXDEF

Page 39: Automatic program repair using genetic programming

Claire Le Goues 39

INPUT

OUTPUT

EVALUATE FITNESS

DISCARD

ACCEPT

MUTATE

Page 40: Automatic program repair using genetic programming

Claire Le Goues 40http://www.clairelegoues.com

There are many ways to fix any bug• Upshot: coarse-grained edits.

Not every line of code is equally likely to contribute to the bug.

• Upshot: use test cases to refine mutation probabilities.

Programs contain the seeds of bug repair; programmers know what they’re doing.

• Upshot: do not invent new code.

SECRET SAUCES

Page 41: Automatic program repair using genetic programming

Claire Le Goues 41

EVALUATE FITNESS

MUTATE

INPUT

OUTPUT

ACCEPT

DISCARD

Page 42: Automatic program repair using genetic programming

Claire Le Goues 42http://www.clairelegoues.com

Compile and run each candidate:• Fitness = weighted count of passed test cases.

• Heuristic: passing the bug test is worth more.• If it fails to compile, fitness = 0.

Selection and Crossover: higher fitness variants are (randomly) retained and combined into the next generation.

FITNESS

Page 43: Automatic program repair using genetic programming

Claire Le Goues 43MUTATE

DISCARD

INPUT EVALUATE FITNESS

ACCEPT

OUTPUT

Page 44: Automatic program repair using genetic programming

Claire Le Goues 44http://www.cs.virginia.edu/~csl9q

Search: random (GP) search through population of patches.Approach: compose small random edits.

•Where to change?•How to change it?

MUTATION: BIRD’S EYE VIEW

Page 45: Automatic program repair using genetic programming

Claire Le Goues 45http://www.clairelegoues.com

Use the test cases to associate bug with code:1. Instrument program.2. Record which statements are executed on

failing vs. passing test cases. 3. Weight statements accordingly.

Initial weighting heuristic:• High (1.0) if S is only visited on a failed test • Low (0.1) if S is visited on both failed and passed tests

• Zero (0.0) if S is not visited on any failed test.

MUTATION: WHERE

Page 46: Automatic program repair using genetic programming

Claire Le Goues 46http://www.clairelegoues.com

Three statement-level edits: • delete X• replace X with Y• insert Y after X.

To mutate a variant:1. Choose a random program

statement S.2. Choose a random edit to apply

to S.3. Append edit to the variant.

Replace/insert: pick Y from somewhere else in the program.

MUTATION: HOW

Page 47: Automatic program repair using genetic programming

Claire Le Goues 47http://www.clairelegoues.com

Three statement-level edits: • delete X• replace X with Y• insert Y after X.

To mutate a variant:1. Choose a random program

statement S.2. Choose a random edit to apply

to S.3. Append edit to the variant.

Replace/insert: pick Y from somewhere else in the program.

MUTATION: HOW Coarse! Reduces search space by at

least 2—10x.

Page 48: Automatic program repair using genetic programming

Claire Le Goues 48http://www.clairelegoues.com

Three statement-level edits: • delete X• replace X with Y• insert Y after X.

To mutate a variant:1. Choose a random program

statement S.2. Choose a random edit to apply

to S.3. Append edit to the variant.

Replace/insert: pick Y from somewhere else in the program.

MUTATION: HOW Coarse! Reduces search space by at

least 2—10x.

Page 49: Automatic program repair using genetic programming

Claire Le Goues 49http://www.clairelegoues.com

Three statement-level edits: • delete X• replace X with Y• insert Y after X.

To mutate a variant:1. Choose a random program

statement S.2. Choose a random edit to apply

to S.3. Append edit to the variant.

Replace/insert: pick Y from somewhere else in the program.

MUTATION: HOW Coarse! Reduces search space by at

least 2—10x.

Biased by fault localization.

Page 50: Automatic program repair using genetic programming

Claire Le Goues 50http://www.clairelegoues.com

Three statement-level edits: • delete X• replace X with Y• insert Y after X.

To mutate a variant:1. Choose a random program

statement S.2. Choose a random edit to apply

to S.3. Append edit to the variant.

Replace/insert: pick Y from somewhere else in the program.

MUTATION: HOW Coarse! Reduces search space by at

least 2—10x.

Biased by fault localization.

Selection probabilities set

heuristically.

Page 51: Automatic program repair using genetic programming

Claire Le Goues 51http://www.clairelegoues.com

Three statement-level edits: • delete X• replace X with Y• insert Y after X.

To mutate a variant:1. Choose a random program

statement S.2. Choose a random edit to apply

to S.3. Append edit to the variant.

Replace/insert: pick Y from somewhere else in the program.

MUTATION: HOW Coarse! Reduces search space by at

least 2—10x.

Biased by fault localization.

Selection probabilities set

heuristically.

Page 52: Automatic program repair using genetic programming

Claire Le Goues 52http://www.clairelegoues.com

Three statement-level edits: • delete X• replace X with Y• insert Y after X.

To mutate a variant:1. Choose a random program

statement S.2. Choose a random edit to apply

to S.3. Append edit to the variant.

Replace/insert: pick Y from somewhere else in the program.

MUTATION: HOW Coarse! Reduces search space by at

least 2—10x.

Biased by fault localization.

Selection probabilities set

heuristically.

Code/expertise reuse.

Page 53: Automatic program repair using genetic programming

Claire Le Goues 53http://www.clairelegoues.com

1void gcd(int a, int b) {2 if (a == 0) {3 printf(“%d”, b);4 }5 while (b > 0) {6 if (a > b) 7 a = a – b;8 else9 b = b – a;10 }11 printf(“%d”, a);12 return;13}

>

Page 54: Automatic program repair using genetic programming

Claire Le Goues 54http://www.clairelegoues.com

1void gcd(int a, int b) {2 if (a == 0) {3 printf(“%d”, b);4 }5 while (b > 0) {6 if (a > b) 7 a = a – b;8 else9 b = b – a;10 }11 printf(“%d”, a);12 return;13}

> gcd(4,2)

Page 55: Automatic program repair using genetic programming

Claire Le Goues 55http://www.clairelegoues.com

1void gcd(int a, int b) {2 if (a == 0) {3 printf(“%d”, b);4 }5 while (b > 0) {6 if (a > b) 7 a = a – b;8 else9 b = b – a;10 }11 printf(“%d”, a);12 return;13}

> gcd(4,2)> 2>

Page 56: Automatic program repair using genetic programming

Claire Le Goues 56http://www.clairelegoues.com

1void gcd(int a, int b) {2 if (a == 0) {3 printf(“%d”, b);4 }5 while (b > 0) {6 if (a > b) 7 a = a – b;8 else9 b = b – a;10 }11 printf(“%d”, a);12 return;13}

> gcd(4,2)> 2> > gcd(1071,1029)

Page 57: Automatic program repair using genetic programming

Claire Le Goues 57http://www.clairelegoues.com

1void gcd(int a, int b) {2 if (a == 0) {3 printf(“%d”, b);4 }5 while (b > 0) {6 if (a > b) 7 a = a – b;8 else9 b = b – a;10 }11 printf(“%d”, a);12 return;13}

> gcd(4,2)> 2> > gcd(1071,1029)> 21>

Page 58: Automatic program repair using genetic programming

Claire Le Goues 58http://www.clairelegoues.com

1void gcd(int a, int b) {2 if (a == 0) {3 printf(“%d”, b);4 }5 while (b > 0) {6 if (a > b) 7 a = a – b;8 else9 b = b – a;10 }11 printf(“%d”, a);12 return;13}

> gcd(4,2)> 2> > gcd(1071,1029)> 21> > gcd(0,55)

Page 59: Automatic program repair using genetic programming

Claire Le Goues 59http://www.clairelegoues.com

1void gcd(int a, int b) {2 if (a == 0) {3 printf(“%d”, b);4 }5 while (b > 0) {6 if (a > b) 7 a = a – b;8 else9 b = b – a;10 }11 printf(“%d”, a);12 return;13}

> gcd(4,2)> 2> > gcd(1071,1029)> 21> > gcd(0,55)> 55

(looping forever)

Page 60: Automatic program repair using genetic programming

Claire Le Goues 60http://www.clairelegoues.com

1void gcd(int a, int b) {2 if (a == 0) {3 printf(“%d”, b);4 }5 while (b > 0) {6 if (a > b) 7 a = a – b;8 else9 b = b – a;10 }11 printf(“%d”, a);12 return;13}

Page 61: Automatic program repair using genetic programming

Claire Le Goues 61http://www.clairelegoues.com

1void gcd(int a, int b) {2 if (a == 0) {3 printf(“%d”, b);4 }5 while (b > 0) {6 if (a > b) 7 a = a – b;8 else9 b = b – a;10 }11 printf(“%d”, a);12 return;13}

(a=0; b=55)

Page 62: Automatic program repair using genetic programming

Claire Le Goues 62http://www.clairelegoues.com

1void gcd(int a, int b) {2 if (a == 0) {3 printf(“%d”, b);4 }5 while (b > 0) {6 if (a > b) 7 a = a – b;8 else9 b = b – a;10 }11 printf(“%d”, a);12 return;13}

(a=0; b=55)true

Page 63: Automatic program repair using genetic programming

Claire Le Goues 63http://www.clairelegoues.com

1void gcd(int a, int b) {2 if (a == 0) {3 printf(“%d”, b);4 }5 while (b > 0) {6 if (a > b) 7 a = a – b;8 else9 b = b – a;10 }11 printf(“%d”, a);12 return;13}

(a=0; b=55)true> 55

Page 64: Automatic program repair using genetic programming

Claire Le Goues 64http://www.clairelegoues.com

1void gcd(int a, int b) {2 if (a == 0) {3 printf(“%d”, b);4 }5 while (b > 0) {6 if (a > b) 7 a = a – b;8 else9 b = b – a;10 }11 printf(“%d”, a);12 return;13}

(a=0; b=55)true> 55

(a=0; b=55) true

Page 65: Automatic program repair using genetic programming

Claire Le Goues 65http://www.clairelegoues.com

1void gcd(int a, int b) {2 if (a == 0) {3 printf(“%d”, b);4 }5 while (b > 0) {6 if (a > b) 7 a = a – b;8 else9 b = b – a;10 }11 printf(“%d”, a);12 return;13}

(a=0; b=55)true> 55

(a=0; b=55) truefalse

Page 66: Automatic program repair using genetic programming

Claire Le Goues 66http://www.clairelegoues.com

1void gcd(int a, int b) {2 if (a == 0) {3 printf(“%d”, b);4 }5 while (b > 0) {6 if (a > b) 7 a = a – b;8 else9 b = b – a;10 }11 printf(“%d”, a);12 return;13}

(a=0; b=55)true> 55

(a=0; b=55) truefalse

b = 55 - 0

Page 67: Automatic program repair using genetic programming

Claire Le Goues 67http://www.clairelegoues.com

1void gcd(int a, int b) {2 if (a == 0) {3 printf(“%d”, b);4 }5 while (b > 0) {6 if (a > b) 7 a = a – b;8 else9 b = b – a;10 }11 printf(“%d”, a);12 return;13}

(a=0; b=55)true> 55

(a=0; b=55) truefalse

b = 55 - 0

Page 68: Automatic program repair using genetic programming

Claire Le Goues 68http://www.clairelegoues.com

1void gcd(int a, int b) {2 if (a == 0) {3 printf(“%d”, b);4 }5 while (b > 0) {6 if (a > b) 7 a = a – b;8 else9 b = b – a;10 }11 printf(“%d”, a);12 return;13}

(a=0; b=55)true> 55

(a=0; b=55) truefalse

b = 55 - 0

!

Page 69: Automatic program repair using genetic programming

Claire Le Goues 69http://www.clairelegoues.com

1void gcd(int a, int b){2 if (a == 0) {3 printf(“%d”, b);4 }5 while (b > 0) {6 if (a > b) 7 a = a – b;8 else9 b = b – a;10 }11 printf(“%d”, a);12 return;13}

Page 70: Automatic program repair using genetic programming

Claire Le Goues 70http://www.clairelegoues.com

1void gcd(int a, int b){2 if (a == 0) {3 printf(“%d”, b);4 }5 while (b > 0) {6 if (a > b) 7 a = a – b;8 else9 b = b – a;10 }11 printf(“%d”, a);12 return;13}

Page 71: Automatic program repair using genetic programming

Claire Le Goues 71http://www.clairelegoues.com

1void gcd(int a, int b){2 if (a == 0) {3 printf(“%d”, b);4 }5 while (b > 0) {6 if (a > b) 7 a = a – b;8 else9 b = b – a;10 }11 printf(“%d”, a);12 return;13}

Page 72: Automatic program repair using genetic programming

Claire Le Goues 72http://www.clairelegoues.com

1void gcd(int a, int b){2 if (a == 0) {3 printf(“%d”, b);4 }5 while (b > 0) {6 if (a > b) 7 a = a – b;8 else9 b = b – a;10 }11 printf(“%d”, a);12 return;13}

1void gcd(int a, int b){2 if (a == 0) {3 printf(“%d”, b);4 return;5 }6 while (b > 0) {7 if (a > b) 8 a = a – b;9 else10 b = b – a;11 }12 printf(“%d”, a);13 return;14}

Page 73: Automatic program repair using genetic programming

Claire Le Goues 73http://www.clairelegoues.com

1void gcd(int a, int b){2 if (a == 0) {3 printf(“%d”, b);4 }5 while (b > 0) {6 if (a > b) 7 a = a – b;8 else9 b = b – a;10 }11 printf(“%d”, a);12 return;13}

1void gcd(int a, int b){2 if (a == 0) {3 printf(“%d”, b);4 return;5 }6 while (b > 0) {7 if (a > b) 8 a = a – b;9 else10 b = b – a;11 }12 printf(“%d”, a);13 return;14}

Page 74: Automatic program repair using genetic programming

Claire Le Goues 74http://www.clairelegoues.com

1void gcd(int a, int b){2 if (a == 0) {3 printf(“%d”, b);4 }5 while (b > 0) {6 if (a > b) 7 a = a – b;8 else9 b = b – a;10 }11 printf(“%d”, a);12 return;13}

Page 75: Automatic program repair using genetic programming

Claire Le Goues 75http://www.clairelegoues.com

printf(b)

{block}

while (b>0)

{block}{block} {block}

if(a==0)

if(a>b)

a = a – b

{block}{block}

printf(a) return

b = b – a

Input:

Page 76: Automatic program repair using genetic programming

Claire Le Goues 76http://www.clairelegoues.com

printf(b)

{block}

while (b>0)

{block}{block} {block}

if(a==0)

if(a>b)

a = a – b

{block}{block}

printf(a) return

b = b – a

Input:

Legend:High change probability.Low change probability.Not changed.

Page 77: Automatic program repair using genetic programming

Claire Le Goues 77http://www.clairelegoues.com

printf(b)

{block}

while (b>0)

{block}{block} {block}

if(a==0)

if(a>b)

a = a – b

{block}{block}

printf(a) return

b = b – a

Input:

Legend:High change probability.Low change probability.Not changed.

Page 78: Automatic program repair using genetic programming

Claire Le Goues 78http://www.clairelegoues.com

printf(b)

{block}

while (b>0)

{block}{block} {block}

if(a==0)

if(a>b)

a = a – b

{block}{block}

printf(a) return

b = b – a

Input:

An edit is: • Replace statement

X with statement Y • Insert statement X

after statement Y • Delete statement X

Page 79: Automatic program repair using genetic programming

Claire Le Goues 79http://www.clairelegoues.com

printf(b)

{block}

while (b>0)

{block}{block} {block}

if(a==0)

if(a>b)

a = a – b

{block}{block}

printf(a) return

b = b – a

Input:

An edit is: • Replace statement

X with statement Y • Insert statement X

after statement Y • Delete statement X

Page 80: Automatic program repair using genetic programming

Claire Le Goues 80http://www.clairelegoues.com

printf(b)

{block}

while (b>0)

{block}{block} {block}

if(a==0)

if(a>b)

a = a – b

{block}{block}

printf(a) return

b = b – a

Input:

An edit is: • Replace statement

X with statement Y • Insert statement X

after statement Y • Delete statement X

Page 81: Automatic program repair using genetic programming

Claire Le Goues 81http://www.clairelegoues.com

printf(b)

{block}

while (b>0)

{block}{block} {block}

if(a==0)

if(a>b)

a = a – b

{block}{block}

printf(a) return

b = b – a

Input:

An edit is: • Replace statement

X with statement Y • Insert statement X

after statement Y • Delete statement X

Page 82: Automatic program repair using genetic programming

Claire Le Goues 82http://www.clairelegoues.com

printf(b)

{block}

while (b>0)

{block}{block} {block}

if(a==0)

if(a>b)

a = a – b

{block}{block}

printf(a) return

b = b – a

Input:

An edit is: • Replace statement

X with statement Y • Insert statement X

after statement Y • Delete statement X

Page 83: Automatic program repair using genetic programming

Claire Le Goues 83http://www.clairelegoues.com

return

{block}

while (b>0)

{block}{block} {block}

if(a==0)

if(a>b)

a = a – b

{block}{block}

printf(a) return

b = b – a

Input:

An edit is: • Replace statement

X with statement Y • Insert statement X

after statement Y • Delete statement X

Page 84: Automatic program repair using genetic programming

Claire Le Goues 84http://www.clairelegoues.com

return

{block}

while (b>0)

{block}{block} {block}

if(a==0)

if(a>b)

a = a – b

{block}{block}

printf(a) return

b = b – a

Input:

An edit is: • Replace statement

X with statement Y • Insert statement X

after statement Y • Delete statement X

Page 85: Automatic program repair using genetic programming

Claire Le Goues 85

INPUT

OUTPUT

EVALUATE FITNESS

DISCARD

ACCEPT

MUTATE

Page 86: Automatic program repair using genetic programming

Claire Le Goues 86http://www.clairelegoues.com

What is it?• GenProg: automatic program repair.• Approach: search for a patch to a program with a bug.

Does it work?

WHERE ARE WE?

Page 87: Automatic program repair using genetic programming

Claire Le Goues

Program Description Size Bug Type Time

Page 88: Automatic program repair using genetic programming

Claire Le Goues

Program Description Size Bug Type Timegcd example 22 infinite loop 153

Page 89: Automatic program repair using genetic programming

Claire Le Goues

Program Description Size Bug Type Timegcd example 22 infinite loop 153zune example 28 infinite loop 42

Page 90: Automatic program repair using genetic programming

Claire Le Goues

Program Description Size Bug Type Timegcd example 22 infinite loop 153zune example 28 infinite loop 42uniq text processing 1146 segmentation fault 34look-u dictionary lookup 1169 segmentation fault 45look-s dictionary lookup 1363 infinite loop 55units metric conversion 1504 segmentation fault 109deroff document processing 2236 segmentation fault 131indent code processing 9906 infinite loop 546flex lexical analyzer generator 18774 segmentation fault 230

Page 91: Automatic program repair using genetic programming

Claire Le Goues

Program Description Size Bug Type Timegcd example 22 infinite loop 153zune example 28 infinite loop 42uniq text processing 1146 segmentation fault 34look-u dictionary lookup 1169 segmentation fault 45look-s dictionary lookup 1363 infinite loop 55units metric conversion 1504 segmentation fault 109deroff document processing 2236 segmentation fault 131indent code processing 9906 infinite loop 546flex lexical analyzer generator 18774 segmentation fault 230nullhttpd webserver 5575 heap buffer overflow (code) 578openldap directory protocol 292598 non-overflow denial of service 665ccrypt encryption utility 7515 segmentation fault 330lighttpd webserver 51895 heap buffer overflow (vars) 394atris graphical game 21553 local stack buffer exploit 80php scripting language 764489 integer overflow 56wu-ftpd FTP server 67029 format string vulnerability 2256leukocyte computational biology 6718 segmentation fault 360

tiff image processing 84067 segmentation fault 108imagemagick image processing 450516 wrong output 2160

Page 92: Automatic program repair using genetic programming

Claire Le Goues

Program Description Size Bug Type Timegcd example 22 infinite loop 153zune example 28 infinite loop 42uniq text processing 1146 segmentation fault 34look-u dictionary lookup 1169 segmentation fault 45look-s dictionary lookup 1363 infinite loop 55units metric conversion 1504 segmentation fault 109deroff document processing 2236 segmentation fault 131indent code processing 9906 infinite loop 546flex lexical analyzer generator 18774 segmentation fault 230nullhttpd webserver 5575 heap buffer overflow (code) 578openldap directory protocol 292598 non-overflow denial of service 665ccrypt encryption utility 7515 segmentation fault 330lighttpd webserver 51895 heap buffer overflow (vars) 394atris graphical game 21553 local stack buffer exploit 80php scripting language 764489 integer overflow 56wu-ftpd FTP server 67029 format string vulnerability 2256leukocyte computational biology 6718 segmentation fault 360

tiff image processing 84067 segmentation fault 108imagemagick image processing 450516 wrong output 2160

Page 93: Automatic program repair using genetic programming

Claire Le Goues

Program Description Size Bug Type Timegcd example 22 infinite loop 153zune example 28 infinite loop 42uniq text processing 1146 segmentation fault 34look-u dictionary lookup 1169 segmentation fault 45look-s dictionary lookup 1363 infinite loop 55units metric conversion 1504 segmentation fault 109deroff document processing 2236 segmentation fault 131indent code processing 9906 infinite loop 546flex lexical analyzer generator 18774 segmentation fault 230nullhttpd webserver 5575 heap buffer overflow (code) 578openldap directory protocol 292598 non-overflow denial of service 665ccrypt encryption utility 7515 segmentation fault 330lighttpd webserver 51895 heap buffer overflow (vars) 394atris graphical game 21553 local stack buffer exploit 80php scripting language 764489 integer overflow 56wu-ftpd FTP server 67029 format string vulnerability 2256leukocyte computational biology 6718 segmentation fault 360

tiff image processing 84067 segmentation fault 108imagemagick image processing 450516 wrong output 2160

Page 94: Automatic program repair using genetic programming

Claire Le Goues

Program Description Size Bug Type Timegcd example 22 infinite loop 153zune example 28 infinite loop 42uniq text processing 1146 segmentation fault 34look-u dictionary lookup 1169 segmentation fault 45look-s dictionary lookup 1363 infinite loop 55units metric conversion 1504 segmentation fault 109deroff document processing 2236 segmentation fault 131indent code processing 9906 infinite loop 546flex lexical analyzer generator 18774 segmentation fault 230nullhttpd webserver 5575 heap buffer overflow (code) 578openldap directory protocol 292598 non-overflow denial of service 665ccrypt encryption utility 7515 segmentation fault 330lighttpd webserver 51895 heap buffer overflow (vars) 394atris graphical game 21553 local stack buffer exploit 80php scripting language 764489 integer overflow 56wu-ftpd FTP server 67029 format string vulnerability 2256leukocyte computational biology 6718 segmentation fault 360

tiff image processing 84067 segmentation fault 108imagemagick image processing 450516 wrong output 2160

Page 95: Automatic program repair using genetic programming

Claire Le Goues 95http://www.clairelegoues.com

What is it?• GenProg: automatic program repair.• Approach: search for a patch to a program with a

bug.Does it work?

• Yes: repairs a variety of bugs in a variety of programs.

Cool! But: is it really usable?• Are the repairs trustworthy?• Is it human-competitive?• Can we make it better?

WHERE ARE WE?

Page 96: Automatic program repair using genetic programming

Claire Le Goues 96http://www.clairelegoues.com

What is it?• GenProg: automatic program repair.• Approach: search for a patch to a program with a

bug.Does it work?

• Yes: repairs a variety of bugs in a variety of programs.

Cool! But: is it really usable?• Are the repairs trustworthy?• Is it human-competitive?• Can we make it better?

WHERE ARE WE?

Page 97: Automatic program repair using genetic programming

Claire Le Goues 97http://www.clairelegoues.com

What is it?• GenProg: automatic program repair.• Approach: search for a patch to a program with a

bug.Does it work?

• Yes: repairs a variety of bugs in a variety of programs.

Cool! But: is it really usable?• Are the repairs trustworthy?• Is it human-competitive?• Can we make it better?

WHERE ARE WE?

Page 98: Automatic program repair using genetic programming

Claire Le Goues 98http://www.clairelegoues.com

What is it?• GenProg: automatic program repair.• Approach: search for a patch to a program with a

bug.Does it work?

• Yes: repairs a variety of bugs in a variety of programs.

Cool! But: is it really usable?• Are the repairs trustworthy?• Is it human-competitive?• Can we make it better?

WHERE ARE WE?

Page 99: Automatic program repair using genetic programming

Claire Le Goues 99http://www.clairelegoues.com

What is it?• GenProg: automatic program repair.• Approach: search for a patch to a program with a

bug.Does it work?

• Yes: repairs a variety of bugs in a variety of programs.

Cool! But: is it really usable?• Are the repairs trustworthy?• Is it human-competitive?• Can we make it better?

WHERE ARE WE?

Page 100: Automatic program repair using genetic programming

Claire Le Goues 100http://www.clairelegoues.com

Recall: any proposed repair must pass all regression test cases, ensuring it maintains required functionality.

REPAIR QUALITY

However, repairs are not always what a human would have done.

• Example: Adds a bounds check to a read, rather than refactoring to use a safe abstract string class.

Are they still acceptable?

Page 101: Automatic program repair using genetic programming

Claire Le Goues 101http://www.clairelegoues.com

What makes a high-quality repair?• Retains required functionality.• Does not introduce new bugs.• Addresses the cause, not just the symptom.

Experimental scenario: a long-running server with an intrusion detection system that will generate/deploy repairs for detected anomalies.

• Worst-case: no humans around to vet the repairs!

REPAIR QUALITY EXPERIMENT

Page 102: Automatic program repair using genetic programming

Claire Le Goues 102http://www.clairelegoues.com

One web application language interpreter with an integer overflow vulnerability:

• php Workload: 15kloc secure reservation system web app, 12,375 requests.

REPAIR QUALITY BENCHMARKS Two webservers with buffer overflows:

• nullhttpd (simple, multithreaded)

• lighttpd (used by Wikimedia, etc.)

Workload: 138,226 requests from 12,743 distinct client IP addresses.Workload held out in both cases; one day of data.

Page 103: Automatic program repair using genetic programming

Claire Le Goues 103http://www.clairelegoues.com

Apply indicative workloads to vanilla servers.• Record results and times.

Send attack input. • Caught by intrusion detection system.

Generate, deploy repair using attack input and regression test cases.Apply indicative workload to patched server. Compare requests processed pre- and post- repair.

• Each request must yield exactly the same output (bit-per-bit) in the same time or less!

EXPERIMENTAL SETUP

Page 104: Automatic program repair using genetic programming

Claire Le Goues 104http://www.clairelegoues.com

Program Post-patch requests lostnullhttpd 0.00 % ± 0.25%lighttpd 0.03% ± 1.53%php 0.02% ± 0.02%

REPAIR QUALITY RESULTS

Page 105: Automatic program repair using genetic programming

Claire Le Goues 105http://www.clairelegoues.com

Program Post-patch requests lostFuzz Tests FailedGeneral

nullhttpd 0.00 % ± 0.25% 0 0 lighttpd 0.03% ± 1.53% 1410 1410 php 0.02% ± 0.02% 3 3

REPAIR QUALITY RESULTS

Page 106: Automatic program repair using genetic programming

Claire Le Goues 106http://www.clairelegoues.com

Program Post-patch requests lostFuzz Tests FailedGeneral Exploit

nullhttpd 0.00 % ± 0.25% 0 0 lighttpd 0.03% ± 1.53% 1410 1410 php 0.02% ± 0.02% 3 3

REPAIR QUALITY RESULTS

Page 107: Automatic program repair using genetic programming

Claire Le Goues 107http://www.clairelegoues.com

Program Post-patch requests lostFuzz Tests FailedGeneral Exploit

nullhttpd 0.00 % ± 0.25% 0 0 10 0lighttpd 0.03% ± 1.53% 1410 1410 9 0php 0.02% ± 0.02% 3 3 5 0

REPAIR QUALITY RESULTS

Page 108: Automatic program repair using genetic programming

Claire Le Goues 108http://www.clairelegoues.com

What is it?• GenProg: automatic program repair.• Approach: search for a patch to a program with a

bug.Does it work?

• Yes: repairs a variety of bugs in a variety of programs.

Cool! But: is it really usable?• Are the repairs trustworthy?• Is it human-competitive?• Can we make it better?

WHERE ARE WE?

Page 109: Automatic program repair using genetic programming

Claire Le Goues http://www.clairelegoues.com

COMPETITIVE

109

How many bugs can GenProg fix?

How much does it cost?

Page 110: Automatic program repair using genetic programming

Claire Le Goues 110http://www.clairelegoues.com

Goal: systematically evaluate GenProg on a general, indicative bug set.General approach:

• Avoid overfitting: fix the algorithm. • Systematically create a generalizable benchmark set.

• Try to repair every bug in the benchmark set, establish grounded cost measurements.

SETUP

Page 111: Automatic program repair using genetic programming

Claire Le Goues 111http://www.clairelegoues.com

Goal: systematically evaluate GenProg on a general, indicative bug set.General approach:

• Avoid overfitting: fix the algorithm. • Systematically create a generalizable benchmark set.

• Try to repair every bug in the benchmark set, establish grounded cost measurements.

SETUP

Page 112: Automatic program repair using genetic programming

Claire Le Goues 112http://www.clairelegoues.com

CHALLENGE: INDICATIVE BUG SET

Page 113: Automatic program repair using genetic programming

Claire Le Goues 113http://www.clairelegoues.com

Goal: a large set of important, reproducible bugs in non-trivial programs. Approach: use historical data to approximate discovery and repair of bugs in the wild.

SYSTEMATIC BENCHMARK SELECTION

Page 114: Automatic program repair using genetic programming

Claire Le Goues 114http://www.clairelegoues.com

Consider top programs from SourceForge, Google Code, Fedora SRPM, etc:

• Find pairs of viable versions where test case behavior changes.

• Take all tests from most recent version.• Go back in time through the source control.

Corresponds to a human-written repair for the bug tested by the failing test case(s).

SYSTEMATIC BENCHMARK SELECTION

Page 115: Automatic program repair using genetic programming

Claire Le Goues 115http://www.clairelegoues.com

BENCHMARKSProgram LOC Tests Bugs Descriptionfbc 97,000 773 3 Language (legacy)gmp 145,000 146 2 Multiple precision mathgzip 491,000 12 5 Data compressionlibtiff 77,000 78 24 Image manipulationlighttpd 62,000 295 9 Web serverphp 1,046,000 8,471 44 Language (web)python 407,000 355 11 Language (general)wireshark 2,814,000 63 7 Network packet analyzerTotal 5,139,000 10,193 105

Page 116: Automatic program repair using genetic programming

Claire Le Goues 116http://www.clairelegoues.com

BENCHMARKSProgram LOC Tests Bugs Descriptionfbc 97,000 773 3 Language (legacy)gmp 145,000 146 2 Multiple precision mathgzip 491,000 12 5 Data compressionlibtiff 77,000 78 24 Image manipulationlighttpd 62,000 295 9 Web serverphp 1,046,000 8,471 44 Language (web)python 407,000 355 11 Language (general)wireshark 2,814,000 63 7 Network packet analyzerTotal 5,139,000 10,193 105

Page 117: Automatic program repair using genetic programming

Claire Le Goues 117http://www.clairelegoues.com

BENCHMARKSProgram LOC Tests Bugs Descriptionfbc 97,000 773 3 Language (legacy)gmp 145,000 146 2 Multiple precision mathgzip 491,000 12 5 Data compressionlibtiff 77,000 78 24 Image manipulationlighttpd 62,000 295 9 Web serverphp 1,046,000 8,471 44 Language (web)python 407,000 355 11 Language (general)wireshark 2,814,000 63 7 Network packet analyzerTotal 5,139,000 10,193 105

Page 118: Automatic program repair using genetic programming

Claire Le Goues 118http://www.clairelegoues.com

BENCHMARKSProgram LOC Tests Bugs Descriptionfbc 97,000 773 3 Language (legacy)gmp 145,000 146 2 Multiple precision mathgzip 491,000 12 5 Data compressionlibtiff 77,000 78 24 Image manipulationlighttpd 62,000 295 9 Web serverphp 1,046,000 8,471 44 Language (web)python 407,000 355 11 Language (general)wireshark 2,814,000 63 7 Network packet analyzerTotal 5,139,000 10,193 105

Page 119: Automatic program repair using genetic programming

Claire Le Goues 119http://www.clairelegoues.com

CHALLENGE: GROUNDED COST MEASUREMENTS

Page 120: Automatic program repair using genetic programming

Claire Le Goues http://www.clairelegoues.com 120

Page 121: Automatic program repair using genetic programming

Claire Le Goues http://www.clairelegoues.com 121

Page 122: Automatic program repair using genetic programming

Claire Le Goues 122http://www.clairelegoues.com

READY

Page 123: Automatic program repair using genetic programming

Claire Le Goues 123http://www.clairelegoues.com

GO

Page 124: Automatic program repair using genetic programming

Claire Le Goues 124http://www.clairelegoues.com

13 HOURS LATER

Page 125: Automatic program repair using genetic programming

Claire Le Goues 125http://www.clairelegoues.com

SUCCESS/COST

Program

Defects Repaired

Cost per non-repair

Cost per repair

Hours US$ Hours US$

fbc 1/3 8.52 5.56 6.52 4.08gmp 1/2 9.93 6.61 1.60 0.44gzip 1/5 5.11 3.04 1.41 0.30libtiff 17/24 7.81 5.04 1.05 0.04lighttpd 5/9 10.79 7.25 1.34 0.25php 28/44 13.00 8.80 1.84 0.62python 1/11 13.00 8.80 1.22 0.16wireshark 1/7 13.00 8.80 1.23 0.17Total 55/105 11.22h 1.60h$403 for all 105 trials, leading to 55 repairs; $7.32 per bug repaired.

Page 126: Automatic program repair using genetic programming

Claire Le Goues 126http://www.clairelegoues.com

SUCCESS/COST

Program

Defects Repaired

Cost per non-repair

Cost per repair

Hours US$ Hours US$

fbc 1/3 8.52 5.56 6.52 4.08gmp 1/2 9.93 6.61 1.60 0.44gzip 1/5 5.11 3.04 1.41 0.30libtiff 17/24 7.81 5.04 1.05 0.04lighttpd 5/9 10.79 7.25 1.34 0.25php 28/44 13.00 8.80 1.84 0.62python 1/11 13.00 8.80 1.22 0.16wireshark 1/7 13.00 8.80 1.23 0.17Total 55/105 11.22h 1.60h$403 for all 105 trials, leading to 55 repairs; $7.32 per bug repaired.

Page 127: Automatic program repair using genetic programming

Claire Le Goues 127http://www.clairelegoues.com

SUCCESS/COST

Program

Defects Repaired

Cost per non-repair

Cost per repair

Hours US$ Hours US$

fbc 1/3 8.52 5.56 6.52 4.08gmp 1/2 9.93 6.61 1.60 0.44gzip 1/5 5.11 3.04 1.41 0.30libtiff 17/24 7.81 5.04 1.05 0.04lighttpd 5/9 10.79 7.25 1.34 0.25php 28/44 13.00 8.80 1.84 0.62python 1/11 13.00 8.80 1.22 0.16wireshark 1/7 13.00 8.80 1.23 0.17Total 55/105 11.22h 1.60h$403 for all 105 trials, leading to 55 repairs; $7.32 per bug repaired.

Page 128: Automatic program repair using genetic programming

Claire Le Goues 128http://www.clairelegoues.com

SUCCESS/COST

Program

Defects Repaired

Cost per non-repair

Cost per repair

Hours US$ Hours US$

fbc 1/3 8.52 5.56 6.52 4.08gmp 1/2 9.93 6.61 1.60 0.44gzip 1/5 5.11 3.04 1.41 0.30libtiff 17/24 7.81 5.04 1.05 0.04lighttpd 5/9 10.79 7.25 1.34 0.25php 28/44 13.00 8.80 1.84 0.62python 1/11 13.00 8.80 1.22 0.16wireshark 1/7 13.00 8.80 1.23 0.17Total 55/105 11.22h 1.60h$403 for all 105 trials, leading to 55 repairs; $7.32 per bug repaired.

Page 129: Automatic program repair using genetic programming

Claire Le Goues 129http://www.clairelegoues.com

JBoss issue tracking: median 5.0, mean 15.3 hours.1

IBM: $25 per defect during coding, rising at build, Q&A, post-release, etc.2

Tarsnap.com: $17, 40 hours per non-trivial repair.3

Bug bounty programs in general:• At least $500 for security-critical bugs.• One of the php bugs that GenProg fixed has an

associated NIST security certification.1C. Weiß, R. Premraj, T. Zimmermann, and A. Zeller, “How long will it take to fix this bug?” in Workshop on Mining Software Repositories, May 2007. 2 L. Williamson, “IBM Rational software analyzer: Beyond source code,” in Rational Software Developer Conference, Jun. 2008. 3http://www.tarsnap.com/bugbounty.html

PUBLIC COMPARISON

Page 130: Automatic program repair using genetic programming

Claire Le Goues 130http://www.clairelegoues.com

What is it?• GenProg: automatic program repair.• Approach: search for a patch to a program with a

bug.Does it work?

• Yes: repairs a variety of bugs in a variety of programs.

Cool! But: is it really usable?• Are the repairs trustworthy?• Is it human-competitive?• Can we make it better?

WHERE ARE WE?

Page 131: Automatic program repair using genetic programming

Claire Le Goues 131http://www.clairelegoues.com

ATYPICAL SPACE

Page 132: Automatic program repair using genetic programming

Claire Le Goues http://www.clairelegoues.com 132

OTHER GP PROBLEMSGECCO GP-TRACK

BEST PAPERSLearning: expression trees or listsPopulation: 64 – 2500Iterations: 50 – 10000Max variant size: 48 operations, 17 levels

PROGRAM REPAIR

Page 133: Automatic program repair using genetic programming

Claire Le Goues http://www.clairelegoues.com 133

OTHER GP PROBLEMSGECCO GP-TRACK

BEST PAPERSLearning: expression trees or listsPopulation: 64 – 2500Iterations: 50 – 10000Max variant size: 48 operations, 17 levels

PROGRAM REPAIRLearning: patches or repaired programsPopulation: 40Iterations: 10 Max variant size: unboundedLargest benchmark program: 2.8 million lines of C code.

Page 134: Automatic program repair using genetic programming

Claire Le Goues http://www.clairelegoues.com 134

OTHER GP PROBLEMSGECCO GP-TRACK

BEST PAPERSLearning: expression trees or listsPopulation: 64 – 2500Iterations: 50 – 10000Max variant size: 48 operations, 17 levels

PROGRAM REPAIRLearning: patches or repaired programsPopulation: 40Iterations: 10 Max variant size: unboundedLargest benchmark program: 2.8 million lines of C code

Page 135: Automatic program repair using genetic programming

Claire Le Goues 135http://www.clairelegoues.com

Representation: • Which representation choice gives better results? • Which representation features contribute most to

success? Crossover: Which crossover operator is best? Operators:

• Which operators contribute the most to success? • How should they be selected?

Search space: How should the representation weight program statements to best define the search space?

UNDER THE HOOD

Page 136: Automatic program repair using genetic programming

Claire Le Goues 136http://www.clairelegoues.com

Representation: • Which representation choice gives better results? • Which representation features contribute most to

success? Crossover: Which crossover operator is best? Operators:

• Which operators contribute the most to success? • How should they be selected?

Search space: How should the representation weight program statements to best define the search space?

UNDER THE HOOD

Page 137: Automatic program repair using genetic programming

Claire Le Goues 137http://www.clairelegoues.com

Hypothesis: statements executed only by the failing test case(s) should be weighted more heavily than those also executed by the passing test cases.Procedure: examine that ratio in actual repairs.Result:

SEARCH SPACE: SETUP

Page 138: Automatic program repair using genetic programming

Claire Le Goues 138http://www.clairelegoues.com

Hypothesis: statements executed only by the failing test case(s) should be weighted more heavily than those also executed by the passing test cases.Procedure: examine that ratio in actual repairs.Result:

Expected: 10 : 1 vs.

Actual: 1 : 1.85

SEARCH SPACE: SETUP

Page 139: Automatic program repair using genetic programming

Claire Le Goues 139http://www.clairelegoues.com

HUH.

Page 140: Automatic program repair using genetic programming

Claire Le Goues 140http://www.clairelegoues.com

Dataset: the 105 bugs from the earlier dataset.Rerun that experiment, varying the statement weighting scheme:

• Default: the original experiment• Realistic: 1 : 1.85• Uniform: 1 : 1

Metrics: time to repair, success rate.

SEARCH SPACE EXPERIMENT

Page 141: Automatic program repair using genetic programming

Claire Le Goues 141http://www.clairelegoues.com

Easy Medium Hard All0

102030405060708090

100110

34.3

93.6103.1

66.0

49.1

75.7

27.1

59.5

36.3

67.057.7 58.6

DefaultRealisticEqual

Search difficulty

# fit

ness

eva

luat

ions

to re

pair

SEARCH SPACE: REPAIR TIME10 : 1

1 : 1.85

1 : 1

Page 142: Automatic program repair using genetic programming

Claire Le Goues 142http://www.clairelegoues.com

Easy Medium Hard All0

102030405060708090

100110

34.3

93.6103.1

66.0

49.1

75.7

27.1

59.5

36.3

67.057.7 58.6

DefaultRealisticEqual

Search difficulty

# fit

ness

eva

luat

ions

to re

pair

SEARCH SPACE: REPAIR TIME10 : 1

1 : 1.85

1 : 1

Page 143: Automatic program repair using genetic programming

Claire Le Goues 143http://www.clairelegoues.com

Easy Medium Hard 0% All0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%1.00

0.78

0.24

0.00

0.62

0.930.85

0.23 0.23

0.69

0.92 0.90

0.33

0.19

0.70

DefaultRealisticEqual

Search difficulty

GP

Succ

ess

Rat

eSEARCH SPACE: SUCCESS RATE

10 : 1

1 : 1.85

1 : 1

Page 144: Automatic program repair using genetic programming

Claire Le Goues 144http://www.clairelegoues.com

Easy Medium Hard 0% All0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%1.00

0.78

0.24

0.00

0.62

0.930.85

0.23 0.23

0.69

0.92 0.90

0.33

0.19

0.70

DefaultRealisticEqual

Search difficulty

GP

Succ

ess

Rat

eSEARCH SPACE: SUCCESS RATE

10 : 1

1 : 1.85

1 : 1

Page 145: Automatic program repair using genetic programming

Claire Le Goues 145http://www.clairelegoues.com

Investigated 3 other sets of choices related to representation and operators.Combination of modifications:

• Repairs an additional 5 bugs; 60 out of 105. • Reduced time to repair 17 – 43% for difficult bug scenarios.

Have similarly studied fitness function improvements.Investigation continues!

MORE UNDER THE HOOD

Page 146: Automatic program repair using genetic programming

Claire Le Goues 146http://www.clairelegoues.com

Repair templates for more expressive mutation operators.

• Mine common fixes from existing source code and source code repositories.

Provably secure patching.• Apply cryptographic techniques to make patches difficult to reverse-engineer.

Medical device system safety.• Conduct a whole-environment safety analysis for medical device users.

WHAT’S NEXT?

Page 147: Automatic program repair using genetic programming

Claire Le Goues 147http://www.clairelegoues.com

Page 148: Automatic program repair using genetic programming

Claire Le Goues 148http://www.clairelegoues.com

JournalAutomatic program repairCode quality metrics for specification miningTool support for formal program verification

TSE ’12

(featured article)ConferenceICSE ’091

(best paper)ICSE ’122 GECCO ’122

1gold, Humies2bronze, Humies

Page 149: Automatic program repair using genetic programming

Claire Le Goues 149http://www.clairelegoues.com

JournalAutomatic program repairCode quality metrics for specification miningTool support for formal program verification

Workshop/Tutorial

FOSER ’10 SBST ’09

GECCO ’12

TSE ’12

(featured article)

CACM ’10

(invited article)ConferenceGECCO ’10ICSE ’091

(best paper)

GECCO ’091

(best paper)ICSE ’122 GECCO ’122

1gold, Humies2bronze, Humies

Page 150: Automatic program repair using genetic programming

Claire Le Goues 150http://www.clairelegoues.com

JournalAutomatic program repairCode quality metrics for specification miningTool support for formal program verification

Workshop/Tutorial

FOSER ’10 SBST ’09

GECCO ’12

TSE ’12

(featured article)

CACM ’10

(invited article)ConferenceGECCO ’10TACAS ’09 ICSE ’091

(best paper)

GECCO ’091

(best paper)ICSE ’122 GECCO ’122

1gold, Humies2bronze, Humies

TSE ’12

Page 151: Automatic program repair using genetic programming

Claire Le Goues 151http://www.clairelegoues.com

JournalAutomatic program repairCode quality metrics for specification miningTool support for formal program verification

Workshop/Tutorial

FOSER ’10 SBST ’09

GECCO ’12

TSE ’12 TSE ’12

(featured article)

CACM ’10

(invited article)ConferenceGECCO ’10TACAS ’09 ICSE ’091

(best paper)

GECCO ’091

(best paper)SEFM ’11 ICSE ’122 GECCO ’122

1gold, Humies2bronze, Humies

Page 152: Automatic program repair using genetic programming

Claire Le Goues 152http://www.clairelegoues.com

GenProg: scalable, automatic bug repair.• Genetic programming search for a patch that addresses

a given bug.• Render the search tractable by restricting the search

space intelligently.It works!

• Fixes a variety of bugs in a variety of programs.• Repaired 60 of 105 bugs for < $8 each, on average.

Benchmarks/results/source code/VM images available:• http://genprog.cs.virginia.edu

CONCLUSIONS

Page 153: Automatic program repair using genetic programming

Claire Le Goues 153http://www.clairelegoues.com

I LOVE QUESTIONS.

Page 154: Automatic program repair using genetic programming

Claire Le Goues 154http://www.clairelegoues.com

WHICH BUGS…?Slightly more likely to fix bugs where the human:

• restricts the repair to statements.• touched fewer files.

As fault space decreases, success increases, repair time decreases.As fix space increases, repair time decreases.

Page 155: Automatic program repair using genetic programming

Claire Le Goues 155http://www.clairelegoues.com

SEARCH SPACE

Y = 0.8x + 0.02R2 = 0.63

Page 156: Automatic program repair using genetic programming

Claire Le Goues 156http://www.clairelegoues.com

Minimization step: try removing each line in the patch, check if the result still passes all tests Delta Debugging finds a 1-minimal subset of the diff in O(n2) time We use a tree-structured diff algorithm (diffX)

• Avoids problems with balanced curly braces, etc.

Takes significantly less time than finding the initial repair repair.

CODE BLOAT

Page 157: Automatic program repair using genetic programming

Claire Le Goues 157http://www.clairelegoues.com

1. class test_class { 2. public function __get($n) 3. { return $this; %$ }4. public function b()5. { return; }6. }7. global $test3; 8. $test3 = new test_class(); 9. $test3->a->b();

EXAMPLE: PHP BUG #54372 Relevant code: function zend_std_read_property in zend_object_handlers.cNote: memory management uses reference counting.Problem: this line:449.zval_ptr_dtor(object)If object points to $this and $this is global, its memory is completely freed, even though we could access $this later.

Expected output: nothingBuggy output: crash on line 9.

Page 158: Automatic program repair using genetic programming

Claire Le Goues 158http://www.clairelegoues.com

GenProg :% 448c448,451> Z_ADDROF_P(object);> if (PZVAL_IS_REF(object)) > {> SEPARATE_ZVAL(&object);> } zval_ptr_dtor(&object)

EXAMPLE: PHP BUG #54372Human : % 449c449,453 < zval_ptr_dtor(&object);> if (*retval != object)> { // expected> zval_ptr_dtor(&object);> } else {> Z_DELREF_P(object);> }

Page 159: Automatic program repair using genetic programming

Claire Le Goues 159http://www.clairelegoues.com

SCALABLE: REPRESENTATION

1

2

54

Naïve:

1

2

4 5

5’

1

32

54

Input:

New:

Delete(3)

Replace(3,5)

Page 160: Automatic program repair using genetic programming

Claire Le Goues 160http://www.clairelegoues.com

SCALABLE: FITNESSFitness:

• Subsample test cases.

• Evaluate in parallel.Random runs:

• Multiple simultaneous runs on different seeds.

Page 161: Automatic program repair using genetic programming

Claire Le Goues 161http://www.clairelegoues.com