Nighthawk: A Two-Level Genetic-Random Unit Test Data Generator

37
Nighthawk: A Two-Level Genetic-Random Unit Test Data Generator Jamie Andrews and Felix C. H. Li Department of Computer Science University of Western Ontario Tim Menzies Lane Department of Computer Science West Virginia University

description

A talk to ASE 2007 by Jamie Andrews, Felix C. H. Li Department of Computer Science University of Western Ontario; and Tim Menzies Lane Department of Computer Science West Virginia University

Transcript of Nighthawk: A Two-Level Genetic-Random Unit Test Data Generator

Nighthawk:A Two-Level Genetic-Random

Unit Test Data Generator

Jamie Andrewsand Felix C. H. Li

Department of Computer ScienceUniversity of Western Ontario

Tim MenziesLane Department of Computer Science

West Virginia University

Andrews -- ASE 2007 -- Atlanta Nov. 8 2

Plan of Talk• Randomized Unit Testing• Genetic Algorithms (GA)• Nighthawk: randomized testing level• Nighthawk: GA level• Empirical studies

Andrews -- ASE 2007 -- Atlanta Nov. 8 3

Randomized Testing

• Generate inputs using randomization

• Challenges:– Thoroughness?– Test oracle?

Andrews -- ASE 2007 -- Atlanta Nov. 8 4

Effectiveness of Randomized Testing

• Analysis:– Duran & Ntafos, Hamlet & Taylor, Jeng &

Weyuker, ...

• Empirical:– Miller (Fuzz, 1990s), Claessen & Hughes

(QuickCheck, 2000), Andrews (ASE 2004), Pacheco et al., Groce et al. (ICSE 2007)

• Why?– Generate many different test inputs cheaply– Effective and thorough if set up right

Andrews -- ASE 2007 -- Atlanta Nov. 8 5

Unit Testing

• Test case = seq of method calls

• Each call possibly:– Preceded by

argument setup– Followed by

result evaluation

TreeMap t = new TreeMap();Employee e = new

Employee("W");t.put(e, 43);t.remove(e);assert t.size() == 0;

How many TreeMaps to

store?

When to reuse

TreeMaps?

Andrews -- ASE 2007 -- Atlanta Nov. 8 6

Randomized Unit Testing

• Randomization of:– Methods called– Arguments selected

• Challenges:– Test oracle – JML, Java assertions, ...– Thoroughness?

Andrews -- ASE 2007 -- Atlanta Nov. 8 7

Example: TreeMap

• Create pool of n Employees• Randomly put(), remove() Employees• n=10000: remove() usually fails at

first– Doesn't cover emptying out tree

• n=2: tree doesn't get big– Doesn't cover many branches

• n=30: "just right"

Andrews -- ASE 2007 -- Atlanta Nov. 8 8

Genetic Algorithms (GA)

• Chromosomes encode solutions

Andrews -- ASE 2007 -- Atlanta Nov. 8 9

Genetic Algorithms (GA)

• Mutation

Andrews -- ASE 2007 -- Atlanta Nov. 8 10

Genetic Algorithms (GA)

• Recombination

Andrews -- ASE 2007 -- Atlanta Nov. 8 11

Genetic Algorithms (GA)

• Fitness function chooses "survivors"

Andrews -- ASE 2007 -- Atlanta Nov. 8 12

GAs and Testing

• Evolving individual test cases:– Guo et al., FATES 2003– Tonella, ISSTA 2004

Andrews -- ASE 2007 -- Atlanta Nov. 8 13

Nighthawk:Randomized Testing Level• Input: set M of target methods;

chromosome c• Output: fitness of chromosome• Algorithm overview: select and

run one randomized unit test case, measure coverage

• Random choices partly controlled by chromosome c

Andrews -- ASE 2007 -- Atlanta Nov. 8 14

Randomized Testing Level: Details

• Populate "value pools" for every relevant class

• Repeat L times:– Choose target method "randomly"– Choose receiver, params "randomly"

from value pools– Call method– Place return value in value pool

Andrews -- ASE 2007 -- Atlanta Nov. 8 15

Value Pools and MethodsTreeMap intEmployee

..

.. ..

. ..

. ..

..

.. ...

. . .

...

t.put(e, i);

"value reuse policy"

Andrews -- ASE 2007 -- Atlanta Nov. 8 16

GA Level: Chromosomes

• Chromosome = set of genes• Each gene controls aspect of

randomized testing algorithm

Andrews -- ASE 2007 -- Atlanta Nov. 8 17

Genes

• Genes answer questions like:– How long is the test case?– How often do we choose method m?– How many value pools?– How do we construct int value pools?– What is the value reuse policy?

• Where do I get this parameter from?• Where do I put the result value?

Andrews -- ASE 2007 -- Atlanta Nov. 8 18

Initialization

Andrews -- ASE 2007 -- Atlanta Nov. 8 19

Cloning

Andrews -- ASE 2007 -- Atlanta Nov. 8 20

Mutation, Recombination

Andrews -- ASE 2007 -- Atlanta Nov. 8 21

Fitness Evaluation

43244300 4288 369635593331 32783277 3000

Andrews -- ASE 2007 -- Atlanta Nov. 8 22

Sorting

4324 4300 4288 3696 3559 3331 3278 3277 3000

Andrews -- ASE 2007 -- Atlanta Nov. 8 23

Retention

4324 4300 4288

Andrews -- ASE 2007 -- Atlanta Nov. 8 24

Fitness Function

(number of lines covered) * 1000- (number of method calls)

brake on test case length

reward for high coverage

Andrews -- ASE 2007 -- Atlanta Nov. 8 25

Empirical Evaluation

• Comparison to previous studies• Case study: Collection and Map

classes• Comparison of option settings

Andrews -- ASE 2007 -- Atlanta Nov. 8 26

Comparison to Previous Studies

• Compared to:– Michael et al. TSE 2001: straight GA– Visser et al. ISSTA 2006: JPF with state

matching– Pacheco et al. ICSE 2007: extending test

sequences randomly

• Achieved same coverage when run with same constraints

• Achieved more coverage when run with no constraints

Andrews -- ASE 2007 -- Atlanta Nov. 8 27

Case Study: java.util

• Applied Nighthawk to all 16 Collection and Map classes from java.util 1.5.0

• Measured line coverage, clock time• Compared option settings

Andrews -- ASE 2007 -- Atlanta Nov. 8 28

Results – Coverage (Lines)

Source SLOC PN EN PD ED

ArrayList

150 111 140 109 140 (.93)

EnumMap

239 7 9 10 7 (.03)

HashMap

360 238 265 305 347 (.96)

HashSet 46 24 40 26 44 (.96)

Hashtable

355 205 253 252 325 (.92)

...

Enriched test wrappers Deep target analysis Both

Andrews -- ASE 2007 -- Atlanta Nov. 8 29

Results – Time (Clock Sec.)

Source PN EN PD ED

ArrayList

75 91 29 48

EnumMap

3 9 6 5

HashMap

63 37 136 176

HashSet 25 29 27 39

Hashtable

8 110 110 157

...

Andrews -- ASE 2007 -- Atlanta Nov. 8 30

Results - Analysis

• Enriched wrappers, Deep analysis better– Overall coverage 82% of lines

• Deep analysis also took longer– Still less than 100 sec/class avg

• EnumMap (3%): constructor expects enumerated type– Customized wrapper:

• 85% coverage of EnumMap• Raises overall coverage to 88%

Andrews -- ASE 2007 -- Atlanta Nov. 8 31

Conclusions

• Metaheuristic search can find effective parameters for randomized testing

• Only needs info about methods and parameter types

• Efficiency acceptable

Andrews -- ASE 2007 -- Atlanta Nov. 8 32

Future Work

• Which metaheuristic search approach?– Genetic algorithms– Simulated annealing ...

• Which genes are really necessary?• Which coverage criterion?

Andrews -- ASE 2007 -- Atlanta Nov. 8 33

Thank you!

Andrews -- ASE 2007 -- Atlanta Nov. 8 34

Oracle: Test Wrapper Class

• Test wrapper class: methods that call methods of target class

public class TreeMapTestWrapper { private TreeMap target; ... void put(Object key, Object value) { // insert preconditions target.put(key, value); // insert oracle assertions} }

Andrews -- ASE 2007 -- Atlanta Nov. 8 35

Enriched (E) Test Wrappers

• Add methods to test:– Serialization– Typed equals()

• Typically cover more code

Andrews -- ASE 2007 -- Atlanta Nov. 8 36

Normal (N) Target Analysis

• Red = call all methods• Green = call only constructors

B: Classes ofparams of A methods

A: Classes named by user

All classes

Andrews -- ASE 2007 -- Atlanta Nov. 8 37

Deep (D) Target Analysis

• Red = call all methods• Green = call only constructors

B: Classes ofparams of A methods

C: Classes ofparams of B methods

A: Classes named by user

All classes