Nighthawk: A Two-Level Genetic-Random Unit Test Data Generator

Nighthawk:A Two-Level Genetic-Random

Unit Test Data Generator

Jamie Andrewsand Felix C. H. Li

Department of Computer ScienceUniversity of Western Ontario

Tim MenziesLane Department of Computer Science

West Virginia University

Andrews -- ASE 2007 -- Atlanta Nov. 8 2

Plan of Talk• Randomized Unit Testing• Genetic Algorithms (GA)• Nighthawk: randomized testing level• Nighthawk: GA level• Empirical studies


Randomized Testing

• Generate inputs using randomization

• Challenges:– Thoroughness?– Test oracle?


Effectiveness of Randomized Testing

• Analysis:– Duran & Ntafos, Hamlet & Taylor, Jeng &

Weyuker, ...

• Empirical:– Miller (Fuzz, 1990s), Claessen & Hughes

(QuickCheck, 2000), Andrews (ASE 2004), Pacheco et al., Groce et al. (ICSE 2007)

• Why?– Generate many different test inputs cheaply– Effective and thorough if set up right


Unit Testing

• Test case = seq of method calls

• Each call possibly:– Preceded by

argument setup– Followed by

result evaluation

TreeMap t = new TreeMap();Employee e = new

Employee("W");t.put(e, 43);t.remove(e);assert t.size() == 0;

How many TreeMaps to

store?

When to reuse

TreeMaps?


Randomized Unit Testing

• Randomization of:– Methods called– Arguments selected

• Challenges:– Test oracle – JML, Java assertions, ...– Thoroughness?


Example: TreeMap

• Create pool of n Employees• Randomly put(), remove() Employees• n=10000: remove() usually fails at

first– Doesn't cover emptying out tree

• n=2: tree doesn't get big– Doesn't cover many branches

• n=30: "just right"


Genetic Algorithms (GA)

• Chromosomes encode solutions



• Mutation



• Recombination



• Fitness function chooses "survivors"


GAs and Testing

• Evolving individual test cases:– Guo et al., FATES 2003– Tonella, ISSTA 2004


Nighthawk:Randomized Testing Level• Input: set M of target methods;

chromosome c• Output: fitness of chromosome• Algorithm overview: select and

run one randomized unit test case, measure coverage

• Random choices partly controlled by chromosome c


Randomized Testing Level: Details

• Populate "value pools" for every relevant class

• Repeat L times:– Choose target method "randomly"– Choose receiver, params "randomly"

from value pools– Call method– Place return value in value pool


Value Pools and MethodsTreeMap intEmployee

..

.. ..

. ..

. ..

..

.. ...

. . .

...

t.put(e, i);

"value reuse policy"


GA Level: Chromosomes

• Chromosome = set of genes• Each gene controls aspect of

randomized testing algorithm


Genes

• Genes answer questions like:– How long is the test case?– How often do we choose method m?– How many value pools?– How do we construct int value pools?– What is the value reuse policy?

• Where do I get this parameter from?• Where do I put the result value?


Initialization


Cloning


Mutation, Recombination


Fitness Evaluation

43244300 4288 369635593331 32783277 3000


Sorting

4324 4300 4288 3696 3559 3331 3278 3277 3000


Retention

4324 4300 4288


Fitness Function

(number of lines covered) * 1000- (number of method calls)

brake on test case length

reward for high coverage


Empirical Evaluation

• Comparison to previous studies• Case study: Collection and Map

classes• Comparison of option settings


Comparison to Previous Studies

• Compared to:– Michael et al. TSE 2001: straight GA– Visser et al. ISSTA 2006: JPF with state

matching– Pacheco et al. ICSE 2007: extending test

sequences randomly

• Achieved same coverage when run with same constraints

• Achieved more coverage when run with no constraints


Case Study: java.util

• Applied Nighthawk to all 16 Collection and Map classes from java.util 1.5.0

• Measured line coverage, clock time• Compared option settings


Results – Coverage (Lines)

Source SLOC PN EN PD ED

ArrayList

150 111 140 109 140 (.93)

EnumMap

239 7 9 10 7 (.03)

HashMap

360 238 265 305 347 (.96)

HashSet 46 24 40 26 44 (.96)

Hashtable

355 205 253 252 325 (.92)

...

Enriched test wrappers Deep target analysis Both


Results – Time (Clock Sec.)

Source PN EN PD ED

ArrayList

75 91 29 48

EnumMap

3 9 6 5

HashMap

63 37 136 176

HashSet 25 29 27 39

Hashtable

8 110 110 157

...


Results - Analysis

• Enriched wrappers, Deep analysis better– Overall coverage 82% of lines

• Deep analysis also took longer– Still less than 100 sec/class avg

• EnumMap (3%): constructor expects enumerated type– Customized wrapper:

• 85% coverage of EnumMap• Raises overall coverage to 88%


Conclusions

• Metaheuristic search can find effective parameters for randomized testing

• Only needs info about methods and parameter types

• Efficiency acceptable


Future Work

• Which metaheuristic search approach?– Genetic algorithms– Simulated annealing ...

• Which genes are really necessary?• Which coverage criterion?


Thank you!


Oracle: Test Wrapper Class

• Test wrapper class: methods that call methods of target class

public class TreeMapTestWrapper { private TreeMap target; ... void put(Object key, Object value) { // insert preconditions target.put(key, value); // insert oracle assertions} }


Enriched (E) Test Wrappers

• Add methods to test:– Serialization– Typed equals()

• Typically cover more code


Normal (N) Target Analysis

• Red = call all methods• Green = call only constructors

B: Classes ofparams of A methods

A: Classes named by user

All classes


Deep (D) Target Analysis

• Red = call all methods• Green = call only constructors

B: Classes ofparams of A methods

C: Classes ofparams of B methods

A: Classes named by user

All classes

Nighthawk: A Two-Level Genetic-Random Unit Test Data Generator

Technology

Transcript of Nighthawk: A Two-Level Genetic-Random Unit Test Data Generator