1 Today Another approach to “coverage” Cover “everything” – within a well-defined,...

Post on 20-Dec-2015

218 views 3 download

Tags:

Transcript of 1 Today Another approach to “coverage” Cover “everything” – within a well-defined,...

1

Today

Another approach to “coverage”• Cover “everything” – within a well-defined,

feasible limit• Bounded Exhaustive Testing

2

Coverage Revisited

With some kinds of coverage, we expect to be able to reach 100% coverage

if (x < y){ y = 0; x = x + 1;}else{ x = y;}

4

1

2 3

x >= yx < y

x = yy = 0

x = x + 1

Statement coverage:Cover every reachablestatement

3

Coverage Revisited

With some kinds of coverage, we expect to be able to reach 100% coverage

if (x < y){ y = 0; x = x + 1;}

3

1

2x >= y

x < y

y = 0x = x + 1

Branch coverage

4

Coverage Revisited

With some kinds of coverage, we expect to be able to reach 100% coverage

b1 b2

b3

Input space partitioning

5

Coverage Revisited

With other kinds of coverage, we know that reaching 100% is very difficult, perhaps completely infeasible

if (x < y){ y = 0; x = x + 1;}else{ x = y;}

if (x < y)

{

y = 0;

x = x + 1;

}

4

1

2 3

x >= yx < y

x = yy = 0

x = x + 1

6

4

5x >= y

x < y

y = 0x = x + 1

Path coverage: exponential inthe number of conditional branches!

6

Coverage Revisited

With other kinds of coverage, we know that reaching 100% is very difficult, perhaps completely infeasible

State space coverage

7

Coverage Revisited

With other kinds of coverage, we know that reaching 100% is very difficult, perhaps completely infeasible

Complete input coverage

4 operations x

0, 1, 2, 3, 4… 232 x

0, 1, 2, 3, 4… 232 x

0, 1, 2, 3, 4… 232 =

TOO MUCH

8

Coverage Revisited

What if we aim for exhaustive coverage, but arbitrarily limit the size somehow?• Daniel Jackson: “small scope

hypothesis”

“Our approach is simply to truncate the state space artificially, checking only within some finite bounds.” - Jackson and Damon, “Elements of Style, Analyzing a Software Design Feature with a Counterexample Detector

9

Small Scope Hypothesis

Remember “downward scalability”?

Idea related (in a hand-waving fashion) to small model property in logics:• For some logical formulas, though in

principle the variables may have infinite domains, it can be shown that you only have to consider a finite set of individuals when checking for satisfiability

• Bounded by length of the formula

10

Small Scope Hypothesis

The idea in a nutshell:• Most faults can be exposed by some “short”

failing trace• Short may mean in # of operations• Short may mean in complexity of input

structures• How many nodes in the red-black tree?

• Short may mean something else entirely• Number of voluntary thread context switches• Small flash device• Small # of different pathname components• Bounded pathname length

11

Small Scope Hypothesis

Obvious dangerous exception: resource bound violations• Can be handled by “shrinking” the resource

bound• May require “shrinking” types in a program

• E.g., converting some ints to chars• May be difficult, depending on program and

language

12

“All” Binary Trees of Size 3

N0

N1

N2left

left

N0

N1

N2

right

right

N0

N1

N2left

rightN0

N1

N2

left

right

N0

N1 N2

left right

13

“All” Binary Trees of Size 3

N0

N1

N2left

left

N0

N1

N2

right

right

N0

N1

N2left

rightN0

N1

N2

left

right

N0

N1 N2

left right

Do we care about the actual elements?

14

“All” Binary Trees of Size 3

N0

N1

N2left

left

N0

N1

N2

right

right

N0

N1

N2left

rightN0

N1

N2

left

right

N0

N1 N2

left right

What if these were red-black trees?

In general: exploit isomorphisms

15

Enumerating “All” Inputs

For more complex data structures, enumerating all valid inputs may be a complex problem• E.g., all “programs” (random ASCII

sequences) of length 100 vs.• All parseable C programs of length 100

• May require enumeration by a constraint solver

16

Enumerating “All” Inputs

Enumeration may require staging• Constraint solver generates a set of abstract

inputs, satisfying the constraints• A postprocessor then concretizes each

abstract input into a (large) set of concrete inputs

17

Bounded (Depth) Model Checking

All states reachable with a path of no more thank transitions

Or in which no loop executes more thank times…

18

Bounded (Depth) Model Checking

Can’t quite do the naïve thing:

pan –m1000

Won’t guarantee reaching everystate reachable in 1,000 steps

Why not?

19

Iterative Bounding

Typical approach is iterative• Start with a small bound• If you find a failing trace, fix the software or

the specification and repeat• If you can show no faults with bound k

• Increase the bound and repeat until you can’t exhaustively test for the given bound

• Traditional search technique, iterative deepening

20

Limitations to Scope/Bound

Stop increasing scope for one of two reasons:• Difficulty of generating all inputs

• Clever approaches can often deal with this• Difficulty of executing all the test cases!

• This one is more fundamental• With large enough k, exhaustive coverage

become “exhausting”

21

Evaluation

How does bounded exhaustive testing stack up against random testing?• Hard to say in general• Marinov, Andoni, Daniliuc, Khurshid, and

Rinard at MIT tried to look at this question for some Java programs

• “An Evaluation of Exhaustive Testing for Data Structures”

22

Evaluation

Marinov et al.: let’s use mutation testing and kill rates to compare• Generated all tests for limit k• Generated all tests for limit k-1• Compared the mutation kill rate for the

complete k-1 tests to the rate for a random subset of the k tests

• Subset same size as complete k-1 tests

23

Evaluation

K Tests

K-1 Tests

Randomsubset

24

Evaluation

Benchmark Scope Kill-rate Scope-1SearchTree 7 99.26% =DisjSet 5 95.06% =HeapArray 7 95.99% <BinomialHeap 7 95.10% <FibonacciHeap 5 86.87% >LinkedList 7 99.59% =SortedList 7 96.40% <TreeMap 7 89.08% <HashSet 7 91.39% <AVTree 5 93.17% >

Cases where bounded exhaustive testing beat random testing

25

Evaluation

More interesting facts:• Scope correlated highly with statement

coverage• Even after statement coverage was

complete, increasing scope increased the mutant kill rate

26

CHESS

Now we’ll look at one of the most interesting bounded exhaustive testing approaches• Context bounding• Idea: limit number of pre-emptive context

switches by threads

27

Review

Before I hand over to Klaus for good…

28

Black box (Finite State Machine) testing

Design for testability

Coverage measures

Random testing

Constraint-based testing

Debugging and test case minimization

Using model checkers for testing

Coverage revisited (“small model property”)

Topics in Testing We’ve Covered

29

Black box (Finite State Machine) testing

• There “are no Turing machines”

• Vasilevskii and Chow algorithm for conformance testing based on spanning trees and distinguishing sets

• Exhaustive testing that cannot miss bugs is often computationally intractable

Topics in Testing We’ve Covered

a

a b

d

30

Design for testability

• Controllability and observability

• Simulation and stubbing, assertions, downward scalability, etc.

Topics in Testing We’ve Covered

31

Coverage measures

• Not necessarily correlated with fault detection!• Still useful!

• Graph coverage: node and edge (statement and branch coverage)

• Logic coverage• Input space partitioning• Syntax-based coverage

Topics in Testing We’ve Covered

4

1

2 3

x >= yx < y

x = yy = 0

x = x + 1

b1 b2

b3

b1 b2

b3

((a <= b) && !G) || (x >= y)

32

Random testing

• Generate inputs at random• Explore very large numbers of executions• Relies on a good automatic test oracle• Feedback to bias choices away from

redundant and irrelevant inputs is useful

• Good baseline for evaluating other methods, and often very effective

Topics in Testing We’ve Covered

33

Constraint-based testing

• Addresses weaknesses of random testing• E.g., finding needles in haystacks, such as

where hash(x) = y

• Combines concrete and symbolic execution to generate inputs

• Concrete execution helps where symbolic solvers choke

Topics in Testing We’ve Covered

34

Debugging and test case minimization

• Automatic minimization of test cases is very valuable for debugging and reducing regression suite size

• Debugging can be considered as an application of the scientific method

• Various techniques exist for using test cases to localize faults

Topics in Testing We’ve Covered

35

Using model checkers for testing

• Testing based on states, rather than on executions or paths

• Use abstractions to reducestate space

• Use automatic instrumentationto handle the engineeringdifficulties

Topics in Testing We’ve Covered

36

Any Questions???