1 Today Another approach to “coverage” Cover “everything” – within a well-defined,...

36
1 Today Another approach to “coverage” Cover “everything” – within a well- defined, feasible limit Bounded Exhaustive Testing
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    218
  • download

    3

Transcript of 1 Today Another approach to “coverage” Cover “everything” – within a well-defined,...

Page 1: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

1

Today

Another approach to “coverage”• Cover “everything” – within a well-defined,

feasible limit• Bounded Exhaustive Testing

Page 2: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

2

Coverage Revisited

With some kinds of coverage, we expect to be able to reach 100% coverage

if (x < y){ y = 0; x = x + 1;}else{ x = y;}

4

1

2 3

x >= yx < y

x = yy = 0

x = x + 1

Statement coverage:Cover every reachablestatement

Page 3: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

3

Coverage Revisited

With some kinds of coverage, we expect to be able to reach 100% coverage

if (x < y){ y = 0; x = x + 1;}

3

1

2x >= y

x < y

y = 0x = x + 1

Branch coverage

Page 4: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

4

Coverage Revisited

With some kinds of coverage, we expect to be able to reach 100% coverage

b1 b2

b3

Input space partitioning

Page 5: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

5

Coverage Revisited

With other kinds of coverage, we know that reaching 100% is very difficult, perhaps completely infeasible

if (x < y){ y = 0; x = x + 1;}else{ x = y;}

if (x < y)

{

y = 0;

x = x + 1;

}

4

1

2 3

x >= yx < y

x = yy = 0

x = x + 1

6

4

5x >= y

x < y

y = 0x = x + 1

Path coverage: exponential inthe number of conditional branches!

Page 6: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

6

Coverage Revisited

With other kinds of coverage, we know that reaching 100% is very difficult, perhaps completely infeasible

State space coverage

Page 7: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

7

Coverage Revisited

With other kinds of coverage, we know that reaching 100% is very difficult, perhaps completely infeasible

Complete input coverage

4 operations x

0, 1, 2, 3, 4… 232 x

0, 1, 2, 3, 4… 232 x

0, 1, 2, 3, 4… 232 =

TOO MUCH

Page 8: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

8

Coverage Revisited

What if we aim for exhaustive coverage, but arbitrarily limit the size somehow?• Daniel Jackson: “small scope

hypothesis”

“Our approach is simply to truncate the state space artificially, checking only within some finite bounds.” - Jackson and Damon, “Elements of Style, Analyzing a Software Design Feature with a Counterexample Detector

Page 9: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

9

Small Scope Hypothesis

Remember “downward scalability”?

Idea related (in a hand-waving fashion) to small model property in logics:• For some logical formulas, though in

principle the variables may have infinite domains, it can be shown that you only have to consider a finite set of individuals when checking for satisfiability

• Bounded by length of the formula

Page 10: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

10

Small Scope Hypothesis

The idea in a nutshell:• Most faults can be exposed by some “short”

failing trace• Short may mean in # of operations• Short may mean in complexity of input

structures• How many nodes in the red-black tree?

• Short may mean something else entirely• Number of voluntary thread context switches• Small flash device• Small # of different pathname components• Bounded pathname length

Page 11: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

11

Small Scope Hypothesis

Obvious dangerous exception: resource bound violations• Can be handled by “shrinking” the resource

bound• May require “shrinking” types in a program

• E.g., converting some ints to chars• May be difficult, depending on program and

language

Page 12: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

12

“All” Binary Trees of Size 3

N0

N1

N2left

left

N0

N1

N2

right

right

N0

N1

N2left

rightN0

N1

N2

left

right

N0

N1 N2

left right

Page 13: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

13

“All” Binary Trees of Size 3

N0

N1

N2left

left

N0

N1

N2

right

right

N0

N1

N2left

rightN0

N1

N2

left

right

N0

N1 N2

left right

Do we care about the actual elements?

Page 14: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

14

“All” Binary Trees of Size 3

N0

N1

N2left

left

N0

N1

N2

right

right

N0

N1

N2left

rightN0

N1

N2

left

right

N0

N1 N2

left right

What if these were red-black trees?

In general: exploit isomorphisms

Page 15: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

15

Enumerating “All” Inputs

For more complex data structures, enumerating all valid inputs may be a complex problem• E.g., all “programs” (random ASCII

sequences) of length 100 vs.• All parseable C programs of length 100

• May require enumeration by a constraint solver

Page 16: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

16

Enumerating “All” Inputs

Enumeration may require staging• Constraint solver generates a set of abstract

inputs, satisfying the constraints• A postprocessor then concretizes each

abstract input into a (large) set of concrete inputs

Page 17: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

17

Bounded (Depth) Model Checking

All states reachable with a path of no more thank transitions

Or in which no loop executes more thank times…

Page 18: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

18

Bounded (Depth) Model Checking

Can’t quite do the naïve thing:

pan –m1000

Won’t guarantee reaching everystate reachable in 1,000 steps

Why not?

Page 19: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

19

Iterative Bounding

Typical approach is iterative• Start with a small bound• If you find a failing trace, fix the software or

the specification and repeat• If you can show no faults with bound k

• Increase the bound and repeat until you can’t exhaustively test for the given bound

• Traditional search technique, iterative deepening

Page 20: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

20

Limitations to Scope/Bound

Stop increasing scope for one of two reasons:• Difficulty of generating all inputs

• Clever approaches can often deal with this• Difficulty of executing all the test cases!

• This one is more fundamental• With large enough k, exhaustive coverage

become “exhausting”

Page 21: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

21

Evaluation

How does bounded exhaustive testing stack up against random testing?• Hard to say in general• Marinov, Andoni, Daniliuc, Khurshid, and

Rinard at MIT tried to look at this question for some Java programs

• “An Evaluation of Exhaustive Testing for Data Structures”

Page 22: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

22

Evaluation

Marinov et al.: let’s use mutation testing and kill rates to compare• Generated all tests for limit k• Generated all tests for limit k-1• Compared the mutation kill rate for the

complete k-1 tests to the rate for a random subset of the k tests

• Subset same size as complete k-1 tests

Page 23: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

23

Evaluation

K Tests

K-1 Tests

Randomsubset

Page 24: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

24

Evaluation

Benchmark Scope Kill-rate Scope-1SearchTree 7 99.26% =DisjSet 5 95.06% =HeapArray 7 95.99% <BinomialHeap 7 95.10% <FibonacciHeap 5 86.87% >LinkedList 7 99.59% =SortedList 7 96.40% <TreeMap 7 89.08% <HashSet 7 91.39% <AVTree 5 93.17% >

Cases where bounded exhaustive testing beat random testing

Page 25: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

25

Evaluation

More interesting facts:• Scope correlated highly with statement

coverage• Even after statement coverage was

complete, increasing scope increased the mutant kill rate

Page 26: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

26

CHESS

Now we’ll look at one of the most interesting bounded exhaustive testing approaches• Context bounding• Idea: limit number of pre-emptive context

switches by threads

Page 27: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

27

Review

Before I hand over to Klaus for good…

Page 28: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

28

Black box (Finite State Machine) testing

Design for testability

Coverage measures

Random testing

Constraint-based testing

Debugging and test case minimization

Using model checkers for testing

Coverage revisited (“small model property”)

Topics in Testing We’ve Covered

Page 29: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

29

Black box (Finite State Machine) testing

• There “are no Turing machines”

• Vasilevskii and Chow algorithm for conformance testing based on spanning trees and distinguishing sets

• Exhaustive testing that cannot miss bugs is often computationally intractable

Topics in Testing We’ve Covered

a

a b

d

Page 30: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

30

Design for testability

• Controllability and observability

• Simulation and stubbing, assertions, downward scalability, etc.

Topics in Testing We’ve Covered

Page 31: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

31

Coverage measures

• Not necessarily correlated with fault detection!• Still useful!

• Graph coverage: node and edge (statement and branch coverage)

• Logic coverage• Input space partitioning• Syntax-based coverage

Topics in Testing We’ve Covered

4

1

2 3

x >= yx < y

x = yy = 0

x = x + 1

b1 b2

b3

b1 b2

b3

((a <= b) && !G) || (x >= y)

Page 32: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

32

Random testing

• Generate inputs at random• Explore very large numbers of executions• Relies on a good automatic test oracle• Feedback to bias choices away from

redundant and irrelevant inputs is useful

• Good baseline for evaluating other methods, and often very effective

Topics in Testing We’ve Covered

Page 33: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

33

Constraint-based testing

• Addresses weaknesses of random testing• E.g., finding needles in haystacks, such as

where hash(x) = y

• Combines concrete and symbolic execution to generate inputs

• Concrete execution helps where symbolic solvers choke

Topics in Testing We’ve Covered

Page 34: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

34

Debugging and test case minimization

• Automatic minimization of test cases is very valuable for debugging and reducing regression suite size

• Debugging can be considered as an application of the scientific method

• Various techniques exist for using test cases to localize faults

Topics in Testing We’ve Covered

Page 35: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

35

Using model checkers for testing

• Testing based on states, rather than on executions or paths

• Use abstractions to reducestate space

• Use automatic instrumentationto handle the engineeringdifficulties

Topics in Testing We’ve Covered

Page 36: 1 Today Another approach to “coverage” Cover “everything” – within a well-defined, feasible limit Bounded Exhaustive Testing.

36

Any Questions???