12. Testing and Debugging - Testing, Bug Fixing and Debugging the Code
Testing - Queen's University · – To provide data for debugging. 8 Testing Techniques • Black...
Transcript of Testing - Queen's University · – To provide data for debugging. 8 Testing Techniques • Black...
Software TestingSoftware Testing
CISC 323Winter 2006
Prof. Lamb Prof. [email protected] [email protected]
2
Included in Courseware
• Reference material for the lectures
– 19 pages from “Software Testing, A Craftsman’s Approach”, by Paul C. Jorgensen
3
All Testing is Sampling
How can we sample in such a way that it is– Efficient– Effective
• Another problem specifically for scientific and engineering software– Lack of accurate and detailed oracles
• Often use the TLAR method (That Looks About Right!)
Illustration of difficulties: next slides
4
5
6
7
Testing Techniques
• Purpose– To provide reasoned ways of generating test cases– To produce test cases systematically
• Might allow some sort of test case automation
– To help in reproducibility of the tests– To address the”when are we done?” problem
• Gives some level of confidence that enough bugs have been found
– To allow better planning– To allow the use of different techniques to address different
issues– To provide data for debugging
8
Testing Techniques
• Black box testing– Also known as functional testing– Testing without opening the software box– We’ll look at examples of testing by considering
the input data
• White box testing– Also known as structural testing– Testing by considering what the code looks like– We’ll look at examples of code coverage
9
Boundary Value Testing(Black Box Testing)
• Focuses on the boundaries of the input space to identify test cases– Based on the finding that errors seem to cluster at boundaries
• Eg. code checks for <= instead of <
• Assumes– software failures are rarely the result of the simultaneous
occurrence of two or more faults
• Test cases obtained by– Holding the values of all but one input variable at their nominal
values and letting that one variable assume its extreme values• Minimum, just above minimum, nominal, just below maximum,
maximum
10
Boundary Value Testing Example
X1
X2
a b
c
d
For a function of n variables, there are 4n+1 test cases
11
Robustness Testing
• Extension of boundary value testing– See what happens when the extrema are
exceeded• Value slightly greater than maximum• Value slightly less than minimum
– Forces attention on exception handling• what happens or should happen when values fall outside
their ranges
12
Weaknesses of Boundary Value and Robustness Testing
• Assumption of independent failures
• No consideration of interactions between input values
• Meaning of the variables not taken into account
• Example– {month, day}
• (02,31) doesn’t make semantic sense, yet this may not be tested by this technique
13
Worst Case Testing
• Reject the single fault assumption– Investigate what happens when more than one
variable at a time is assigned an extreme value
• For a function of n variables– Generates 5n test cases
• Limitations– No consideration of interactions between input
values– Meaning of the variables not taken into account
14
Special Value Testing
• Tester uses his/her domain knowledge to identify risky parts of the software system– Technique that is most dependent on the skills of
the tester– No guidelines – use best engineering judgment– Can be used to “fill in” extra test cases not
generated by any of the systematic techniques
15
Error in Courseware
• Software Testing A Craftsman’s Approach• Page 62, Table 1
– What’s the goof?
16
Equivalence Class Testing(Black Box Testing)
• Technique– Divide the input domain into classes
• So that the requirements and/or design specifications require exactly the same behavior from each value in the class
– Assumption• The program is constructed so that it either succeeds or
fails for each of the values in that class
– For robustness• For each equivalence class of valid inputs, we devise
equivalence classes of invalid inputs
17
Equivalence Class Testing
• Technique (cont’d)– Assume the equivalence classes are disjoint– Construct tests choosing one value from each
appropriate equivalence class• Values are usually drawn at random each time the test is
run
– An equivalence class is considered covered if at least one value has been selected from it
• There is research that says this is sufficient for effective testing
– The goal is to cover all equivalence classes
18
Equivalence Class TestingExample – Registry of Your Pet
• Fields:– Classification:
• Dog– age, breed, colour, M/F, neutered
• Cat– age, colour, M/F, neutered
• Exotic:– Mammal: species, age, restricted, endangered, origin– Other: Phylum, Class, Order, Family, Genus,
Species, restricted, endangered, origin
19
Equivalence Class TestingExample – Registry of Your Pet
• Possible Valid Equivalence Classes– Classification:
• Dog, Cat, Exotic– Age:
• 0-200– Breed
• Dalmatian, boxer, poodle, …– Colour
• Black, white, brown, gray, calico, tiger, …
– M/F– Neutered – Y/N
– Exotic – Mammal, other
– Mammal Species• Monkey, ape, tiger, cheetah,
lemur, wolf, raccoon, harbourseal, …
– Restricted – Y/N
– Endangered – Y/N
– Origin• Antarctica, Algeria,
Australia, Brazil, …– Phylum – alpha (0-40)
– Class – alpha (0-40)
– Order – alpha (0-40)
– Family – alpha (0-40)
– Genus – alpha (0-40)
– Species other – alpha (0-40)
• Corresponding Invalid Equivalence Classes …?
20
Equivalence Class Testing
• Limitations– assumes all values within an equivalence class are
equally likely to cause failure– assumes the implementation of the class conforms
to its specification • If the implementation does not match the
requirement/design as specified (which it will not if a defect exists), then some values in the equivalence class will produce a different actual behavior than expected
– Those values may not be chosen during equivalence class testing
21
Equivalence Class Testing
• Controversy whether random values drawn from the whole input space are as effective at finding defects as are random values chosen based on equivalence classes
22
Testing Techniques based on Code Coverage (White Box)
• Statement coverage– Touch each statement once
• Branch coverage– Execute each branch after a condition statement
• Cause each condition to be set to true, then set to false
• Path coverage– Execute all paths through the code
• Generally infeasible in all but simplest cases
• Basis path coverage– A compromise
23
Example of Statement, Branch, and Path Coverage
void tst(int x){if (x>0)
pos = pos+1;
if (x%2==0) even = even+1;
return;}
• Tests required for:– Statement Coverage: tst(2)– Branch Coverage: tst(-1),tst(2) – Path Coverage: tst(-2),tst(-1),tst(1),tst(2)
24
Use of Statement Coverage for “When are we done?” Problem
• Weakest type of the three coverage types (statement, branch, path) as far as effective test case generation
• Easiest to support with a tool– Many tools available to track statement coverage
• Tools add “instrumentation” to your code to track which statements have been executed
– Could slow your code down unacceptably
• After all your tests are run, tools report on total code coverage
– Often used to decide adequate testing• Even for this, 100% coverage may not be feasible
– May have special conditions, dead code, error routines
• Best use is to find what parts of the code have not been tested at all– It may be more cost effective to inspect uncovered parts than it is to
cover them by testing
25
100% Coverage?
• Even 100% coverage does not guarantee that software is fault free or reliable
• The Law of Diminishing Returns– The more a program has been tested, the less a
given test set can further contribute to the test adequacy.
26
Example Code with two Branches (part of
Myers Triangle)
#include <stdio.h>#include <stdlib.h>#include <math.h>
main(argc, argv)int argc;char* argv[];{int sideA;int sideB;int sideC;double s;double Area;
sideA = atoi(argv[1]);sideB = atoi(argv[2]);sideC = atoi(argv[2]);
if ( (sideA == sideB) && (sideA == sideC ) ){s = 0.5 * (sideA + sideB + sideC);Area = sqrt (s / (s - sideA) * (s - sideB) * (s - sideC) );printf ( "area = %g\n", Area) ;}Else {puts ("not an equilateral triangle") ;return 0;}
27
Branch Coverage: 2 test cases needed
• To exercise the TRUE branch
– use inputs• Side A = 2• Side B = 2• Side C = 2
– output• area = 1.73205
• To exercise the FALSE branch
– use inputs• Side A = 3• Side B = 4• Side C = 5
– output• not an equilateral
triangle
28
Two Errors not Caught by Test Cases
• wrong operand to read in sideC– should be argv[3]
• Heron’s formula for area of a triangle– Area = [s (s - sideA) (s - sideB) (s - sideC) ]
– code has division operator instead of multiplication– 2 is the only value for which the wrong equation
gives the right answer
29
Flowgraphs and Path testing
• Flowgraph– graphical representation of program’s control
structure• Path
– follows the edges of a flowgraph• Path Selection Criteria - examples:
– every path from entry to exit (path coverage)• impractical for most programs
– selected paths• e.g. each loop 0, once, maxcount times
– all paths up to some length
30
McCabe Basis Path Coverage
• Select test cases that follow the control flow paths defined for the McCabe complexity measure
• McCabe’s conjecture was that such test cases would provide ‘adequate’ coverage– It doesn’t
• Tools are available to support the McCabe approach– Claims that if more than 10 tests needed for a
module, then the module is error prone • No empirical evidence to support this claim
31
McCabe Basis Path Coverage
• Follow sample code on page 18 of Courseware, extract from Jorgensen
• We will draw the McCabe flow graph on the board
32
McCabe Basis Path Coverage
• Number of test cases required is
V(G) = e – n +2pWhere e = number of edges on the graph
n = number of nodes on the graphp = number of modules
OR (number of IF statements) + 1
33
Issues in Computational Software
• From “Numerical Recipes in … - The Art of Scientific Computing”, by William H. Press et al, Cambridge University Press, 1996
• Chapter 1, section on Error, Accuracy, and Stability
– Extremely difficult to test for especially when you don’t know exactly what the answer should be …
34
Machine Accuracy
• Numbers are stored in some approximation that can be packed into a fixed number of bits
• A number in integer representation is exact
• Arithmetic in integer representation is exact as long as– Answer is not outside the range that can be
represented– Division throws away the integer remainder
35
Machine Accuracy
• Floating point numbers are represented as s X M X B e – E
Where s is a sign bite is an exact integer exponent
M is an exact positive integer mantissaB is the base of the representation (2, 8 or 16)E is a fixed integer bias for any given machine
36
Machine Accuracy
• Arithmetic for floating point numbers is not exact
• Eg. two floating point numbers are added by right-shifting the mantissa of the smaller one and increasing its exponent until the two operands have the same exponent– Low order bits of the smaller number are lost by
this shifting– If the operands differ too greatly in magnitude
• Smaller number is right-shifted into oblivion
37
Machine Accuracy
• The smallest floating point number which, when added to 1.0, produces a floating point result different from 1.0 is called the machine accuracy, εm
• A typical computer with B=2 and a 32 bit word length has εm around 3X10-8
• εm depends on how many bits there are in the mantissa
38
Round Off Error
• Any arithmetic operation involving floating point numbers introduces an error of at least εm
• Round off errors accumulate with increasing amounts of calculation
• The error can accumulate preferentially in one direction– With N operations, the error can be N εm
39
Round Off Error
• Subtraction of two nearly equal numbers– Correct result dependent on a few low order bits
• Example: solution of a quadratic equationx = (-b +- (b2 – 4ac)1/2) / 2a
• Problem when ac << b2
40
Correct Solution
Define q = -1/2 [b + sgn(b)(b2 – 4ac)1/2]
The roots arex1 = q/ax2 = c/q
41
Instability
• Algorithms that compute a discrete approximation to a continuous quantity
• An algorithm can be unstable if the round off error gets mixed into the calculation at an early stage– Round off is successively magnified until it
swamps the true answer
42
Instability Example
Golden mean given byφ = (5 1/2 – 1) / 2 ~ 0.61803398
Powers of φ n
Can be computed by recurrence algorithmφ n+1 = φ n-1 - φ n
But this has another undesired solution of-1/2 (5 1/2 + 1)
Algorithm starts giving wrong answers around n=16 on a machine with 32-bit word length
43
How to deal with this?
• Identify tricky areas and design test cases specifically to exercise those areas– Use engineering judgment to devise special value testing
• Code with the knowledge of machine floating point arithmetic
• Use algorithms that are tried and tested– Find standard algorithms in textbooks– Use well-established math libraries
• Have someone with numerical techniques knowledge inspect your code