Post on 31-Dec-2015
Regression Testing
2
Regression Testing
So far Unit testing
System testing
Test coverage
All of these are about the first round of testing Testing is performed time to time during the software
life cycle
Test cases / oracles can be reused in all rounds
Testing during the evolution phase is regression testing
3
Regression Testing
When we try to enhance the software We may also bring in bugs
The software works yesterday, but not today, it is called “regression”
Numbers Empirical study on eclipse 2005
11% of commits are bug-inducing
24% of fixing commits are bug-inducing
4
Regression Example
public int[] reverse(int[] origin){ int[] target = new int[origin.length]; int index = 0; while(index < origin.length - 1){ index++; target[origin.length-index] = origin[index]; } return target;}
//bug, missing origin[0]
public int[] reverse(int[] origin){ int[] target = new int[origin.length]; int index = 0; while(index < origin.length - 1){ index++; target[origin.length-index] = origin[index]; } target[origin.length-1] = origin[0] return target;}
Regression, now crash when length of origin is 0
5
Regression Testing
Run old test cases on the new version of software
It will cost a lot if we run the whole suite each time
Try to save time and cost for new rounds of testing Test prioritization
Test relevant code
Record and replay
6
Test prioritization
Rank all the test cases
Run test cases according to the ranked sequence
Stop when resources are used up
How to rank test cases To discover bugs sooner
Or approximation: to achieve higher coverage sooner
7
APFD: Measurement of Test Prioritization
Average Percentage of Fault Detected (APFD) Compare two test case sequences
A number of faults (bugs) are detected after each test case
The following two sequences, which is better? S1: T1 (2), t2(3), t3(5) S2: T2(1), t1(3), t3(5)
APFD is the average of these numbers (normalized with the total number of faults), and 0 for initial state
APFD (S1) = (0/5 + 2/5 + 3/5 + 5/5) / 4 = 0.5
APFD (S2) = (0/5 + 1/5 + 3/5 + 5/5) / 4 = 0.45
8
APFD: Illustration
APFD can be deemed as the area under the TestCase-Fault curve Consider t1(f1, f2), t2(f3), t3(f3), t4(f1, f2, f3, f4)
9
Coverage-based test case prioritization
Code coverage based Require recorded code-coverage information in
previous testing
Combination coverage based Require input model
Mutation coverage based Require recorded mutation-killing stats
10
Total Strategy
The simplest strategy
Always select the unselected test case that has the best coverage
11
Example
Consider code coverage on five test cases: T1: s1, s3, s5
T2: s2, s3, s4, s5
T3: s3, s4, s5
T4: s6, s7
T5: s3, s5, s8, s9, s10
Ranking: T5, T2, T1 / T3, T4
12
Additional Strategy
An adaption of total strategy
Instead of always choosing the test case with highest coverage Choose the test case that result in most extra
coverage
Starts from the test case with highest coverage
13
Example
Consider code coverage on five test cases: T1: s1, s3, s5
T2: s2, s3, s4, s5
T3: s3, s4, s5
T4: s6, s7
T5: s3, s5, s8, s9, s10
Ranking: T5(5), T2(2, s2, s4) / T4(2, s6, s7), T1(1, s1), T3
14
Combination-coverage based prioritization
Use combination coverage instead of code coverage
Total strategy does not work for combination coverage, why?
Use additional strategy (for n-wise combinations) Example: input model: (coke, sprite), (icy, normal),
(receipt, not)
Test cases: {coke, icy, not}, {coke, normal, not}, {sprite, icy, receipt}, {sprite, normal, receipt}
Ranking for 2-wise prioritization: {coke, icy, not}, {sprite, icy, receipt} (+3), {coke, normal, not} (+2), {sprite, normal, receipt} (+2)
15
Combination-coverage based prioritization
Multi-wise coverage based prioritization
Problem It may be not reasonable to consider combinations on
only certain N-wise, (sprite, normal, receipt) > (sprite, icy, receipt)
Multi-wise prioritization Select the test case with best additional 1-wise
prioritization If there is a tie, go to 2-wise, and then 3-wise, …
Results: {coke, icy, not}, {sprite, normal, receipt} (1-wise + 3), {coke, normal, not} (2-wise + 2, 3-wise + 1), {sprite, icy, receipt} (2-wise + 2, 3-wise + 1)
16
Mutation-coverage based prioritization
Similar to code coverage based prioritization
Run mutation testing for the test suite
Use killed mutants of each test case as criteria
Work for both total and additional strategy
17
Setting the threshold
Prioritization help us to find bugs earlier
Due to resource limit, we do not want to execute all test cases
The testing should stop at some place in the prioritized rank list Resource limit
Money, time
Coverage based Cover all/certain percent of statements
Cover all/certain percent of n-wise combinations Cover all/certain percent of mutations
18
Test Relevant Code
Basic Idea: Only use test cases that cover the changed code
Can be combined with test prioritization
Give more priority to the test cases that cover more code affected by the change Determine the affected code with program slicing
19
Which test case is better?
Consider the following change and test cases
void main() { int sum, i; sum = 0; -> sum = 1; i = read; if(i >= 12){ String rep = report(invalid, i); sendReport(rep) }else{ while ( i<11 ) {
sum = add(sum, i); i = add(i, 1); } }}
Test case: 0Test case: 13
Test case: 0is better because it covers more code in the forward slice
20
Program slicing
Observation The more a test case cover code affected by a
change, the results of the test case is more likely to be changed
Only test the part that are related to the revision
Program slicing: Locating all parts in the code base that will be
affected by the value of a variable
21
Program slicing
Forward slice of variable v at statement s All the code that are either control or data
depend on v at statement s
Backward slice of variable v at statement s All the code that v at statement s depends on
(either control or data dependency)
22
Data Dependencies
Data dependencies are the dependency from the usage of a variable to the definition of the variable
Example:s1: x = 3;s2: if(y > 5){s3: y = y + x; //data depend on x in s1s4: }
23
Control Dependencies
Control dependencies are the dependency from the branch basic blocks to the predicate
Example:
s1: x = 3;s2: if(y > 5){s3: y = y + x; //control depend on y in s2s4: }
24
Example: call-site -> actual argumentsvoid main() {
int sum, i;
sum = 0;
i = 1;
while ( i<11 ) {
sum = add(sum, i);
i = add(i, 1);
}
}
entry:main
expression: sum=0
expression: i=1
control-point: while i<11
call-site: add$0
expression:sum=add$0
call-site: add$1
expression:i=add$1
actual-out:add$0
actual-out:add$1
actual-in:sum$0
actual-in: i$0
actual-in: i$1
actual-in: 1
25
Example: program slicing
static int add(int a, int b){ return a + b;}
entry: add
Formal-in: a Formal-in:b formal-out:add$result
expression: add$result=a+b
26
Example: Inter-Procedure
call-site: add$0
call-site: add$1
actual-out:add$0
actual-out:add$1
actual-in:sum$0
actual-in: i$0
actual-in: i$1
entry: add
Formal-in: a Formal-in:b formal-out:add$result
actual-in: 1
sum = add(sum, i);i = add(i, 1);
static int add(int a, int b){ return a + b;}
27
Example: Full dependence graphentry:main
expression: sum=0
expression: i=1
control-point: while i<11
call-site: add
expression:sum=add$0
call-site: add
expression:i=add$1
actual-out:add$0
actual-out:add$1
actual-in:sum$0
actual-in: i$0
actual-in: i$1
entry: add
Formal-in: a Formal-in:b formal-out:add$result
expression: add$result=a+b
actual-in: 1
28
Program slicing for sum = 0 -> sum = 1entry:main
expression: sum=0
expression: i=1
control-point: while i<11
call-site: add
expression:sum=add$0
call-site: add
expression:i=add$1
actual-out:add$0
actual-out:add$1
actual-in:sum$0
actual-in: i$0
actual-in: i$1
entry: add
Formal-in: a Formal-in:b formal-out:add$result
expression: add$result=a+b
???
actual-in: 1
29
Context Sensitivity
A property that measures whether an analysis is sensitive to the method-invocation context The actual in / out of a method invocation should
match with each other
The actual in / out of different invocations should not
How to do this Bracket matching
Consider actual in / out of $0 to be ‘(’ and ‘)’
Consider actual in / out of $1 to be ‘{’ and ‘}’
A real path should have all brackets matched
30
Program slicing for sum = 0 -> sum = 1entry:main
expression: sum=0
expression: i=1
control-point: while i<11
call-site: add
expression:sum=add$0
call-site: add
expression:i=add$1
actual-out:add$0
actual-out:add$1
actual-in:sum$0
actual-in: i$0
actual-in: i$1
entry: add
Formal-in: a Formal-in:b formal-out:add$result
expression: add$result=a+b
({
})
actual-in: 1
31
Program Slicing based Test Selection
Retrieve the forward slice of the changed code
Select test cases that will cover more statements in the forward slice
void main() { int sum, i; sum = 0; -> sum = 1; i = read; if(i >= 12){ String rep = report(invalid, i); sendReport(rep) } while ( i<11 ) {
sum = add(sum, i); i = add(i, 1); }}
Test case: 0Test case: 13
Test case: 0is better because it covers more code in the forward slice
32
Record and Replay
A resource waste in regression testing We change the code a little bit
We need to run all the unchanged code in the test execution
Record and Replay For all/some of the unchanged modules
Do not run the modules
Use the results of previous test instead
33
Record and Replay
Example Testing an expert system for finance
Has two components, UI and interest calculator (based on the inputs from UI)
In first round of testing, store as a map the results of interest calculator: (a, b) -> 5%, (a, c) -> 10%, (d, e) -> 7.7%
In regression testing, if the change is made on UI, you can rerun the software with the data map
Recording more objects means saving more time in regression testing, should we record every object???
34
Pros & Cons
Pros Saving time in regression testing
Cons Be careful when recording non-deterministic
components E.g., recording getSystemTime(), may conflict with
another call
Spend a lot of time for recording data maps
Stored data map can be too huge
When the stored object is changed, the data map requires updates
35
Selection of recorded modules
Rules Record time consuming modules
So that you save more time
The recorded module should be stable E.g., libraries
The interface should contain a small data flow E.g., numeric inputs and return values
36
Selection of recording modules
Recording UI Components
Recording Internet Components
Recording components that will affect real world Sending an email
Transfer money from credit cards
37
Review of Regression Testing
Test Prioritization Try only the most important test cases
Test Relevant Code Try the most relevant test cases
Record and Replay Reuse the execution results of previous test cases