8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational...

26
8/23/00 ISSTA-2000 1 Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic University Brooklyn, NY

description

8/23/00ISSTA Measures of Test Effectiveness Probability of detecting at least one fault [DN84,HT90,FWey93,FWei93,…] Expected number of failures during test [FWey93,CY96] Number of faults detected [HFGO94] Delivered reliability [FHLS98]

Transcript of 8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational...

Page 1: 8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic.

8/23/00 ISSTA-2000 1

Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case

Study

Phyllis G. FranklYuetang Deng

Polytechnic UniversityBrooklyn, NY

Page 2: 8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic.

8/23/00 ISSTA-2000 2

Outline

• Measures of test effectiveness• Delivered reliability• Experiment design• Subject program• Results• Threats to validity• Conclusions

Page 3: 8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic.

8/23/00 ISSTA-2000 3

Measures of Test Effectiveness

• Probability of detecting at least one fault [DN84,HT90,FWey93,FWei93,…]

• Expected number of failures during test [FWey93,CY96]

• Number of faults detected [HFGO94]

• Delivered reliability [FHLS98]

Page 4: 8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic.

8/23/00 ISSTA-2000 4

Select test cases

Execute test cases

Check results

Debug program

Release program

Check test data adequacy

OK?

OK?no

yes

yes

no

Page 5: 8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic.

8/23/00 ISSTA-2000 5

Select test cases

Execute test cases

Check results

Debug program

Release program

Estimate reliability

OK?no

yes

Page 6: 8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic.

8/23/00 ISSTA-2000 6

Delivered Reliability

• Captures intuition that discovery and removal of “important” faults is more crucial

• Evaluates testing technique according to the extent to which testing will increase reliability

• Introduced and studied analytically, FHLS (FSE-97, TSE-98)

Page 7: 8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic.

8/23/00 ISSTA-2000 7

Failures, Faults, and Failure Regions

int foo();int x,y;{ s1; s2; if c1 { s3; s4; }; s5; s6;}

qi = probability that input selected according to operational distribution willhit failure region i

Page 8: 8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic.

8/23/00 ISSTA-2000 8

Failure Rate After Testing/Debugging

• Reliability after testing and debugging determined by which failure regions are hit by test cases

• Random variable represents failure rate after testing and debugging

• Compare testing techniques by comparing statistics of their ’s

Page 9: 8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic.

8/23/00 ISSTA-2000 9

ExampleFault set Probability

of detectionFailure rate

Empty 0.94 0.0F1 0.02 0.001F2 0.03 0.010F1,F2 0.01 0.011

01.0)000.0Pr(03.0)001.0Pr(02.0)010.0Pr(94.0)011.0Pr(

Page 10: 8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic.

8/23/00 ISSTA-2000 10

Testing Criteria Considered

• Various levels of coverage of– decision coverage (branch testing)– def-use coverage (all-used data flow

testing)– grouped into quartiles and deciles

• random testing with no coverage criterion

Page 11: 8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic.

8/23/00 ISSTA-2000 11

Questions Investigated

• How do test sets that achieve high coverage levels (of branch testing or data flow testing) compare to those achieving lower coverage, according to– Expected improvement in reliability: – Probability of reaching given reliability

target:

)(E

)Pr( x

Page 12: 8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic.

8/23/00 ISSTA-2000 12

Subject Program

• “Space” Program• 10,000+ LOC C antenna design program,

written by professional programmers, containing naturally occurring faults

• Test generator generates tests according to operational distribution [Pasquini et al]

• Considered 10 relatively hard-to-detect faults• Failure rate: 0.05564

Page 13: 8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic.

8/23/00 ISSTA-2000 13

Experiment Design• Adapted from design used to compare

probability of detecting at least one fault [Frankl, Weiss, et al.]

• Simulate execution of very large number of fixed-sized test sets

• For each, note coverage achieved (branch, data flow) and faults detected

• Compute density function of for various coverage-level groups

Page 14: 8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic.

8/23/00 ISSTA-2000 14

featuresTe

st c

ases

Coverage matrixFa

ult-s

ets

Failure rate vectorTe

st c

ases

faultsResults matrix

Faul

t-set

s

Fault-detection matrix

Coverage levels

Page 15: 8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic.

8/23/00 ISSTA-2000 15

Coverage Levels

• Considered the following groups of test sets for test sets of size 50:– highest decile of decision coverage– highest decile of def-use coverage– four quartiles of decision coverage– four quartiles of def-use coverage

Page 16: 8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic.

8/23/00 ISSTA-2000 16

Expected Valuescoverage expected percentage

range decrease in decreasefailure rate

all 0.021 38%

decision coverage 0 to 26 % 0.018 32%26 to 51% 0.021 38%51 to 77% 0.22 40%77 to 100% 0.023 42%88 t0 100% 0.024 43%

def-use coverage 0 to 32% 0.017 13%32 to 53% 0.021 38%53 to 77% 0.023 40%77 to 100% 0.025 44%88 to 100% 0.025 46%

Page 17: 8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic.

8/23/00 ISSTA-2000 17

Tail Probabilities

Page 18: 8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic.

8/23/00 ISSTA-2000 18

Page 19: 8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic.

8/23/00 ISSTA-2000 19

Page 20: 8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic.

8/23/00 ISSTA-2000 20

Page 21: 8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic.

8/23/00 ISSTA-2000 21

Idealized Test Generation Strategy• Select one test case from each subdomain

(independently, randomly)• Widely studied analytically• Results in very large test sets for this

subject– decision coverage: 995– def-use coverage: 4296

• Compared to large random test sets

Page 22: 8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic.

8/23/00 ISSTA-2000 22

Expected Values

size expected percentagedecrease in decreasefailure rate

100% decision coverage 995 0.055 100%random 995 0.054 96%

100% def-use coverage 4296 0.056 100%random 4296 0.056 100%

Page 23: 8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic.

8/23/00 ISSTA-2000 23

Tail Probabilities

Page 24: 8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic.

8/23/00 ISSTA-2000 24

Threats to Validity

• Single program• Dependence on programmers’

characterization of the faults• Dependence on universe• Universe based on operational distribution• Single test set size (50)• Accurate estimates of expected value, but

less accuracy in estimates of density function

Page 25: 8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic.

8/23/00 ISSTA-2000 25

Conclusions• Positive:

– higher decision coverage yields lower expected failure rate

– higher def-use coverage yields lower expected failure rate

– higher coverage increases likelihood of reaching high reliability target (low failure rate target)

Page 26: 8/23/00ISSTA-20001 Comparison of Delivered Reliability of Branch, Data Flow, and Operational Testing: A Case Study Phyllis G. Frankl Yuetang Deng Polytechnic.

8/23/00 ISSTA-2000 26

Conclusions (continued)

• Negative:– reliability gains with increased coverage

are modest• cost-effectiveness questionable• economic significance of increases depends on

context– no silver bullet for ultra-reliability