Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

66
BB Corporate Research - 1 - uary, 2004 Experiences and Results from Initiating Field Defect Prediction and Product Test Prioritization Efforts at ABB Inc. Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University) Mary Shaw (Carnegie Mellon University) Brian Robinson (ABB Research)

description

Experiences and Results from Initiating Field Defect Prediction and Product Test Prioritization Efforts at ABB Inc. Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University) Mary Shaw (Carnegie Mellon University) Brian Robinson (ABB Research). - PowerPoint PPT Presentation

Transcript of Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

Page 1: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 1

-Ja

nuar

y, 2

004

Experiences and Results from Initiating Field Defect Prediction and Product Test Prioritization Efforts at ABB Inc.

Paul Luo Li (Carnegie Mellon University)

James Herbsleb (Carnegie Mellon University)

Mary Shaw (Carnegie Mellon University)

Brian Robinson (ABB Research)

Page 2: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 2

Field defects matter

Citizen Thermal Energy,Indianapolis, Indiana, USA

Molson,Montreal, Canada

Heineken Spain,Madrid, Spain

Shajiao Power Station,Guandong, China

Page 3: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 3

ABB’s risk mitigation activities include…

Remove potential field defects by focusing systems/integration testing

Higher quality product

Earlier defect detection (when its cheaper to fix)

Page 4: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 4

ABB’s risk mitigation activities include…

Remove potential field defects by focusing systems/integration testing

Higher quality product

Earlier defect detection (when its cheaper to fix)

Plan for maintenance by predicting the number of field defects within the first year

Faster response for customers

Less stress (e.g. for developers)

More accurate budgets

Page 5: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 5

ABB’s risk mitigation activities include…

Remove potential field defects by focusing systems/integration testing

Higher quality product

Earlier defect detection (when its cheaper to fix)

Plan for maintenance by predicting the number of field defects within the first year

Faster response for customers

Less stress (e.g. for developers)

More accurate budgets

Plan for future process improvement efforts by identifying important characteristics (i.e. category of metrics)

More effective improvement efforts

Page 6: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 6

Practical issues

How to conduct analysis with incomplete information?

How to select an appropriate modeling method for systems/integration testing prioritization and process improvement planning?

How to evaluate the accuracy of predictions across multiple releases in time?

Experience on how we got to the results

Page 7: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 7

Talk outline

ABB systems overview Field defect modeling overview Outputs Inputs

Insight 1: Use available information Modeling methods

Insight 2: Select a modeling method based on explicability and quantifiability

Insight 3: Evaluate accuracy using forward prediction evaluation

Empirical results Conclusions

Page 8: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 8

We examined two systems at ABB

Product A

Real-time monitoring system

Growing code base of ~300 KLOC

13 major/minor/fix-pack releases

~127 thousand changes committed by ~ 40 different people

Dates back to 2000 (~5 years)

Product B

Tool suite for managing real-time modules

Stable code base of ~780 KLOC

15 major/minor/fix-pack releases

~50 people have worked on the project

Dates back to 1996 (~9 years)

Page 9: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 9

We collected data from…

Request tracking system (Serena Tracker)

Version control system (Microsoft Visual Source Safe)

Experts (e.g. team leads and area leads)

Page 10: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 1

0

Talk outline

ABB systems overview Field defect modeling overview Outputs Inputs

Insight 1: Use available information Modeling methods

Insight 2: Select a modeling method based on explicability and quantifiability

Insight 3: Evaluate accuracy through time using forward prediction evaluation

Empirical results Conclusions

Page 11: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 1

1

Breakdown of field defect modeling

Predictors (metrics available before release)

Product (Khoshgoftaar and Munson, 1989)

Development (Ostrand et al., 2004)

Deployment and usage (Mockus et al., 2005)

Software and hardware configurations (Li et al., 2006)

Inputs

Page 12: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 1

2

Inputs

Metrics-based methods

Modeling method

Breakdown of field defect modeling

Page 13: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 1

3

Inputs

Field defects

Modeling method

Outputs

Breakdown of field defect modeling

Page 14: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 1

4

Modeling process

Inputs

Take historical Inputs and Outputs to construct model

Modeling method

Outputs

Page 15: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 1

5

Talk outline

ABB systems overview Field defect modeling overview Outputs Inputs

Insight 1: Use available information Modeling methods

Insight 2: Select a modeling method based on explicability and quantifiability

Insight 3: Evaluate accuracy through time using forward prediction evaluation

Empirical results Conclusions

Page 16: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 1

6

Outputs

Field defects: valid customer reported problems attributable to a release in Serena Tracker

Relationships

What predictors are related to field defects?

Quantities

What is the number of field defects?

Page 17: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 1

7

Outputs

Field defects: valid customer reported problems attributable to a release in Serena Tracker

Relationships

Plans for improvement

Targeted systems testing

Quantities

Maintenance resource planning

Remember these objectives

Page 18: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 1

8

Talk outline

ABB systems overview Field defect modeling overview Outputs Inputs

Insight 1: Use available information Modeling methods

Insight 2: Select a modeling method based on explicability and quantifiability

Insight 3: Evaluate accuracy through time using forward prediction evaluation

Empirical results Conclusions

Page 19: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 1

9

Inputs (Predictors)

Product metrics

Lines of code

Fanin

Halstead’s difficulty …

Development metrics

Open issues

Deltas

Authors …

Deployment and usage (DU) metrics

… we’ll talk more about this

Software and hardware configuration (SH) metrics

Sub-system

Windows configuration …

Page 20: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 2

0

Insight 1: use available information

ABB did not officially collect DU information (e.g. the number of installations) Do analysis without the information?

We collected data from available data sources that provided information on possible deployment and usage Type of release Elapsed time between releases

Improved validity More accurate models Justification for better data collection

Page 21: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 2

1

Talk outline

ABB systems overview Field defect modeling overview Outputs Inputs

Insight 1: Use available information Modeling methods

Insight 2: Select a modeling method based on explicability and quantifiability

Insight 3: Evaluate accuracy through time using forward prediction evaluation

Empirical results Conclusions

Page 22: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 2

2

Methods to establish relationships

Rank Correlation (for improvement planning)

Single predictor

Defect modeling (for improvement planning and for systems/integration test prioritization)

Multiple predictors that complement each other

Page 23: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 2

3Insight 2: select a modeling method based on explicability and quantifiability

Previous work use accuracy, however…

To prioritize product testing

Identify faulty configurations

Quantify relative fault-proneness of configurations

For process improvement

Identify characteristics related to field defects

Quantify relative importance of characteristics

Page 24: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 2

4Insight 2: select a modeling method based on explicability and quantifiability

Previous work use accuracy, however…

To prioritize product testing

Identify faulty configurations

Quantify relative fault-proneness of configurations

For process improvement

Identify characteristics related to field defects

Quantify relative importance of characteristics

Explicability

Page 25: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 2

5Insight 2: select a modeling method based on explicability and quantifiability

Previous work use accuracy, however…

To prioritize product testing

Identify faulty configurations

Quantify relative fault-proneness of configurations

For process improvement

Identify characteristics related to field defects

Quantify relative importance of characteristics

Quantifiability

Page 26: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 2

6Insight 2: select a modeling method based on explicability and quantifiability

Previous work use accuracy, however…

To prioritize product testing

Identify faulty configurations

Quantify relative fault-proneness of configurations

For process improvement

Identify characteristics related to field defects

Quantify relative importance of characteristics

Not all models have these qualities e.g. Neural Networks, models with

Principal Component Analysis

Page 27: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 2

7

The modeling method we used

Linear modeling with model selection

39% less accurate than Neural Networks (Khoshgoftaar et al.)

example only: not a real model

Page 28: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 2

8

Linear modeling with model selection

example only: not a real model

Explicability: distinguish the effects of each predictor

The modeling method we used

Function (Field defects) = B1*Input1 + B2 *Input2 + B3 *Input4

Page 29: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 2

9

Linear modeling with model selection

example only: not a real model

Quantifiability: compare the effects of predictors

The modeling method we used

Function (Field defects) = B1*Input1 + B2 *Input2 + B3 *Input4

Page 30: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 3

0

Talk outline

ABB systems overview Field defect modeling overview Outputs Inputs

Insight 1: Use available information Modeling methods

Insight 2: Select a modeling method based on explicability and quantifiability

Insight 3: Evaluate accuracy through time using forward prediction evaluation

Empirical results Conclusions Skipping ahead…

Read the paper

Page 31: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 3

1

Systems/Integration test prioritizationProduct A Predictors Estimated Effect

Sub-system:

Sub-system 1 9.85x more

Sub-system 2 8.39x more

Sub-system 3 8.13x more

Sub-system 4 7.22x more

Software platforms:

Not Windows Server Versions

1.91x more

Other predictors:

Service Pack 5.55x less

Open Issues 1.01x less

Num Authors 1.08x less

Months Before Next Release 1.16x more

Months Since 1st Release 1.03x less

Log (Field defects) = B1 Input1 + B2 Input2 …

Page 32: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 3

2

Systems/Integration test prioritization

Select a modeling method based on explicability and quantifiability

Product A Predictors Estimated Effect

Sub-system:

Sub-system 1 9.85x more

Sub-system 2 8.39x more

Sub-system 3 8.13x more

Sub-system 4 7.22x more

Software platforms:

Not Windows Server Versions

1.91x more

Other predictors :

Service Pack 5.55x less

Open Issues 1.01x less

Num Authors 1.08x less

Months Before Next Release 1.16x more

Months Since 1st Release 1.03x less

Page 33: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 3

3

Systems/Integration test prioritization

Select a modeling method based on explicability and quantifiability

Product A Predictors Estimated Effect

Sub-system:

Sub-system 1 9.85x more

Sub-system 2 8.39x more

Sub-system 3 8.13x more

Sub-system 4 7.22x more

Software platforms:

Not Windows Server Versions

1.91x more

Other predictors :

Service Pack 5.55x less

Open Issues 1.01x less

Num Authors 1.08x less

Months Before Next Release 1.16x more

Months Since 1st Release 1.03x less

Page 34: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 3

4

Systems/Integration test prioritization

Select a modeling method based on explicability and quantifiability

Product A Predictors Estimated Effect

Sub-system:

Sub-system 1 9.85x more

Sub-system 2 8.39x more

Sub-system 3 8.13x more

Sub-system 4 7.22x more

Software platforms:

Not Windows Server Versions

1.91x more

Other predictors :

Service Pack 5.55x less

Open Issues 1.01x less

Num Authors 1.08x less

Months Before Next Release 1.16x more

Months Since 1st Release 1.03x less

Page 35: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 3

5

Systems/Integration test prioritization

Select a modeling method based on explicability and quantifiability

Use available information

Improved validity

More accurate model

Justification for better data collection

Product A Predictors Estimated Effect

Sub-system:

Sub-system 1 9.85x more

Sub-system 2 8.39x more

Sub-system 3 8.13x more

Sub-system 4 7.22x more

Software platforms:

Not Windows Server Versions

1.91x more

Other predictors :

Service Pack 5.55x less

Open Issues 1.01x less

Num Authors 1.08x less

Months Before Next Release 1.16x more

Months Since 1st Release 1.03x less

Page 36: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 3

6

Systems/Integration test prioritization

Experts validated results

Quantitative justification for action

ABB found additional defects

Product A Predictors Estimated Effect

Sub-system:

Sub-system 1 9.85x more

Sub-system 2 8.39x more

Sub-system 3 8.13x more

Sub-system 4 7.22x more

Software platforms:

Not Windows Server Versions

1.91x more

Other predictors :

Service Pack 5.55x less

Open Issues 1.01x less

Num Authors 1.08x less

Months Before Next Release 1.16x more

Months Since 1st Release 1.03x less

Page 37: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 3

7

Talk outline

ABB systems overview Field defect modeling overview Outputs Inputs

Insight 1: Use available information Modeling methods

Insight 2: Select a modeling method based on explicability and quantifiability

Insight 3: Evaluate accuracy through time using forward prediction evaluation

Empirical results Conclusions

Page 38: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 3

8

Risk mitigation activities enabled

Focusing systems/integration testing

Found additional defects

Plan for maintenance by predicting the number of field defects within the first year

Do not yet know if results are accurate enough for planning purposes

Plan for future process improvement efforts

May combine with prediction method to enable process adjustments

Page 39: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 3

9

Experiences recapped

Use available information when direct/preferred information is unavailable

Consider explicability and quantifiability of a modeling method when objectives are improvement planning and test prioritization

Use forward prediction evaluation procedure to assesses accuracy of prediction for multiple releases in time

Details on insights and results in our paper

Page 40: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 40

-Ja

nuar

y, 2

004

Experiences and Results from Initiating Field Defect Prediction and Product Test Prioritization Efforts at ABB Inc.

Paul Luo Li (Carnegie Mellon University)

James Herbsleb (Carnegie Mellon University)

Mary Shaw (Carnegie Mellon University)

Brian Robinson (ABB Research)

Thanks to:

Ann Poorman

Janet Kaufman

Rob Davenport

Pat Weckerly

Page 41: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 4

1

Insight 3 : use forward prediction evaluation

Accuracy is the correct criterion when predicting the number of field defects for maintenance resource planning

Current accuracy evaluation methods are not well-suited for multi-release systems

Cross-validation

Random data with-holding

Page 42: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 4

2

Accuracy is the correct criterion when predicting the number of field defects for maintenance resource planning

Current accuracy evaluation methods are not well-suited for multi-release systems

Cross-validation

Random data with-holding

Release 1 Release 2 Release 3 Release 4

Insight 3 : use forward prediction evaluation

Page 43: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 4

3

Accuracy is the correct criterion when predicting the number of field defects for maintenance resource planning

Current accuracy evaluation methods are not well-suited for multi-release systems

Cross-validation

Random data with-holding

Release 1 Release 2 Release 3 Release 4

Insight 3 : use forward prediction evaluation

Page 44: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 4

4

Accuracy is the correct criterion when predicting the number of field defects for maintenance resource planning

Current accuracy evaluation methods are not well-suited for multi-release systems

Cross-validation

Random data with-holding

Release 1 Release 2 Release 3 Release 4

Insight 3 : use forward prediction evaluation

Page 45: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 4

5

Accuracy is the correct criterion when predicting the number of field defects for maintenance resource planning

Current accuracy evaluation methods are not well-suited for multi-release systems

Cross-validation

Random data with-holding

Release 1 Release 2 Release 3 Release 4

Insight 3 : use forward prediction evaluation

Page 46: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 4

6

Accuracy is the correct criterion when predicting the number of field defects for maintenance resource planning

Current accuracy evaluation methods are not well-suited for multi-release systems

Cross-validation

Random data with-holding

Release 1 Release 2 Release 3 Release 4

Insight 3 : use forward prediction evaluation

Page 47: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 4

7

Accuracy is the correct criterion when predicting the number of field defects for maintenance resource planning

Current accuracy evaluation methods are not well-suited for multi-release systems

Cross-validation

Random data with-holding

Release 1 Release 2 Release 3 Release 4

Insight 3 : use forward prediction evaluation

Page 48: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 4

8

Accuracy is the correct criterion when predicting the number of field defects for maintenance resource planning

Current accuracy evaluation methods are not well-suited for multi-release systems

Cross-validation

Random data with-holding

Release 1 Release 2 Release 3 Release 4

Insight 3 : use forward prediction evaluation

Page 49: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 4

9

Accuracy is the correct criterion when predicting the number of field defects for maintenance resource planning

Current accuracy evaluation methods are not well-suited for multi-release systems

Cross-validation

Random data with-holding

Release 1 Release 2 Release 3 Release 4

Insight 3 : use forward prediction evaluation

Page 50: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 5

0

Accuracy is the correct criterion when predicting the number of field defects for maintenance resource planning

Current accuracy evaluation methods are not well-suited for multi-release systems Cross-validation Random data with-holding

Only a non-random sub-set is available Predicting for a past release is not the same as

predicting for a future release

Not realistic!

Insight 3 : use forward prediction evaluation

Page 51: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 5

1

Accuracy is the correct criterion when predicting the number of field defects for maintenance resource planning

Current accuracy evaluation methods are not well-suited for multi-release systems

Cross-validation

Random data with-holding

We use a forward prediction evaluation method

Release 1 Release 2 Release 3 Release 4

Insight 3 : use forward prediction evaluation

Page 52: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 5

2

Accuracy is the correct criterion when predicting the number of field defects for maintenance resource planning

Current accuracy evaluation methods are not well-suited for multi-release systems

Cross-validation

Random data with-holding

We use a forward prediction evaluation method

Release 1 Release 2 Release 3 Release 4

Insight 3 : use forward prediction evaluation

Page 53: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 5

3

Accuracy is the correct criterion when predicting the number of field defects for maintenance resource planning

Current accuracy evaluation methods are not well-suited for multi-release systems

Cross-validation

Random data with-holding

We use a forward prediction evaluation method

Release 1 Release 2 Release 3 Release 4

Insight 3 : use forward prediction evaluation

Page 54: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 5

4

Field defect prediction

ARE Product A R1.0 R1.1 R1.2 Avg

Moving Average 1 Release 3.0% 51.7% 19.2% 24.6%

Linear Regression with Model Selection 17.6% 17.6%

Tree Split with 2 Releases 3.0% 51.7% 19.2% 24.6%

Tree Split with 3 Releases 3.0% 54.0% 19.2% 25.4%

With cost of field defects, can combine with field defect predictions to allocate initial maintenance resources

Need to evaluate if ~24% average error is adequate

Page 55: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 5

5

Improvement planning

Product A

Open Issues

Service Pack

Product B

Open Issues

Months Before Next Release

Model selection selected predictors

Page 56: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 5

6

Improvement planning

Select a modeling method based on explicability and quantifiability

Product A

Open Issues

Service Pack

Product B

Open Issues

Months Before Next Release

Page 57: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 5

7

Improvement planning

Select a modeling method based on explicability and quantifiability

Use available information

Improved validity

More accurate model

Justification for better data collection

Product A

Open Issues

Service Pack

Product B

Open Issues

Months Before Next Release

Page 58: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 5

8

Improvement planning

Product A

Open Issues

Service Pack

Product B

Open Issues

Months Before Next Release

Can delay deployment to conduct more testing

Can reduce scope of next release to resolve field defects

Page 59: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 5

9

Image from smig.usgs.gov

Explicability example: neural networks

Page 60: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 6

0

Z1 = 1

Input1*weight1 + Input2*weight2 + Input3*weight3 + Input4*weight4

Z1

Explicability example: neural networks

Page 61: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 6

1

Output = 1

Z1*weight1 + Z2*weight2 + Z3*weight3 + Z4*weight4

Z1

Z2

Z3

Z4

Z5

Explicability example: neural networks

Page 62: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 6

2

Z1

Z2

Z3

Z4

Z5

?

How does input relate to the output?

Explicability example: neural networks

Page 63: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 6

3

Z1

Z2

Z3

Z4

Z5

Improvement planning and test prioritization both need to attribute effects to predictors

?

Explicability example: neural networks

Page 64: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 6

4

Z1 = 1

Input1*weight1 + Input2*weight2 + Input3*weight3 + Input4*weight4

Z1

Quantifiability example: neural networks

Page 65: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 6

5

Z1

Z2

Z3

Z4

Z5

How do the predictors compare?

?

Quantifiability example: neural networks

Page 66: Paul Luo Li (Carnegie Mellon University) James Herbsleb (Carnegie Mellon University)

© A

BB

Cor

pora

te R

esea

rch

- 6

6

Z1

Z2

Z3

Z4

Z5

Improvement planning and test prioritization both need to compare importance of predictors

?

Quantifiability example: neural networks