Promise 2011: "Are Change Metrics Good Predictors for an Evolving Software Product Line?"

20
Are Change Metrics Good Predictors for an Evolving Software Product Line? Sandeep Krishnan, ISU Chris Strasburg, ISU & Ames Laboratory Robyn R. Lutz, ISU & JPL, California Institute of Technology Katerina Goseva-Popstojanova, WVU 1 This research is supported by NSF grants 0916275 and 0916284 Dept. of Computer Science, Iowa State University, PROMISE, September 20, 2011

description

Promise 2011:"Are Change Metrics Good Predictors for an Evolving Software Product Line?"Sandeep Krishnan, Chris Strasburg, Robyn Lutz and Katerina Goseva-Popstojanova.

Transcript of Promise 2011: "Are Change Metrics Good Predictors for an Evolving Software Product Line?"

Page 1: Promise 2011: "Are Change Metrics Good Predictors for an Evolving Software Product Line?"

1

Are Change Metrics Good Predictors for an Evolving Software Product Line?

Sandeep Krishnan, ISUChris Strasburg, ISU & Ames Laboratory

Robyn R. Lutz, ISU & JPL, California Institute of Technology

Katerina Goseva-Popstojanova, WVU

This research is supported by NSF grants 0916275 and 0916284

Dept. of Computer Science, Iowa State University, PROMISE, September 20, 2011

Page 2: Promise 2011: "Are Change Metrics Good Predictors for an Evolving Software Product Line?"

2

BackgroundProduct line – “A family of products designed to take advantage of

their common aspects and predicted variabilities” [Weiss and Lai 1999]e.g., Nokia cellphones, HP printers, etc.

Products -

Commonalities – Shared by all products. e.g., Platform

Variabilities – Differentiate the products High-reuse variation

JDT, PDE, Mylyn, Webtools, etc. Reused in more than three products and for more than six years.

Low-reuse variation CDT, Datatools, Java EE tools. Reused in three or fewer products and for more than four years.

Dept. of Computer Science, Iowa State University, PROMISE, September 20, 2011

Page 3: Promise 2011: "Are Change Metrics Good Predictors for an Evolving Software Product Line?"

3

Related WorkEclipse as a product line. [Chastek, McGregor, and Northrop,

2007], [Linden, 2009], [Krishnan et al., 2011].

Summary of previous work

Authors Eclipse releases

Metrics Results

Zimmerman, Premraj, Zeller

Eclipse 2.0(2002), 2.1(2003), 3.0(2004)

Code complexity metrics

Classification at package level is more accurate than at file level

Moser, Pedrycz, Succi

Eclipse 2.0(2002), 2.1(2003), 3.0(2004)

Code change metrics

J48 learner with change metrics gives better classification at file level than code metricsFailure-prone file - A file with one or more non-trivial

post-release

bugs recorded in the Eclipse Bugzilla database.

Important/Good predictor – Predictor providing high information gain for classification of failure-prone filesDept. of Computer Science, Iowa State University, PROMISE, September 20, 2011

Page 4: Promise 2011: "Are Change Metrics Good Predictors for an Evolving Software Product Line?"

4

Product Line EvolutionProduct line evolution in two dimensions

P1

R1

P1

R2

P1

R3

P1

Rn

P2

R1

P2

R2

P2

R3

P2

Rn

Pn

R1

Pn

R2

Pn

R3

Pn

Rn

New Releases

New Product

s

Dept. of Computer Science, Iowa State University, PROMISE, September 20, 2011

Page 5: Promise 2011: "Are Change Metrics Good Predictors for an Evolving Software Product Line?"

5

MotivationCan we leverage the reducing amount of change in product lines to better predict failure-prone files?

Time

Change

Dept. of Computer Science, Iowa State University, PROMISE, September 20, 2011

Page 6: Promise 2011: "Are Change Metrics Good Predictors for an Evolving Software Product Line?"

6

Eclipse case study

Eclipse ClassicEclipse Java

Eclipse JavaEE

Eclipse C/C++

Eclipse products

Change data collected for 6 months before and after each release.

Failure categories

Blocker

Critical

Major

Normal

Minor

Page 7: Promise 2011: "Are Change Metrics Good Predictors for an Evolving Software Product Line?"

7

Research Questions

As a product evolves, do any change metrics serve as good predictors of failure-prone files?

Is there a subset of change metrics which are good predictors across all product line members?

Does our ability to predict failure-prone files improve as product line evolves?

Dept. of Computer Science, Iowa State University, PROMISE, September 20, 2011

Page 8: Promise 2011: "Are Change Metrics Good Predictors for an Evolving Software Product Line?"

8

FindingsThe change metrics provide good classification of the

failure-prone files in the Eclipse product line.

As each product evolves, there is a stable set of change metrics that are prominent predictors of failure-prone files across its releases.

There is a subset of change metrics that is among the prominent predictors of all the products across most of the releases.

As the product line matures, prediction performance improves for each of the four Eclipse products.

Dept. of Computer Science, Iowa State University, PROMISE, September 20, 2011

Page 9: Promise 2011: "Are Change Metrics Good Predictors for an Evolving Software Product Line?"

9

Data Source

Source of failure reports-

Source of change reports – CVS repository of Eclipse.

Data Timeline

Dept. of Computer Science, Iowa State University, PROMISE, September 20, 2011

Replication ExtensionData Timeline

Page 10: Promise 2011: "Are Change Metrics Good Predictors for an Evolving Software Product Line?"

10

Approach

Weka J48

decision tree

learner

Get prediction results +

best predictors

Dept. of Computer Science, Iowa State University, PROMISE, September 20, 2011

Page 11: Promise 2011: "Are Change Metrics Good Predictors for an Evolving Software Product Line?"

11

Replication Results

Release Moser et al. study This study

PC TPR FPR PC TPR FPR

2.0 82 69 11 88 55 5

2.1 83 60 10 85 63 9

3.0 80 65 13 84 62 9

Learner performance compared to previous results

Classification performance comparison for Eclipse Classic 2.0, 2.1, and 3.0

PC- Percentage of correctly classified instances

TPR- True positive rate

FPR- False positive rate

Dept. of Computer Science, Iowa State University, PROMISE, September 20, 2011

Page 12: Promise 2011: "Are Change Metrics Good Predictors for an Evolving Software Product Line?"

12

Top predictors from this study Revisions, Weighted_age

Top predictors from previous study Max_changeset, Bugfixes, Revisions

Dept. of Computer Science, Iowa State University, PROMISE, September 20, 2011

Replication Results

Release Top five predictors from Moser study

Top five predictors from this study

2.0 Max_Changeset, Revisions, Age, Bugfixes, Refactorings

Revisions, Weighted_Age, Ave_Changeset, Bugfixes, Max_Loc_Added

2.1 Bugfixes, Max_Changeset, Revisions, Max_Added, Max_Loc_Deleted

Revisions, CodeChurn, Age, Weighted_Age,Loc_Deleted

3.0 Revisions, Max_Changeset, Bugfixes, Age, Ave_Loc_Added

Revisions, Authors, Weighted_Age, CodeChurn, Age

Top five predictors for earlier releases of Eclipse Classic

Page 13: Promise 2011: "Are Change Metrics Good Predictors for an Evolving Software Product Line?"

13

Learner performance improves as single product evolves

Extension Results

Release PC TPR FPR

2.0 88 55 5

2.1 85 63 9

3.0 84 62 9

3.3 (Europa) 93 79 4

3.4 (Ganymede)

94 81 3

3.5 (Galileo) 97 86 2

3.6 (Helios) 97 85 2

Dept. of Computer Science, Iowa State University, PROMISE, September 20, 2011

Page 14: Promise 2011: "Are Change Metrics Good Predictors for an Evolving Software Product Line?"

14

Revisions is good predictor for later releases also.Max_changeset is a good predictor also.

Dept. of Computer Science, Iowa State University, PROMISE, September 20, 2011

Extension Results

Top five predictors for later releases of Eclipse Classic

Release Top five predictors

3.3 (Europa) Ave_Loc_Added, Revisions, Authors, Max_Changeset, Weighted_Age

3.4 (Ganymede)

Revisions, Age, Ave_Changeset, Max_Loc_Added, Max_Changeset

3.5 (Galileo) Loc_Added, Authors, Max_Changeset, Revisions, Weighted_Age

3.6 (Helios) Authors, Revisions, Max_Changeset, Bugfixes, CodeChurn

Page 15: Promise 2011: "Are Change Metrics Good Predictors for an Evolving Software Product Line?"

15

Learner performance improves as product line evolves

Percentage of correctly classified instances increases across releases for each product

Dept. of Computer Science, Iowa State University, PROMISE, September 20, 2011

Extension Results

Page 16: Promise 2011: "Are Change Metrics Good Predictors for an Evolving Software Product Line?"

16

Extension Results

Percentage of true positives shows improvement across releases for each product

Learner performance improves as product line evolves

Dept. of Computer Science, Iowa State University, PROMISE, September 20, 2011

Page 17: Promise 2011: "Are Change Metrics Good Predictors for an Evolving Software Product Line?"

17

Percentage of false positives shows reduces across releases for each product

Dept. of Computer Science, Iowa State University, PROMISE, September 20, 2011

Extension ResultsLearner performance improves as product line

evolves

Page 18: Promise 2011: "Are Change Metrics Good Predictors for an Evolving Software Product Line?"

18

No common set of predictors across each product and each release.

Max_changeset, Revisions and Authors are prominent predictors for all products.

Some predictors are prominent for only one product.

Dept. of Computer Science, Iowa State University, PROMISE, September 20, 2011

Extension Results

Product

Java JavaEE C/C++ Classic

Max_Changeset: 4, Revisions : 3, Authors: 3, Code Churn: 3

Authors: 4,Max_Changeset: 4, Revisions: 3 Bugfixes: 3

Revisions: 4,Authors: 3,Max_Changeset: 3Age: 3

Revisions: 4,Max_Changeset: 4,Authors: 3

Top five predictors for four products of Eclipse Product Line

Page 19: Promise 2011: "Are Change Metrics Good Predictors for an Evolving Software Product Line?"

19

Research Questions Revisited

• There is a small set of change metrics which serve as good

predictors

RQ1: As a product evolves, do any change metrics serve as good predictors of failure-

prone files?

• There is no subset of metrics common to all products and all

releases.• There are some differences in the set

of good predictors between products

RQ2: Is there a subset of change metrics

which are good predictors across all

product line members?

• As the Eclipse product line evolves, the predictions of failure-prone files

improve

RQ3:Does our ability to predict failure-prone

files improve as product line evolves?

Dept. of Computer Science, Iowa State University, PROMISE, September 20, 2011

Page 20: Promise 2011: "Are Change Metrics Good Predictors for an Evolving Software Product Line?"

20

Thank You!

Our data is available at http://www.cs.iastate.edu/~lss/PROMISE11Data.tar.gz

Dept. of Computer Science, Iowa State University, PROMISE, September 20, 2011