Benevol 2010

14
Software metrics are usually right-skewed Histogram of SLOC(org.argouml.ui) SLOC for classes in org.argouml.ui Frequency 0 100 200 300 400 500 0 5 10 15 20 25

description

I used these slides during my presentation at BeNeVol 2010 in Lille, France. Paper: Vasilescu B, Serebrenik A and van den Brand MGJ (2010), "Comparative study of software metrics' aggregation techniques", In Proceedings of the 9th Belgian-Netherlands Software Evolution Seminar, pp. 80-84.

Transcript of Benevol 2010

Page 1: Benevol 2010

Software metrics are usually right-skewed

Histogram of SLOC(org.argouml.ui)

SLOC for classes in org.argouml.ui

Fre

quen

cy

0 100 200 300 400 500

05

1015

2025

Page 2: Benevol 2010

2/11

Aggregation of software metrics using the“softnometric” index

Bogdan [email protected]

Eindhoven University of TechnologyThe Netherlands

March 9, 2011

Page 3: Benevol 2010

3/11Aggregation techniques

Classical:I MeanI SumI Cardinality

Distribution fitting:I Log-normalI ExponentialI Negative binomial

Inequality indices:I TheilI GiniI KolmI Atkinson

Page 4: Benevol 2010

3/11Aggregation techniques

Classical:I MeanI SumI Cardinality

Distribution fitting:I Log-normalI ExponentialI Negative binomial

Inequality indices:I TheilI GiniI KolmI Atkinson

Page 5: Benevol 2010

4/11Gini index

The Gini index is based on the Lorenz curve:I proportion of the total income of the population (y-axis)

cumulatively earned by the bottom x% of the people.I 0 perfect equality: every person receives the same income.I 1 perfect inequality: one person receives all the income.

IGini(X ) = AA+B

Page 6: Benevol 2010

4/11Gini index

The Gini index is based on the Lorenz curve:I proportion of the total income of the population (y-axis)

cumulatively earned by the bottom x% of the people.I 0 perfect equality: every person receives the same income.I 1 perfect inequality: one person receives all the income.

IGini(X ) = AA+B

Page 7: Benevol 2010

5/11Theoretical comparison

Criteria:I Domain→ determines applicability

I Range→ determines interpretation

I Invariance• w.r.t. addition→ LOC, ignore headers• w.r.t. multiplication→ LOC, percentages vs. absolute values

I Decomposability→ explain inequality by partitioning thepopulation into groups

Page 8: Benevol 2010

6/11Theoretical comparison

Agg. technique Domain Range Invariance Decomposability

Mean R R - N/ASum R R - N/ACardinality R N - N/AGini Index R+ [0, 1] mult. -

R R mult. -Theil Index R+ [0, log n] mult. yesKolm Index R R+ add. yesAtkinson Index R+ [0, 1− 1/n] mult. -

Page 9: Benevol 2010

7/11Empirical comparison

Research questions:

I Does LOC relate to bugs?

I Do the aggregation techniques influence the presence/strength ofthis relation?

I Is there any difference between the aggregation techniques?Do they express the same thing?

Page 10: Benevol 2010

8/11Empirical comparison

Case study: ArgoUMLI Open-source,∼ 1200 Java classes,∼ 100 packages.

Methodology:I Tool chain to automatically process issue tracker and version

control system data.I Mapped defects to Java classes and then packages.I Measured SLOC of each class, aggregated to package level.I For each aggregation technique, statistically studied correlation

with bugs.

Page 11: Benevol 2010

8/11Empirical comparison

Case study: ArgoUMLI Open-source,∼ 1200 Java classes,∼ 100 packages.

Methodology:I Tool chain to automatically process issue tracker and version

control system data.I Mapped defects to Java classes and then packages.I Measured SLOC of each class, aggregated to package level.I For each aggregation technique, statistically studied correlation

with bugs.

Page 12: Benevol 2010

9/11Results

mean IGini ITheil IKolm IAtkinson defectsmean 0.170 0.192 0.6761 0.203 0.0096IGini 0.908 0.467 0.903 0.27ITheil 0.488 0.918 0.273IKolm 0.501 0.119IAtkinson 0.229

I IGini, ITheil and IAtkinson indicate the strongest and also statisticallysignificant correlation with the number of defects.However, high and statistically significant correlation betweenthem.

I Mean indicates the lowest correlation with the number of defects.

1statistically significant correlations, with two-sided p-values not exceeding 0.01, are typeset in boldface

Page 13: Benevol 2010

10/11Threats to validity

No control over the issue tracker→mapping of defects to classes.I bugs missing from the issue tracker.I bug fixes not showing up in the commit log.

How representative is the case? How about the version?I replicate on more systems and more versions.

Is LOC the most suitable metric?I replicate with more metrics.

Page 14: Benevol 2010

11/11Conclusions

Software metrics are not distributed normally.

Histogram of SLOC(org.argouml.ui)

SLOC for classes in org.argouml.ui

Fre

quen

cy

0 100 200 300 400 500

05

1015

2025

Theoretical comparison.Agg. technique Domain Range Invariance Decomposability

Mean R R - N/ASum R R - N/ACardinality R N - N/AGini Index R+ [0, 1] mult. -

R R mult. -Theil Index R+ [0, log n] mult. yesKolm Index R R+ add. yesAtkinson Index R+ [0, 1− 1/n] mult. -

Empirical comparison.mean Gini Theil Kolm Atkinson defects

mean 0.170 0.192 0.676 0.203 0.0096Gini 0.908 0.467 0.903 0.27Theil 0.488 0.918 0.273Kolm 0.501 0.119Atkinson 0.229

Classical aggregation techniques have problems when distributions areskewed. Inequality indices look more promising.