Confidence in Software Cost Estimation Results based on MMRE and PRED

Confidence in Software Confidence in Software Cost Estimation Results Cost Estimation Results

based on MMRE and PREDbased on MMRE and PREDPresentation for PROMISE 2008

Marcel [email protected]

Dan PortUniversity of Hawai'i at Manoa

Phone: +1-(808)[email protected]

Table of ContentsTable of Contents13 May 2008 2

IntroductionApproachThe Standard ErrorBootstrappingThe Confidence intervalsDatasets and models usedEx.: Bootstrapped MMREsAccounting for Standard ErrorHow much confidence needed?The Desharnais ProblemConclusion Invitation for collaboration

IntroductionIntroduction

Large number of cost estimation research efforts over last 20+ years

Still lack of confidence in such research results

Average overrun of software projects is 30% - 40% (Moløkken, Jørgensen)

Various studies show inconclusive and / or contradictory results

13 May 2008 3

ApproachApproach

Software cost estimation research is based on one or more datasets

Yet datasets are samples, perhaps significantly biased, often outdated, and of questionable relevancy

Empirical results, based on small datasets, are generalized to an entire population without considering the possible error inherent

Question: How accurate is my accuracy?

13 May 2008 4

The Standard ErrorThe Standard Error

Widely used in many fields of research and well understood

Measure of the error in calculations based on sample population datasets

Has not been used in the field of software cost estimation yet

Many confusing, inconclusive, or contradictory results can be illuminated by indicating that we cannot “have confidence” in them.

13 May 2008 5

BootstrappingBootstrapping

General problem: Distribution not known„Computer intensive“ technique similar to

Monte-Carlo methodResampling with replacement to

„reconstruct“ the general population distribution

Well-accepted, straightforward approach to approximating the standard error of an estimator

We used 15,000 iterations in this study

13 May 2008 6

The Confidence IntervalsThe Confidence Intervals

MREs are not normally distributedUnderlying distribution is not knownBC-percentile, or „bias corrected“ method

has been shown effective in approximating confidence intervals for the available distributions

13 May 2008 7

Average -1.63Median -1.63Mode #N/ASkewness 0.19Kurtosis -0.33

Average 0.20Median 0.20Mode #N/ASkewness 0.70Kurtosis 0.46

Histogram of bootstrapped MMRE and log-transformed MMRE for model (A), NASA93 dataset

Datasets and models usedDatasets and models used

PROMISE Datasets: COCOMO81*, COCOMONASA, NASA93, and Desharnais*

Models:A: ln_LSR_CAT**B: aSbC: given_EMD: ln_LSR_aSbE: ln_LSR_EMF: LSR_a+Sb

* Some errors found and corrected in these datasets** Purely statistical model

13 May 2008 8

Bootstrapped MMRE intervals 1/2Bootstrapped MMRE intervals 1/213 May 2008 9

COCOMO81 dataset

COCOMONASA dataset

Bootstrapped MMRE intervals 2/2Bootstrapped MMRE intervals 2/213 May 2008 10

NASA93 dataset

Desharnais dataset (*note only D & F used with FP raw and FP adj)

Accounting for Standard ErrorAccounting for Standard Error13 May 2008 11

COCOMO81 COCOMONASA NASA93

1. A A A

2. E E E

3. C C C

4. B D B

5. D B D

Model ranking based on MMRE, not accounting for Standard Error.

COCOMO81 COCOMONASA NASA93

1. A A A, B, C, D, E

2. C, E E -

3. B, D B, C, D -

4. - - -

5. - - -

Model ranking based on MMRE,accounting for Standard Error at 95% confidence level.

How much confidence needed?How much confidence needed?13 May 2008 12

Bootstrapped PRED(.30) intervals with significant differences(32%-confidence level, COCOMONASA dataset)*

*This a very crude example. There are more refined approaches that account for simultaneous (ANOVA like) comparisons

Bootstrapped PRED(.30) intervals (COCOMONASA dataset)

The Desharnais ProblemThe Desharnais Problem

MMRE PRED(.25)

1. F D

2. D F

13 May 2008 13

Model ranking not accounting for Standard Error (Desharnais, FP adj) imply contradictory results

MMRE PRED(.25)

1. F, D F, D

2. - -

Model ranking not accounting for Standard Error (Desharnais, FP adj).

No confident interpretation is possible based on the Desharnais dataset andmodels D, F

Conclusions 1/2Conclusions 1/213 May 2008 14

We applied standard, easily analyzed and replicated statistical methods: Standard Error, Bootstrapping

Approach has potential for increasing confidence in research results and cost estimation practice

Use of Standard Error can help address:◦ How can we meaningfully interpret intuitively appealing

accuracy measure research results?◦ How to make valid statistical inferences (i.e. significant)

for results based on comparing PRED or MMRE values.◦ Estimating how many data points are needed for

confident results.

Conclusions 2/2Conclusions 2/213 May 2008 15

◦The different behaviors of MMRE and PRED (Expansion of this in ESEM 2008 paper)

◦Determination of an adequate sample size for model calibration.

◦Understanding how sample size effects model accuracy.

◦Can “bad” calibration data be identified?◦If doing model validation studies using random

methods (such as Jackknife, holdouts, or bootstrap), how many iterations are needed for stable results?

◦Why are some cost estimation study results contradictory and how can these be resolved?

Invitation for collaborationInvitation for collaboration13 May 2008 16

ESEM08 paper: “Comparative Studies of the Model Evaluation Criterions MMRE and PRED in Software Cost Estimation Research” (Port, Korte)

There is much interesting work still to be done in this area such as:•Standard error studies of non-COCOMO models•Refinement of “how much data is enough?” methods•Standard error studies of the “deviation” problem (i.e. variance in model parameters) (Menzies, et al)•Validation of model selection when reducing parameters (Menzies, et al)•Applying standard statistical methods for model accuracy (e.g. MSE, least-likelihood estimators)

As suggested by Tim Menzies, we are keen to “crowd source” this research so if this presentation has inspired you in some way, contact Dan Port ([email protected]) and lets discuss possible collaborations!

Thank you!Thank you!

13 May 2008

Marcel [email protected]

Dan PortUniversity of Hawai'i at Manoa

Phone: +1-(808)[email protected]

Confidence in Software Cost Estimation Results based on MMRE and PRED

Technology

Transcript of Confidence in Software Cost Estimation Results based on MMRE and PRED