A Hybrid Approach to Expert and Model Based Effort Estimation

Thesis Defense

Daniel Baker

West Virginia University

“A Hybrid Approach toExpert and Model

Based Effort Estimation”

December 3rd, 2007

Introduction

● Research since May 2006● WVU and NASA JPL● Key Thesis Points

Human / Machine Integration “White Box”, not “Black Box” Embrace Uncertainty What not to do Use proper method evaluation Research => Industrial Application / Evaluation

2

What is Software Effort Estimation?

● Approximation of the amount of work required to develop a software project

● Made with a unit of effort such as work-months

● Effort can be converted to cost● Effort is useful across time periods,

geographic locations, industry.

3

Why is effort estimation important?

● An accurate estimate is needed to budget new projects

● Bad estimates can lead to projects being canceled

● “To gain control over its finances, NASA last week

scuttled a new checkout and launch control system

(CLCS) for the space shuttle. A recent assessment of the

CLCS, which the space agency originally estimated

would cost $206 million to field, estimated that costs

would swell to between $488 million and $533 million by

the time the project was completed.” – June 11 2003,

Computer News4

How is software effort estimation done?

● Expert Judgment Estimate is largely

based on unrecoverable human intuition

Used by most of industry

● Model Estimation Estimate is made as

some function of historical data

Topic of most of research

● Jorgensen reviewed 15 studies using both

methods and found that, ”of the fifteen studies,

we categorize five to be in favor of expert

estimation, five to find no difference, and five to

be in favor of model-based estimation.”● - M. Jorgensen. A review of studies on expert estimation of

software development effort.. Journal of Systems and Software,

70(1-2):37–60, 2004.

5

A Hybrid Approach

● Recent studies have suggested new research in combining expert judgment and prediction models (Shepperd, Meli).

● “a conceptual framework of integration and a set of operational rules to follow (Meli)”

● Two ways to do this: full combination “from scratch”, or add aspects of one to the other based on existing methods

● Decided to add expert-judgment qualities to existing model prediction (Don't reinvent the wheel)

● M. Shepperd. Software project economics: A roadmap. In International

Conference on Software Engineering 2007: Future of Software Engineering,

2007.● Roberto Meli. Human factors and analytical models in software estimation: An

integration perspective. In Proceedings of the ESCOM-SCOPE 2000, pages 33–

40, Munich, Germany, 2000. Shaker Publishing.6

How to Add Expert Qualities?● Increase the involvement of the cost analyst with the

model: “white box”, not “black box”● Automate Jorgensen's Expert Judgment Best

Practices: 1. evaluate estimation accuracy, but avoid high evaluation pressure 2. avoid conflicting estimation goals 3. ask the estimators to justify and criticize their estimates 4. avoid irrelevant and unreliable estimation information 5. use documented data from previous development tasks 6. find estimation experts with relevant domain background 7. estimate top-down and bottom-up, independently of each other 8. use estimation checklists 9. combine estimates from different experts and estimation

strategies 10. assess the uncertainty of the estimate 11. provide feedback on estimation accuracy 12. provide estimation training opportunities

● M. Jorgensen. A review of studies on expert estimation of software development

effort. Journal of Systems and Software, 70(1-2):37–60, 2004.7

Which Model-based Methods?

● COCOMO (Boehm 2000) Clearly defined algorithm

● Uses historical data to calibrate a linear model relating code size to effort (also uses “cost drivers”)

Publicly available data Used by the client: NASA JPL Solid performer, heavily researched

● Feature and Record Selection techniques may improve performance, create simpler, more stable models, and “avoid irrelevant and unreliable estimation information.”

● Boosting and Bagging● Barry Boehm, Ellis Horowitz, Ray Madachy, Donald Reifer, Bradford K. Clark, Bert Steece,

A. Winsor Brown, Sunita Chulani, and Chris Abts. Software Cost Estimation with Cocomo II. Prentice Hall, 2000.

8

Method Evaluation

● The model-based methods should be validated with historical data

● Notoriously unstable conclusions of SW effort predictors

● Source: a small number of outliers

● Solution: evaluate using nonparametric techniques

● Median MRE is a more consistent measure

9

Laboratory Studies

● Large portion of this thesis was experimenting with new uses of COCOMO: Feature Subset Selection (FSS)

● Exhaustive Search (COCOMOST)● Backward Elimination Search (dBFS)● Near Linear Time Search (COCOMIN)

Bagging Boosting

● by Subsampling● by Oversampling● by Adaboost (Distributive Sampling)

● End result FSS is sometimes useful; simple methods are as good

as complicated ones Bagging and Boosting add complexity without significant

improvement (subsampling might be useful)10

Industrial Evaluation

● Methods researched were implemented in a tool, 2CEE, and evaluated at JPL

● Calibrated with most recent NASA data

● Test records represented those estimated at JPL

● Newest methods => good improvements in median error

11

Uncertainty Representation

● Kitchenham's* sources of uncertainty: Measurement, Model, Assumption, and Scope Error

● Cross Validation and Representing the Inputs Uncertainty Covers Each Source except Scope Error

● Benefits: Risk management Budgeting Estimates can be

made early with high uncertainty, and again later with more confidence

*Barbara Kitchenham and Stephen Linkman. Estimates, uncertainty, and risk. IEEE Softw., 14(3):69–74, 1997.12

Unexpected Results

● 2CEE’s data visualization provided insights into the calibration of COCOMO

● The linear COCOMO coefficient a, and the exponential coefficient b are supposed to vary:

● 2.5 <= a <= 2.94● 0.91 <= b < 1.01● However calibration

with NASA data shows much different numbers with higher variance

13

Summary of Achievements

● Automated 7 of Jorgensen's 12 Expert Judgment Best Practices in the 2CEE tool.

● Experimented with a lot of methods for SW effort estimation and evaluated non-parametrically: Bagging/boosting were mostly wasted CPU cycles

when applied to COCOMO Feature selection was sometimes useful, and

simpler selectors are as good as complex● Represented more sources of estimation

uncertainty in a tool than previously seen.● Methods were applied and evaluated in an

industrial application at JPL Median MRE was greatly reduced

● Reported more unstable / different COCOMO calibrations than previously reported14

Related Work as a Coauthor

● O. Jalali, T. Menzies, D. Baker, and J. Hihn. Column pruning beats stratification in effort estimation. In Proceedings, PROMISE workshop, Workshop on Predictor Models in Software Engineering, 2007.

● T. Menzies, O. Elrawas, D. Baker, J. Hihn, and K. Lum. On the value of stochastic abduction (if you fix everything, you lose fixes for everything else). In International Workshop on Living with Uncertainty (an ASE’07 co-located event), 2007. Available from http://menzies.us/pdf/07fix.pdf.

● Tim Menzies, Omid Jalali, Jairus Hihn, Dan Baker, and Karen Lum. Software effort estimation and conclusion stability, 2007.15

Future Work

● Implement More of Jorgensen's Best Practices● Include More Models in 2CEE● Interactive Feature Exploration● Tradeoff Analysis● Approach a hybrid combination of expert and

model either from scratch or from the experts point of view (by adding model aspects to standard expert judgment)

● Research into the COCOMO calibration coefficient variance

16

Concluding Key Points

● Human / Machine Integration● “White Box”, not “Black Box”● Embrace Uncertainty● What not to do● Use proper method evaluation● Research => Industrial Application /

Evaluation

17

Questions?

18

A Hybrid Approach to Expert and Model Based Effort Estimation

Economy & Finance

Transcript of A Hybrid Approach to Expert and Model Based Effort Estimation