Domain-Driven Software Cost Estimation
description
Transcript of Domain-Driven Software Cost Estimation
University of Southern California
Center for Systems and Software Engineering
Domain-DrivenSoftware Cost Estimation
Wilson Rosa (Air Force Cost Analysis Agency)Barry Boehm (USC)
Brad Clark (USC)Thomas Tan (USC)
Ray Madachy (Naval Post Graduate School)
27th International Forum on COCOMO®and Systems/Software Cost Modeling
October 16, 2012
This material is based upon work supported, in whole or in part, by the U.S. Department of Defense through the Systems Engineering Research Center (SERC) under Contract H98230-08-D-0171. The SERC is a federally funded University Affiliated Research Center (UARC) managed by
Stevens Institute of Technology consisting of a collaborative network of over 20 universities. More information is available at www.SERCuarc.org
University of Southern California
Center for Systems and Software Engineering
27th International Forum on COCOMO® and Systems/Software Cost Modeling 2
Data Preparation and Analysis
Cost (Effort) = a * Sizeb
Research Objectives• Make collected data useful to oversight and management entities
– Provide guidance on how to condition data to address challenges– Segment data into different Application Domains and Operating
Environments– Analyze data for simple Cost Estimating Relationships (CER) and
Schedule-Cost Estimating Relationships (SCER) within each domain– Develop rules-of-thumb for missing data
Data Records for one Domain
Schedule = a * Sizeb * Staffc
Domain CER/SER
University of Southern California
Center for Systems and Software Engineering
27th International Forum on COCOMO® and Systems/Software Cost Modeling 3
Stakeholder Community • Research is collaborative across heterogeneous
stakeholder communities who have helped us in refining our data definition framework, taxonomy, providing us data and funding
Project has evolved into a Joint Government Software Study
Funding Sources Data Sources
University of Southern California
Center for Systems and Software Engineering
27th International Forum on COCOMO® and Systems/Software Cost Modeling 4
Topics• Data Preparation Workflow
– Data Segmentation• Analysis Workflow• Software Productivity Benchmarks• Cost Estimating Relationships• Schedule Estimating Relationships• Conclusion• Future Work
University of Southern California
Center for Systems and Software Engineering
Data Preparation
University of Southern California
Center for Systems and Software Engineering
27th International Forum on COCOMO® and Systems/Software Cost Modeling 6
Current Dataset• Multiple Data Formats (SRDR, SEER, COCOMO)• SRDR (377 records) + Other (143 records) = 522 total
records1.
1. System/Element Name (version/release): 2. Report As Of:
3. Authorizing Vehicle (MOU, contract/amendment, etc.): 4. Reporting Event: Contract/Release End Submission # ________
(Supersedes # _______, if applicable) Description of Actual Development Organization
5. Development Organization: 8. Lead Evaluator:
7. Certification Date: 9. Affiliation:
10. Precedents (list up to five similar systems by the same organization or team):
Comments on Part 1 responses:
2. Product and Development Description Percent of Product Size
Upgrade or New?
1. Primary Application Type: 2. % 3. 4.
17. Primary Language Used: 18. %
21. List COTS/GOTS Applications Used:
22. Peak staff (maximum team size in FTE) that worked on and charged to this project: __________
23. Percent of personnel that was: Highly experienced in domain: ___% Nominally experienced: ___% Entry level, no experience: ___%
Comments on Part 2 responses:
3.
2. Number of External Interface Requirements (i.e., not under project control)
4. Amount of New Code developed and delivered (Size in __________ )
5. Amount of Modified Code developed and delivered (Size in __________ )
6. Amount of Unmodified, Reused Code developed and delivered (Size in __________ )
Comments on Part 3 responses:
DD Form 2630-3 Page 1 of 2
Software Resources Data Report: Final Developer Report - Sample Page 1: Report Context, Project Description and Size
Report Context
6. Certified CMM Level (or equivalent):
Product Size Reporting Provide Actuals at Final Delivery
Actual Development Process
Code Size Measures for items 4 through 6. For each, indicate S for physical SLOC (carriage returns); Snc for noncomment SLOC only; LS for logical statements; or provide abbreviation _________ and explain in associated Data Dictionary.
1. Number of Software Requirements, not including External Interface Requirements (unless noted in associated Data Dictionary)
3. Amount of Requirements Volatility encountered during development (1=Very Low .. 5=Very High)
Multiple Sources
University of Southern California
Center for Systems and Software Engineering
27th International Forum on COCOMO® and Systems/Software Cost Modeling 7
The Need for Data Preparation• Issues found in dataset
– Inadequate information on modified code (size provided)– Inadequate information on size change or growth– Size measured inconsistently– Inadequate information on average staffing or peak staffing– Inadequate information on personnel experience– Inaccurate effort data in multi-build components– Missing effort data– Replicated duration (start and end dates) across components– Inadequate information on schedule compression– Missing schedule data– No quality data
University of Southern California
Center for Systems and Software Engineering
27th International Forum on COCOMO® and Systems/Software Cost Modeling
Data Preparation Workflow
8
Start with SRDR submissions
Correct Missing or Questionable Data
Determine Data Quality Levels
Exclude from Analysis
Normalize Data
Inspect each Data Point
No resolution
Segment Data
University of Southern California
Center for Systems and Software Engineering
27th International Forum on COCOMO® and Systems/Software Cost Modeling 9
Segment Data by Operating Environments (OE)
University of Southern California
Center for Systems and Software Engineering
27th International Forum on COCOMO® and Systems/Software Cost Modeling 10
Segment Data by Productivity Type (PT)
1. Sensor Control andSignal Processing (SCP)
2. Vehicle Control (VC)3. Real Time Embedded (RTE)4. Vehicle Payload (VP)5. Mission Processing (MP)6. System Software (SS)7. Telecommunications (TEL)
8. Process Control (PC)9. Scientific Systems (SCI)10.Planning Systems (PLN)11.Training (TRN)12.Test Software (TST)13.Software Tools (TUL)14.Intelligence & Information
Systems (IIS)
• Different productivities have been observed for different software application types.
• SRDR dataset was segmented into 14 productivity types to increase the accuracy of estimating cost and schedule
University of Southern California
Center for Systems and Software Engineering
27th International Forum on COCOMO® and Systems/Software Cost Modeling 11
Example: Finding Productivity TypeFinding Productivity Type (PT) using the Aircraft MIL-STD-881 WBS:
The highest level element represents the environment. In the MAV environment there are the Avionics subsystem, Fire-Control sub-subsystem, and the sensor, navigation, air data, display, bombing computer and safety domains. Each domain has an associated productivity type.
Env Subsys Sub-subsystem Domains PT
MAV Avionics Fire Control Search, target, tracking sensors SCP
Self-contained navigation RTE
Self-contained air data systems RTE
Displays, scopes, or sights RTE
Bombing computer MP
Safety devices RTE
Data Display Multi-function display RTE
and Controls Control display units RTE
Display processors MP
On-board mission planning TRN
Level 1 Level 2 Level 3 Level 4
University of Southern California
Center for Systems and Software Engineering
27th International Forum on COCOMO® and Systems/Software Cost Modeling 12
Operating Environment & Productivity Type
Operating Environment
GSF GSM GVM GVU MVM MVU AVM AVU OVU SVM SVU
Productivity
Type
SSP VC X RTE VP MP SS
TEL PC SCI PLN TRN TST TUL IIS
When the dataset is segmented by Productivity Type and Operating Environment, the impact accounted for by many COCOMO II model drivers are considered
University of Southern California
Center for Systems and Software Engineering
Data Analysis
University of Southern California
Center for Systems and Software Engineering
27th International Forum on COCOMO® and Systems/Software Cost Modeling 14
Analysis Workflow
10/16/2012
Prepared, Normalized & Segmented Data
Derive CER Model Form
Derive Final-CER & reference data subsetDerive SCER
Publish SCER
CER: Cost Estimating RelationshipPR: Productivity RatioSER: Schedule Estimating RelationshipSCER: Schedule Compression / Expansion Relationship
Publish Productivity Benchmarks by
Productivity Type & Size Group
Publish CER results
University of Southern California
Center for Systems and Software Engineering
Software Productivity Benchmarks• Productivity-based CER • Software productivity refers to the ability of an organization to generate
outputs using the resources that it currently has as inputs. Inputs typically include facilities, people, experience, processes, equipment, and tools. Outputs generated include software applications and documentation used to describe them.
• The metric used to express software productivity is thousands of equivalent source lines of code (ESLOC) per person-month (PM) of effort. While many other measures exist, ESLOC/PM will be used because most of the data collected by the Department of Defense (DoD) on past projects is captured using these two measures. While controversy exists over whether or not ESLOC/PM is a good measure, consistent use of this metric (see Metric Definitions) provides for meaningful comparisons of productivity.
.
University of Southern California
Center for Systems and Software Engineering
Software Productivity Benchmarks
PTMIN
(ESLOC/PM)MEAN
(ESLOC/PM)MAX
(ESLOC/PM) Obs.Std. Dev. CV
KESLOC
MIN MAXSCP 10 50 80 38 19 39% 1 162
VP 28 82 202 16 43 52% 5 120RTE 33 136 443 52 73 54% 1 167MP 34 189 717 47 110 58% 1 207SCI 9 221 431 39 119 54% 1 171SYS 61 225 421 60 78 35% 2 215
IIS 169 442 1039 36 192 43% 1 180
Benchmarks by PT, across all operating environments**
** The following operating environments were included in the analysis:• Ground Surface Vehicles• Sea Systems• Aircraft• Missile / Ordnance (M/O)• Spacecraft
Preliminary Results – More Records to be added
University of Southern California
Center for Systems and Software Engineering
Software Productivity Benchmarks
PT OEMIN
(ESLOC/PM)MEAN
(ESLOC/PM)MAX
(ESLOC/PM) Obs.Std. Dev. CV
KESLOC
MINMAXSCP GSM 27 56 80 13 17 30% 1 76RTE GSM 51 129 239 22 46 36% 9 89MP GSM 87 162 243 6 52 32% 15 91SYS GSM 115 240 421 28 64 26% 5 215SCI GSM 9 243 410 24 108 44% 5 171IIS GSM 236 376 581 23 85 23% 15 180
Benchmarks by PT, Ground System Manned Only
CV: Cost VarianceESLOC: Equivalent SLOCKESLOC: Equivalent SLOC in ThousandsMAD: Mean Absolute DeviationMAX: MaximumMIN: MinimumPM: Effort in Person-MonthsPT: Productivity TypeOE: Operating Environment
Preliminary Results – More Records to be added
University of Southern California
Center for Systems and Software Engineering
Cost Estimating Relationships
Preliminary Results – More Records to be added
University of Southern California
Center for Systems and Software Engineering
CER Model Forms• Effort = a * Size• Effort = a * Size + b• Effort = a * Sizeb + c• Effort = a * ln(Size) + b• Effort = a * Sizeb * Durationc
• Effort = a * Sizeb * c1-n
Production Cost(Cost/Unit)
Scaling Factor
% Adjustment Factor
ln(Effort) = b0 + (b1 * ln(Size)) + (b2 * ln(c1)) + (b3 * ln(c2)) + …
Effort = eb0 * Sizeb1 * c1b2 * c2
b3 + …
Log-Log transform
Anti-log transform
19
University of Southern California
Center for Systems and Software Engineering
Software CERs by Productivity Type (PT)
PT Equation Form Obs.R2
(adj) MADPRED (30)
KESLOC
MIN MAXIIS PM = 1.266 * KESLOC1.179 37 90% 35% 65 1 180
MP PM = 3.477 * KESLOC1.172 48 88% 49% 58 1 207
RTE PM = 34.32 + KESLOC1.515 52 68% 61% 46 1 167
SCI PM = 21.09 + KESLOC1.356 39 61% 65% 18 1 171
SCP PM = 74.37 + KESLOC1.714 36 67% 69% 31 1 162
SYS PM = 16.01 + KESLOC1.369 60 85% 37% 53 2 215
VP PM = 3.153 * KESLOC 1.382 16 86% 27% 50 5 120
CERs by PT, across all operating environments**
** The following operating environments were included in the analysis:• Ground Surface Vehicles• Sea Systems• Aircraft• Missile / Ordnance (M/O)• Spacecraft
Preliminary Results – More Records to be added
University of Southern California
Center for Systems and Software Engineering
Software CERs for Aerial Vehicle Manned (AVM)
PT OE Equation Form Obs.R2
(adj) MADPRED (30)
KESLOC
MIN MAXMP MAV PM = 3.098*KESLOC1.236 31 88% 50% 59 1 207
RRTE MAV PM = 5.611*KESLOC1.126 9 89% 50% 33 1 167
SCP MAV PM = 115.8 + KESLOC1.614 8 88% 27% 62 6 162
CERs by Productivity Type, AVM Only
CERs: Cost Estimating RelationshipsESLOC: Equivalent SLOCKESLOC: Equivalent SLOC in ThousandsMAD: Mean Absolute DeviationMAX: MaximumMIN: MinimumPM: Effort in Person-MonthsPRED: Prediction (Level)PT: Productivity TypeOE: Operating EnvironmentPreliminary Results – More Records to be added
University of Southern California
Center for Systems and Software Engineering
Software CERs for Manned Ground Systems Manned (GSM)CERs by Productivity Type
PT OE Equation Form Obs.
R2 (adj
)MAD
PRE(30)
KESLOC
MIN MAXIIS MGS PM = 30.83 + 1.381 * KESLOC1.103 23 16% 91 15 180
MP MGS PM = 3.201 * KESLOC1.188 6 86% 24% 83 15 91
RTE MGS PM = 84.42 + KESLOC1.451 22 24% 73 9 89
SCI MGS PM = 34.26 + KESLOC1.286 24 37% 56 5 171
SCP MGS PM = 135.5 + KESLOC1.597 13 39% 31 1 76
SYS MGS PM = 20.86 + 2.347 * KESLOC1.115 28 19% 82 5 215CERs: Cost Estimating RelationshipsESLOC: Equivalent SLOCKESLOC: Equivalent SLOC in ThousandsMAD: Mean Absolute DeviationMAX: MaximumMIN: MinimumPM: Effort in Person-MonthsPT: Productivity TypeOE: Operating Environment
Preliminary Results – More Records to be added
University of Southern California
Center for Systems and Software Engineering
Software CERs for Space Vehicle Unmanned
PT OE Equation Form Obs.R2
(adj) MADPRED (30)
KESLOC
MIN MAXVP SVU PM = 3.153*KESLOC 1.382 16 86% 27% 50 5 120
CERs by Productivity Type (PT) - SVU Only
CERs: Cost Estimating RelationshipsESLOC: Equivalent SLOCKESLOC: Equivalent SLOC in ThousandsMAD: Mean Absolute DeviationMAX: MaximumMIN: MinimumPM: Effort in Person-MonthsPRED: Prediction (Level)PT: Productivity TypeOE: Operating Environment
Preliminary Results – More Records to be added
University of Southern California
Center for Systems and Software Engineering
Schedule Estimating Relationships
Preliminary Results – More Records to be added
University of Southern California
Center for Systems and Software Engineering
27th International Forum on COCOMO® and Systems/Software Cost Modeling 25
Schedule Estimation Relationships (SERs)• SERs by Productivity Type (PT), across operating environments**
PT Equation Form Obs.
R2 (adj
)MAD
PRED (30)
KESLOC
MIN MAXIIS TDEV = 3.176 * KESLOC0.7209 / FTE 0.4476 35 65 25 68 1 180
MP TDEV = 3.945 * KESLOC0.968 / FTE 0.7505 43 77 39 52 1 207
RTE TDEV= 11.69 * KESLOC 0.7982 / FTE 0.8256 49 70 36 55 1 167
SYS TDEV = 5.781 * KESLOC0.8272 / FTE 0.7682 56 71 27 62 2 215
SCP TDEV = 34.76 * KESLOC0.5309 / FTE 0.5799 35 62 26 64 1 165** The following operating environments were included in the analysis:
• Ground Surface Vehicles• Sea Systems• Aircraft• Missile / Ordnance (M/O)• Spacecraft
Preliminary Results – More Records to be added
University of Southern California
Center for Systems and Software Engineering
27th International Forum on COCOMO® and Systems/Software Cost Modeling 26
Size – People – Schedule Tradeoff
University of Southern California
Center for Systems and Software Engineering
COCOMO 81 vs. New Schedule Equations• Model Comparisons
PT Obs. New Schedule EquationsCOCOMO 81 Equations
IIS 35 TDEV = 3.176 * KESLOC0.7209 * FTE -0.4476 TDEV = 2.5 * PM0.38
MP 43 TDEV = 3.945 *KESLOC0.968 * FTE-0.7505 TDEV = 2.5 * PM0.35
RTE 49 TDEV= 11.69 *KESLOC 0.7982 * FTE -0.8256 TDEV = 2.5 * PM0.32
SYS 56 TDEV = 5.781 *KESLOC0.8272 * FTE-0.7682 TDEV = 2.5 * PM0.35
SCP 35 TDEV = 34.76 * KESLOC0.5309 * FTE-0.5799 TDEV = 2.5 * PM0.32
** The following operating environments were included in the analysis:• Ground Surface Vehicles• Sea Systems• Aircraft• Missile / Ordnance (M/O)• Spacecraft
Preliminary Results – More Records to be added
University of Southern California
Center for Systems and Software Engineering
COCOMO 81 vs. New Schedule Equations• Model Comparisons using PRED (30%)
PT Obs.
New Schedule Equations PRED
(30)COCOMO 81 Equations
PRED (30) IIS 35 68 28MP 43 52 23RTE 49 55 16SYS 56 62 5SCP 35 64 8
Preliminary Results – More Records to be added
** The following operating environments were included in the analysis:• Ground Surface Vehicles• Sea Systems• Aircraft• Missile / Ordnance (M/O)• Spacecraft
University of Southern California
Center for Systems and Software Engineering
Conclusions
.
University of Southern California
Center for Systems and Software Engineering
27th International Forum on COCOMO® and Systems/Software Cost Modeling 30
Conclusion• Developing CERs and Benchmarks by grouping appears to
account for some of the variability in estimating relationships.
• Grouping software applications by Operating Environment and Productivity Type appears to have promise – but needs refinement
• Analyses shown in this presentation are preliminary as more data is available for analysis– It requires preparation first
University of Southern California
Center for Systems and Software Engineering
Future Work• Productivity Benchmarks need to be segregated by size-
groups• More data is available to fill in missing cells in the OE-PT
table• Workshop recommendations will be implemented
– New data grouping strategy• Data repository that provides drill-down to source data
– Presents the data to the analyst– If there is a question, it is possible to navigate to the source
document, e.g. data collection form, project notes, EVM data, Gantt Charts, etc.
• Final results will be published online
http://csse.usc.edu/afcaawiki