Post on 03-Jan-2016
description
Lecture topics
Software processSoftware project metricsSoftware project management
Does software have a life?
Software lifecycle is the sequence of stages the software goes through during its “lifetime”
Software is bornRequirements, design, coding, testing
Software livesMaintenance
Software diesSoftware retirement
Software process governs software lifecycle
What is software process
A framework for a set of key areas necessary for successful production of software
GeneralApplicable to most software projects
Outlines major tasksRequirements, specifications,…
Defines activities for each taskQuality assuranceMeasurement of progressDocument preparation
Why do we need software process?
A look back at mechanical engineeringIn the 1890, a mechanical engineer
Frederick W. Taylor invented “scientific management”
The idea was that the way in which things are done is the key to better results• Improvements like using harder steel for the cutting
tools
The labor component is important•Only good operators can take advantage of the
better cutting tools
Extensive opposition movementMany engineers thought that Taylor’s
method wasn’t really engineering, but rather some non-technical hybrid
Why do we need software process (cont.)
What about development of software?In many cases, it’s a pretty chaotic process,
similar to mechanical engineering in 1800sOpinion of many managers: software
engineering is a bag of tricks to keep programmers in line
So, the process of software development should be studied, formalized, and controlled by engineering techniques
“Software processes are software too” -- Leon J. Osterweil
There is a split between technical and management software people on the process issue
“Process vs. product” controversy: what’s more important, organizing people or organizing products
Clash of issues: technical vs. managerial (or nerds vs. suits)Very different concerns
The problem of running a a large multi-person project is different from doing the work itself
This course didn’t really touch the managerial side
Engineers need managers!And vice versa, of course
Very few people are good at both technical and managerial jobs
Capability maturity model (CMM)
How do we measure the quality of a software process?
Need to do it to compare between organizations or to know how to improve software practices in a given organization
The Software Engineering Institute introduced the CMM model
Assigns a software development organization a maturity level
1 to 5, low to high maturity Ain’t no simple formula
Careful evaluation of of the organization is needed• Mostly about how its software projects are conducted
(established practices)
Introducing predictability into software development is a primary goal of the higher CMM levels
A high quality software process is not a guarantee of a high quality software product
But the likelihood of improving software quality is high
CMM levels
InitialAd hoc software development
RepeatableCost, schedule, functionality tracking
DefinedThe process is standardized
ManagedMeasurements of progress and quality are
usedOptimizing
The process is being constantly improved
Initial level
Might be better to call it “level 0”An organization may use many of the
ideas from CMM, but not in the order or manner described in the formal levels
Thus, it will be placed on this initial level
Repeatable level
Refers more to the ability to track cost, schedule, and functionality than to the routine exercise of this ability
The only technical reference in the formal definition of this level is configuration management
The requirements might seem modest, but this level is quite hard to achieve
Defined level
The management practices of level 2 are formally defined and recorded
Followed throughout the organization even when things go wrong
There must be a Software Engineering Process group within the organization that codifies practices
Managed level
The central concept is measurement of the development process and the software product
The product here includes requirements, design, code, documentation, test plans etc.
Optimizing level
Introduces feedback into the process from the measurements of level 4
E.g., if a project is behind schedule in its design phase,
A manager at level 4 will have measurements to show this and then will try to correct matters (e.g. by adjusting schedule)
A manager at level 5 will use data from the delinquent project to try to discover the root cause of the problem and change the development process itself•So that the problem does not occur in future
projects
Critique of CMM levels
Most descriptions of CMM levels are full of hype
Descriptions of different levels are not specific
The basis of CMM is mostly managerial (not technical)
The step from level 1 to level 2 is based on management alone
In general, effort should not be spent on process at the expense of effort on product
Unless there’s a clear indication that the product will benefit from that
Using CMM to evaluate a potential employerKnowing the CMM level of a potential
employer is a valuable data for an engineer
E.g. level 4 means that there is considerably more regimentation than at (say) level 2
Many employees at a level 4 organization will have rigid job description
Likely little scope for advancementExciting technical risks are not takenBut managerial personnel has more
opportunities for advancement
Process management is not for every organizationFirst off, there are two extremes
For a project involving a handful of people, process is often a waste of time
A project involving hundreds of people will not succeed without process
What about the non-extreme cases?E.g., suppose that development time for a
project is about 2 years, involving about 200,000 LOC
Technical model - hire about eight senior engineers who work essentially without any management hierarchy•Productivity about 1200 LOC/person-month
Managed model - hire 2 line managers and 16 junior engineers•Productivity about 500 LOC/person-month
Are all software processes born equal?
There are many different ways to organize software production
Different process modelsThe choice of a process model is based
onThe nature of the projectThe methods and tools that the
organization wants to useThe controls over software productionThe product
Waterfall process model
Does not represent the practice well Too rigid
Parallel production is limited All requirements must be specified fully in the
beginning
requirements
HL design
LL design
testing
coding
Prototyping process model (evolutionary development model in Sommerville)
This model is often practical!Customers may get wrong impression
about the final product from the prototype
Customers may ask for deployment before the product is ready
Often prototype flaws are not fixed in the final product
requirements Prototypedevelopment
PrototypeTest-drive
Rapid application development (RAD) process modelA number of software teams, each
Developing a well-defined part of the product
Using the waterfall modelBenefits:
Very rapid developmentComponent-based (reusable) products
Drawbacks:Requirements have to be well-understoodProduct decomposition is not always
possibleSensitive to lack of commitment
Incremental process model
Useful when deadline cannot be achieved directly
May require significant human resourcesIf large number of teams
requirements HL design LL design testingcoding
requirements HL design LL design testingcoding
requirements HL design LL design coding
Time
Spiral process model
Natural for large software systemsCustomers are “stuck” with the
development organization
start
Requirementssector
Testsector
codesector
Designsector
Concurrent development process model
May reduce development time by exploiting concurrency
None
Awaitingchanges
Underdevelopment
Underrevision
Underreview
Baselined
Done
Formal development process model
Similar to the waterfall model in its structure
Formal processes are used on each stage
Formal specifications on the requirements stage, including formal verification
Formal process of transforming requirements into design and implementation
Standard testing of the code
So, which process model is the best?
Depends on many parameters (nature of the product, availability of resources, organization, etc.)
The spiral model should probably be the choice in most cases
Driven by risk - in the first turn of the spiral, the developers decide if building of the system is feasible
Software metrics
What are they? Formulas for computing quantitative characteristics of
software development, deployment, and maintenance
Why do we need them? Consider the following scenario (Hamlet, Maybee):
Someone in the organization makes what is called a “business case” for a new product by estimating the revenue that will be lost day by day if it is not available. Then they guess how long the business can stand the loss and come up with a schedule for developing the product - a schedule that bears no relation to what is actually required to develop it. Engineers are then told: meet this schedule.
Software developers may think that the schedule is unrealistic, but how can they prove it?
E.g. through statistical measurements available for projects of comparable complexity
Primary way to measure software
The size of the projectLines of code (LOC)Functional points
Historical data provides a link between LOC for a project and the resources needed:
PeopleNumber of personnel and length of the
period they are neededTime
The whole process of developmentIndividual phases of development
Capital goodsComputers, desks, work rooms, pizzas,
cups of coffee, …
But how would we know the size of a system before it is built?Historical data
We did something like that in the past…Estimation models
Not many people have personal experience with software projects of different sizes
Models summarize experience in equations that relate project size, schedule, and effort
Sophisticated: a 200,000 LOC project takes more than twice the resources of a 100,000 project
Can we do better than using LOC?
Functional points (FP) metric proposedBased on counting:
External input and output pointsUser interaction pointsExternal interfacesFiles used by the system
Each characteristic is evaluated based on its complexity (importance for the system) and assigned a weight
A word of caution: developed a long time ago
Before OO programmingBefore database penetrationBiased toward data processing systems
Functional points metric
Unadjusted function-point count formula:
E.g., let The number of inputs and outputs be 3, with
assigned weight 10 The number of user interactions be 2, with assigned
weight 5 The number of external interfaces be 5, with
assigned weight 3 The number of files used by the system be 2, with
assigned weight 2 Then UFC for this system is 3*10 + 2*5 + 5*3
+ 2*2 = 59
UFC = (number of elements of given type) X (weight)
COCOMO estimation model
COnstructive COst MOdelDeveloped by B. Boehm in the 1980sRecognizes 3 classes of projects:
Organic modeSmall, simple projects; democratically
configured teamsSemi-detached mode
Intermediate projects, a mix of rigid and non-rigid requirements
Embedded modeLarge projects, tight constraints
Defines 3 different levelsBasic, intermediate, advanced
Levels of the COCOMO model
BasicNeeds only the size in LOC
IntermediateNeeds LOC and a set of cost drivers
AdvancedNeeds LOC and cost driversApplies cost drivers to each activity of the
software process
Example: output from COCOMO for a 100,000 LOC project (Hamlet, Maybee)
Distributions:
Effort
(man-months)
Schedule
(months)
Personnel on board
Requirements/specification
(17%) 88.6 (27%) 6.0 14.7
Design/code
(55%) 286.7 (44%) 9.8 29.2
Integration and test
(28%) 146.0 (29%) 6.5 22.5
Model mode: semidetachedModel size: large (100,000 lines of code)
Total effort: 521.3 man-months, 152 man-hours/man-monthTotal schedule: 22.3 months
Rule-of-thumb facts from the COCOMO modelProjects in the range of 100,000 LOC
take about 2 yearsRequired effort is
20% for requirements/specification50% for design/coding30% for the rest
Staffing and distribution depend on the type of the project, but generally are
About 500 man-monthsDistributed roughly 30-40-30% among the
phases
Software quality metrics
Correctness The degree to which software performs the intended
functionMetric: number of defects per KLOC
Maintainability The ease with which software can be corrected,
adapted, or enhancedMetric: mean time to change
Integrity The degree to which software is protected against
attacksMetric: the success ratio of (known) attacks
Usability The degree of user-friendliness
Metric: the time period required to become efficient in the use of the system
Defect-related quality metrics
Defect removal efficiency (DRE)DRE = E/(E+D)
E is the number of errorsD is the number of defects
Can be used to estimate defect removal efficiency of process steps:
DREi = Ei/(Ei + Ei+1)Ei is the number of errors discovered on
step iEi+1 is the number of errors discovered
on step i+1
Using quality metrics in management
Performance of individuals and teams can be compared
Team A found 112 errors in their softwre component; team B found 240 errors in their component
Which team is better?After the deployment of the system, 5
defects were traced to software produced by team A and 2 defects were traced to software produced by team B
Which team is better?The DRE metric for teams A and B is .9
and .8Which team is better?
Using quality metrics for management is not easy and can be misleading
So, if managed software process is so great, how come Open Source is so successful?Background
Enthusiasts write software that is often quite good
Often done in collaboration by large groups of people
Informally!Open Source Foundation and Free
Software Foundation are organizations that support the notion of open source software
High profile open source projectsLinuxApache
A case study of open source software development: the Apache serverA. Mockus, R. Fielding, and J. Herbsleb
Appeared in ICSE’2000An attempt to investigate the claim
that open source software development can successfully compete with traditional commercial development methods
Characteristics of open software style (OSS) developmentBuilt by potentially large numbers
(hundreds and even thousands) of volunteers
Extremely geographically distributedParticipants rarely or never meet face to
faceWork is not assigned
People undertake the work they choose to undertake
There is no explicit system-level design, or even detailed design
There is no project plan, schedule, or list of deliverables
The Apache Web server
Began in February 1995An effort to coordinate existing fixes to the
httpd programNew architecture design by R. Thau in July
1995Apache httpd 1.0 released in January 1996
According to the Netcraft survey, the most widely deployed server
Over 50% of the 7 mil sites queriedDeveloper email list is used for
communication among developersProblem reporting database is used for
communication between users and developers
CVS archive is used for version control
The Apache development process
The Apache Group (AG) is an informal organization of developers
Only volunteers, with day jobsEach member can vote on the inclusion of
any code change and has write access to CVS
MembersPeople who have contributed for an
extended period of time (usually >6 months)
25 as of April 2000Core developers (about 15 at any given
time)Only a subset of AG active (4-6 usually)
The Apache development process (cont.)
Each developer iterates through a common sequence of actions
Discovering that a problem existsDetermining whether a volunteer will work
on itIdentifying a solutionDeveloping and testing the code within
their local copy of the sourcePresenting the code changes to the AG for
reviewCommitting the code and documentation to
the repository
The size of the Apache development communityAlmost 400 different people
contributed code182 people contributed to 695 problem
report related changes249 people contributed to 6092 non-PR
changes3060 different people submitted 3975
problem reports458 individuals submitted 591 reports that
caused a change to the code or documentation
Distribution of the work within the development communityThe top 15 developers contributed
more than 88% of added lines and 91% of deleted lines of code
A single person did about 20% of these66% of the PR related changes were
produced by the top 15 contributors
Code ownership
Hypothesis: a single person would write the vast majority of the code for a module
This didn’t happen!Of 42 .c files with >30 changes, 40 had at
least two (and 20 had at least 4) developers making more than 10% of the changes
What is the defect density of Apache code?It was more than in other four large
systems (undisclosed) it was compared to
The role of bloaty code is unclear, thoughApache did better than others in the
number of defects in pre-test stateThere is no provision for systematic system
test in OSSCode inspection is better under OSS?
Hypotheses based on this study
OSS projects will have a core of developers who control the code base
A group larger by an order of magnitude than the core will repair defects and an even larger group will report problems
Projects with a small number of developers besides the core will fail because of a large number of defects
In successful OSS projects, developers are also users
OSS developments exhibit very rapid responses to customer problems
Defect density in OSS project releases will generally be lower than in commercial code that has only been feature tested