Two Case Studies of Open Source Software Development: Apache and Mozilla By Helen Gower, Drew...
-
Upload
grace-townsend -
Category
Documents
-
view
215 -
download
1
Transcript of Two Case Studies of Open Source Software Development: Apache and Mozilla By Helen Gower, Drew...
Two Case Studies of Open Source Software Development:
Apache and Mozilla
By Helen Gower, Drew Spencer, Mila Reid, Nigel Macarthur & Mohamed Hossain
Introduction
Development Process - Traditional What is Open Source Software Apache and Mozilla Apache Process Hypotheses Mozilla Process Hypotheses revisited Conclusion Research Any questions
Development Process - Traditional
Basically the waterfall cycle
Predominately used in the commercial industry
Advantages: well established, structured procedures
Disadvantages: management related constraints, cannot go back a phase
What is Open Source Software (OSS)?
A new way to develop software
Differences from traditional development– Source code is freely available
– Communicate exclusively by email/bulletin boards
– Geographically distributed development
Advantages: developer freedom, tacit knowledge
Disadvantages: lacks traditional methods to coordinate development
Open Source Software - Results
OSS development has proven to be equivalent/superior to traditional methods
– Defects found and fixed quicker
– Code written with more care/creativity
An example of successful OSS software is the Linux operating system
What is Apache?
Apache is a free, open source HTTP web server software system
Works well on open source operating systems such as UNIX and Linux
Also available for Windows and other operating systems
What is Apache? cont…
Supports the PERL and PHP languages
Provides services such as server-side scripting
Industry leaders such as DEC, UUNet and Yahoo use Apache
70% of the worlds web servers run on Apache (http://news.netcraft.com/web_server_survey.html)
Why call it ‘Apache’?
In early 1995, developers of some high visibility web sites decided to pool their patches and enhancements to the NCSA/1.3 server to create…
‘A patchy server’
The Apache Group (AG) started 1995
What is Mozilla?
The Mozilla Project is an open source software project
Dedicated to development of the Mozilla web browser and application framework
Available for many operating systems – Firefox, a cross-platform browser; and – Camino, a web browser for MacOS X
What is Mozilla? cont…
Includes mail and news reader (Mozilla Thunderbird), HTML editor and an IRC client
Supports many technologies including development tools
– CVS, Bugzilla, Bonsai, Tinderbox
It also builds toolkit type applications such as Komodo from ActiveState
What is Mozilla? cont…
Mozilla uses a development process with commercial roots
Mozilla.org exists as a group within Netscape– Central point of contact responsible for coordinating development
The Development Process
Problems posed by OSS-style development– Decentralised Workspaces
– Lack of communication & leadership
– Inconsistent dedication of time
Solutions– Concurrent Version Control Archive (CVS)
– Mailing List
– Quorum Voting System
– Meritocracy
Identifying work to be done
Modification Requests (MRs) – mailing list
BUGDB
USENET groups
“Showstoppers” always addressed
Others discussed on mailing list
Assigning & Performing Work
Core developers have own areas
New developers take on disowned areas or new features
Great respect for core developers’ expertise and experience
No specific rights to code – meritocracy gives implicit ownership
The Development Community
400 individual contributors of code– 182 people contributed 695 fixes– 249 people contributed 6,092 new code submissions
3,060 people submitted 3,975 bug reports– 458 people submitted 591 that caused a change in the code
Distribution of Work
Top 15 developers contributed:– 83% of MRs for new features– 66% of MRs for defects/bugs
The wider development community is significant in defect repair
Few outside the core group submit with any regularity
Developers contributing > 1 MR Before After Both
New Features 49 49 25
Fixes 120 140 25
Commercial Project Comparison
MR KLOCA Dev MR/top dev/yrLOCA/top dev/yr
A 3,300 5,000 101 30 38,600
B 2,500 1,000 91 30 11,700
C 1,100 81 17 90 6,100
D 200 21 8 20 5,400
E 700 90 16 60 10,000
A-E Avg. - - 47 46 14,360
Apache 6,000 220 388 110 4,300
Commercial Project Comparison
“Top” developers handle around twice the number of MRs as commercial projects
Rate of development is within 2/3 that of C & D in terms of LOCA
B & E are about twice as productive
A is 10 times more productive
Reporting Problems
“Top” Problem Reporters only contributed 5% of PRs
Of these 15, only 3 are also core developers
Problem reporting belongs almost exclusively to the wider development community
Ownership of Code
Was thought likely that strong code ownership would evolve due to modular design and decentralisation
This was not supported by analysis of files (.c files)– 75% had > 2 developers contributing 10% of lines– 50% had > 4 developers contributing 10% of lines– High level of trust and recognition of expertise
Defect Density
Measured in defects/KLOCA
Apache has same defect density for pre-release and post release tests
Pre-release – less defects than commercial products
Post release – more defects than commercial products
Resolving Problems
How long does it take to resolve problems?– 50% resolved within 24hrs– 75% resolved within 42 days– 90% resolved within 140 days
Slightly lower for documentation, OS related and optional features
Over two periods the average resolution interval decreased significantly while the number of users increased
What has been analysed?
The structure of the development process
The number of participants
The distribution of work among different roles
Rules of ownership of code
Density of defects
Time taken to resolve problems
Hypothesis 1
Implicit coordination mechanism– Detailed knowledge of who has expertise in what area– Customs & habits regarding how things are done– What are core members are doing
Core of developers who control the code base
- No larger than 10-15 people
- Create approx 80% of the new functionality (not fixes or problem reporting)
Core of developers who control the code base
- No larger than 10-15 people
- Create approx 80% of the new functionality (not fixes or problem reporting)
Hypothesis 2
Satellite projects created
‘Divide & conquer’ – work split over core developers and satellite groups
Strict code ownership policy needs to be adopted
Strict code ownership policy needs to be adopted
Hypothesis 3
A group around 10x larger than the core (10-15 people)
will repair defects- E.g. Apache, 182 people repaired defects
A group around 10x larger than the core (10-15 people)
will repair defects- E.g. Apache, 182 people repaired defects
A group 10x larger or more will report problems
- E.g. Apache, 3,060 people reported bug reports
A group 10x larger or more will report problems
- E.g. Apache, 3,060 people reported bug reports
Hypothesis 4
Lack of resources = overburdened
Most people have only ever submitted 1 bug– Apache: 3,060 people reported 3,975 bugs
Wider community needed to free up core developers time so they can develop new functionality
Projects without a wider community finding and repairing defects will
fail
Projects without a wider community finding and repairing defects will
fail
Hypothesis 5
Defect density (per 1,000 lines of code) will be lower than commercial
software
Defect density (per 1,000 lines of code) will be lower than commercial
software
Hypothesis 6
Familiar with the features needed
Familiar with desirable user behaviour
Developers are also experienced users of the software they write
Developers are also experienced users of the software they write
Hypothesis 7
‘Many eyeballs’ implies shallow bugs
‘Free-world’ of OSS – Patches available to all customers nearly as soon as they are made
Commercial developments– Patches bundled into new releases and scheduled for release at
specific times (long term projects)
OSS developments exhibit rapid responses to customer problems
OSS developments exhibit rapid responses to customer problems
Mozilla – How Things Happen
Development was done at the time of writing the paper by 12 staff in mozilla.org
Non-development staff concentrate on issues like testing, or community milestone releases
The content of future releases is specified in a ‘road map’
Work within this is allocated according to developer preferences and expertise
How Things Happen cont…
Developers can browse Bugzilla to choose areas on which they would like to work
Mozilla web pages can be used to note areas where help is needed
Mozilla operates on a ‘daily build’
Each build is ‘smoke tested’ by one of 6 pre-release test teams
This is followed by inspections and managed release
Mozilla – Research Findings
The points below summarise the research questions originally considered
486 people contributed code
412 contributed code to fixes
6,873 communicated problems (external community very large, small core)
Code ownership is enforced
Mozilla – Findings cont…
The authors’ hypotheses 1 and 2 are supported by the Mozilla data
However, these hypotheses were modified as summarised below:
The core size (10-15) is limited to 10-15 if only informal coordination is used
– Original hypothesis did not discuss impact of coordination
Project cores larger than 10-15 might require other mechanisms in addition to code ownership to improve coordination
Mozilla – Findings cont…
Hypothesis 3 (relative sizes of core/fixers/reporters) is weakly supported
Core = 22 to 35 (larger than expected)
Fixers = 47 to 129
Reporters = 119 to 623
Mozilla defect density lower than commercial equivalent projects
– although: caution – more may be found later
Mozilla - Conclusions
Commercial/OSS has many possible hybrids
These hybrids will require a large open source community to fix bugs
They will also require an even larger community to find bugs
Research
A strong paper
Good approach to measuring the metrics required to test the required hypotheses
Citation frequency in the following years suggests that is regarded as authoritative
However, the final conclusion, as mentioned previously, must be considered unproven as yet
References
http://www.cs.colostate.edu/~cs656/reading/reading-paper.ppt#1 (“How to read and critique a technical paper”, Colorado State University
Greenhalgh, Trisha, “How to read a paper” London, BMJ, 1997
The above summarised at:– http://www.bmj.com/archive/7102/7102ed.htm
Note: the BMJ references above concern evidence based medicine, but have some useful sections!
Any questions?
…preferably easy ones!