Verax oss bss suite quickly define provision monitor and bill telco & it services (presentation)
Empirical Project Monitor and Results from 100 OSS Development
Transcript of Empirical Project Monitor and Results from 100 OSS Development
Empirical Project Monitor andResults from 100 OSS Development Projects
Masao OhiraEmpirical Software Engineering Research Laboratory, Nara Institute of Science and [email protected]
12/21/2003 CDEKA Workshop@UC Irvine, December 16-18, 2003 2
EASE Project
Empirical software development environment for tens of thousands of projects
Massive data collectionIntensive data analysisFeedback for software process improvement in organizations/communities (not only a single developer/project)
collection
analysis improvement
Empirical Environment
Versioning(CVS)
Mailing(Mailman)
Issuetracking
(GNATS)
Other tooldata
FormatTranslator
FormatTranslator
FormatTranslator
FormatTranslator
Process data archive(XML format)
Product data archive(CVS format)
Code clonedetection
Componentsearch
Metricsmeasurement
Projectcategorization
Cooperativefiltering
GU
I
Widely used development support tools
Managers
Developers
Project xProject y
Project z. . .
EPM(developing)
12/21/2003 CDEKA Workshop@UC Irvine, December 16-18, 2003 4
EPM: Empirical Project Monitor
A partial implementation of Empirical EnvironmentCollect, measure, and show various data for project controlData source from tools used in software development
Versioning system (e.g. CVS)Mailing list manager (e.g. Mailman)Issue tracking tool (e.g. GNATS)
12/21/2003 CDEKA Workshop@UC Irvine, December 16-18, 2003 5
Architecture of EPM
versioninghistory mail history problem
history
Standardized empirical SE data (in XML)Standardized empirical SE data (in XML)
PostgreSQL(Repository)
CVS, Mailman, GNATS (ShareSourceTM)
analysis tools
prediction/schedule metrics valueother tool dataetc.
developerManager
developermanager
measurement of intra and inter projects
12/21/2003 CDEKA Workshop@UC Irvine, December 16-18, 2003 6
Characteristics of EPM
Use open source development tools → Easy to introduce
Small overhead of data collectionMost data from versioning historyCommunication through e-mail, and recoding issues by tracking tool
Easy to transform other data format to the standardized empirical SE data format
12/21/2003 CDEKA Workshop@UC Irvine, December 16-18, 2003 7
Application Area of EPM
Large projectShare project status immediatelyReduce project management loadReduce risk for tampering data
Small projectApply with small costApply to various projects, including XP and distributed development
12/21/2003 CDEKA Workshop@UC Irvine, December 16-18, 2003 8
Data collection from OSS Development Projects
SourceForge.nethosted projects: 72,853 (Dec. 15)registered Users: 753,428 (Dec. 15)
A variety of collaboration toolsSourceForge Collaborative Development System (CDS) web tools Project Web Server Tracker: Tools for Managing Support Mailing lists and discussion forums MySQL Database Services Project CVS Services etc.
Available data source for EPM
collection
analysis improvement
12/21/2003 CDEKA Workshop@UC Irvine, December 16-18, 2003 9
Overview of Collected Data
100 Active projects @ SF.netData sources for EPM
• CVS data (only 40 projects)• Mailing Lists data• Issue (Bug) reports data
Project info. in a summary page• number of developers • period of a project• development status • intended audience
collection
analysis improvement
• programming language• number of bugs• number of CVS commits• etc.
12/21/2003 CDEKA Workshop@UC Irvine, December 16-18, 2003 10
SourceForge.net
information related to the project
links to available data source for EPM
collection
analysis improvement
12/21/2003 CDEKA Workshop@UC Irvine, December 16-18, 2003 11
Summary of 100 OSS [email protected]: Evolution?
0
5
10
15
20
25
30
35
40
45
50
Aug-99 Mar-00 Oct-00 Apr-01 Nov-01 May-02 Dec-02 Jun-03 Jan-04
Registered Day of Projects
Current Developers ?
collection
analysis improvement
12/21/2003 CDEKA Workshop@UC Irvine, December 16-18, 2003 12
Result of CVS Product Data:Lines of Code (history of software growth)
collection
analysis improvement
12/21/2003 CDEKA Workshop@UC Irvine, December 16-18, 2003 13
Result of CVS Process Data: Check in/out (history of developer’s activities)
collection
analysis improvement
12/21/2003 CDEKA Workshop@UC Irvine, December 16-18, 2003 14
How can we use such a lot of data?
collection
analysis improvement
12/21/2003 CDEKA Workshop@UC Irvine, December 16-18, 2003 15
Gross Classification using EVIDII
EVIDII: Interactive interfaces that visualize relationships among three sets of data
(original application domain: face-to-face communication support between clients and designers)
collection
analysis improvement
12/21/2003 CDEKA Workshop@UC Irvine, December 16-18, 2003 16
Demo: organizing dynamic community?
collection
analysis improvement
Project X
Project info.
numbers of developers, LOC, development terms, etc.
12/21/2003 CDEKA Workshop@UC Irvine, December 16-18, 2003 17
Scenario: organizing a dynamic community / providing feedback for improvement
1. Comparing other projects with a target project
2. Finding similarities and differences between them
collection
analysis improvement
3-a. Notifying to related project leaders of the existence of communities
4-a. Asking them help/ advices for improvement
DynC approach
3-b. Identifying factors of the similarities and differences
4-b. Providing suggestions for improvement
EASE approach