Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf ·...
Transcript of Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf ·...
![Page 1: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/1.jpg)
Topology andEvolution of the OpenSource SoftwareCommunity
Advisors:
Dr. Vincent W. FreehDr. Kevin Bowyer
Supported in part bythe National Science Foundation – Digital Science & Technology
Yongqin Gao
![Page 2: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/2.jpg)
2
Outline
�Overview• Data collection
• Network modeling
• Topological statistical analysis (real data)
• Simulations
• Publications
• Conclusions
![Page 3: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/3.jpg)
3
Overview (about OSS)
• What is OSS
– Free to use, free to distribute
– Unlimited user and usage
– Source code available and modifiable
• Potential advantages over commercial software– Higher quality
– Faster development
– Lower cost
– Transparent
![Page 4: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/4.jpg)
4
Overview (about our research)
• Our goal– Understanding the OSS phenomenon
• Approach– SourceForge is the source of our empirical data
– Modeling as a social network
– Analysis of topological statistics
– Use simulation to verify and validate the model
![Page 5: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/5.jpg)
5
Outline
• Overview
�Data collection
• Network modeling
• Topological statistical analysis
• Simulations
• Publications
• Conclusions
![Page 6: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/6.jpg)
6
Data Collection — Monthly
• Web crawler (scripts)– Python– Shell– AWK– Sed
• Monthly• Since Jan 2001• ProjectID• DeveloperID• Almost 2 million records• Relational database
PROJ|DEVELOPER8001|dev3488001|dev89728001|dev99228002|dev276508005|dev313518006|dev124098007|dev199358007|dev42628007|dev367118008|dev8972
![Page 7: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/7.jpg)
7
Outline
• Overview
• Data collection
�Network modeling
• Topological statistical analysis (real data)
• Simulations
• Publications
• Conclusions
![Page 8: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/8.jpg)
8
Modeling as CollaborationNetwork
• What is a collaboration network?– A social network representing the collaborating
relationships.– Movie actor network and scientist collaboration
network
• Difference of SourceForge collaborationnetwork– Link detachment– Virtual collaboration– Voluntary– Global
• Bipartite property of collaboration networks
![Page 9: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/9.jpg)
9
Collaboration network -bipartite
Adapted from Newman, Strogatz and Watts, 2001
![Page 10: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/10.jpg)
10
SourceForge DeveloperNetwork
15850 dev[46]dev[83] 15850 dev[46]
dev[48]
15850 dev[46]dev[56]
15850 dev[46]dev[58]
6882 dev[58]dev[47]
6882 dev[47]dev[79]
6882 dev[47]dev[52]
6882 dev[47]dev[55]
7028 dev[46]dev[99]
7028 dev[46]dev[51]
7028 dev[46]dev[57] 7597 dev[46]
dev[45]
7597 dev[46]dev[72]
7597 dev[46]dev[55]
7597 dev[46]dev[58]
7597 dev[46]dev[61]
7597 dev[46]dev[64]7597 dev[46]
dev[67]
7597 dev[46]dev[70]
9859 dev[46]dev[49]9859 dev[46]
dev[53]
9859 dev[46]dev[54]
9859 dev[46]dev[59]
dev[46]
dev[83] dev[56]
dev[48]
dev[52]
dev[79]
dev[72]
dev[51]
dev[57]
dev[55]
dev[99]
dev[47]
Dev[80]
dev[53]
dev[58]
dev[65]
dev[45]
dev[70]
dev[67]
dev[59]
dev[54]
dev[49]
dev[64]
dev[61]
Project 6882
Project 9859
Project 7597
Project 7028
Project 15850
OSS Developer Network (Part)Developers are nodes / Projects are links
24 Developers5 Projects
2 hub Developers1 Cluster
![Page 11: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/11.jpg)
11
Outline
• Overview
• Data collection
• Network modeling
�Topological statistical analysis (real data)
• Simulations
• Publications
• Conclusion
![Page 12: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/12.jpg)
12
Topological Analysis
• Statistics inspected– Diameter
– Average degree
– Clustering coefficient
– Degree distribution
– Cluster size distribution
– Relative size of major cluster
– Fitness and life cycle
• Evolution of these statistics
• Dual networks– developer network and project network
![Page 13: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/13.jpg)
13
Terminology
• Diameter– Average length of shortest paths between all pairs of vertices
• Degree– The count of edges connected to given vertex
• Average degree– Average of the degrees of all vertices in the network
• Cluster– The connected components of the network
• Clustering coefficient (CC)– CCi: Fraction representing the number of links actually present relative
to the total possible number of links among the vertices in itsneighborhood.
– CC: average of all CCi in a network• Degree distribution
– The distribution of degrees throughout a network• Major cluster
– The largest cluster in the network
![Page 14: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/14.jpg)
14
Diameter of DeveloperNetwork vs. Time
• Network sizeincreasedfrom 30,000to 70,000
![Page 15: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/15.jpg)
15
Diameter of ProjectNetwork vs. Time
• Network sizeincreasedfrom 20,000to 50,000.
• Diameterdecreasingwith time bothfor developernetwork andprojectnetwork
![Page 16: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/16.jpg)
16
Clustering Coefficient ofDeveloper Network vs. Time
![Page 17: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/17.jpg)
17
Clustering Coefficient ofProject Network vs. Time
![Page 18: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/18.jpg)
18
Degree Distribution(developers)
![Page 19: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/19.jpg)
19
Degree Distribution(projects)
![Page 20: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/20.jpg)
20
Cluster Size Distribution
• R2 with majorcluster is0.7426
• R2 withoutmajor clusteris 0.9799
![Page 21: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/21.jpg)
21
Relative Size of Major Clustervs. Time
• Increase of therelative size ofthe majorcluster
• Increasing rateis decreasing
• May be anindication ofthe networkevolution
![Page 22: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/22.jpg)
22
Existence of Fitness
• Investigation of development of single projectcan verify the existence of “newcomer”phenomenon
• We tracked the development of every newproject in July 2001 until now (total 1660projects)
• Maximal monthly growth per project is 13while average monthly growth per project isjust 0.3639
![Page 23: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/23.jpg)
23
Life Cycle of Project
![Page 24: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/24.jpg)
24
Summary
![Page 25: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/25.jpg)
25
Summary of Results
• Power law rules– Degree distributions, cluster distribution
• Average degree increasing with time
• Diameter decreasing with time
• Clustering coefficient decreasing with time
• Fitness existed in SourceForge
• Projects have life cycle behaviors
![Page 26: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/26.jpg)
26
Outline
• Overview
• Data collection
• Network modeling
• Topological statistical analysis (real data)
�Simulations
• Publications
• Conclusion
![Page 27: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/27.jpg)
27
Conceptual Framework
Empirical data
Adjustment
Generation
Verification
Validation
Characterization
Description
Model
Simulation
![Page 28: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/28.jpg)
28
Agent-based Modeling
• EBM vs. ABM– Heterogeneous individuals
– Complex network
• Experience environment– Hardware: computer cluster
– Software:• Simulation toolkits: Swarm
• Database: Oracle
• Language: Java, PL/SQL
![Page 29: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/29.jpg)
29
Model for SourceForge
• ABM based on bipartite graph
• Model description– Agent: developer
– Behaviors: Create, join, abandon and idle
– Preference: developer’s and project’s
– Fitness
• Four models in iterations– ER, BA, BA with constant fitness and BA with dynamic
fitness
• Comparison of empirical and simulated data
![Page 30: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/30.jpg)
30
ER Model - Diameter
• Average degreeis decreasingwhile it isincreasing inempirical data
• Diameter isincreasing whileit is decreasingin empirical data
![Page 31: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/31.jpg)
31
ER Model – ClusteringCoefficient
• Clusteringcoefficient isrelatively lowunder 0.3 while itis around 0.7 inempirical data.
![Page 32: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/32.jpg)
32
ER Model – DegreeDistribution
• Degreedistribution isnormaldistributionwhile it ispower law inempirical data
![Page 33: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/33.jpg)
33
ER Model – Cluster SizeDistribution
• power lawdistribution with R2
as 0.6667 (0.9653without the majorcluster) while R2 inempirical data is0.7426 (0.9799without the majorcluster)
• The actualdistribution isdifferent fromempirical data
![Page 34: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/34.jpg)
34
BA Model – Diameter andClustering Coefficient
• Small diameterand highclusteringcoefficient likeempirical data
• Diameter andclusteringcoefficient areboth decreasinglike empiricaldata
![Page 35: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/35.jpg)
35
BA Model – DegreeDistribution
• Power laws in degreedistributions, similar toempirical data (o forsimulated data and xfor empirical data).
• For developerdistribution: simulateddata has R2 as 0.9798and empirical data hasR2 as 0.9714.
• For project distribution:simulated data has R2
as 0.6650 andempirical data has R2
as 0.9838.
![Page 36: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/36.jpg)
36
BA Model with ConstantFitness
• Power laws in degreedistributions, similar toempirical data (o forsimulated data and x forempirical data).
• For developer distribution:simulated data has R2 as0.9742 and empirical datahas R2 as 0.9714.
• For project distribution:simulated data has R2 as0.7253 and empirical datahas R2 as 0.9838.
![Page 37: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/37.jpg)
37
BA Model with DynamicFitness
• Power laws in degreedistribution, similar toempirical data (o forsimulated data and x forempirical data).
• For developer distribution:simulated data has R2 as0.9695 and empirical datahas R2 as 0.9714.
• For project distribution:simulated data has R2 as0.8051 and empirical datahas R2 as 0.9838.
![Page 38: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/38.jpg)
38
Advantage of Dynamic Fitness
• Intuition: Fitness should decreasing with time.
• Statistics: project has life cycle behaviorwhich can not be replicated by BA model withconstant fitness but can be replicated by BAmodel with dynamic fitness
![Page 39: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/39.jpg)
39
Summary
![Page 40: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/40.jpg)
40
Summary of Results
• We use ABM to model and simulate theSourceForge collaboration network.
• Conceptual framework is proposed for agent-based modeling and simulation.
• Case study of this framework: SourceForgestudy through ER, BA, BA with constantfitness and BA with dynamic fitness.
![Page 41: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/41.jpg)
41
Outline
• Overview
• Data collection
• Network modeling
• Topological statistical analysis (real data)
• Simulations
�Publications
• Conclusion
![Page 42: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/42.jpg)
42
Publications To-date
• Yongqin Gao, "Modeling and Simulation of the OSS Community",Seventh Annual Swarm Researchers Meeting (Swarm2003), NotreDame, IN, 2003.
• Yongqin Gao, Vince Freeh, and Greg Madey, "Analysis andModeling of the Open Source Software Community", NAACSOSConference 2003, Pittsburgh.
• Yongqin Gao, Vince Freeh, and Greg Madey, "ConceptualFramework for Agent-based Modeling and Simulation", NAACSOSConference 2003, Pittsburgh.
• Greg Madey, Vincent Freeh, Renee Tynan, Yongqin Gao, ChrisHoffman, "Agent-based Modeling and Simulation of CollaborativeSocial Networks", AMCIS 2003, Tampa, FL.
![Page 43: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/43.jpg)
43
Possible Journals
• Chapter 3– Physica A: statistical mechanics and its
applications
– Journal of Social Structure (JSS)
• Chapter 4– Journal of Artificial Societies and Social
Simulation (JASSS)
– Journal of Statistical Computation and Simulation(JSCS)
![Page 44: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/44.jpg)
44
Outline
• Overview
• Data collection
• Network modeling
• Topological statistical analysis (real data)
• Simulations
• Publications
�Conclusion
![Page 45: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/45.jpg)
45
Conclusion
• Study of SourceForge collaboration networkcan help us understanding the OSScommunity
• We investigate not only the topologicalstatistics but also the evolution of thesestatistics.
• Simulation is used to investigate ofSourceForge collaboration network.
![Page 46: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/46.jpg)
46
Contribution
• Statistical study of the SourceForgecommunity (snapshot and evolution)
• Verification of the approximate method tocalculate the diameter and CC
• Proposal of a model for the SourceForgecommunity
• Improvement of dynamic fitness to BA model
![Page 47: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/47.jpg)
47
Future Work
• Data collection– Database dump from SourceForge (PostgreSQL 8GB)– All the possible attributes– Database schema in UML
• More topology analysis (with more attributes)– Discussion forum– Task assignment– Project management– Active testing
• Behavior-based analysis– Interaction between agents– H. Beyton Young’s model
• Information entropy analysis
![Page 48: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/48.jpg)
48
Acknowledgements
• Committee
• Advisors
• Colleagues
• SourceForge
• NSF
• Others
![Page 49: Topology and Evolution of the Open Source Software Communityoss/Papers/gao_thesis_defense.pdf · Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent](https://reader033.fdocuments.in/reader033/viewer/2022050603/5fab09910e3eeb3bd872fd51/html5/thumbnails/49.jpg)
49
Thank you