Socio-technical evolution and migration in the Ruby ecosystem
Transcript of Socio-technical evolution and migration in the Ruby ecosystem
Socio-Technical Evolution and Migration in the Ruby
Ecosystem
Eleni Constantinou, Tom MensCOMPLEXYS Research Institute, UMONS
BENEVOL 2016, Utrecht
2
IntroductionSoftware ecosystem
• Collection of software projects that are developed and evolve together in the same environment [1]
Ecosystem environment• Development team Social aspect⇒• Source code artefacts Technical aspect⇒
Modifications• Social: Contributors joining/leaving• Technical: New/obsolete source code files
[1] M. Lungu. Towards reverse engineering software ecosystems. Int'l Conf. Software Maintenance, pages 428-431, 2008.
3
Introduction
Evolution• Longevity• Growth
Ecosystem sustainability
Long-term effect of social/technical modifications
A sustainable software ecosystem can increase or maintain its
user/developer community over longer periods of time and can
survive inherent changes such as new technologies or new
products (e.g. from competitors) that can change the population
(the community of users, developers etc) [2]
[2] D. Dhungana, I. Groher, E. Schludermann, S. Biffl. Software ecosystems vs. natural ecosystems: learning from the ingenious mind of nature. Eur. Conf. on Software Architecture: Companion Volume, pages 96-102, 2010.
Background
4
Time Unit 1
Time Unit 2
Time Unit 3 … Time
Unit N-2Time
Unit N-1Time
Unit N
START
END
Technical
Artefacts
Technical
Artefacts
P1 P3P2 P1 P3P4
5
DefinitionsProject Metrics
ObsoleteProjects(t)
NewProjects(t)
ActiveProjects(t)
ProjectRenewal(t)
ProjectAbandonment(t)
P2
P4
P2 P1 P3
P1 P3
P1 P3
P4 P4
P2
6
Definitions
Team Metrics
Leavers(t)
Joiners(t)
Stayers(t)
Team(t)
TeamRenewal(t)
TeamAbandonment(t)
7
DefinitionsFile Metrics
Obsolete(t)
New(t)
Maintained(t)
FileRenewal(t)
FileAbandonment(t)
X✔
⃝�✔
X✔ ⃝�X ⃝�
8
Source Code Files
Refactoring activities• Renamed files• Moved files
Validity of renewal, abandonment measurements
9
Research QuestionsRQ1 How does the ecosystem grow over time?
RQ2 How do the technical artefacts of the ecosystem evolve?
RQ3 How does the ecosystem’s contributor team evolve?
RQ4 How do changes in the contributor team impact the technical artefacts?
10
Dataset• Ruby ecosystem in GitHub• GHTorrent dataset [2] (2016-09-05 dump)
• Timespan: October 2007 – September 2016
• Time unit: year quarters
• Commit activity• Three levels: Base
project/Forks/Ecosystem
[2] G. Gousios. The GHTorrent dataset and tool suite. Working Conf. Mining Software Repositories, pages 233-236, 2013.
11
Dataset Perils – Mitigation & filtersFilter Description Perils
1 Eliminate non-Ruby projects
2 Eliminate inactive projects Low project activity, inactive project, repository is not a project
3 Eliminate isolated projects Personal projects
4 Eliminate forks without merges to the base project
Inactive project, few projects use pull requests
5 Eliminate short-lived contributors Noise of occasional/short-lived contributors
6 Only consider source code files in commits
Non-software development project
12
DatasetBase Forks Ecosystem
Projects 10,792 49,101 60,073
Contributors 42,206 34,317 55,924
Touched Files 681,539 191,016 712,300
Commits 2,638,097 887,030 3,525,127
LOC 389,930,604 77,510,268 467,440,872
RQ1 How does the ecosystem grow over time?
13
Commits Lines of Code (LOC)
RQ1 How does the ecosystem grow over time?
14
Base Projects Forks
Quarter 25 (November 2013-February 2014)Small number of new projects
RQ1 How does the ecosystem grow over time?
15
Base Projects Forks
Before quarter 25• Base Projects: 30-40% new projects, less than 10% abandoned• Forks: more than 60% new forks
RQ1 How does the ecosystem grow over time?
16
Evidence of contributor migration to JavaScriptAfter quarter 17 (December 2011)
Larger growth of JavaScript ecosystem
RQ2 How do the technical artefacts (files) evolve?
Base Projects ForksBase projects: Bulk of development activityAfter quarter 25: decrease of new files
17
RQ3 How does the contributor team evolve?
Base Projects Forks
Contributors leave forks but continue to participate in base projectsAfter quarter 20: more Leavers ; less Joiners
18
Ecosystem
RQ3 How does the contributor team evolve?
Base Project Forks
Decreasing renewal ; increasing abandonmentAfter quarter 25: Abandonment > Renewal
19
Ecosystem
20
Ecosystem Active in Ruby
JavaScript 18,038
Python 10,707
Java 7,363
C 6,406
Ecosystem Abandoned Ruby Percentage
JavaScript 13,814 77%
Python 8,131 76%
Java 5,132 70%
C 4,174 65%
Most Ruby Leavers…• worked in JavaScript projects in parallel to Ruby projects• Continued to work in JavaScript after abandoning Ruby
RQ3 How does the contributor team evolve?
21
RQ4 How do changes in the contributor team impact the technical artefacts?
Diversity index of Leavers(relative entropy)
Increased Leaver specialization throughout time:Large contribution to important projects
22
Threats to validity• Multiple user accounts
• Less common within the same GitHub repository• Identity merging [3]
• Programming language identification
• GitHub dataset• Filters to eliminate noise• Activity outside GitHub• Merged pull requests appear as non-merged in GitHub• Not all activity results from registered users
[3] M. Goeminne and T. Mens, “A comparison of identity merge algorithms for software repositories,” Science of Computer Programming, vol. 78, no. 8, pages 971–986, 2013
ConclusionRuby software ecosystem in GitHub• Investigate the permanent modifications of the
socio-technical network• Impact of permanent changes in contributor
team on the technical artefacts• Preliminary evidence about contributor
migration across different ecosystems (Ruby → JavaScript)
Identify risks in project/ecosystem evolution due to important team changes
23
24
Ongoing/Future WorkContributor migration across different ecosystems
Advanced socio-technical analyses• Socio-technical congruence• Socio-technical debt• Their effect on the ecosystem evolution
25
Thank you!