Software Quality Assessment (SQA) ProfilesRule-Based Activity Profiles for Continuous Integration Environments
Department of Informatics
Martin Brandtner, Sebastian Müller, Philipp Leitner, and Harald C. Gall
University of Zurich, Switzerland
{brandtner, smueller, leitner, gall}@ifi.uzh.ch
IEEE SANER 2015, Montréal, Canada
Continuous Integration Environment
IEEE SANER 2015, Montréal, Canada 1
Issue tracker Version controlsystem
CI platformStatus
dashboard
Continuous Integration Environment
IEEE SANER 2015, Montréal, Canada 2
Dave -PMC Member
Ann -PMC Member
Continuous Integration Environment
IEEE SANER 2015, Montréal, Canada 3
Issue tracker Version controlsystem
CI platformStatus
dashboard
Activity data
Data recommendation
Stakeholderprofiles
Stakeholder Roles
• are defined by the
project
management
• may not reflect the
actual field of
activity
• do not change
during a project
IEEE SANER 2015, Montréal, Canada 4
• can change over
time
• based on actual
activity data
Stakeholder Profiles
Stakeholder Profiles≠
Research Question 1
Can activity data mined from the version
control system and issue tracking platform
be used for the extraction of profiles
within the Project Management Committee
role?
IEEE SANER 2015, Montréal, Canada 5
Research Question 2
What profiles of PMC members can be
extracted from the activity data, and how
can these profiles be described in a ruled-
based model?
IEEE SANER 2015, Montréal, Canada 6
Approach
1) Extraction of profiles from 20 projects by
clustering
2) Definition of a rule-based model to describe
the extracted profiles (SQA-Profiles)
3) Evaluation of the rule-based profile model
IEEE SANER 2015, Montréal, Canada 7
Profile Extraction by Clustering
IEEE SANER 2015, Montréal, Canada 8
VCS data
Issue data
VCS and Issuedata per
stakeholder
20 Apache projects
Clustering
4 Profiles
Activity data:# Commits# Merges# Issue state changes# Issue comments# Issue assignee changes# Issue priority changes
Profile Extraction by Rule Inference
Goal:
Rule-based and project-independent description of
activity profiles
Approach:
Attributes: commits, merges, issue state changes, etc.
Nominal scale for each attribute and project
Profiles defined based on attribute values
(e.g. commits: MEDIUM, merges: HIGH => Profile A)
IEEE SANER 2015, Montréal, Canada 9
Extracted SQA-Profiles – Integrator
Integration of source code contributions
High merging activity
At least one other attribute with medium activity
IEEE SANER 2015, Montréal, Canada 10
HH = At least one attribute with high activity / HM = At least one attribute with medium activitySH = Set of all stakeholders / A = Set of all attributes
Extracted SQA-Profiles
Bandleader
Keeps the show running
High activity in each attribute
Gatekeeper
Decides when the status of an issues changes
High status change activity and moderate activity in assignee changes or commits
Onlooker
Limited contributions (VCS and issue tracking)
At least one attribute with medium activity and at least two attributes with low activity
IEEE SANER 2015, Montréal, Canada 11
Evaluation
IEEE SANER 2015, Montréal, Canada 12
Rule-based profiles Baseline
Evaluation – Results
Profile TP FP Total Precision Recall
Bandleader 3 1 3 75% 100%
Integrator 9 1 9 90% 100%
Gatekeeper 9 5 12 64% 75%
Onlooker 80 2 106 98% 75%
Total 101 9 130 92% 78%
IEEE SANER 2015, Montréal, Canada 13
Rule-based profiles overlap strongly with
machine-learning based clusters
Evaluation – Results
Profile PMC Member Non PMC Member
Bandleader 4 0
Integrator 9 4
Gatekeeper 11 9
Onlooker 20 20
IEEE SANER 2015, Montréal, Canada 14
Even non PMC members perform PMC
member activities
Summary and Outlook
IEEE SANER 2015, Montréal, Canada 15
RQ1: Can activity data be mined to extract profiles?
RQ2: What kind of profiles can be described?
Summary and Outlook
IEEE SANER 2015, Montréal, Canada 16
RQ1: Can activity data be mined to extract profiles?
RQ2: What kind of profiles can be described?
http://goo.gl/Jk01KR