Automation in the Bug Flow - Machine Learning for Triaging and Tracing
-
Upload
markus-borg -
Category
Science
-
view
286 -
download
1
description
Transcript of Automation in the Bug Flow - Machine Learning for Triaging and Tracing
Automation in the Bug Flow- MACHINE LEARNING FOR TRIAGING AND TRACING
MARKUS BORG, LUND UNIVERSITY
Bug tracker
The number of incoming bug reports can be overwhelming…
Bug trackerMachine Learning
Use machine learning
to provide actionable
recommendations
based on previous
patterns!
- Final year PhD student- MSc CS and engineering- ABB software developer (3 years)
process automationcompilers and editors
Per Runeson
The ChallengeThe SolutionThe Evaluation
Reqts. DB
Issue RepoCode Repo
Test DB
Doc. DB.
Developers in large projects must navigate complex information landscapes that continously change
One bug is not much of a problem…
Bug tracker
But a large simultaneous inflow of bug reports can make the best bug tracking system sweat!
Making the wrong prioritizations might result in bugs on your customers
In a safety-critical context
1. Issue Assignment
2. Change Impact Analysis
This talk addresses two tasks involved in issue management:
By safety-critical we refer to document-driven development with a rigid process…
… prior to changing source code, a formal change impact analysis has to be conducted and reported according to an approved template.
We want to increase the confidence at commit time even further!
The ChallengeThe SolutionThe Evaluation
We aim to harness the intrinsic navigational value of bug reports
Bug trackerMachine Learning
We leverage on the number of bug reports in the projects…
Bug tracker
Machine Learning
The more bugs, the better the machine learning gets!
Leif Jonsson
Automated Issue Assignment
• Goal:
Useful tool deployable with minimum configuration effort
• Approach:
Train classifiers on historical bug reports
Combine them using state-of-the-art ensemble learning
Joint work with
Ericsson Research
How to Represent a Bug Report?
• Component
• Severity
• System Version
• Submit Date
• Submitter Location
etc.
• … And the text! Title and description.
Automated Change Impact Analysis
• Goal:
Intuitive tool to jump start analyses based on historical data
Faster + more accurate analyses compared to fully manual work
• Approach
Step 1: Mine the history
Step 2: Recommend impact for new bug fix
Present recommendations Amazon-style:”Other developers working on this class also modified/tested…”
Construct network of previously reported impact
Index textual data with
Calculate centrality measures
Automated Impact Analysis
• Approach part 2: Recommend impact
Find similar bugs using Apache Lucene
Follow links to identify candidate impact set
Design Doc. X.Y
Req. X.Y
Test case UTC56
Req. Z.Y
Design Doc. X.Y
Automated Impact Analysis
• Approach part 2: Recommend impact
Follow links to identify candidate impact set
Use centrality measures to rank candidate impact
Find similar bugs using Apache Lucene
1. Requirement X.Y
2. Design Document X.Y
3. Test case UTC56
4. Design Document X.Y
5. Requirement Z.Y
Screen shot of prototype tool ImpRec
The ChallengeThe SolutionThe Evaluation
Experiment: Issue Assignment
• Five large datasets from two companies
– Telecom and Automation
– 50.000+ issue reports
• 10-fold cross-validation and simulation (”replaying history”)
Experiment: Issue Assignment
• Prediction accuracy in line with human activity
• Numbers depict number of teams in the projects
67
1764
2836
Experiment: Issue Assignment
• Warning! Some systems need fresh training data
Experiment: Issue Assignment
But the decay is not always exponential…
Experiment: Change Impact Analysis
• Experiment with historical impact
– Training set: 8 years, Test set: 2 years
ImpRec presents 30% of past impact among the top-5 recommendations(40%@10, 50%@20)
But what does that mean? User study needed.
Case Study: Change Impact Analysis
• Industrial case study
– Two units of analysis: Team Sweden & Team India
– Tool deployed in March 2014 & August 2014
– Interviews and user log files
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 200
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Click Distribution, top-20 hits
IA Google
Initial result:Developers’ interaction with impact recommendations similar to Google searches
Case Study: Change Impact Analysis
• ”Finding these past bugs was exactly what I was looking for actually”
- Developer, India
• ”Directing attention to potential side-effects is very important”
- Project manager, Sweden
Some encouraging
comments from
developers…
Conclusions
Automated Issue Assignment
• Automated assignment as accurate as current manual work
– But instantaneous!
• At least 2.000 bug reports needed for training
• Continously monitor the accuracy
Favourable Unfavourable
Static team structure
Dynamic team structure
Maintenance project
New development
Automated Change Impact Analysis
• Recommendation system provides a useful starting point
• Recommending related issues is a popular feature
– Study previous issue resolutions
– Compare with previous impact analyses
• Recommendation recall for impact: 30-55%
– Reuse previous impact to jump-start analysis
– Provide warning if probable impact is missing
Embrace your bugs!
Machine learning canguide maintenance work
Much potential:- Severity prediction- Resolution times- ”Noise” filtering
PHOTO CREDITS
Brown stink bug- Marlin E. RiceIsopods- Omoshiro Aquarium- Flickr: littoraria, codaCubicles- Flickr: templetonelliot, ifl, danburgmurmurEightball girl- Flickr: mobilestreetlifeEvaluate- Flickr: theideadeskMy wife- My wife
Thank you!
[email protected]/markus_borg @_Troddel_