Understanding the Rationale for Updating a Function's Comment
Transcript of Understanding the Rationale for Updating a Function's Comment
![Page 1: Understanding the Rationale for Updating a Function's Comment](https://reader036.fdocuments.in/reader036/viewer/2022070515/5878d74f1a28ab917a8b65a9/html5/thumbnails/1.jpg)
Understanding the Rationale for Updating a Function’s Comment
Haroon Malik, Istehad Chowdhury, Hsiao-Ming Tsou, Zhen Ming Jiang, Ahmed E. HassanSchool of Computing, Queen’s University, Canada
![Page 2: Understanding the Rationale for Updating a Function's Comment](https://reader036.fdocuments.in/reader036/viewer/2022070515/5878d74f1a28ab917a8b65a9/html5/thumbnails/2.jpg)
2
Documentation is vital for the successful evolution of a software
system
![Page 3: Understanding the Rationale for Updating a Function's Comment](https://reader036.fdocuments.in/reader036/viewer/2022070515/5878d74f1a28ab917a8b65a9/html5/thumbnails/3.jpg)
3
Why understand the rationale for updating a comment
![Page 4: Understanding the Rationale for Updating a Function's Comment](https://reader036.fdocuments.in/reader036/viewer/2022070515/5878d74f1a28ab917a8b65a9/html5/thumbnails/4.jpg)
4
Because…
Reduce efforts to understand code
Reduce maintenance cost
Prevent bugs
Increase reliability
![Page 5: Understanding the Rationale for Updating a Function's Comment](https://reader036.fdocuments.in/reader036/viewer/2022070515/5878d74f1a28ab917a8b65a9/html5/thumbnails/5.jpg)
5
Likelihood of updating a commentFunction 1.function incrementValue ($val) {
return ($val++);}
Function 2.
function processInput($val) { //loop 11 times.
for (i=0;i<10;i++) {// loop executes for the upper bound of J
for (j=0;j<10;j++) { $val = ($val | i) << 2; $val = $val & $j << 2; } } return $val;}
Which
One?
![Page 6: Understanding the Rationale for Updating a Function's Comment](https://reader036.fdocuments.in/reader036/viewer/2022070515/5878d74f1a28ab917a8b65a9/html5/thumbnails/6.jpg)
6
Likelihood of updating a commentFunction 1.function incrementValue ($val) {
return ($val++);}
Function 2.
function processInput($val) { //loop 11 times.
for (i=0;i<10;i++) {// loop executes for the upper bound of J
for (j=0;j<10;j++) { $val = ($val | i) << 2; $val = $val & $j << 2; } } return $val;}
It Depends!!
![Page 7: Understanding the Rationale for Updating a Function's Comment](https://reader036.fdocuments.in/reader036/viewer/2022070515/5878d74f1a28ab917a8b65a9/html5/thumbnails/7.jpg)
7
• Modified function characteristics (8 attributes)– Long vs. short functions– Long vs. short function names– Well-documented functions– Complex vs. simple functions (# of control statements)
• Change characteristics (8 attributes)– Complex vs. simple change– Large vs. small change
• Time and code ownership characteristics (9)– Do habits change over time? Weekends vs. weekends– Same developer that changed it last time
Study Dimensions
![Page 8: Understanding the Rationale for Updating a Function's Comment](https://reader036.fdocuments.in/reader036/viewer/2022070515/5878d74f1a28ab917a8b65a9/html5/thumbnails/8.jpg)
8
Comment Update?
YES ? No?
Modeled as a classification problem
![Page 9: Understanding the Rationale for Updating a Function's Comment](https://reader036.fdocuments.in/reader036/viewer/2022070515/5878d74f1a28ab917a8b65a9/html5/thumbnails/9.jpg)
9
Measuring Performance
True ClassClassified AsYES NO
YES a bNO c d
We measure overall misclassification rate = (b+c)/(a+b+c+d)
![Page 10: Understanding the Rationale for Updating a Function's Comment](https://reader036.fdocuments.in/reader036/viewer/2022070515/5878d74f1a28ab917a8b65a9/html5/thumbnails/10.jpg)
10
• Explainable model• Resistant to noise• Correlated attributes• Minimum configuration
Need
![Page 11: Understanding the Rationale for Updating a Function's Comment](https://reader036.fdocuments.in/reader036/viewer/2022070515/5878d74f1a28ab917a8b65a9/html5/thumbnails/11.jpg)
Random ForestsProject Comment
update history
Data Set
![Page 12: Understanding the Rationale for Updating a Function's Comment](https://reader036.fdocuments.in/reader036/viewer/2022070515/5878d74f1a28ab917a8b65a9/html5/thumbnails/12.jpg)
12
Random ForestsProject Comment
update history
Data Set
Random Sample
Random TressYes No No
No
Vote
Prediction
![Page 13: Understanding the Rationale for Updating a Function's Comment](https://reader036.fdocuments.in/reader036/viewer/2022070515/5878d74f1a28ab917a8b65a9/html5/thumbnails/13.jpg)
13
Finding Top Attributes
• Sensitivity Analysis for particular attribute• Randomly change the value in all samples• Re-classify and compare performance–Drop in performance is relative to the
importance of the attribute
![Page 14: Understanding the Rationale for Updating a Function's Comment](https://reader036.fdocuments.in/reader036/viewer/2022070515/5878d74f1a28ab917a8b65a9/html5/thumbnails/14.jpg)
Case Study
• Used 4 open source projects with over 39 years of development:• PostgreSQL, FreeBSD, Gcluster and GCC
• Conducted 5 experiments• 1 for each dimension• 1 for all attributes of each project• 1 for total combined attributes of all projects
![Page 15: Understanding the Rationale for Updating a Function's Comment](https://reader036.fdocuments.in/reader036/viewer/2022070515/5878d74f1a28ab917a8b65a9/html5/thumbnails/15.jpg)
15
Exp. #1 Characteristics of changed function
• Intuition– Modification to complex functions are trickier and
more likely to introduce integration bugs
• Findings– Likelihood of comment update is higher in
functions • With a large number of comments • That are complex
![Page 16: Understanding the Rationale for Updating a Function's Comment](https://reader036.fdocuments.in/reader036/viewer/2022070515/5878d74f1a28ab917a8b65a9/html5/thumbnails/16.jpg)
16
Exp. #2 Characteristics of the change
• Intuition– More extensive and complex changes will increase
the probability that a comment will get updated
• Findings– Likelihood of comment update is higher for
changes • That are bug fixes• With a large number of changed dependencies• Which increase the complexity of a function (control statements)
![Page 17: Understanding the Rationale for Updating a Function's Comment](https://reader036.fdocuments.in/reader036/viewer/2022070515/5878d74f1a28ab917a8b65a9/html5/thumbnails/17.jpg)
17
Exp. #3 Change time and code-ownership
• Intuition– To see if time has any impact on a developer tendency
to update a comment– To highlight the relation of a function with developer
• Findings– Likelihood of comment update
• Depends on Weekday: Developers are reluctant to update comment on certain weekdays
• Does not depend on developer: non-creator of function will update too
![Page 18: Understanding the Rationale for Updating a Function's Comment](https://reader036.fdocuments.in/reader036/viewer/2022070515/5878d74f1a28ab917a8b65a9/html5/thumbnails/18.jpg)
18
Exp. #4 All attributes
• Intuition– To find general trend towards all attributes instead of
specific trend per dimension
• Findings– The top attributes are consistent across projects– The top attributes are from the changed function and
change characteristics dimension• Number of changed dependencies• Percentage of changed dependencies• Total number of comments
![Page 19: Understanding the Rationale for Updating a Function's Comment](https://reader036.fdocuments.in/reader036/viewer/2022070515/5878d74f1a28ab917a8b65a9/html5/thumbnails/19.jpg)
19
Exp. #5 All Projects
• Intuition– Determine the most influential attributes across
all projects
• Added an extra attribute “Project Name”• Findings– Project name did not bubble up as an important
attribute
![Page 20: Understanding the Rationale for Updating a Function's Comment](https://reader036.fdocuments.in/reader036/viewer/2022070515/5878d74f1a28ab917a8b65a9/html5/thumbnails/20.jpg)
20
How well we did ?
![Page 21: Understanding the Rationale for Updating a Function's Comment](https://reader036.fdocuments.in/reader036/viewer/2022070515/5878d74f1a28ab917a8b65a9/html5/thumbnails/21.jpg)
21
Number Speaks
• Performance of classifier improves with combining data from all projects. Over all misclassification rate ~ 20%
![Page 22: Understanding the Rationale for Updating a Function's Comment](https://reader036.fdocuments.in/reader036/viewer/2022070515/5878d74f1a28ab917a8b65a9/html5/thumbnails/22.jpg)
Conclusion
![Page 23: Understanding the Rationale for Updating a Function's Comment](https://reader036.fdocuments.in/reader036/viewer/2022070515/5878d74f1a28ab917a8b65a9/html5/thumbnails/23.jpg)
23
Random Forests
Training set …
1
2
n
n random cases
Classification Algorithm
n classifiers
1
2
3
3
n
Classification Algorithm
Classification Algorithm
Classification Algorithm
Test set
…
L1
L2
L3
Ln
n labels
Lvote