Supporting Software Evolution Using Adaptive Change Propagation

25
Supporting Software Evolution Using Adaptive Change Propagation Heuristics Haroon Malik Ahmed E. Hassan School of Computing, Queen’s University, Canada 1

Transcript of Supporting Software Evolution Using Adaptive Change Propagation

Page 1: Supporting Software Evolution Using Adaptive Change Propagation

1

Supporting Software Evolution Using Adaptive Change Propagation Heuristics

Haroon MalikAhmed E. HassanSchool of Computing, Queen’s University, Canada

Page 2: Supporting Software Evolution Using Adaptive Change Propagation

2

What is Change PropagationIt is the process of propagating code

changes to other entities in software system.

It ensures the consistency of assumptions in the system after changing an entity.

Mis-propagating likely to introduce bugs

Page 3: Supporting Software Evolution Using Adaptive Change Propagation

3

The Change Propagation Process

DetermineInitial Entity To Change

ChangeEntity

DetermineOther Entities

To Change

ConsultGuru for Advice

New Req., Bug Fix

“How does a change in one source code entity propagate to other entities?”

No MoreChanges

For Each Entity

Suggested Entity

Page 4: Supporting Software Evolution Using Adaptive Change Propagation

Consider change set with A, B and C changing together

4

A

B

C

Page 5: Supporting Software Evolution Using Adaptive Change Propagation

Consider change set with A, B and C changing together

5

A

B

C

B

CA

D ED

HIST Heuristic

CUD Heuristic(Static dependency)

HELPFUL Wasted Developer time

Page 6: Supporting Software Evolution Using Adaptive Change Propagation

Consider change set with A, B and C changing together

6

A

B

C

B

CA

D ED

HIST Heuristic

CUD Heuristic(Static dependency)

HELPFUL Wasted Developer time

Which heuristics should we pick ?

We should track the performance of pool of heuristics over time for each entity

Page 7: Supporting Software Evolution Using Adaptive Change Propagation

Consider change set with A, B and C changing together

7

A

B

C

B

CA

D DD

HIST Heuristic

CUD Heuristic(Static dependency)

HELPFUL Wasted Developer time

Best Heuristic table (BHT)

Tracks and updates

Page 8: Supporting Software Evolution Using Adaptive Change Propagation

Consider change set with A, B and C changing together

8

A

B

C

B

CA

D DD

HIST Heuristic

CUD Heuristic(Static dependency)

A

E

D

Tim

e

HIST or CUD? BHT says HIST always work

well with A [A-Freq]. We use HIST BHT might also say HIST

worked well with A, last time [A-REC]

Page 9: Supporting Software Evolution Using Adaptive Change Propagation

Consider change set with A, B and D changing together

9

E

DA

Page 10: Supporting Software Evolution Using Adaptive Change Propagation

Consider change set with A, B and D changing together

10

E

DA

B

Page 11: Supporting Software Evolution Using Adaptive Change Propagation

Consider change set with A, B and D changing together

11

E

DA

B

X

Y

Precision= 1/5= 20%Recall = 1/1= 100%We want high Precision & high

Recall

Page 12: Supporting Software Evolution Using Adaptive Change Propagation

12

Change Propagation Challenge

Mostly manual & time consuming processRequires dependency on others

knowledge of senior developers, who are usually too busy to guide every change

Experience of guru, who rarely exists in large projects Communication among different teams; itself is a

challenge in large projects Use of documentation & previous test suits which are

rarely up-todate

Page 13: Supporting Software Evolution Using Adaptive Change Propagation

13

Shortcomings of Current Practices

Explores single dimension HIST: Given a changed entity A, a HIST heuristic would suggest

all entities that changed often with A in the past. CUD: Given a modified entity A, a CUD heuristic returns all

entities that depend on A or that A depends on. FILE: Given a modified entity A, a file heuristic would return all

entities in the same file as A

Static heuristics Do not adjust over time nor, Adapt to particular changed entity

Page 14: Supporting Software Evolution Using Adaptive Change Propagation

14

Proposed Approach

Adaptive co-change meta-heuristics:Tracks best performing heuristics for each

entity in Best Heuristic table (BHT)Updates Table as project evolves

Page 15: Supporting Software Evolution Using Adaptive Change Propagation

15

BHT Update

BHT has best performing heuristicsA-Recency:

For the last change of an entity

A-Frequency Over all changes of an entity

By continuously updating the BHT table, we ensure that we are always using the most optimal heuristic for an entity

Page 16: Supporting Software Evolution Using Adaptive Change Propagation

16

Empirical Study

Used change sets from 5 open source projects with over 39 years of development:PostgreSQL, FreeBSD, Gcluster and GCC

Recover change sets from source control repositories (CVS)

Replayed the history to measure the performance

Page 17: Supporting Software Evolution Using Adaptive Change Propagation

17

Performance Measures of Heuristics

ProjectHIST CUD FILE A-Freq A-Rec

Rec Prec Rec Prec Rec Prec Rec Prec Rec PrecPostgress 0.69 0.14 0.44 0.02 0.73 0.13 0.45 0.25 0.4 0.30FreeBSD 0.70 0.12 0.40 0.02 0.76 0.11 0.41 0.27 0.41 0.30GCluster 0.52 0.18 0.38 0.09 0.70 0.14 0.39 0.22 0.35 0.28GCC 0.78 0.10 0.43 0.02 0.80 0.12 0.51 0.21 0.47 0.25All 0.67 0.13 0.41 0.04 0.74 0.12 0.44 0.23 0.40 0.28F-measure 0.23 0.06 0.21 0.30 0.33

Recall: Adaptive heuristics are similar to traditional heuristicsPrecision: Adaptive heuristics out perform traditional heuristicsF-measure: Adaptive heuristics out perform traditional heuristics

(23% better than the best heuristic HIST)

Page 18: Supporting Software Evolution Using Adaptive Change Propagation

18

Performance Characteristics of Adaptive Heuristics

To better understand our Adaptive Heuristics we examined their performance along three direction:

Performance Over TimeBHT Composition over TimeBHT suggestions vs. optimal suggestions

Page 19: Supporting Software Evolution Using Adaptive Change Propagation

19

Performance Over Time

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

1993 1995 1997 1999 2001 2003 2005

Years

Precesion

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

1993 1995 1997 1999 2001 2003 2005

Years

Recall

HIST CUD File A-Freq A-Rec

For Precision: Adaptive heuristic outperforms traditional heuristics.

For Recall: Adaptive heuristics do not perform as well as other traditional heuristics. Overall A-Rec has lower recall as compared to A-Freq for all projects

Page 20: Supporting Software Evolution Using Adaptive Change Propagation

20

BHT Composition over Time

0

5

10

15

20

25

30

35

40

45

50

55

60

0 500 1000 1500 2000 2500 3000 3500 4000Day(s)

HBT

com

post

ion(

%)

HISTFILECUD

0

5

10

15

20

25

30

35

40

45

50

55

0 500 1000 1500 2000 2500 3000 3500 4000Day(s)

HBT

com

post

ion

(%)

HISTFILECUD

A-Freq A-Rec

BHT for Free BSD All projects show same trends At start History is not widely used As the projects evolves, HIST is most effective.

HIST HIST

Page 21: Supporting Software Evolution Using Adaptive Change Propagation

21

BHT Suggestion Vs. Optimal Since we are replaying of historical change set we can

compare Adaptive vs. Optimal heuristic Optimal heuristic always 100% suggests the best heuristic Suggestion: # of correctly suggested heuristics

76-85% Performance:

63% of optimal F-measureHIST is 44% of optimal best performing basic heuristics

37% room for improvement

Page 22: Supporting Software Evolution Using Adaptive Change Propagation

22

Improving the Performance Adaptive Heuristics

Improve HIST in hope to improve adaptive heuristics by employing advance techniques

Two improved HIST [Hassan, Holt: 2005] RECN(M): given a changed entity E, RECN(M) suggests all

entities that changed with E in the past M months. FREQ(A): given a changed entity E, FREQ(A) suggests all

entities that changed with E at least twice in the past and changed more that A% of the time with E.

Page 23: Supporting Software Evolution Using Adaptive Change Propagation

23

Improved HIST heuristics

Integrated RECN(4) and FREQ(60) into the heuristic pool used by adaptive meta-heuristics

Achieved 0.73 to 0.78 for Recall and 0.64 for Precision Nearly 30% increase in performance:

A-FREQ is within 91% of the optimal heuristic A-REC is within 93% of the optimal heuristic

RECN(M) F-Measure FREQ(A) F-MeasureRECN(2) 0.39 FREQ(50) 0.39RECN(4) 0.40 FREQ(60) 0.44RECN(6) 0.34 FREQ(70) 0.42RECN(8) 0.28 FREQ(80) 0.39

Page 24: Supporting Software Evolution Using Adaptive Change Propagation

24

FindingsAdaptive heuristics can achieve:

0.73 to 0.78 for Recall and 0.64% Precession

57% improvement over T. heuristicsPerformance difference are statically

significant based on a paired Wilcoxon signed rant test at 5% level of significant. (Alpha=0.05)

Page 25: Supporting Software Evolution Using Adaptive Change Propagation

25

Conclusion