An Empirical Study of the Effect of File Editing Patterns on Software Quality

Post on 05-Jul-2015

48 views 2 download

description

While some developers like to work on multiple code change requests, others might prefer to handle one change request at a time. This juggling of change requests and the large number of developers working in parallel often lead to files being edited as part of different change requests by one or several developers. Existing research has warned the community about the potential negative impacts of some file editing patterns on software quality. For example, when several developers concurrently edit a file as part of different change requests, they are likely to introduce bugs due to limited awareness of other changes. However, very few studies have provided quantitative evidence to support these claims. In this paper, we identify four file editing patterns. We perform an empirical study on three open source software systems to investigate the individual and the combined impact of the four patterns on software quality. We find that: (1) files that are edited concurrently by many developers have on average 2.46 times more future bugs than files that are not concurrently edited; (2) files edited in parallel with other files by the same developer have on average 1.67 times more future bugs than files individually edited; (3) files edited over an extended period (i.e., above the third quartile) of time have 2.28 times more future bugs than other files; and (4) files edited with long interruptions (i.e., above the third quartile) have 2.1 times more future bugs than other files. When more than one editing patterns are followed by one or many developers during the editing of a file, we observe that the number of future bugs in the file can be as high as 1.6 times the average number of future bugs in files edited following a single editing pattern. These results can be used by software development teams to warn developers about risky file editing patterns.

Transcript of An Empirical Study of the Effect of File Editing Patterns on Software Quality

An Empirical Study of the

Effect of File Editing Patterns

on Software Quality

Feng Zhang, Foutse Khomh, Ying Zou

and Ahmed E. Hassan

2

Do developers follow some file editing patterns?

Concurrent?

Extended?

Interrupted?

File Editing

3

Concurrent?

Extended?

Interrupted?

?

?

?

What’s the impact on software quality?

File Editing

4

Examples

5

1. Example on concurrent editing

BugzillaTaskEditorPage.java

Frank Steffen Pingel David Green

6

1. Concurrent Editing

Several developers edit the same file concurrently.

7

2. Example on parallel editing

AbstractTaskEditorPage.java

Frank

TasksUiPlugin.java PlanningPerspectiveFactory.java

8

2. Parallel Editing

Multiple files are edited in parallel by the same developer.

9

3. Example on extended editing

DiscoveryViewer.java

4.5 hours

Mylyn Project

1.25 hours

3.6 times

10

3. Extended Editing

Developers spend longer time editing a file. (threshold: third quantile)

11

4. Example on interrupted editing

TaskCompareDialog.java

338 hours

Mylyn Project

2 hours

164 times

12

4. Interrupted Editing

There is a longer idle period during the editing of a file.

(threshold: third quantile)

13

File Editing Patterns Summary

1. Concurrent Editing

2. Parallel Editing

3. Extended Editing

4. Interrupted Editing

14

How to evaluate the effect of

file editing patterns?

Time

Split Date Density of Future Bug

File Editing Patterns

15

Subject Systems

3,883 logs

2,722 bugs

606 bugs

524 bugs

793 logs

638 logs

Mylyn

Eclipse Platform

PDE

16

Description of Data

Split Date: 2011-01-01

119 developers

98 bugs

2,140 files

5,070 CVS logs

17

1. Recovering File Edit History

Data Processing

2. Recovering File Change History

3. Identifying File Editing Patterns

bug Id

density of bug

18

Research Questions

RQ1: Are there different file editing patterns?

RQ2: Do file editing patterns lead to more bugs?

RQ3: Do interactions among file editing patterns lead to more bugs?

19

RQ1: Are there different file editing patterns?

Occurrences of file editing patterns and their interactions

0

500

1000

1500

2000

2500

1. Concurrent

2. Parallel

3. Extended

4. Interrupted

File editing patterns do exist

Combinations of patterns

Single Pattern

No

Patt

ern

20

RQ2: Do concurrent editing pattern lead to more bugs?

Odds Ratio of two groups:

NC: no concurrent editing

C: concurrent editing

2.1 times more likely to experience a future bug if concurrent editing pattern happens

21

RQ2: Level of involvement in concurrent editing pattern

Definition of Level average number of developers involved in a concurrent editing of a file.

0

1

2

3

No Pattern ≤3rd quantile > 3rd quantile

2.4* 1.6 Statistically significant

Not statistically significant

22

RQ2: Do parallel editing patterns lead to more bugs?

Odds Ratio of two groups:

NP: no parallel editing

P: parallel editing

1.9 times more likely to experience a future bug if parallel editing pattern happens

23

RQ2: Level of involvement in parallel editing pattern

Definition of Level average number of files edited in parallel with a file.

0

1

2

3

4

5

No Pattern ≤3rd quantile > 3rd quantile

1.3

3.8* Statistically significant

Not statistically significant

24

RQ2: Do extended editing patterns lead to more bugs?

1.9 times more likely to experience a future bug if extended editing pattern happens

Odds Ratio of two groups:

NE: no extended editing

E: extended editing

25

RQ2: Level of involvement in extended pattern

Definition of Level average editing time of a file.

0

0.5

1

1.5

2

No Pattern ≤3rd quantile > 3rd quantile

1.0

1.9* Statistically significant

Not statistically significant

26

RQ2: Do interrupted editing patterns lead to more bugs?

1.6 times more likely to experience a future bug if interrupted editing pattern happens

Odds Ratio of two groups:

NI: no interrupted editing

I: interrupted editing

27

RQ2: Level of involvement in interrupted pattern

Definition of Level average interruption time of a file.

0

0.5

1

1.5

2

No Pattern ≤3rd quantile > 3rd quantile

1.2 1.8*

Statistically significant

Not statistically significant

28

RQ3: Do interactions among file editing patterns lead to more bugs?

Odds ratio of 16 groups.

0

1

2

3

4

5

6

1. Concurrent

2. Parallel

3. Extended

4. Interrupted

Combinations of patterns are more risky than single pattern

Combinations of patterns

Single Pattern

No

Patt

ern

29

Conclusion

30

Concurrent Parallel

Extended Interrupted

2.1 Odds Ratio 1.9 Odds

Ratio

1.9 Odds Ratio 1.6 Odds

Ratio

31

Average Bug Density:

as high as 1.7 times

Single v.s. Interactions