Msr presentation final

26
An Empirical Study of the Copy and Paste Behavior during Development Tarek M. Ahmed 1 Weiyi Shang Ahmed E. Hassan

Transcript of Msr presentation final

1

An Empirical Study of the Copy and Paste Behavior during Development

Tarek M. Ahmed Weiyi Shang Ahmed E. Hassan

2

Copy & Paste leads to clones

//Code fragment Copy

//Code fragment

Paste

3

//Code fragment Copy

//Code fragment

Paste

Copy & Paste leads to clones

Large number of code clones

4

Clone detection tools detect C&P after it is performed

Source Code

Clone Detection

C&P

C&P

Code Clones

5

There exists no large scale C&P study on developers

Controlled ExperimentSmall number Experienced only

6

Larger scale study exists on regular users

• Regular computer users• Non-Software development tasks

A large scale C&P study is needed for software development tasks

7

Eclipse Usage Data Collector (UDC) enables a large scale C&P study

20

Months>1 Millions

Users

8

How to detect C&P in Eclipse UDC

User ID What Kind … Description

104526 Executed Command org.eclipse.ui.edit.copy

User performs Copy

9

How to detect C&P in Eclipse UDC

User ID What Kind … Description

104526 Executed Command org.eclipse.ui.edit.copy

104526 Executed Command org.eclipse.ui.edit.paste

User performs Paste

10

Our study focuses on users who frequently and actively use Eclipse

Create Development

Sessions

Find Active Sessions

Find Frequent Users

11

20

Months4 Million C&P

20,000

Users

12

Uncovering the C&P behavior of IDE users

13

Average number of C&P per hour isdifferent from recent studies

2.73 16

Our finding Previous finding

#Commands > Average #Commands + 1 Standard deviation

#Commands > Average #Commands + 2 Standard deviation

Heavy Editing

Sessions

V. Heavy Editing

Sessions

11.39 13.18

14

Do IDE users follow the same C&P patterns as regular users?

How do IDE users copy and paste code across different file formats?

15

Do IDE users follow the same C&P patterns as regular users?

How do IDE users copy and paste code across different file formats?

16

Copy

//Code fragment

Paste

//Code fragment

Inside the same file Between different files

Copy

//Code fragment

//Code fragment

Paste

17

Copy

//Code fragment

Paste

//Code fragment

Inside the same file Between different files

Copy

//Code fragment

//Code fragment

Paste

IDE users often C&P within the same file

80%IDE 20%

IDE23%

Regular 77%Regular

18

IDE users perform consecutive C&P

A

Repeat

Copy

C Copy

B Paste

D Paste

19

IDE users perform consecutive C&P

AB

C

Distribution

Copy

Paste

Paste

20

IDE users perform consecutive C&P

A B C

Relay

Copy

Copy Paste

Paste

21

IDE users often perform relay on C&P

A

C

B

DA

B

C

A B C

Repeat Distribution

Relay

32%Regular

9%IDE

36%Regular

1%IDE

2%Regular

33%IDE

Others

30%Regular

57%IDE

22

C&P behavior of IDE users is different from regular users

IDE Users

Higher Within Higher Relay Lower Distribution

Regular Users

Higher Between Lower Relay Higher Distribution

Eclipse IDE requires tailored C&P support tools that differfrom regular users’ C&P tools

23

Do IDE users follow the same C&P patterns as regular users?

How do IDE users copy and paste code across different file formats?

There are major differences between C&P behavior of Eclipse IDE users and C&P behavior of regular users.

24

Do IDE users follow the same C&P patterns as regular users?

How do IDE users copy and paste code across different file formats?

There are major differences between C&P behavior of Eclipse IDE users and C&P behavior of regular users.

There exists large number of C&P between editors, hence, clone detection techniques would consider detect clones across different languages.

25

Summary

26