ICSE 2014 NIER Track - New Opportunities for Extracting Insight from Cloud Based IDEs

12
IBM Watson Life ©2014 IBM Corporation 1 University of California - Irvine, 2 IBM Watson Group - Watson Life, 3 IBM Research June 4, 2014 Yi Wang 1 , Patrick Wagstrom 2,3 , Evelyn Duesterwald 3 , David Redmiles 1 New Opportunities for Extracting Insight from Cloud Based IDEs @pridkett http://wagstrom.net/ [email protected]

description

It used to be that getting fine grained data about development practices was a complicated process that required convincing developers to install a plugin in their IDE. The shift to cloud based IDEs opens up numerous possibilities finer grained understanding of development practices. This research reports on a small preliminary study from IBM where we tracked eight developers as they interacted with a cloud based IDE and yields interesting insights about how much information we can actually learn from these environments.

Transcript of ICSE 2014 NIER Track - New Opportunities for Extracting Insight from Cloud Based IDEs

Page 1: ICSE 2014 NIER Track - New Opportunities for Extracting Insight from Cloud Based IDEs

IBM Watson Life ©2014 IBM Corporation

1University of California - Irvine, 2IBM Watson Group - Watson Life, 3IBM Research June 4, 2014

Yi Wang1, Patrick Wagstrom2,3, Evelyn Duesterwald3, David Redmiles1

New Opportunities for Extracting Insight from Cloud Based IDEs

@pridkett http://wagstrom.net/ [email protected]

Page 2: ICSE 2014 NIER Track - New Opportunities for Extracting Insight from Cloud Based IDEs

IBM Watson Life :: ©2014 IBM Corporation

Spoiler Alert

• Novelty: • Identify new opportunities provided by cloud based IDEs

• Demonstrate how to realize these opportunities with a case study.

• New data analysis technique applied to case study.

• Emerging: • Insights extracted from data collected in our case study

• Promising future directions for cloud based IDEs

• Impact: • Researchers - shift in focus to distributed environments

• Practitioners - the future is awesome

• IDE Developers - rich possibilities for tool enhancement based on findings

2

Page 3: ICSE 2014 NIER Track - New Opportunities for Extracting Insight from Cloud Based IDEs

IBM Watson Life :: ©2014 IBM Corporation 3

Page 4: ICSE 2014 NIER Track - New Opportunities for Extracting Insight from Cloud Based IDEs

IBM Watson Life :: ©2014 IBM Corporation 4

Page 5: ICSE 2014 NIER Track - New Opportunities for Extracting Insight from Cloud Based IDEs

IBM Watson Life :: ©2014 IBM Corporation

Experimental Setup - Flow

• Lab Experiment with 8 IBM Employees

• Intake and Exit Surveys to Understand Development Skill and Experience

• Eight Programming Tasks of Increasing Difficulty • Similar to Interview Questions

• Written in JazzHub Orion Code Editor

• Instrumented to Record (Audio, Screen, and Network) of All Sessions

• Effort and Difficulty Self-Evaluated After Each Task

• Solutions Were Graded and Subjects Ranked after Experiment

5

Page 6: ICSE 2014 NIER Track - New Opportunities for Extracting Insight from Cloud Based IDEs

IBM Watson Life :: ©2014 IBM Corporation

Experimental Setup - Man In the Middle Attack

6

Participant Computer

Execution Container

MITMProxy

JazzHub

Google

StackOverflowData Logger

The Internet

Page 7: ICSE 2014 NIER Track - New Opportunities for Extracting Insight from Cloud Based IDEs

IBM Watson Life :: ©2014 IBM Corporation

Code Growth Extraction

• Captured TCP Session Traces Allow Replay of Code Growth

• Collected 41 Total <subject, task> Traces

7

Page 8: ICSE 2014 NIER Track - New Opportunities for Extracting Insight from Cloud Based IDEs

IBM Watson Life :: ©2014 IBM Corporation

Developer Traces

8

0 100 300 500

050

100

200

Subject 4 (Worst)

time (seconds)

code

size

(byt

es)

0 20 40 60 80

010

020

030

040

0

Subject 6 (Best)

time (seconds)

code

size

(byt

es)

Insight 1: Differing levels of expertise yield dramatically different code growth patterns

Page 9: ICSE 2014 NIER Track - New Opportunities for Extracting Insight from Cloud Based IDEs

IBM Watson Life :: ©2014 IBM Corporation

Co-Integration Networks

• Cointegration is an Indicator of Shape Similarity of Two Time Series

• We Calculated Pairwise Cointegration for all 41 Traces

• Cointegration Network was Created UsingCases when p-value < 0.05

9

S1.1

S1.2

S1.3

S1.4

S2.1

S2.2S2.3

S2.4S2.5S3.1

S3.2

S3.3

S3.4S3.5

S4.1S4.2

S4.3S5.1

S5.2

S5.3

S5.4S6.1

S6.2

S6.3

S6.4

S6.5S6.6

S6.7

S6.8

S7.1

S7.2S7.3

S7.4

S7.5S8.1

S8.2

S8.3

S8.4

S8.5

S8.6

S8.7

Page 10: ICSE 2014 NIER Track - New Opportunities for Extracting Insight from Cloud Based IDEs

IBM Watson Life :: ©2014 IBM Corporation

Co-Integration Networks

• The “best” developers (subject 6 and 8) share few intra-subject edges (5.71%), while the other 6 subjects share 112 edges (34.36%).

• The “best” developers traces are mostly located in the peripheral part of the network of Fruchterman-Reingold layout.

10

S1.1

S1.2

S1.3

S1.4

S2.1

S2.2S2.3

S2.4S2.5S3.1

S3.2

S3.3

S3.4S3.5

S4.1S4.2

S4.3S5.1

S5.2

S5.3

S5.4S6.1

S6.2

S6.3

S6.4

S6.5S6.6

S6.7

S6.8

S7.1

S7.2S7.3

S7.4

S7.5S8.1

S8.2

S8.3

S8.4

S8.5

S8.6

S8.7

Insight 2: The “best” developers may have more strategies in

programming, making the traces are more diverse. This may be used as

an potential indicator to infer developers’ skill levels.

Page 11: ICSE 2014 NIER Track - New Opportunities for Extracting Insight from Cloud Based IDEs

IBM Watson Life :: ©2014 IBM Corporation

Individual Programming Styles

• Question: Are code growth traces are more similar for a collection of users doing same task, or for an individual user doing a collection of task?

• Compute edge density in three categories: • I: (subject-subject)—0.3085

• II: (task-task)—0.2479

• III: (task-subject)—0.2611.

• Test the differences using Kruskal-Wallis test: P(I, II) < 0.01, P(I, III) = 0.03.

11

Insight 3: Individual differences rather than Task differences contribute more to the differences of code

growth traces. Code growth traces may be some kind of “finger-print” of developers.

Page 12: ICSE 2014 NIER Track - New Opportunities for Extracting Insight from Cloud Based IDEs

IBM Watson Life :: ©2014 IBM Corporation

Major Impacts

• Shift from desktop to cloud opens up incredible opportunities for studying large numbers of developers in depth

• Centralized instrumentation allows data from one developer to benefit other developers in near real-time

• In depth knowledge of IDE user behavior from cloud IDEs will enable next leap in productivity for software development professionals

12

@pridkett http://wagstrom.net/ [email protected]