Dependency Tracking in software systems Presented by: Ashgan Fararooy.

19
Dependency Tracking in software systems Presented by: Ashgan Fararooy

Transcript of Dependency Tracking in software systems Presented by: Ashgan Fararooy.

Page 1: Dependency Tracking in software systems Presented by: Ashgan Fararooy.

Dependency Trackingin

software systems

Presented by: Ashgan Fararooy

Page 2: Dependency Tracking in software systems Presented by: Ashgan Fararooy.

Related Papers• Supporting Software Evolution Analysis with Historical

Dependencies and Defect Information (ICSM 2008)

• A Flexible Framework to Support Collaborative Software Evolution Analysis (CSMR 2008)

• Mining Software Repositories for Traceability Links (ICPC 2007)

• Tracking Objects to Detect Feature Dependencies (ICPC 2007)• Software Repositories: A Source for Traceability Links

(TEFSE-GTC 2007)• Mining Version Archives for Co-changed Lines (ICSE 2006)• Understanding Semantic Impact of Source Code Changes: an

Empirical Study

Page 3: Dependency Tracking in software systems Presented by: Ashgan Fararooy.

Mining Version Archives for Co-changed Lines

Thomas Zimmermann, Sunghun Kim, Andreas Zeller, E. James Whitehead Jr.

(ICSE 2006)

Page 4: Dependency Tracking in software systems Presented by: Ashgan Fararooy.

Abstract

• Files, classes, or methods have frequently been investigated in research on co-change

• Present a first study at the level of lines

• Annotation Graph which captures how lines evolve over time

• More fine-grained software evolution information (based on lines)

Page 5: Dependency Tracking in software systems Presented by: Ashgan Fararooy.

Overview

• Co-Change: items that are changed together, are related to each other

• Any granularity: modules, files, classes, methods

• What about more fine-grained items: blocks, lines …

Page 6: Dependency Tracking in software systems Presented by: Ashgan Fararooy.

Co-Change in More Fine-Grained Items

• Seemed infeasible

• Hard to identify across different versions

• Line numbers are not suitable identifiers

• SCM systems annotation feature is not enough

• Line content is not a good identifier either

Page 7: Dependency Tracking in software systems Presented by: Ashgan Fararooy.

Annotation Graph

Definition:– A multipartite graph where each part corresponds

to one version of a file

–Within each part/version every line is represented by a single node

– Edges between node indicate that a line originates from another: by modification / movement

– Node labels (e.g. bold node) indicate a changed line

Page 8: Dependency Tracking in software systems Presented by: Ashgan Fararooy.

Annotation Graph

Page 9: Dependency Tracking in software systems Presented by: Ashgan Fararooy.

Annotation Graph

Construction:

– One needs to compare all subsequent revisions of a file

– Using the GNU diff tool For computing textual differences

– The diff tool returns a list of regions (“hunk”s) that differ in the two files

Page 10: Dependency Tracking in software systems Presented by: Ashgan Fararooy.

Annotation Graph

Three different kinds of changes:

–Modifications• Result in a complete bipartite subgraphs

– Additions• Do not result in any edges• Positions of the following lines are updated

– Deletions• The same effect as in addition

Page 11: Dependency Tracking in software systems Presented by: Ashgan Fararooy.

Annotation Graph

Computation:

– Creates nodes for each revision and each line

– Two approaches • 1- Forward-Directed

• 2- Backward-Directed

Page 12: Dependency Tracking in software systems Presented by: Ashgan Fararooy.

Annotation Graph

Computation (Forward-Directed Algorithm): – Iterate over all pairs of subsequent revisions

– For each pair compute the differences (hunks)

– Process the hunks to create edges• Exactly one edge between unchanged lines (nodes)

• For modified lines all possible edges

• For inserted and deleted lines no edges

– Label the nodes of the later revision in modifications and additions

Page 13: Dependency Tracking in software systems Presented by: Ashgan Fararooy.

Annotation Graph

Problem:

– Changes that modify large parts of a file

– Results in a large number of edges

– Not reasonable for evolution analysis

Page 14: Dependency Tracking in software systems Presented by: Ashgan Fararooy.

Annotation Graph

– Treat large modifications as combined deletions and additions

– No creation of edges in the annotation graph

Page 15: Dependency Tracking in software systems Presented by: Ashgan Fararooy.

Annotation Graph

Recognizing Large Modifications:

Page 16: Dependency Tracking in software systems Presented by: Ashgan Fararooy.

Annotating Lines

Comparison

–Most SCM systems have annotating features for each line providing the latest change information

– Annotation graphs can be used to get such information

– Furthermore, they provide information on all past changes

Page 17: Dependency Tracking in software systems Presented by: Ashgan Fararooy.

Life Cycle of Lines

Investigated the life cycle of lines for the Eclipse Project

– How frequently are lines changed• Computed for each line the change count

• The number of distinct revisions in its annotation

– How many developers change a line

–What are the most frequently changed lines

Page 18: Dependency Tracking in software systems Presented by: Ashgan Fararooy.

Finding Related Lines

– Computed related lines using frequent pattern mining

– Used transaction ids instead of revision ids

– Used Apriori algorithm

– Inferred useful association rules

Page 19: Dependency Tracking in software systems Presented by: Ashgan Fararooy.

Thank you