Components - Graph Based Detection of Library API Limitations

Post on 13-Jan-2015

433 views 3 download

Tags:

description

Paper: Graph-based Detection of Library API ImitationsAuthors: Chengnian Sun, Siau-Cheng Khoo, Shao Jie Zhang (All from National University of Singapore)Session: Research Track Session 7: Component

Transcript of Components - Graph Based Detection of Library API Limitations

Graph-based Detection of

Library API Imitations

October 6, 20111

Chengnian Sun, Siau-Cheng Khoo, Shao Jie Zhang

National University of Singapore

Motivation – Software Libraries

Common practice to employ 3rd-party software libraries

Providing certain functionalities / hiding implementation details

Improving productivity

Well tested

Enhancing program quality

Application Programming Interfaces (APIs)

Exported by libraries

Ways for programmers to interact with libraries

October 6, 20112

Motivation – Problem

APIs are not always effectively used by programmers

Imitation: client code re-implements the behavior of library

APIs

Reasons

Unfamiliar with the library,

Library evolution

Cost

Waste unnecessary resources, time and energy

Error-prone, software maintenance issue

October 6, 20113

Motivation – Example from JBoss

October 6, 20114

Motivation – Example from JBoss

October 6, 20115

Imitation (1): method.getInterceptors() == null ||

method.getInterceptors().length < 1

Motivation – Example from JBoss

October 6, 20116

Imitation (1): method.getInterceptors() == null ||

method.getInterceptors().length < 1

API: return (interceptors != null && interceptors.length > 0)

Motivation – Example from JBoss

October 6, 20117

Imitation (1): method.getInterceptors() == null ||

method.getInterceptors().length < 1

Refactor to: !method.hasAdvices()

Motivation – Example from JBoss

October 6, 20118

Refactor to: !method.hasAdvices()

Imitation (1): method.getInterceptors() == null ||

method.getInterceptors().length < 1

Motivation

October 6, 20119

A library API imitation can be

Not exactly the same

Inter-procedural

Motivation

October 6, 201110

A library API imitation can be

Not exactly the same

Inter-procedural

Goal: to accurately detect such imitations

Detection of Library API Imitations

Motivation

Definitions

Data Dependency Graph

Trace & Subtrace

Trace Subsumption

Potential Imitation

Algorithms

Pre- & Post-processing

Case Studies

Conclusion

October 6, 201111

Definitions – Overview

October 6, 201112

Employing Data Dependency Graphs (DDG) to represent

code

Semantic representation

Capturing data flows within a method

Carrying a portion of control flow information

A library DDG is trace-subsumed by a client DDG

potential API imitation

Relaxation of sub-graph isomorphism

More efficient

Minor-difference tolerant

Definitions – Data Dependency Graph

October 6, 201113

DDG – a graphical representation of a method

Vertices: basic statements (three address form)

Edges v u: direction represents data dependency

vertex u is data dependent on vertex v

a variable var

defined at v

used at u

and there is an execution path P from v to u, and along P, the

var is not redefined.

Definitions – Trace & Subtrace

October 6, 201114

A trace in a data dependency graph

A path of vertices, <v1, v2, …, vm>

The first vertex is an entry of the graph

Definitions – Trace & Subtrace

October 6, 201115

A trace in a data dependency graph

A path of vertices, <v1, v2, …, vm>

The first vertex is an entry of the graph

Given two traces T1 = <v1, v2, …, vm> and T2 = <u1, u2, …, un>, T1

is a subtrace of T2 (T1 ≤ T2) if there exists an integer i,

0 ≤ i ≤ n – m

match(v1, u1 + i), match(v2, u2 + i), …, match(vm, um + i)

Subtrace is a generalization of substring relation.

T1 = <C, D, E>

T2 = <A, B, C, D, E, F>

Definitions – Trace & Subtrace

October 6, 201116

A trace in a data dependency graph

A path of vertices, <v1, v2, …, vm>

The first vertex is an entry of the graph

Given two traces T1 = <v1, v2, …, vm> and T2 = <u1, u2, …, un>, T1

is a subtrace of T2 (T1 ≤ T2) if there exists an integer i,

0 ≤ i ≤ n – m

match(v1, u1 + i), match(v2, u2 + i), …, match(vm, um + i)

Subtrace is a generalization of substring relation.

T1 = <C, D, E>

T2 = <A, B, C, D, E, F>

i = 2

Definitions – Trace Subsumption

October 6, 201117

A data dependency graph Glib

A data dependency graph Gclt

Gclt trace subsumes Glib , if and only if

for each trace there exists at least one trace

such that is a subtrace of

Definitions – Potential Imitation

October 6, 201118

A client method Clt potentially imitates a library

method Lib, if

A DDG Gclt of Clt, resulting from inlining zero or some

method calls into Clt

A DDG Glib of Lib, resulting from inlining zero or some

method calls into Lib

Gclt trace subsumes Glib

Detection of Library API Imitations

Motivation

Definitions

Algorithms

Overall Algorithm

Trace Subsumption Checking

Pre- & Post-processing

Case Studies

Conclusion

October 6, 201119

Algorithms – Overall Algorithm

October 6, 201120

Input

A library API Lib

A client method Clt

A set S of all method calls in both Lib and Clt

Output true if Clt potentially imitates Lib

Body

for each sub-set s of S {

Lib’ = a copy of Lib with calls in s inlined

Clt’ = a copy of Clt with calls in s inlined

if the DDG of Clt’ trace subsumes the DDG of Lib’

return true

}

return false;

Algorithms – Trace Subsumption

October 6, 201121

Input

A DDG of a library API Glib

A DDG of a client method Gclt

Output

true if Gclt trace subsumes Glib

Depth-first Search,

Step-by-step checking

Algorithms – An Example

October 6, 201122

Current:

Stack:

Algorithms – An Example

October 6, 201123

Locating all vertices in client matching each entry of the library (A, {A, A})Stack:

Current:

Algorithms – An Example

October 6, 201124

Locating client vertices matching library A’s successor D Stack:

Current: (A, {A, A})

Algorithms – An Example

October 6, 201125

Locating client vertices matching library A’s successor D (D, {D})Stack:

Current: (A, {A, A})

Algorithms – An Example

October 6, 201126

Locating client vertices matching library A’s successor B (D, {D})Stack:

Current: (A, {A, A})

Algorithms – An Example

October 6, 201127

Locating client vertices matching library A’s successor B (B, {B})

(D, {D})

Stack:

Current: (A, {A, A})

Algorithms – An Example

October 6, 201128

Locating client vertices matching B’s successor {} in library (D, {D})Stack:

Current: (B, {B})

Algorithms – An Example

October 6, 201129

Locating client vertices matching library D’s successor M Stack:

Current: (D, {D})

Detection of Library API Imitations

Motivation

Definitions

Algorithms

Pre-processing & Post-validation

Case Studies

Conclusion

October 6, 201130

Pre-processing Libraries

October 6, 201131

Remove nullness checks

Remove assertions

Remove exception handlers

If (a ==) {

return Constant;

} else {

a.XXX();

}

if (…)

throw Exception();

…….

try {

} catch (…) {}

Post-validating Reported Imitations

October 6, 201132

Reject the following two cases

Unmatched InlinedVertices in Client

Matching All References to Library Locals

Detection of Library API Imitations

Motivation

Definitions

Algorithms

Pre-processing & Post-validation

Case Studies

Conclusion

October 6, 201133

Case Studies

October 6, 201134

Evaluation measure

Subjects – 10 open-source Java projects

Testbed:

Intel Core 2 Quad CPU 3.00GHz and 8GB memory

Case Studies – Two Experiments

October 6, 201135

Detecting Imitations of Imported Libraries

Testing all method pairs (lib, clt), where the declaring class of

lib is already imported in the client class

Precision = 313 / 383 = 82%

Runtime = 314 seconds

Case Studies – Two Experiments

October 6, 201136

Detecting Imitations of Imported Libraries

Testing all method pairs (lib, clt), where the declaring class of

lib is already imported in the client class

Precision = 313 / 383 = 82%

Runtime = 314 seconds

Detecting Imitations of Static Libraries

Testing all method pairs (lib, clt), where lib is a public static

method

Precision = 116 / 155 = 75%

Runtime = 396 seconds

Case Studies – Example of Static API

October 6, 201137

Conclusion

October 6, 201138

A common practice to employ 3rd party software libraries

Client code re-implements behavior of existing APIs

An algorithm based on data dependency graphs to detect

complex imitations

Average precision 82% & 75%

Thank you.

Q&A

October 6, 201139