Dataflow Analysis for Software Product Lines Feb, 2013 DAGSTUHL Intraprocedural Dataflow Analysis...

Post on 24-Dec-2015

219 views 0 download

Tags:

Transcript of Dataflow Analysis for Software Product Lines Feb, 2013 DAGSTUHL Intraprocedural Dataflow Analysis...

Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

IntraproceduralDataflow Analysis for

Software Product LinesClaus Brabrand

IT University of CopenhagenUniversidade Federal de Pernambuco

[ brabrand@itu.dk ]

Márcio RibeiroUniversidade Federal de Alagoas

Universidade Federal de Pernambuco[ mmr3@cin.ufpe.br ]

Társis TolêdoUniversidade Federal de Pernambuco

[ twt@cin.ufpe.br ]

Johnni WintherAarhus University

[ jw@cs.au.dk ]

Paulo BorbaUniversidade Federal de Pernambuco

[ phmb@cin.ufpe.br ]

[ 2 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

< Outline >

Introduction (Software Product Lines)Dataflow Analyses for Software Product Lines:

A0 (brute force): (feature in-sensitive) [product-based]

A1 (consecutive): (feature sensitive) [family-based]

A2 (simultaneous): (feature sensitive) [family-based]

A3 (shared simul.): (feature sensitive) [family-based]

Results:A0 vs A1 vs A2 vs A3 (total time, incl. compilation)

A1 vs A2 vs A3 (analysis time, excl. compilation)

How to combine the analyses: A*

Conclusion(s)

[ 3 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

Software Product Line

SPLs based on Conditional Compilation:

#ifdef ( )

...

#endif

Logo logo;...

...logo.use();

#ifdef (VIDEO) logo = new Logo();#endif

Exam

ple

(SPL

fragm

ent)

Similarly for; e.g.:■ uninitialized vars■ unused variables■ ...

*** null-pointer exception!in configurations: {Ø, {COLOR}}

: fF | |

[ 4 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

resultresult

0100101111011010100111110111

0100101111011010100111110111

Analysis of SPLs

The Compilation Process:

...and for Software Product Lines:

0100101111011010100111110111

resultcompile run

ERROR!

generate 0100101111011010100111110111

result

run

ERROR!

ANALYZE!

ANALYZE!

Feature-sensitive data-flow analysis !

runruncompilecompilecompile

ANALYZE!ANALYZE! ERROR!ERROR!

2F

[ 5 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

Dataflow Analysis

Dataflow Analysis:1) Control-flow graph

2) Lattice (finite height)

3) Transfer functions (monotone)

L

Example:"sign-of-x analysis"

[ 6 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

Analyzing a Program1) Program 2) Build CFG 3) Make Equations

4) Solve equations: fixed-point computation (iteration)

5) SOLUTION (least fixed point):

Annotated with program points

[ 7 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

< Outline >

IntroductionDataflow Analyses for Software Product Lines:

A0 (brute force): (feature in-sensitive) [product-based]

A1 (consecutive): (feature sensitive) [family-based]

A2 (simultaneous): (feature sensitive) [family-based]

A3 (shared simul.): (feature sensitive) [family-based]

Results:A0 vs A1 vs A2 vs A3 (total time, incl. compilation)

A1 vs A2 vs A3 (analysis time, excl. compilation)

How to combine the analyses: A*

Conclusion(s)

[ 8 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

A0

A0 (brute force):void m() { int x=0; ifdef(A) x++; ifdef(B) x--;}

c = {A}: c = {B}: c = {A,B}:

int x = 0;

x++;

x--;

int x = 0;

x++;

x--;

int x = 0;

x++;

x--;

0

_|

+

0

_|

-

0

_|

0/+

+

ψFM = A B∨

Lfeature in-sensitive!

N = O(2F) compilations!

[ 9 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

int x = 0;

x++;

x--;

A:

B:

int x = 0;

x++;

x--;

A:

B:

int x = 0;

x++;

x--;

A:

B:

A1

A1 (consecutive):void m() { int x=0; ifdef(A) x++; ifdef(B) x--;}

c = {A}:

0

_|

+

ψFM = A B∨

L

c = {B}: c = {A,B}:

0

_|

-

0

_|

0/+

+

✓ ✓

✓ ✓

+

0

feature sensitive!

[ 10 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

x++;

+({A} = , {B} = , {A,B} = )

({A} = , {B} = , {A,B} = )

({A} = , {B} = , {A,B} = )

A2

A2 (simultaneous):void m() { int x=0; ifdef(A) x++; ifdef(B) x--;}

∀c ∈ {{A},{B},{A,B}}:

int x = 0;

x--;

0

_|

0

_|

-

0

_|

0/+

+

A:

B:

✓({A} = , {B} = , {A,B} = )

✓✓

✓✓

✓✓

ψFM = A B∨

L

0

+

feature sensitive!

[ 11 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

x--;

+

x++;

0

( [[ψ ¬A ]] = , [[∧ ψ A ]] = , [[∧ ψ ¬A ]] = , [[∧ ψ A ]] = )∧

( [[ψ ]] = , [[ψ ]] = )

A3

A3 (shared):void m() { int x=0; ifdef(A) x++; ifdef(B) x--;}

ψFM = A B:∨

int x = 0;

A:

B:

_|( [[ψ]] = )

0( [[ψ]] = )

(A B) ¬A ¬B ≡ ∨ ∧ ∧ false

can use BDDrepresentation !(compact+efficient)

- 0/+

i.e., invalid given wrt.the feature model, ψ !

ψFM = A B∨

L

0∧¬A ∧A +

∧¬B ∧¬B ∧B ∧B

(although our evaluation: bit vector representation)

feature sensitive!

[ 12 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

Summary

A0 A1

A2

A3

void m() { int x=0; ifdef(A) x++; ifdef(B) x--;}

Analyzing program:

ψFM = A B∨

[ 13 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

< Outline >

IntroductionDataflow Analyses for Software Product Lines:

A0 (brute force): (feature in-sensitive) [product-based]

A1 (consecutive): (feature sensitive) [family-based]

A2 (simultaneous): (feature sensitive) [family-based]

A3 (shared simul.): (feature sensitive) [family-based]

Results:A0 vs A1 vs A2 vs A3 (total time, incl. compilation)

A1 vs A2 vs A3 (analysis time, excl. compilation)

How to combine the analyses: A*

Conclusion(s)

[ 14 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

Intraprocedural Evaluation

Five (qualitatively different) SPL benchmarks:

intraproceduralimpl based on

SOOT and CIDE

[ 15 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

Total Time (incl. compile)

Tasks:

In practice:

4x

(Reaching Definitions)

7x

3x

1x

1x

(no re-compile!)

Feature sensitive(A1, A2, and A3)all faster than A0

[ 16 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

Analysis Time (excl. compile)

Tasks:

In practice:(caching!)

(Reaching Definitions)A2 faster than A1

A3 faster than A2(sharing!)

[ 17 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

Beyond the Sum of all Methods

For a method with x # valid configurations, which of analyses A1 vs A2 vs A3 is fastest?

Statistically significant differences between A1, A2, and A3 for all N,except between A2 and A3 for N=4 (underlined above).

[ 18 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

Combo Analysis Strategy: A*

Intraprocedurally combinedanalysis strategy, A*:

A* consistently fastest(combo!)

[ 19 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

< Outline >

IntroductionDataflow Analyses for Software Product Lines:

A0 (brute force): (feature in-sensitive) [product-based]

A1 (consecutive): (feature sensitive) [family-based]

A2 (simultaneous): (feature sensitive) [family-based]

A3 (shared simul.): (feature sensitive) [family-based]

Results:A0 vs A1 vs A2 vs A3 (total time, incl. compilation)

A1 vs A2 vs A3 (analysis time, excl. compilation)

How to combine the analyses: A*

Conclusion(s)

[ 20 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

Overview

A0 (brute force)

A1 (consecutive)

A2 (simultaneous)

A3 (shared)

A* (combo)

IFDS IDE➞ (lift)

FASTER

(intra-procedural)

"SPLLIFT: Transparent and Efficient Reuse of IFDS-based Static Program Analyses for Software Product Lines"( Bodden, Ribeiro, Tolêdo, Brabrand, Borba, Mezini ) PLDI 2013:

IFDS (graph repr)

A3+BDD (esp. inter- procedural)

no re-compile!

caching!

sharing!

combo!

graphencoding!

repr!

Friday

[ 21 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

Conclusion(s)

It is possible to analyze SPLs using DFAs

We can automatically "lift" any dataflow analysis and make it feature sensitive:

A1, A2, A3 are all faster than A0 (no re-compile!)

A2 is faster than A1 (caching!)

A3 is faster than A2 (sharing!)

A* is fastest (combo!)

A3 saves lots of memory vs A2 (sharing!)

A1 (consecutive) ➞ A2 (simultaneous) ➞A3 (shared) ➞ A* (combined)

Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

< Obrigado* >

*) Thanks

[ 23 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

A0 vs IFDS and A2 vs SPLLIFT

IFDS:A0:

λS . (S – {x}) {y}∪

{x}

{y}

SPLLIFT (IFDS ➞ IDE):A2:

λS . (S – {x}) {y}∪

( {A} = {x} , {B} = {x} , {A,B} = {x,y} )

A:

0 x y

0 x y

0 x y

0 x y

A ¬A¬A

#ifdef (A)

( {A} = {y} , {B} = {x} , {A,B} = {y} )

true

[ (A B)∧ ¬A∧ ] ∨ [ true A∧ ]

= A

true A B∧

true ¬A∧ = ¬A

true

LIFT:

Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

Software Product LinesDataflow Analysis

INTRO:

[ 25 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

AbstractSoftware product lines (SPLs) developed using annotative approaches such as conditional compilation come with an inherent risk of constructing erroneous products. For this reason, it is essential to be able to analyze such SPLs. However, as dataflow analysis techniques are not able to deal with SPLs, developers must generate and analyze all valid products individually, which is expensive for non-trivial SPLs.

In this paper, we demonstrate how to take any standard intraprocedural dataflow analysis and automatically turn it into a feature-sensitive dataflow analysis in five different ways where the last is a combination of the other four. All analyses are capable of analyzing all valid products of an SPL without having to generate all of them explicitly.

We have implemented all analyses using SOOT’s intraprocedural dataflow analysis framework and experimentally evaluated four of them according to their performance and memory characteristics on five qualitatively different SPLs. On our benchmarks, the combined analysis strategy is up to almost eight times faster than the brute-force approach.

[ 26 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

< Outline >

Introduction

Software Product Lines

Dataflow Analysis (recap)Dataflow Analyses for Software Product Lines:

feature in-sensitive (A0) vs feature sensitive (A1, A2, A3)

Results:A0 vs A1 vs A2 vs A3 (in theory and practice)

Related Work

Conclusion

[ 27 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

Introduction

1x CAR

=

1x CELL PHONE

=

1x APPLICATION

=

CARS CELL PHONES APPLICATIONS

Traditional Software Development:One program = One product

Product Line:A ”family” of products (of N ”similar” products):

customize

SPL:(Family ofPrograms)

[ 28 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

Software Product Line

SPL:

Feature Model: (e.g.: ψFM ≡ VIDEO COLOR)

Family ofPrograms:

COLOR

VIDEO

COLORVIDEO

VID

EO

Ø

{ Video }

{ Color, Video }

Configurations:Ø, {Color}, {Video}, {Color,Video}VALID

{ Color }

customize

2F

Set of Features:F = { COLOR, VIDEO }

2F

[ 29 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

Software Product Line

SPL:Family of s:

COLOR

VIDEO

COLORVIDEO

VID

EO

Program

Conditional compilation:

#ifdef ( )

...

#endif

Alternatively,via Aspects(as in AOSD)

Logo logo;...

...logo.use();

#ifdef (VIDEO) logo = new Logo();#endif

Exam

ple

(SPL

fragm

ent)

Similarly for; e.g.:■ uninitialized vars■ unused variables■ ...

*** null-pointer exception!in configurations: {Ø, {COLOR}}

: fF | |

[ 31 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

resultresult

0100101111011010100111110111

0100101111011010100111110111

Analysis of SPLs

The Compilation Process:

...and for Software Product Lines:

0100101111011010100111110111

resultcompile run

ERROR!

customize 0100101111011010100111110111

result

run

ERROR!

ANALYZE!

ANALYZE!

Feature-sensitive data-flow analysis !

runruncompilecompilecompile

ANALYZE!ANALYZE! ERROR!ERROR!

2F

[ 32 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

< Outline >

Introduction

Software Product Lines

Dataflow Analysis (recap)Dataflow Analyses for Software Product Lines:

feature in-sensitive (A0) vs feature sensitive (A1, A2, A3)

Results:A0 vs A1 vs A2 vs A3 (in theory and practice)

Related Work

Conclusion

[ 33 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

Dataflow Analysis

Dataflow Analysis:1) Control-flow graph

2) Lattice (finite height)

3) Transfer functions (monotone)

L

Example:"sign-of-x analysis"

[ 34 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

Analyzing a Program1) Program 2) Build CFG 3) Make Equations

4) Solve equations: fixed-point computation (iteration)

5) SOLUTION (least fixed point):

Annotated with program points

Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

Related Work:

[ 36 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

< Outline >

Introduction

Software Product Lines

Dataflow Analysis (recap)Dataflow Analyses for Software Product Lines:

feature in-sensitive (A0) vs feature sensitive (A1, A2, A3)

Results:A0 vs A1 vs A2 vs A3 (in theory and practice)

Related Work

Conclusion

[ 37 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

Related Work (DFA)

Path-sensitive DFA:

Idea of “conditionally executed statements”

Compute different analysis info along different paths (~ A1, A2, A3) to improve precision or to optimize “hot paths”

Predicated DFA:

Guard lattice values by propositional logic predicates (~ A3), yielding “optimistic dataflow values” that are kept distinct during analysis (~ A2 and A3)

“Constant Propagation with Conditional Branches”( Wegman and Zadeck ) TOPLAS 1991

“Predicated Array Data-Flow Analysis for Run-time Parallelization”( Moon, Hall, and Murphy ) ICS 1998

Our work: Automatically lift any DFA to SPLs (with ψFM) ⇒feature-sensitive analysis for analyzing entire program family

[ 38 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

Related Work (Lifting for SPLs)

Model Checking:

Type Checking:

Parsing:

Testing:

Model Checking Lots of Systems: Efficient Verification of Temporal Properties in Software Product Lines”( Classen, Heymans, Schobbens, Legay, and Raskin ) ICSE 2010

Model checks all SPLs at the same time (3.5x faster) than one by one! (similar goal, diff techniques)

Type checking ↔ DFA (similar goal, diff techniques)Our: auto lift any DFA (uninit vars, null pointers, ...)

“Type Safety for Feature-Oriented Product Lines”( Apel, Kastner, Grösslinger, and Lengauer ) ASE 2010

“Type-Checking Software Product Lines - A Formal Approach”( Kastner and Apel ) ASE 2008

“Variability-Aware Parsing in the Presence of Lexical Macros & C.C.”( Kastner, Giarrusso, Rendel, Erdweg, Ostermann, and Berger ) OOPSLA 2011

“Reducing Combinatorics in Testing Product Lines”( Hwan, Kim, Batory, and Khurshid ) AOSD 2011

Select relevant feature combinations for a given test caseUses (hardwired) DFA (w/o FM) to compute reachability

(similar techniques, diff goal):Split and merging parsing (~A3) and also uses instrumentation

[ 39 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

Emerging Interfaces

[ 40 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

Emerging Interfaces

"A Tool for Improving Maintainability of Preprocessor-based Product Lines"( Márcio Ribeiro, Társis Tolêdo, Paulo Borba, Claus Brabrand )

*** Best Tool Award ***CBSoft 2011:

Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

BONUS SLIDES

[ 42 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

Specification: A0, A1, A2, A3

A0

A1

A2

A3

[ 43 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

Analysis Time (excl. compile)

In theory:

In practice: TIME(A3) : Depends ondegree of sharing in SPL !

(caching!)

(Reaching Definitions) A2 faster than A1

A3 faster than A2(sharing!)

[ 44 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

Memory Usage

In theory:

In practice:(Reaching Definitions)

SPACE(A3) : Depends ondegree of sharing in SPL !

[ 45 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

Analysis Time (excl. compile)

In practice:(Reaching Definitions)

Nx1 ≠ 1xN ?!

Caching!A2 faster than A1

[ 46 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

Caching (A1 vs A2)

Cache misses (A1 vs A2):

Cache enabled:This is the "normal condition" (for reference)

Cache disabled*:As hypothesized, this indeed affects A1 more than A2

i.e., A2 has better cache properties than A1

*) we flush the L2 cache, by traversing an 8MB “bogus array” to invalidate cache!

[ 47 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

IFDEF normalization

Refactor "undisciplined" (lexical) ifdefs into "disciplined" (syntactic) ifdefs:

Normalize "ifdef"s (by transformation):

[ 48 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

Lexical #ifdef Syntactic ifdef

Simple transformation:

We do not handle non-syntactic '#ifdef's:

Fair assumption(also in CIDE)

Nested ifdef's also give rise to a conj.of formulas

[ 49 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

BDD (Binary Decision Diagram)

Compact and efficient representation forboolean functions (aka., set of set of names)

FAST: negation, conjunction, disjunction, equality !

= F(A,B,C) = A(BC)

A

C

minimized BDD

B

A

BB

C C C C

BDD

[ 50 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

Formula ~ Set of Configurations

Definitions (given F, set of feature names):f F feature namec 2F configuration (set of feature names) c FX 22 set of config's (set of set of feature names) X 2F

Exampleifdefs:

F

[[ BA ]]

[[ A(BC) ]]

F = {A,B}

F = {A,B,C}

= { {A}, {B}, {A,B} }

= { {A,B}, {A,C}, {A,B,C} }

[ 51 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

Feature Model (Example)

Feature Model:

Feature set:

Formula:

Set of configurations:FM Car Engine (1.01.4) Air1.4

{ {Car, Engine, 1.0}, {Car, Engine, 1.4}, {Car, Engine, 1.4, Air} }

F = {Car, Engine, 1.0, 1.4, Air}

Note:| [[FM]] | = 3 < 32 = |2F |

[[ ]] =

Engine

1.0

Air

Air

1.4

[ 52 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

Conditional Compilation

The 'ifdef' construction:

Syntactic variant of lexical #ifdef

Propositional Logic:

where fF (finite set of feature names)

Example:

STM : 'ifdef' '(' ')' STM

: fF | |

status.print("you die");ifdef (DeluxeVersion && ColorDisplay) { player.redraw(Color.red); Audio.play("crash.wav");}lives = lives - 1;

A

ifdef (A) { ...

}

[ 53 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

CASE 1: "COPY"

A4: Lazy Splitting (using BDDs)CASE 2: "APPLY" CASE 3: "SPLIT"

: S

[ =l , ... ]

[ =l , ... ]

l ' = fS(l )

: S

[ =l , ... ]

[ =l ', ... ]

l ' = fS(l )

: S

[ =l , ... ]

[ =l, =l' ,...]

l ' = fS(l )

= Ø = Ø

[ 54 ]Dataflow Analysis for Software Product Lines Feb, 2013DAGSTUHL

A1, A2, A3, and A4A1 A2

A3 A4