Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation II

92
Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation II Roman Manevich Ben-Gurion University

description

Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation II. Roman Manevich Ben-Gurion University. Syllabus. Previously. Semantic domains Preorders Partial orders ( posets ) Pointed posets Ascending/descending chains The height of a poset - PowerPoint PPT Presentation

Transcript of Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation II

Page 1: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

Spring 2014Program Analysis and Verification

Lecture 10: Abstract Interpretation II

Roman ManevichBen-Gurion University

Page 2: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

2

SyllabusSemantics

NaturalSemantics

Structural semantics

AxiomaticVerification

StaticAnalysis

AutomatingHoare Logic

Control Flow Graphs

Equation Systems

CollectingSemantics

AbstractInterpretation fundamentals

Lattices

Fixed-Points

Chaotic Iteration

Galois Connections

Widening/Narrowing

Domain constructors

InterproceduralAnalysis

AnalysisTechniques

Numerical Domains

CEGAR

Alias analysis

ShapeAnalysis

Crafting your own

Soot

From proofs to abstractions

Systematically developing

transformers

Page 3: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

3

Previously• Semantic domains– Preorders– Partial orders (posets)– Pointed posets– Ascending/descending chains– The height of a poset– Join and Meet operators– Complete lattices

• Constructing new lattices from old• Abstract Interpretation package – domains

Page 4: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

4

Abstract domain types

Page 5: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

5

A taxonomy of semantic domain typesComplete Lattice(D, , , , , )

Lattice(D, , , , , )

Join semilattice(D, , , )

Meet semilattice(D, , , )

Join/Meet exist for every subset of DJoin/Meet exist for every finite

subset of D (alternatively, binary join/meet)

Join of the empty set Meet of the empty set

Complete partial order (CPO)(D, , )

Partial order (poset)(D, )

Preorder(D, )

reflexivetransitiveanti-symmetric: d d’ and d’ d implies d = d’

reflexive: d dtransitive: d d’, d’ d’’ implies d d’’

poset with LUB for all ascending chains

Page 6: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

6

Composing domains

Page 7: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

7

Cartesian product of complete lattices• For two complete lattices

L1 = (D1, 1, 1, 1, 1, 1) L2 = (D2, 2, 2, 2, 2, 2)

• Define the posetLcart = (D1D2, cart, cart, cart, cart, cart)as follows:– (x1, x2) cart (y1, y2) iff

x1 1 y1 andx2 2 y2

– cart = ? cart = ? cart = ? cart = ?

• Lemma: L is a complete lattice• Define the Cartesian constructor Lcart = Cart(L1, L2)

Page 8: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

8

Disjunctive completion• For a complete lattice

L = (D, , , , , )• Define the powerset lattice

L = (2D, , , , , ) = ? = ? = ? = ? = ?

• Lemma: L is a complete lattice• L contains all subsets of D, which can be thought of

as disjunctions of the corresponding predicates• Define the disjunctive completion constructor

L = Disj(L)

Page 9: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

9

Relational product of lattices

• L1 = (D1, 1, 1, 1, 1, 1)L2 = (D2, 2, 2, 2, 2, 2)

• Lrel = (2D1D2, rel, rel, rel, rel, rel)as follows:– Lrel = Disj(Cart(L1, L2))

• Lemma: L is a complete lattice

Page 10: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

10

Finite maps• For a complete lattice

L = (D, , , , , )and finite set V

• Define the posetLVL = (VD, VL, VL, VL, VL, VL)as follows:– f1 VL f2 iff for all vV

f1(v) f2(v)– VL = ? VL = ? VL = ? VL = ?

• Lemma: L is a complete lattice• Define the map constructor LVL = Map(V, L)

Page 11: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

11

The collecting lattice

• Lattice for a given control-flow node v: Lv=(2State, , , , , State)

• Lattice for entire control-flow graph with nodes V:

LCFG = Map(V, Lv)• We will use this lattice as a baseline for static

analysis and define abstractions of its elements

Page 12: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

12

Implementation

Page 13: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

13

Software package: paver142• Built on top of the Soot compiler framework

for Java• Download from web-site– Includes all necessary Soot jar files

Page 14: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

14

Infrastructurefor implementingstatic analysis

Example analyses

Soot-specific utilities

Page 15: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

15

Existing analyses

Page 16: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

Implementing abstract domains

16

Page 17: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

17

Variable equalities analysis

Page 18: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

18

Today

• Solving monotone systems• Fixed-points• Vanilla static analysis algorithm• Chaotic iteration

Page 19: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

19

Abstract interpretation via abstraction

set of states set of statescollecting semantics

statement S

abstract representationof sets of states

abstract

representationof sets of states abstract semantics

statement Sabstract

representationof sets of states

abstraction abstraction

{P} S {Q} sp(S, P)generalizes axiomatic verification

Page 20: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

20

Abstract interpretation via concretization

set of states set of statescollecting semantics

statement Sset of states

abstract representationof sets of states

abstract semanticsstatement S abstract

representationof sets of states

concretization concretization

{P} S {Q}

models(P) models(sp(S, P)) models(Q)

Page 21: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

21

Missing knowledge

• Collecting semantics• Abstract semantics• Connection between collecting semantics and

abstract semantics• Algorithm to compute abstract semantics

Page 22: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

22

Review of collecting semantics

Page 23: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

23

The collecting lattice (sets of states)

• Lattice for a given control-flow node v: Lv=(2State, , , , , State)

• Lattice for entire control-flow graph with nodes V:

LCFG = Map(V, Lv)• We will use this lattice as a baseline for static

analysis and define abstractions of its elements

Page 24: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

24

Collecting semantics as equation system

• A vector of variables R[0, 1, 2, 3, 4]• R[0] = {xZ} // established input

R[1] = R[0] R[4]R[2] = R[1] {s | s(x) > 0}R[3] = R[1] {s | s(x) 0}R[4] = x:=x-1 R[2]

• A (recursive) system of equations

if x > 0

x := x-1

entry

exit

R[0]

R[1]

R[2]R[4]

R[3]

Semantic function for assume x>0

Semantic function for x:=x-1 lifted to sets of states

Page 25: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

25

General definition• A vector of variables R[0, …, k] one per input/output of a node

– R[0] is for entry• For node n with multiple predecessors add equation

R[n] = {R[k] | k is a predecessor of n}• For an atomic operation node R[m] S R[n] add equation

R[n] = S R[m]

• Transform if b then S1 else S2

to (assume b; S1) or (assume b; S2)

if x > 0

x := x-1

entry

exit

R[0]

R[1]

R[2]R[4]

R[3]

Page 26: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

26

Static analysis• Given a system of equations

for the collecting semantics

A static analysis solves a corresponding system of equations over an abstract domain

• Questions:– What is the relation between the solutions?

Next lecture– How do you solve the second system? This lecture

R[0] = {xZ} // established inputR[1] = R[0] R[4]R[2] = assume x>0 R[1]R[3] = assume x0 R[1]R[4] = x:=x-1 R[2]

R[0]# = {xZ}#

R[1]# = R[0] R[4]R[2]# = assume x>0# R[1]R[3]# = assume x0# R[1]R[4]# = x:=x-1# R[2]

Page 27: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

27

Solving equation systems

Page 28: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

28

Equation systems in general• Let L be a complete lattice (D, , , , , )• Let R be a vector of analysis variables R[0, …, n] D … D• Let F be a vector of functions of the type

F[i] : R[0, …, n] R[0, …, n]• A system of equations

R[0] = f[0](R[0], …, R[n])…R[n] = f[n](R[0], …, R[n])

• In vector notation R = F(R)• Questions:

1. Does a solution always exist?2. If so, is it unique?3. If so, is it computable?

For R[i]=f[i] RUsually f[i] reads only a small subset of R – D[i].We say that R[i] depends on D[i]

R[0] = {xZ} // established inputR[1] = R[0] R[4]R[2] = R[1] {s | s(x) > 0}R[3] = R[1] {s | s(x) 0}R[4] = x:=x-1 R[2]

Page 29: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

29

Equation systems in general• Let L be a complete lattice (D, , , , , )• Let R be a vector of analysis variables R[0, …, n] D … D• Let F be a vector of functions of the type

F[i] : R[0, …, n] R[0, …, n]• A system of equations

R[0] = f[0](R[0], …, R[n])…R[n] = f[n](R[0], …, R[n])

• In vector notation R = F(R)• Questions:

1. Does a solution always exist?2. If so, is it unique?3. If so, is it computable?

If it does – it is a fixed point of this equation

Page 30: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

30

Monotone systems

Page 31: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

31

Monotone functions

• Let L1=(D1, ) and L2=(D2, ) be two posets

• A function f : D1 D2 is monotone if for every pair x, y D1

x y implies f(x) f(y)• A special case: L1=L2=(D, )

f : D D

Page 32: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

32

Monotone function

f

1x

L1 L2

2y

3 f(x)

4 f(y)

f

Page 33: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

33

Important cases of monotonicity• Join: f(X, Y) = X Y is monotone in each operand– Prove it!

• Set lifting function: for a set X and any function gF(X) = { g(x) | x X } is monotone w.r.t. – Prove it!

• Notice that the collecting semantics function is defined in terms of– Join (set union)– Semantic function for atomic statements lifted to sets of

states• Conclusion: collecting semantics function is monotone

Page 34: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

34

Fixed points

Page 35: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

35

Extensive/reductive functions

• Let L=(D, ) be a poset• A function f : D D is extensive

if for every x D, we have that x f(x)• A function f : D D is reductive

if for every x D, we have that x f(x)

Page 36: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

36

Fixed points

• L = (D, , , , , )• f : D D monotone• Fix(f) = { d | f(d) = d }• Red(f) = { d | f(d) d }• Ext(f) = { d | d f(d) }• Theorem [Tarski 1955]– lfp(f) = Fix(f) = Red(f) Fix(f)– gfp(f) = Fix(f) = Ext(f) Fix(f)

Red(f)

Ext(f)

Fix(f)

lfp

gfp

fn()

1.Does a solution always exist? Yes2. If so, is it unique? No, but it has least/greatest solutions3. If so, is it computable? Under some conditions…

Page 37: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

37

Fixed point example• R[0] = {xZ}

R[1] = R[0] R[4]R[2] = R[1] {s | s(x) > 0}R[3] = R[1] {s | s(x) 0}R[4] = x:=x-1 R[2]

if x>0

x := x-1

1

2

entry

exit

xZ

xZ

{x0}{x0}

4

if x>0

x := x-1

1

2

entry

exit

xZ

xZ

{x0}{x0}

4{x>0}

F(d) : Fixed point

=

d0

3

0

3{x>0}

Page 38: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

38

Pre-fixed point example• R[0] = {xZ}

R[1] = R[0] R[4]R[2] = R[1] {s | s(x) > 0}R[3] = R[1] {s | s(x) 0}R[4] = x:=x-1 R[2]

if x>0

x := x-1

entry

exit

xZ

xZ

{x<-5}

if x>0

x := x-1

entry

exit

xZ

xZ

{x0}

F(d) : pre-fixed point

d

4 4

1

2

0

3

1

2

0

3

{x0}

{x>0}

{x0}

{x>0}

Page 39: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

39

Post-fixed point example• R[0] = {xZ}

R[1] = R[0] R[4]R[2] = R[1] {s | s(x) > 0}R[3] = R[1] {s | s(x) 0}R[4] = x:=x-1 R[2]

if x>0

x := x-1

entry

exit

xZ

xZ

{x<9}

if x>0

x := x-1

entry

exit

xZ

xZ

{x0}

F(d) : post-fixed point

d

44

1

2

0

3

1

2

0

3

{x0}

{x>0}{x>0}

{x0}

Page 40: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

40

Recap• A system of equations of the form R=F(R) where R

draws its elements from a complete latticeL = (D, , , , , )

• Tarski’s fixed point theorem ensures us that there exists a least fixed point: lfp(f) = Fix(f)

• However, it is not an algorithm since D is often infinite– Ineffective when D is finite

• We need a more constructive way of computing lfp(f)

Page 41: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

41

Computingthe least Fixed point

Page 42: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

42

Continuous functions• Let L = (D, , , ) be a complete partial order– Every ascending chain has an upper bound

• A function f is continuous if for every increasing chain Y D*,

f(Y) = { f(y) | yY }• Lemma: if f is continuous then f is monotone• Proof: assume x y

Therefore xy=yThen f(y) = f(xy) = f(x) f(y), which means f(x) f(y)

Page 43: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

43

Kleene’s fixed point theorem

• Let L = (D, , , ) be a complete partial order and a continuous function f: D D then

lfp(f) = nN fn()• That is, take the ascending chain

f() f(f()) … fn() …and return the supremum– Why is this an ascending chain?

• But how do you know if a function f is continuous

Page 44: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

44

Continuity and ACC condition• Let L = (D, , , ) be a complete partial order

– Every ascending chain has an upper bound• L satisfies the ascending chain condition (ACC) if every ascending chain

eventually stabilizes:d0 d1 … dn = dn+1 = dn+2 = …

• Lemma: Monotone functions on posets satisfying ACC are continuousProof:We need to show that f(Y) = { f(y) | yY }

1. Every ascending chain Y eventually stabilizes d0 d1 … dn = dn+1 = … hence dn is the least upper bound of {d0, d1, … , dn},thus f(Y) = f(dn)

2. From monotonicity of f we get thatf(d0) f(d1) … f(dn) = f(dn+1) = … Hence f(dn) is the least upper bound of {f(d0), f(d1), … , f(dn)},thus { f(y) | yY } = f(dn)

Page 45: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

45

Resulting algorithm • Kleene’s fixed point theorem gives a

constructive method for computing lfp(f) over a poset with ACC when f is monotone

lfpfn()

f()f2()

…d := while f(d) d do d := f(d)return d

Algorithm

lfp(f) = nN fn()Mathematical definition

Page 46: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

Our very first genericstatic analysis algorithm

46

Page 47: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

47

Vanilla algorithmProblem Definition:

1. Lattice of properties L of finite height (ACC)2. For each statement define a monotone transformer

Preparation:3. Parse program into AST4. Convert AST into CFG5. Generate system of equations from CFG

Analysis:6. Initialize each analysis variable with 7. Update all analysis variables of each equation until

reaching a fixed point

Non-incremental.Most variables don’t change.

Page 48: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

48

Chaotic iteration

Page 49: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

49

Chaotic iteration• Input:

– A cpo L = (D, , , ) satisfying ACC– Ln = L L … L– A monotone function f : Dn Dn – A system of equations { X[i] | f(X) | 1 i n }

• Output: lfp(f)• A worklist-based algorithm

for i:=1 to n do X[i] := WL = {1,…,n}while WL do j := pop WL // choose index non-deterministically N := F[i](X) if N X[i] then X[i] := N add all the indexes that directly depend on i to WL (X[j] depends on X[i] if F[j] contains X[i])return X

Page 50: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

50

Chaotic iteration for static analysis• Specialize chaotic iteration for programs• Create a CFG for program• Choose a cpo of properties for the static analysis to infer: L

= (D, , , )• Define variables R[0,…,n] for input/output of each CFG node

such that R[i] D• For each node v let vout be the variable at the output of that

node:vout = F[v]( u | (u,v) is a CFG edge)– Make sure each F[v] is monotone

• Variable dependence determined by outgoing edges in CFG

Page 51: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

51

Static analysis example:constant propagation

Page 52: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

52

Constant propagation example

x := 4

if (*)

assume y5

assume y=5

z := x

x := 4

entry

exit

x := 4;while (y5) do z := x; x := 4

Page 53: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

53

Constant propagation lattice

• For each variable x define L as

• For a set of program variables Var=x1,…,xn

Ln = L L … L

0-1-2 1 2 ......

no information yet

not-a-constant

Page 54: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

54

Write down variables

x := 4

if (*)

assume y5

assume y=5

z := x

x := 4

entry

exit

x := 4;while (y5) do z := x; x := 4

Page 55: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

55

Write down equations

x := 4

if (*)

assume y5

assume y=5

z := x

x := 4

entry

exit

R2R2R2

R3

R4

R6

R1

R5

R0x := 4;while (y5) do z := x; x := 4

Page 56: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

56

Collecting semantics equations

x := 4

if (*)

assume y5

assume y=5

z := x

x := 4

entry

exit

R2R2R2

R3

R4

R6

R0 = StateR1 = x:=4 R0

R2 = R1 R5

R3 = assume y5 R2

R4 = z:=x R3

R5 = x:=4 R4

R6 = assume y=5 R2

R1

R5

R0

Page 57: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

57

Constant propagation equations

x := 4

if (*)

assume y5

assume y=5

z := x

x := 4

entry

exit

R2R2R2

R3

R4

R6

R0 = R1 = x:=4# R0

R2 = R1 R5

R3 = assume y5# R2

R4 = z:=x# R3

R5 = x:=4# R4

R6 = assume y=5# R2

R1

R5

R0

abstract transformer

Page 58: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

58

Abstract operations for CPR0 = R1 = x:=4# R0

R2 = R1 R5

R3 = assume y5# R2

R4 = z:=x# R3

R5 = x:=4# R4

R6 = assume y=5# R2

Lattice elements have the form: (vx, vy, vz)x:=4# (vx,vy,vz) = (4, vy, vz)z:=x# (vx,vy,vz) = (vx, vy, vx)assume y5# (vx,vy,vz) = (vx, vy, vz)assume y=5# (vx,vy,vz) = if vy=k5 then (, , ) else (vx, 5, vz)R1 R5 = (a1, b1, c1) (a5, b5, c5) = (a1a5, b1b5, c1c5)

0-1-2 1 2 ......

CP lattice for a single variable

Page 59: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

59

Chaotic iteration for CP: initialization

x := 4

if (*)

assume y5

assume y=5

z := x

x := 4

entry

exit

R2=(, , )R2R2

R3=(, , )

R4=(, , )

R6=(, , )

R0 = R1 = x:=4# R0

R2 = R1 R5

R3 = assume y5# R2

R4 = z:=x# R3

R5 = x:=4# R4

R6 = assume y=5# R2

R1=(, , )

R5=(, , )

R0=(, , )

WL = {R0, R1, R2, R3, R4, R5, R6}

Page 60: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

60

Chaotic iteration for CP: initialization

x := 4

if (*)

assume y5

assume y=5

z := x

x := 4

entry

exit

R2=(, , )R2R2

R3=(, , )

R4=(, , )

R6=(, , )

R0 = R1 = x:=4# R0

R2 = R1 R5

R3 = assume y5# R2

R4 = z:=x# R3

R5 = x:=4# R4

R6 = assume y=5# R2

R1=(, , )

R5=(, , )

WL = {R1, R2, R3, R4, R5, R6}

R0=(, , )

Page 61: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

61

Chaotic iteration for CP: initialization

x := 4

if (*)

assume y5

assume y=5

z := x

x := 4

entry

exit

R2=(, , )R2R2

R3=(, , )

R4=(, , )

R6=(, , )

R0 = R1 = x:=4# R0

R2 = R1 R5

R3 = assume y5# R2

R4 = z:=x# R3

R5 = x:=4# R4

R6 = assume y=5# R2

R1=(, , )

R5=(, , )

R0=(, , )

WL = {R1, R2, R3, R4, R5, R6}

Page 62: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

62

Chaotic iteration for CP

x := 4

if (*)

assume y5

assume y=5

z := x

x := 4

entry

exit

R2=(, , )R2R2

R3=(, , )

R4=(, , )

R6=(, , )

R0 = R1 = x:=4# R0

R2 = R1 R5

R3 = assume y5# R2

R4 = z:=x# R3

R5 = x:=4# R4

R6 = assume y=5# R2

R1=(, , )

R5=(, , )

R0=(, , )

WL = {R2, R3, R4, R5, R6}

Page 63: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

63

Chaotic iteration for CP

x := 4

if (*)

assume y5

assume y=5

z := x

x := 4

entry

exit

R2=(, , )R2R2

R3=(, , )

R4=(, , )

R6=(, , )

R0 = R1 = x:=4# R0

R2 = R1 R5

R3 = assume y5# R2

R4 = z:=x# R3

R5 = x:=4# R4

R6 = assume y=5# R2

R1=(4, , )

R5=(, , )

R0=(, , )

WL = {R2, R3, R4, R5, R6}

Page 64: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

64

Chaotic iteration for CP

x := 4

if (*)

assume y5

assume y=5

z := x

x := 4

entry

exit

R2=(, , )R2R2

R3=(, , )

R4=(, , )

R6=(, , )

R0 = R1 = x:=4# R0

R2 = R1 R5

R3 = assume y5# R2

R4 = z:=x# R3

R5 = x:=4# R4

R6 = assume y=5# R2

R1=(4, , )

R5=(, , )

R0=(, , )

0-1-2 1 2 ......

3 4

WL = {R3, R4, R5, R6}

Page 65: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

65

Chaotic iteration for CP

x := 4

if (*)

assume y5

assume y=5

z := x

x := 4

entry

exit

R2=(4, , )R2R2

R3=(, , )

R4=(, , )

R6=(, , )

R0 = R1 = x:=4# R0

R2 = R1 R5

R3 = assume y5# R2

R4 = z:=x# R3

R5 = x:=4# R4

R6 = assume y=5# R2

R1=(4, , )

R5=(, , )

R0=(, , )

0-1-2 1 2 ......

3 4

WL = {R3, R4, R5, R6}

Page 66: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

66

Chaotic iteration for CP

x := 4

if (*)

assume y5

assume y=5

z := x

x := 4

entry

exit

R2=(4, , )R2R2

R3=(, , )

R4=(, , )

R6=(, , )

R0 = R1 = x:=4# R0

R2 = R1 R5

R3 = assume y5# R2

R4 = z:=x# R3

R5 = x:=4# R4

R6 = assume y=5# R2

R1=(4, , )

R5=(, , )

R0=(, , )

WL = {R4, R5, R6}

Page 67: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

67

Chaotic iteration for CP

x := 4

if (*)

assume y5

assume y=5

z := x

x := 4

entry

exit

R2R2

R3=(4, , )

R4=(, , )

R6=(, , )

R0 = R1 = x:=4# R0

R2 = R1 R5

R3 = assume y5# R2

R4 = z:=x# R3

R5 = x:=4# R4

R6 = assume y=5# R2

R5=(, , )

R1=(4, , )

R0=(, , )

R2=(4, , )

WL = {R5, R6}

Page 68: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

68

Chaotic iteration for CP

x := 4

if (*)

assume y5

assume y=5

z := x

x := 4

entry

exit

R2R2

R3=(4, , )

R4=(4, , 4)

R6=(, , )

R0 = R1 = x:=4# R0

R2 = R1 R5

R3 = assume y5# R2

R4 = z:=x# R3

R5 = x:=4# R4

R6 = assume y=5# R2

R5=(, , )

R1=(4, , )

R0=(, , )

R2=(4, , )

WL = {R6}

Page 69: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

69

Chaotic iteration for CP

x := 4

if (*)

assume y5

assume y=5

z := x

x := 4

entry

exit

R2R2

R6=(, , )

R0 = R1 = x:=4# R0

R2 = R1 R5

R3 = assume y5# R2

R4 = z:=x# R3

R5 = x:=4# R4

R6 = assume y=5# R2

R5=(4, , 4)

R4=(4, , 4)

R3=(4, , )

R1=(4, , )

R0=(, , )

R2=(4, , )R2=(4, , )

WL = {R6}

Page 70: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

70

Chaotic iteration for CPR0 = R1 = x:=4# R0

R2 = R1 R5

R3 = assume y5# R2

R4 = z:=x# R3

R5 = x:=4# R4

R6 = assume y=5# R2

x := 4

if (*)

assume y5

assume y=5

z := x

x := 4

entry

exit

R2R2

R6=(, , )

R5=(4, , 4)

R4=(4, , 4)

R3=(4, , )

R1=(4, , )

R0=(, , )

R2=(4, , )

WL = {}

Page 71: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

71

Chaotic iteration for CP – fixed pointR0 = R1 = x:=4# R0

R2 = R1 R5

R3 = assume y5# R2

R4 = z:=x# R3

R5 = x:=4# R4

R6 = assume y=5# R2

x := 4

if (*)

assume y5

assume y=5

z := x

x := 4

entry

exit

R2R2

R6=(4, 5, )

R5=(4, , 4)

R4=(4, , 4)

R3=(4, , )

R1=(4, , )

R0=(, , )

R2=(4, , )

WL = {}

In practice maintaina worklist of nodes

Page 72: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

72

Complexity of chaotic iteration• Parameters:

– n : number of CFG nodes– k: maximum in-degree of edges – h: height of lattice L– c: maximum cost of

• Applying Fv

• • Checking fixed-point condition for lattice L

• Complexity: O(n h c k)• Incremental (worklist) algorithm reduces the n factor

– Implement worklist by priority queue and order nodes by reversed topological order

Page 73: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

73

implementation

Page 74: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

74

Major classesVariable per CFG node

Converts CFG to equation system

An equation per CFG edge and join point

A system of equations

Chaotic iteration algorithmto compute fixed point

A transformerfor assume statements

A transformernon-assume statements

Combines all sub-algorithms to get entire static analysis

Page 75: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

75

Soot: a Java Optimization Framework

• Developed at McGill university (Canada)– http://www.sable.mcgill.ca/soot/

• Supports several input languages– Java source code– Java bytecode– Dalvik bytecode (Android)– Jimple intermediate language

• Supported output languages– Java bytecode– Dalvik bytecode (Android)– Jimple intermediate language

• Support several intermediate languages– Jimple – what we will be using– Shimple– Baf– Grimp

• Supports static analysis: CFG, pointer-analysis, etc.• Eclipse plug-in (useful for giving demos and teaching)

Page 76: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

76

Soot documentation and resources

• Soot survivor’s guide• Soot tutorials• Soot API• Eric Bodden’s blog– Running Soot:

http://www.bodden.de/2008/08/21/soot-command-line/

Page 77: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

77

Jimple synopsis• TAC for Java: 15 statement types• Core (intra-procedural) statements

– NopStmt– IdentityStmt (r0 := @this: Foo; i0 := @parameter0: int; )– AssignStmt ($r1 = new Foo;)

• Intra-procedural control-flow statements– IfStmt– GotoStmt– TableSwitchStmt (JVM tableSwitch instruction)– LookupSwithcStmt (JVM lookupswitch instruction)

• Inter-procedural control-flow statements– InvokeStmt– ReturnStmt– ReturnVoidStmt

• Monitor statements– EnterMonitorStmt– ExitMonitorStmt

• Exceptions– ThrowStmt– RetStmt

Page 78: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

78

Jimple expressions

Page 79: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

79

Java source

Page 80: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

80

Running Soot

output .jimple files go in “sootOutput”

Page 81: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

81

Jimple code

(default) static class initializer

Locals

IdentityStmts

Two variables with same name (w)?

Page 82: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

82

Setting up for development

1. Set up Java2. Set up Soot3. Set up abstract interpretation package

Page 83: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

83

Setting up Java• Make sure you have version 1.7

• If you want to operate from command line make sure you have jdk 1.7– Set environment variable JAVA_HOME to point to your jdk

installation path

Page 84: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

84

Setting up Soot• Download– sootclasses.jar– jasminclasses.jar– polyglotclasses.jar

• Recommended: Soot source (complete package)

Add Soot jar files as External

Attach Soot sources

Page 85: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

85

Example inputs• Store input files in a separate directory than the

ones you use for implementing the analyses(otherwise, front-end breaks)

Page 86: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

86

Static analysis package• Written for this course in the last few days– Not fully debugged

• Implements– Conversion of procedures to equation systems– Abstract domain implementations

• Some examples: variable equalities (VE), constant propagation (CP), more to come

– Chaotic iterations• Includes debugging information

– Domain combinators: Cartesian, Disjunctive completion, and Relational

– Code for displaying analysis results

Page 87: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

87

Running the VE analysis

• Example: variable equalities

Page 88: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

88

Running the VE analysis

Adds the analysis to Soot’s list of intra-procedural analyses

1. Creates the equation system2. Runs chaotic iteration3. Attaches results as StringTags

Page 89: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

89

Running the VE analysis• Command-line options:

-cp . : add the current directory to Soot’s CLASSPATH-pp : add Java’s CLASSPATH to Soot’s CLASSPATH-f jimple : output jimple code-p jb use-original-names : keep local variables names as they are-keep-line-number : write source code line numbers in the resulting jimple code-print-tags : write out tags for each jimple statement (analysis results)TestClass : the class to analyze

Enable assertions

Which directory to run in

Page 90: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

90

Debug printout

Page 91: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

91

Analysis results inlined into .jimple

Page 92: Spring 2014 Program Analysis and Verification Lecture 10: Abstract Interpretation  II

Next lecture:abstract interpretation III