Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks:...

56
CS738: Advanced Compiler Optimizations Foundations of Data Flow Analysis Amey Karkare [email protected] http://www.cse.iitk.ac.in/~karkare/cs738 Department of CSE, IIT Kanpur

Transcript of Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks:...

Page 1: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

CS738: Advanced Compiler Optimizations

Foundations of Data Flow Analysis

Amey Karkare

[email protected]

http://www.cse.iitk.ac.in/~karkare/cs738

Department of CSE, IIT Kanpur

Page 2: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Agenda

◮ Poset, Lattice, and Data Flow Frameworks: Review

◮ Connecting Tarski Lemma with Data Flow Analysis

◮ Soutions of Data Flow Analysis constraints

Page 3: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Knaster-Tarski Fixed Point Theorem

◮ Let f be a monotonic function on a complete lattice(S,

∧,∨). Define

◮ red(f ) = {v | v ∈ S, f (v) ≤ v}, pre fix-points◮ ext(f ) = {v | v ∈ S, f (v) ≥ v}, post fix-points◮ fix(f ) = {v | v ∈ S, f (v) = v}, fix-points

Then,◮

∧red(f ) ∈ fix(f ). Further,

∧red(f ) =

∧fix(f )

◮∨

ext(f ) ∈ fix(f ). Further,∨

ext(f ) =∧

fix(f )◮ fix(f ) is a complete lattice

Page 4: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Application of Fixed Point Theorem

◮ f : S → S is a monotonic function

◮ (S,∧) is a finite height semilattice

◮ ⊤ is the top element of (S,∧)

◮ Notation: f 0(x) = x , f i+1(x) = f (f i(x)), ∀i ≥ 0

◮ The greatest fixed point of f is

f k (⊤), where f k+1(⊤) = f k (⊤)

Page 5: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Fixed Point Algorithm

// monotonic function f on a meet semilattice

x := ⊤;while (x 6= f(x)) x := f(x);

return x;

Page 6: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Resemblance to Iterative Algorithm (Forward)

OUT[Entry ] = InfoEntry;

Page 7: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Resemblance to Iterative Algorithm (Forward)

OUT[Entry ] = InfoEntry;

for (other blocks B) OUT[B] = ⊤;

Page 8: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Resemblance to Iterative Algorithm (Forward)

OUT[Entry ] = InfoEntry;

for (other blocks B) OUT[B] = ⊤;while (changes to any OUT) {

Page 9: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Resemblance to Iterative Algorithm (Forward)

OUT[Entry ] = InfoEntry;

for (other blocks B) OUT[B] = ⊤;while (changes to any OUT) {

for (each block B) {

Page 10: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Resemblance to Iterative Algorithm (Forward)

OUT[Entry ] = InfoEntry;

for (other blocks B) OUT[B] = ⊤;while (changes to any OUT) {

for (each block B) {

IN(B) =∧

P∈PRED(B) OUT(P);

Page 11: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Resemblance to Iterative Algorithm (Forward)

OUT[Entry ] = InfoEntry;

for (other blocks B) OUT[B] = ⊤;while (changes to any OUT) {

for (each block B) {

IN(B) =∧

P∈PRED(B) OUT(P);

OUT(B) = fB(IN(B));}

}

Page 12: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Iterative Algorithm

◮ fB(X ) = X − KILL(B) ∪ GEN(B)

Page 13: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Iterative Algorithm

◮ fB(X ) = X − KILL(B) ∪ GEN(B)

◮ Backward:

Page 14: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Iterative Algorithm

◮ fB(X ) = X − KILL(B) ∪ GEN(B)

◮ Backward:◮ Swap IN and OUT everywhere

Page 15: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Iterative Algorithm

◮ fB(X ) = X − KILL(B) ∪ GEN(B)

◮ Backward:◮ Swap IN and OUT everywhere◮ Replace Entry by Exit

Page 16: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Iterative Algorithm

◮ fB(X ) = X − KILL(B) ∪ GEN(B)

◮ Backward:◮ Swap IN and OUT everywhere◮ Replace Entry by Exit◮ Replace predecessors by successors

Page 17: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Iterative Algorithm

◮ fB(X ) = X − KILL(B) ∪ GEN(B)

◮ Backward:◮ Swap IN and OUT everywhere◮ Replace Entry by Exit◮ Replace predecessors by successors

◮ In other words: just “invert” the flow graph!!

Page 18: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Acknowledgement

Rest of the slides based on the material at

http://infolab.stanford.edu/~ullman/dragon/w06/

w06.html

Page 19: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Solutions

◮ IDEAL solution = meet over all executable paths from

entry to a point (ignore unrealizable paths)

Page 20: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Solutions

◮ IDEAL solution = meet over all executable paths from

entry to a point (ignore unrealizable paths)

◮ MOP = meet over all paths from entry to a given point, of

the transfer function along that path applied to InfoEntry .

Page 21: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Solutions

◮ IDEAL solution = meet over all executable paths from

entry to a point (ignore unrealizable paths)

◮ MOP = meet over all paths from entry to a given point, of

the transfer function along that path applied to InfoEntry .

◮ MFP (maximal fixedpoint) = result of iterative algorithm.

Page 22: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Maximum Fixedpoint

◮ Fixedpoint = solution to the equations used in the

iteration:

IN(B) =∧

P∈PRED(B)

OUT(P)

OUT(B) = fB(IN(B))

Page 23: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Maximum Fixedpoint

◮ Fixedpoint = solution to the equations used in the

iteration:

IN(B) =∧

P∈PRED(B)

OUT(P)

OUT(B) = fB(IN(B))

◮ Maximum Fixedpoint = any other solution is ≤ the result if

the iterative algorithm (MFP)

Page 24: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Maximum Fixedpoint

◮ Fixedpoint = solution to the equations used in the

iteration:

IN(B) =∧

P∈PRED(B)

OUT(P)

OUT(B) = fB(IN(B))

◮ Maximum Fixedpoint = any other solution is ≤ the result if

the iterative algorithm (MFP)

◮ ≤: carries less information.

Page 25: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

MOP and IDEAL

◮ All solutions are really meets of the result of starting with

InfoEntry and following some set of paths to the point in

question.

Page 26: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

MOP and IDEAL

◮ All solutions are really meets of the result of starting with

InfoEntry and following some set of paths to the point in

question.

◮ If we don’t include at least the IDEAL paths, we have an

error.

Page 27: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

MOP and IDEAL

◮ All solutions are really meets of the result of starting with

InfoEntry and following some set of paths to the point in

question.

◮ If we don’t include at least the IDEAL paths, we have an

error.

◮ But try not to include too many more.

Page 28: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

MOP and IDEAL

◮ All solutions are really meets of the result of starting with

InfoEntry and following some set of paths to the point in

question.

◮ If we don’t include at least the IDEAL paths, we have an

error.

◮ But try not to include too many more.

◮ Less “ignorance,” but we “know too much.”

Page 29: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

MOP Versus IDEAL

◮ Any solution that is ≤ IDEAL accounts for all executablepaths (and maybe more paths)

Page 30: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

MOP Versus IDEAL

◮ Any solution that is ≤ IDEAL accounts for all executablepaths (and maybe more paths)◮ and is therefore conservative (safe)

Page 31: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

MOP Versus IDEAL

◮ Any solution that is ≤ IDEAL accounts for all executablepaths (and maybe more paths)◮ and is therefore conservative (safe)◮ even if not accurate.

Page 32: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

MFP vs MOP

◮ If MFP ≤ MOP?

Page 33: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

MFP vs MOP

◮ If MFP ≤ MOP?◮ If so, then MFP ≤ MOP ≤ IDEAL, therefore MFP is safe.

Page 34: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

MFP vs MOP

◮ If MFP ≤ MOP?◮ If so, then MFP ≤ MOP ≤ IDEAL, therefore MFP is safe.

◮ Yes, but . . .

Page 35: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

MFP vs MOP

◮ If MFP ≤ MOP?◮ If so, then MFP ≤ MOP ≤ IDEAL, therefore MFP is safe.

◮ Yes, but . . .

◮ Requires two assumptions about the framework:

Page 36: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

MFP vs MOP

◮ If MFP ≤ MOP?◮ If so, then MFP ≤ MOP ≤ IDEAL, therefore MFP is safe.

◮ Yes, but . . .

◮ Requires two assumptions about the framework:◮ “Monotonicity.”

Page 37: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

MFP vs MOP

◮ If MFP ≤ MOP?◮ If so, then MFP ≤ MOP ≤ IDEAL, therefore MFP is safe.

◮ Yes, but . . .

◮ Requires two assumptions about the framework:◮ “Monotonicity.”◮ Finite height

Page 38: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

MFP vs MOP

◮ If MFP ≤ MOP?◮ If so, then MFP ≤ MOP ≤ IDEAL, therefore MFP is safe.

◮ Yes, but . . .

◮ Requires two assumptions about the framework:◮ “Monotonicity.”◮ Finite height

no infinite chains . . . < x2 < x1 < x < . . .

Page 39: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

MFP vs MOP

◮ Intuition: If we computed the MOP directly, we would

compose functions along all paths, then take a big meet.

Page 40: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

MFP vs MOP

◮ Intuition: If we computed the MOP directly, we would

compose functions along all paths, then take a big meet.

◮ But the MFP (iterative algorithm) alternates compositions

and meets arbitrarily.

Page 41: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Good News

◮ The frameworks we’ve studied so far are all monotone.

Page 42: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Good News

◮ The frameworks we’ve studied so far are all monotone.◮ Easy proof for functions in Gen-Kill form.

Page 43: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Good News

◮ The frameworks we’ve studied so far are all monotone.◮ Easy proof for functions in Gen-Kill form.

◮ And they have finite height.

Page 44: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Good News

◮ The frameworks we’ve studied so far are all monotone.◮ Easy proof for functions in Gen-Kill form.

◮ And they have finite height.◮ Only a finite number of defs, variables, etc. in any program.

Page 45: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Two Paths to B that Meet Early

Entry

OUT = x

OUT = y

IN = x∧

y B

f (x∧

y)

f (x)∧

f (y)f

f (x)

f (y)

Page 46: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Two Paths to B that Meet Early

Entry

OUT = x

OUT = y

IN = x∧

y B

f (x∧

y)

f (x)∧

f (y)f

f (x)

f (y)

◮ MOP considers paths independently and combines at the

last possible moment.

Page 47: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Two Paths to B that Meet Early

Entry

OUT = x

OUT = y

IN = x∧

y B

f (x∧

y)

f (x)∧

f (y)f

f (x)

f (y)

◮ MOP considers paths independently and combines at the

last possible moment.

◮ In MFP, Values x and y get combined too soon.

Page 48: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Two Paths to B that Meet Early

Entry

OUT = x

OUT = y

IN = x∧

y B

f (x∧

y)

f (x)∧

f (y)f

f (x)

f (y)

◮ MOP considers paths independently and combines at the

last possible moment.

◮ In MFP, Values x and y get combined too soon.

◮ Since f (x∧

y) ≤ f (x)∧

f (y), it is as we added non-existent

paths.

Page 49: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Distributive Frameworks

◮ Distributivity:

f (x∧

y) = f (x)∧

f (y)

Page 50: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Distributive Frameworks

◮ Distributivity:

f (x∧

y) = f (x)∧

f (y)

◮ Stronger than Monotonicity

Page 51: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Distributive Frameworks

◮ Distributivity:

f (x∧

y) = f (x)∧

f (y)

◮ Stronger than Monotonicity◮ Distributivity ⇒ Monotonicity

Page 52: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Distributive Frameworks

◮ Distributivity:

f (x∧

y) = f (x)∧

f (y)

◮ Stronger than Monotonicity◮ Distributivity ⇒ Monotonicity◮ But the reverse is not true

Page 53: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Even More Good News!

◮ The 4 example frameworks are distributive.

Page 54: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Even More Good News!

◮ The 4 example frameworks are distributive.

◮ If a framework is distributive, then combining paths earlydoesn’t hurt.

Page 55: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Even More Good News!

◮ The 4 example frameworks are distributive.

◮ If a framework is distributive, then combining paths earlydoesn’t hurt.◮ MOP = MFP.

Page 56: Foundations of Data Flow Analysis - GitHub Pages...Agenda Poset, Lattice, and Data Flow Frameworks: Review Connecting Tarski Lemma with Data Flow Analysis Soutions of Data Flow Analysis

Even More Good News!

◮ The 4 example frameworks are distributive.

◮ If a framework is distributive, then combining paths earlydoesn’t hurt.◮ MOP = MFP.◮ That is, the iterative algorithm computes a solution that

takes into account all and only the physical paths.