Map, Set & Bit-Vector Discrete Mathematics and Its Applications Baojian Hua [email protected].
Data Flow Analysis Compiler Baojian Hua [email protected].
-
Upload
milo-gaines -
Category
Documents
-
view
236 -
download
1
Transcript of Data Flow Analysis Compiler Baojian Hua [email protected].
![Page 2: Data Flow Analysis Compiler Baojian Hua bjhua@ustc.edu.cn.](https://reader036.fdocuments.in/reader036/viewer/2022081503/56649e4d5503460f94b438ef/html5/thumbnails/2.jpg)
Front End
source code
abstract syntax
tree
lexical analyzer
parser
tokens
IRsemantic analyzer
![Page 3: Data Flow Analysis Compiler Baojian Hua bjhua@ustc.edu.cn.](https://reader036.fdocuments.in/reader036/viewer/2022081503/56649e4d5503460f94b438ef/html5/thumbnails/3.jpg)
Middle End
AST translation IR1
asmother IR
and translation
translation IR2
![Page 4: Data Flow Analysis Compiler Baojian Hua bjhua@ustc.edu.cn.](https://reader036.fdocuments.in/reader036/viewer/2022081503/56649e4d5503460f94b438ef/html5/thumbnails/4.jpg)
Optimizations
AST translation IR1
asmother IR
and translation
translation IR2
opt
optopt
opt
opt
![Page 5: Data Flow Analysis Compiler Baojian Hua bjhua@ustc.edu.cn.](https://reader036.fdocuments.in/reader036/viewer/2022081503/56649e4d5503460f94b438ef/html5/thumbnails/5.jpg)
General Scheme for Optimization Analysis
control flow, data flow, dependency, …
to obtain conservative static knowledge of the program being optimized
approximation of the dynamic Rewriting
rewrite the program dependent on the knowledge
obtained above
IR
IR’
staticinformation
analysis
rewriting
![Page 6: Data Flow Analysis Compiler Baojian Hua bjhua@ustc.edu.cn.](https://reader036.fdocuments.in/reader036/viewer/2022081503/56649e4d5503460f94b438ef/html5/thumbnails/6.jpg)
“Conservative Static”
Cjump (x==5? L1: L2)
y = 1 y = 2
print (y)
Can we substitute y with the value 2?
This amounts to prove that x is always equal to 5!
Suppose x is an input from user, it’s impossible to know it’s value statically. So one must be conservative to use the static knowledge.
![Page 7: Data Flow Analysis Compiler Baojian Hua bjhua@ustc.edu.cn.](https://reader036.fdocuments.in/reader036/viewer/2022081503/56649e4d5503460f94b438ef/html5/thumbnails/7.jpg)
Liveness Analysis
![Page 8: Data Flow Analysis Compiler Baojian Hua bjhua@ustc.edu.cn.](https://reader036.fdocuments.in/reader036/viewer/2022081503/56649e4d5503460f94b438ef/html5/thumbnails/8.jpg)
Motivation Low level IRs assume an infinite number of a
bstract “registers” good for code generations but bad for execution on a real machine
machine has a finite number of registers so how to leverage this?
The goal of register allocation (optimization) is to put infinite variables into a few registers need liveness analysis
![Page 9: Data Flow Analysis Compiler Baojian Hua bjhua@ustc.edu.cn.](https://reader036.fdocuments.in/reader036/viewer/2022081503/56649e4d5503460f94b438ef/html5/thumbnails/9.jpg)
Example
Consider this TAC: Three variables: a, b, and c.
And assume that the target machine has only one register: r.
Is it possible to put all three variables “a”, “b” and “c” in register “r”?
a = 1
b = a + 2
c = b + 3
return c
![Page 10: Data Flow Analysis Compiler Baojian Hua bjhua@ustc.edu.cn.](https://reader036.fdocuments.in/reader036/viewer/2022081503/56649e4d5503460f94b438ef/html5/thumbnails/10.jpg)
Example
Calculate which variable is “live” at a given program point.
{c}
{b}
{a}
The “liveness” information gives live ranges.
Live ranges don’t overlap, thus all three variables can be put into one reg’.
Consider this TAC:
a = 1
b = a + 2
c = b + 3
return c
![Page 11: Data Flow Analysis Compiler Baojian Hua bjhua@ustc.edu.cn.](https://reader036.fdocuments.in/reader036/viewer/2022081503/56649e4d5503460f94b438ef/html5/thumbnails/11.jpg)
ExampleRegister allocation:
a => r
b => r
c => r
{c}
{b}
{a}
Code rewriting:
r = 1
r = r + 2
r = r + 3
return r
Consider this TAC:
a = 1
b = a + 2
c = b + 3
return c
![Page 12: Data Flow Analysis Compiler Baojian Hua bjhua@ustc.edu.cn.](https://reader036.fdocuments.in/reader036/viewer/2022081503/56649e4d5503460f94b438ef/html5/thumbnails/12.jpg)
Data Flow Equations for Liveness Inside basic blocks (backward):in = use[n] \/ (out - def[n])
// Example:a = 1
b = a + 2
c = b + 3
return c
// Example:a = 1
b = a + 2
c = b + 3
return a + c
int
out
![Page 13: Data Flow Analysis Compiler Baojian Hua bjhua@ustc.edu.cn.](https://reader036.fdocuments.in/reader036/viewer/2022081503/56649e4d5503460f94b438ef/html5/thumbnails/13.jpg)
For general CFG
Equations: in[n] = use[n]\/(out[n]-def[n]) out[n] = \/s∈succ[n] in[s] Fixpoint algorithm
init in out sets with {} loop until no set changes use[n]
def[n]
in[n]
out[n]
![Page 14: Data Flow Analysis Compiler Baojian Hua bjhua@ustc.edu.cn.](https://reader036.fdocuments.in/reader036/viewer/2022081503/56649e4d5503460f94b438ef/html5/thumbnails/14.jpg)
Examplein/out
in/out in/out in/out
in/out
1 {} {}
{} {} {} {a} …
2 {} {}
{a} {}
{a} {b,c} …
3 {} {}
{b,c} {} {b,c}{b} …
4 {} {}
{b} {}
{b}{a,c} …
5 {} {}
{a} {a}
{a}{a,c} …
6 {} {}
{c} {}
{c} {} …
a = 0
b = a + 1
c = c + b
a = b * 2
a<N
return c
1
2
3
4
5
6node 1 2 3 4 5 6
def {a}
{b}
{c} {a} {} {}
use {}
{a}
{b, c}
{b} {a, N}
{c}
{a,c}{b,c}{b,c}{a,c}{a,c}
Final live_out
Loop the nodes with order: 1, 2, 3, 4, 5, 6
{c}
in[n] = use[n] \/ (out[n]-def[n])
out[n] = \/s\in succ[n] in[s]
![Page 15: Data Flow Analysis Compiler Baojian Hua bjhua@ustc.edu.cn.](https://reader036.fdocuments.in/reader036/viewer/2022081503/56649e4d5503460f94b438ef/html5/thumbnails/15.jpg)
Interference Graph
a = 0
b = a + 1
c = c + b
a = b * 2
a<N
return c
1
2
3
4
5
6
{a,c}{b,c}{b,c}{a,c}{a,c}
Final live_out
{c}
For any two variable x and y, if they are live simultaneously, then draw an (undirected) edge x->y.
a
b c
![Page 16: Data Flow Analysis Compiler Baojian Hua bjhua@ustc.edu.cn.](https://reader036.fdocuments.in/reader036/viewer/2022081503/56649e4d5503460f94b438ef/html5/thumbnails/16.jpg)
Speeding-up the analysis Ordering the nodes
for liveness analysis: reverse top-sort order You do this in lab 5
Once a variable Careful selection of set representation
Careful data structure engineering Say: bit-vector
Basic block You do this in lab 5
![Page 17: Data Flow Analysis Compiler Baojian Hua bjhua@ustc.edu.cn.](https://reader036.fdocuments.in/reader036/viewer/2022081503/56649e4d5503460f94b438ef/html5/thumbnails/17.jpg)
Basic Blocks Step 1: calculate def and use for each basic
block b one pass backward calculation
Step 2: do liveness analysis on each block just as discussed above
Step 3: calculate liveness information for each statement in each block one pass backward calculation
![Page 18: Data Flow Analysis Compiler Baojian Hua bjhua@ustc.edu.cn.](https://reader036.fdocuments.in/reader036/viewer/2022081503/56649e4d5503460f94b438ef/html5/thumbnails/18.jpg)
Exampleout/in out/in out/in out/in
3 {} {}
{} {c} {} {c} {} {c}
2 {} {}
{c} {a,c} {a,c} {a,c} {a,c} {a,c}
1 {} {}
{a,c} {c} {a,c}{c} {a,c} {c}
a = 0
b = a + 1c = c + ba = b * 2
a<N
return c
1
2
3block 1 2 3
def {a}
{a,b,c} {}
use {} {a,c} {c}
This set does NOT contain variable “b”. Why?
Blocks are reverse topo-sort ordered
live_out for each block
{a,c}
{a,c}
{}
Backward calculation of live_out for each statement.
{a,c}
{b,c}
{b,c}
![Page 19: Data Flow Analysis Compiler Baojian Hua bjhua@ustc.edu.cn.](https://reader036.fdocuments.in/reader036/viewer/2022081503/56649e4d5503460f94b438ef/html5/thumbnails/19.jpg)
Reaching Definition
![Page 20: Data Flow Analysis Compiler Baojian Hua bjhua@ustc.edu.cn.](https://reader036.fdocuments.in/reader036/viewer/2022081503/56649e4d5503460f94b438ef/html5/thumbnails/20.jpg)
Reaching Definition
a = 0
b = a + 1c = c + ba = b * 2
a<N
return c
1
2
3
E.g., can we substitute the variable a with 0?
The problem: at any program point, we’d like to know where the value of a variable x is defined.
If so, we are doing the so-called constant propagation optimization.
![Page 21: Data Flow Analysis Compiler Baojian Hua bjhua@ustc.edu.cn.](https://reader036.fdocuments.in/reader036/viewer/2022081503/56649e4d5503460f94b438ef/html5/thumbnails/21.jpg)
Implementation
a = 0
b = a + 1c = c + ba = b * 2
a<N
return c
1
2
3
Number each definition:
Here we number the four definition with 5, 6, 7, 8, which have no special meaning, just:
1. they are different from the block
number, and
2. they are all unique.)
5:
6:7:8:
![Page 22: Data Flow Analysis Compiler Baojian Hua bjhua@ustc.edu.cn.](https://reader036.fdocuments.in/reader036/viewer/2022081503/56649e4d5503460f94b438ef/html5/thumbnails/22.jpg)
Equations
a = 0
b = a + 1c = c + ba = b * 2
a<N
return c
1
2
3
Calculate def and kill for each block, based on the equation
for statement:
def[d: x=…] = {d}
kill[d: x= …] = defs(x)-{d}
5:
6:7:8:
def[1] = {5}
kill[1] = {8}
def[2] = {6,7,8}
kill[2] = {5}def[3] = {}
kill[3] = {}
![Page 23: Data Flow Analysis Compiler Baojian Hua bjhua@ustc.edu.cn.](https://reader036.fdocuments.in/reader036/viewer/2022081503/56649e4d5503460f94b438ef/html5/thumbnails/23.jpg)
Data Flow Equation
Forward calculation: in[b] = \/q∈ pred(b) out[b] out[b] = def[b]\/(in[b]-kill[b])
![Page 24: Data Flow Analysis Compiler Baojian Hua bjhua@ustc.edu.cn.](https://reader036.fdocuments.in/reader036/viewer/2022081503/56649e4d5503460f94b438ef/html5/thumbnails/24.jpg)
Fixpoint algorithm
a = 0
b = a + 1c = c + ba = b * 2
a<N
return c
1
2
3
5:
6:7:8:
block 1 2 3
def {5}
{6,7,8} {}
kill {8}
{5} {}
in/out in/out in/out in/out
1 {} {}
{} {5} {} {5}
2 {} {}
{5} {6,7,8} {5,6,7,8} {6,7,8}
3 {} {}
{6,7,8} {6,7,8}
{6,7,8} {6,7,8}in[b] = \/q∈ pred(b) out[b]
out[b] = def[b]\/(in[b]-kill[b])
{}
{5,6,7,8}{5,6,7,8}{5,6,7,8}{6,7,8}
{6,7,8}
![Page 25: Data Flow Analysis Compiler Baojian Hua bjhua@ustc.edu.cn.](https://reader036.fdocuments.in/reader036/viewer/2022081503/56649e4d5503460f94b438ef/html5/thumbnails/25.jpg)
Constant Propagation
a = 0
b = a + 1c = c + ba = b * 2
a<N
return c
1
2
3
5:
6:7:8:
{}
{5,6,7,8}{5,6,7,8}{5,6,7,8}{6,7,8}
{6,7,8}
Can we substitute the variable a here with the constant “0”?
No! Because there are two definitions for “a” which may reach this point: 5 and 8.
![Page 26: Data Flow Analysis Compiler Baojian Hua bjhua@ustc.edu.cn.](https://reader036.fdocuments.in/reader036/viewer/2022081503/56649e4d5503460f94b438ef/html5/thumbnails/26.jpg)
Available Expressions
![Page 27: Data Flow Analysis Compiler Baojian Hua bjhua@ustc.edu.cn.](https://reader036.fdocuments.in/reader036/viewer/2022081503/56649e4d5503460f94b438ef/html5/thumbnails/27.jpg)
Available Expressions
a = 0
b = a + 1c = c + ba = a + 1
a<N
return c
1
2
3
E.g., has the right-side expression “a+1” been calculated and thus available here?
So the second calculation can be avoided!
The problem: at a given program point, we’d like to know whether or not the value of an expression e has been calculated and is also available.
1. The expression e must be calculated on every path to the point, and
2. variables used in e must not been redefined after the initial calculation.
![Page 28: Data Flow Analysis Compiler Baojian Hua bjhua@ustc.edu.cn.](https://reader036.fdocuments.in/reader036/viewer/2022081503/56649e4d5503460f94b438ef/html5/thumbnails/28.jpg)
Implementation
a = 0
b = a + 1c = c + ba = a + 1
a<N
return c
1
2
3
Calculate gen and kill for each block, based on the equation
for statement. (Tiger table 17.4)
gen[1] = {}
kill[1] = {a+1}
gen[2] = {}
kill[2] = ALL
gen[3] = {}
kill[3] = {}
All possible expressions:
ALL={a+1, c+b}
![Page 29: Data Flow Analysis Compiler Baojian Hua bjhua@ustc.edu.cn.](https://reader036.fdocuments.in/reader036/viewer/2022081503/56649e4d5503460f94b438ef/html5/thumbnails/29.jpg)
Implementation
a = 0
b = a + 1c = c + ba = a + 1
a<N
return c
1
2
3
Calculate in/out for each block, based on the fixpoint algorithm.
gen[1] = {}
kill[1] = {a+1}
gen[2] = {}
kill[2] = ALL
gen[3] = {}
kill[3] = {}
All available expressions:
ALL={a+1, c+b}in/out in/out in/out
1 {} ALL {} {}
2 ALL ALL {} {}
3 ALL ALL {} {}
![Page 30: Data Flow Analysis Compiler Baojian Hua bjhua@ustc.edu.cn.](https://reader036.fdocuments.in/reader036/viewer/2022081503/56649e4d5503460f94b438ef/html5/thumbnails/30.jpg)
Implementation
a = 0
b = a + 1c = c + ba = a + 1
a<N
return c
1
2
3
Calculate in/out for each statement, based on the in/out for each block.
{}
All available expressions:
ALL={a+1, c+b}in/out in/out in/out
1 {} ALL {} {}
2 ALL ALL {} {}
3 ALL ALL {} {}
{}
{}{a+1}{a+1}{}{}
{}{}
![Page 31: Data Flow Analysis Compiler Baojian Hua bjhua@ustc.edu.cn.](https://reader036.fdocuments.in/reader036/viewer/2022081503/56649e4d5503460f94b438ef/html5/thumbnails/31.jpg)
Common Sub-expression Elimination (CSE)
a = 0
b = a + 1c = c + ba = a + 1
a<N
return c
1
2
3
E.g., has the right-side expression “a+1” been calculated and thus available here?
So the second calculation can be avoided!After the available expression
analysis, we know “a+1” is available, so the second calculation can be omitted!
return c
1
2
3
{}
{}
{}{a+1}{a+1}{}{}
{}{}
b
But with which variable the expression “a+1” should be substituted? We need to do reaching expression analysis... (Read the text and do homework!)