A Framework for Unrestricted Whole-Program Optimization Spyridon Triantafyllis, Matthew J. Bridges,...
-
Upload
nicholas-rowe -
Category
Documents
-
view
212 -
download
0
Transcript of A Framework for Unrestricted Whole-Program Optimization Spyridon Triantafyllis, Matthew J. Bridges,...
A Framework for UnrestrictedA Framework for UnrestrictedWhole-Program OptimizationWhole-Program Optimization
Spyridon Triantafyllis, Matthew J. Bridges, Easwaran Raman, Guilherme Ottoni, David I.
August
The Liberty Research GroupDepartment of Computer Science
Princeton University
2 Princeton University
Velocity Compiler Research
Procedure-Based CompilationProcedure-Based Compilation
if (EOB)
d=a*bfill B
ret
z=x*yf(x,y,5) f(1,2,3)
g() h()f(a,b,c)Procedure-Based
CompilationPros:• Well KnownCons:• Can not exploit
opportunities that cross procedures
…=z
3 Princeton University
Velocity Compiler Research
Interprocedural AnalysisInterprocedural Analysis
Interprocedural Analysis[Sharir’78] [Morel’78]
[Reps’95]Pros:• Increases available
information• Enables some optimization
across procedure boundaries
Cons:• Has to analyze the entire
program• Optimizations need to
respect the procedure boundary
if (EOB)
d=a*bfill B
ret
z=x*yf(x,y,5) f(1,2,3)
g() h()f(a,b,c)
…=z
4 Princeton University
Velocity Compiler Research
Interprocedural Analysis & Interprocedural OptiInterprocedural Analysis & Interprocedural Opti
Interprocedural Analysis[Sharir’78] [Morel’78]
[Reps’95]Pros:• Increases available
information• Enables some optimization
across procedure boundariesCons:• Has to analyze the entire
program• Optimizations need to
respect the procedure boundary
• Most optimizations will still be intraprocedural
if (EOB)
d=zfill B
ret
z=x*yf(x,y,5,z) f(1,2,3,2)
g() h()f(a,b,c,z)
…=z
5 Princeton University
Velocity Compiler Research
InliningInliningInlining[Scheifler ‘77] [Hwu’89]
[Chang’92] Pros:• Increases optimization scope• Enables specialization• Doesn’t require opti to
understand interprocedural concerns
Cons:• Hard limit on procedure size• Unnecessary code growth
z=x*yg’()
if (EOB)
d=z
jump
fill Bif (EOB)
d=a*bfill B
ret
f(1,2,3)
h()
f(a,b,c)
…=z
6 Princeton University
Velocity Compiler Research
Partial InliningPartial Inlining
z=x*yg()
if (EOB)
d=z
f’()
jump
fill B
return
f’()
if (EOB)
d=a*bfill B
ret
f(1,2,3)
h()
f(a,b,c)
Partial Inlining[Suganuma’03][Way’00] Pros:• Can alleviate some code
growthCons:• Gains are limited
…=z
7 Princeton University
Velocity Compiler Research
Why Procedures?Why Procedures?
if (EOB)
fill B
ret
z=x*yf(x,y,5) f(1,2,3)
g() h()f(a,b,c)
z=x*y
Procedures• Calling convention
boundary• Single-Entry, Single-ExitPros:• Implicit correlated edges
- context sensitivity• Natural unit for divide &
conquer compilationCons:• Optimized for software-
engineering• Restricts optimization
We don’t have to use procedures!
…=z
d=a*bd=z
8 Princeton University
Velocity Compiler Research
The Whole-Program CFGThe Whole-Program CFG
if (EOB)
d=a*bfill B
jump
z=x*y
Retain useful traits of procedures• Correlated edges• Compilation unitGoal: Obtain an optimizable whole-
program representation• Increase optimization scope• Allow all opti to operate on
increased scope without change• Targeted code growth
…=z
9 Princeton University
Velocity Compiler Research
The Whole-Program CFGThe Whole-Program CFG
B
C D
EHF
A G(1 (2
)1 )2
Represent calls and returns as special control-flow transitions [Sharir’78]
Retain useful traits of procedures• Correlated edges• Compilation unitGoal: Obtain an optimizable whole-
program representation• Increase optimization scope• Allow all opti to operate on
increased scope without change• Targeted code growth
10 Princeton University
Velocity Compiler Research
Whole-Program OptimizationsWhole-Program Optimizations
B
C D
EH
F’
A
G
(1
(2
)2
B
C’
E’
F
)1 )1
Optimization destroys the program’s procedural structure!
• Example: Superblock Formation [Hwu’92]
Unconventional call structures!
• Many-to-many call <-> return relation
• Must rediscover structure for summary edges
11 Princeton University
Velocity Compiler Research
Context-Sensitive Interprocedural Analysis
[Sharir‘78]meet over all realizable paths
Identify Entry-Exit Pairs (EEP):
• Correlated call & return arcs• Allows use of summary edges• Blocks may belong to more
than one EEP
Analyzing the Whole-Program CFGAnalyzing the Whole-Program CFG
B
C D
EH
F’
A
G
(1
(2
)2
B’
C’
E’
F
)1 )1
(BE) C’B’ E’
(BF’) CB’ E D
(B’F’) CB E D
12 Princeton University
Velocity Compiler Research
Determining a Compilation Unit: Region FormationDetermining a Compilation Unit: Region Formation
B
C D
EH
F’
A
G
(1
(2
)2
B’
C’
E’
F
)1 )1
Region Formation [Hank’95]• arbitrarily shaped,
compiler-selected compilation unit
Region Selection• Select seed & add neighbors
(profile, structure, dataflow …)
Success Criteria• Optimizability vs. compile
time• Few too small or too big
regions• Intra-region transitions »
inter-region transitions
Encapsulation• Make regions independently
optimizable
Compiler is free to select its own optimization units!
13 Princeton University
Velocity Compiler Research
Evaluation Framework: The Velocity CompilerEvaluation Framework: The Velocity CompilerFrontend
Procedures
Superblock
Classical & ILP
Optimizer
Executable
Baseline
Superblock
Executable
Procedures
Inlining
Frontend
Inlining
PBE
WCFG
Region Form.
Superblock
Regions
Executable
Procedures
Frontend
DetermineCompilation Unit
Optimize Compilation Unit
Classical& ILP
Optimizer
Classical& ILP
Optimizer
Evaluation:
•Inliner & Opti. ported from IMPACT
•Targeting Itanium 2
Procedures
Scheduling Scheduling Scheduling
14 Princeton University
Velocity Compiler Research
Code GrowthCode Growth
Code Size
0.90
1.10
1.30
1.50
1.70
1.90
124.m88ksim129.compress
164.gzip 179.art 181.mcf183.equake 188.ammp 256.bzip2 Geo Mean
Inlining PBE
1.45
1.23
15 Princeton University
Velocity Compiler Research
Speedup - Train InputSpeedup - Train Input
0.95
1.00
1.05
1.10
1.15
1.20
1.25
1.30
124.m88ksim129.compress
164.gzip179.art 181.mcf
183.equake 188.ammp 256.bzip2 Geo Mean
Inlining PBE
1.07
16 Princeton University
Velocity Compiler Research
ConclusionConclusion
Procedure boundaries restrict optimization!
Ways to deal with procedures exist, but limited• Interprocedural analysis & opti: Scales badly, not always
possible• Inlining: Unnecessary Code growth • Procedures are not the right compilation unit
PBE offers unrestricted and practical whole-program optimization
• An expanded form of interprocedural analysis• New region formation framework and heuristics• An interprocedural region encapsulation method