Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1...
Transcript of Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1...
![Page 1: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/1.jpg)
1
Register Allocation forIrregular Architectures
David KoesCALCM
4/20/2004
![Page 2: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/2.jpg)
2
Motivation
int i,j,k;short a,b,c;
r0r1r2
…
r30r31
Good register allocationcritical for performance
![Page 3: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/3.jpg)
3
Outline
• Register Allocation Overview• ...for Irregular Architectures• Previous Work• A New Hope?
![Page 4: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/4.jpg)
4
v <- 1
w <- v + 3
x <- w + v
u <- v
t <- u + x
<- x
<- w
<- t
<- u
Example
need to assign a register tohold the value of each variable
![Page 5: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/5.jpg)
5
Register Allocation
• For fixed number of registers does an assignment exist? if not, what should be spilled (later...)
• Find the assignment
When can we not assign two variablesto the same register?
![Page 6: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/6.jpg)
6
Liveness
• A variable is live at a point if thevariable might be used later in theprogram
• Variables live at the same point cannot be in the same register interfere
![Page 7: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/7.jpg)
7
Interference
v <- 1
w <- v + 3
x <- w + v
u <- v
t <- u + x
<- x
<- w
<- t
<- u
Variables that are liveat the same pointinterfereHow to represent this?
![Page 8: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/8.jpg)
7
Interference
v <- 1
w <- v + 3
x <- w + v
u <- v
t <- u + x
<- x
<- w
<- t
<- u
v
w
Variables that are liveat the same pointinterfereHow to represent this?
![Page 9: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/9.jpg)
8
Interference Graph
v <- 1
w <- v + 3
x <- w + v
u <- v
t <- u + x
<- x
<- w
<- t
<- u
v
x w
u
t
Variable → NodeInterference → Edge
Register Assignment →Graph Coloring
![Page 10: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/10.jpg)
9
Graph Coloring
v <- 1
w <- v + 3
x <- w + v
u <- v
t <- u + x
<- x
<- w
<- t
<- u
v
x w
u
t
Given k colors, is itpossible to color thenodes of a graphsuch that a nodedoes not have thesame color as any ofits neighbors?
Register → Color
![Page 11: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/11.jpg)
9
Graph Coloring
v <- 1
w <- v + 3
x <- w + v
u <- v
t <- u + x
<- x
<- w
<- t
<- u
v
x w
u
t
Given k colors, is itpossible to color thenodes of a graphsuch that a nodedoes not have thesame color as any ofits neighbors?
Register → Color
![Page 12: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/12.jpg)
10
Graph Coloring Register Allocation
• Compute liveness information• Build interference graph• Use heuristics to color graph
use local properties like node degree suboptimal: commit to coloring decisions
• Coloring succeeds - done• Coloring fails - must spill
spilling “removes” variable from graph
![Page 13: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/13.jpg)
11
Spilling
v <- 1
w <- v + 3
x <- w + v
u <- v
t <- u + x
<- x
<- w
<- t
<- u
v
x w
u
t
k = 3Impossible to colorSpill a variable
-allocate to memory-pick using heuristic
![Page 14: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/14.jpg)
12
Spilled
v
x
u
t
w1 w2 w3
v <- 1
w1 <- v + 3
Mw[]<- w1w2 <- Mw[]
x <- w2 + v
u <- v
t <- u + x
<- x
w3 <- Mw[]
<- w3 <- t
<- u
![Page 15: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/15.jpg)
13
Register Allocation Summary
• Build interference graph• Color using heuristics• If not colorable, spill
repeat process [Briggs94]
single pass all uncolored variables
spilled much faster compile time
Build
Color
Spill
unallocated program
allocated program
Graph Coloring Register Allocator
![Page 16: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/16.jpg)
14
Outline
• Register Allocation Overview• ...for Irregular Architectures• Previous Work• A New Hope
![Page 17: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/17.jpg)
15
Irregular Architectures
• Few registers• Register usage restrictions
address registers, hardwired registers...• Memory operands• Examples:
x86, 68k, ColdFire,ARM Thumb, MIPS16,V800, various DSPs...
eaxebxecxedxesiedi
ebpesp
![Page 18: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/18.jpg)
16
Register Allocation forIrregular Architectures
• Graph coloring register allocation used gcc, ORC, SUIF, GHS, IMPACT, IBM
• Assertion: Graph coloring is the wrong
representation for performing registerallocation on irregular architectures
![Page 19: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/19.jpg)
17
Fewer Registers → More Spills
• Used gcc to compile>10,000 functionsfrom Mediabench,Spec95, Spec2000,and micro-benchmarks
• Recorded for whichfunctions graphcoloring failed
Percent of functions with no spills
97.390
54.35
0
20
40
60
80
100
PPC(32)
68k(16)
x86 (8)
Per
cen
t
![Page 20: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/20.jpg)
18
PPC (32 registers)
![Page 21: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/21.jpg)
19
68k (16 registers)
![Page 22: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/22.jpg)
20
x86 (8 registers)
![Page 23: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/23.jpg)
21
Graph Coloring and Spills
• Graph coloring solves the registersufficiency problem Even if P=NP, suboptimal if spills
necessary• No optimization of spill code• Many spills may slow down allocator
has to rebuild interference graph
![Page 24: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/24.jpg)
22
Register Usage Restrictions
• Example68kMOVE (ptr),tmp1
EOR #3,tmp1
MOVEQ #32,tmp2 or MOVEA #32,tmp2ADD tmp2,ptr
MOVE tmp1,D0
address registerdata registerany register
A0 D0A1 D1A2 D2A3 D3A4 D4A5 D5A6 D6A7 D7
![Page 25: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/25.jpg)
23
Register Usage Restrictions
• Instructions may require or prefer a specificsubset of registers 68k address/data registers
MOVEA #1,A0 // 4 byte instruction MOVEQ #1,D0 // 2 byte instruction
x86 div instruction• Graph coloring assumes all colors are
equally applicable no principled way to express preferences requirements may be mutually exclusive
![Page 26: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/26.jpg)
24
Memory Operands
• A variable allocated to memory maynot require load/store to access depends upon instruction still less efficient than register access
• Graph coloring (usually) spillsvariables which make graph easier tocolor may not be an efficient variable to spill
![Page 27: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/27.jpg)
25
Graph Coloring Wrong Approach forIrregular Architectures
• Solves wrong problem focuses attention on preventing spills doesn’t optimize spill code
• No representation of irregular features• Variables assigned single register
complicates meeting usage restrictions live range splitting partial solution
![Page 28: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/28.jpg)
26
Outline
• Register Allocation Overview• ...for Irregular Architectures• Previous Work
Graph coloring improvements Integer Programming Separated IP PBQP
• A New Hope
![Page 29: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/29.jpg)
27
Graph Coloring Improvements
• Spill code optimization better heuristics [Bernstein et al 89] partial spilling [Bergner et al 97]
• Register usage constraints modified interference graph [Briggs 92] weighted interference graph/modified
heuristics [Smith and Holloway 01]
![Page 30: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/30.jpg)
28
Outline
• Register Allocation Overview• ...for Irregular Architectures• Previous Work
Graph coloring improvements Integer Programming Separated IP PBQP
• A New Hope
![Page 31: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/31.jpg)
29
Integer Programming (IP)
• Minimize/maximize linear function• Subject to linear constraints• Solution must be integer• Example
€
Maximize z = x1 + x2
subject to2x1 + 3x2 ≤12,x1 ≤ 4, x2 ≤ 3 5
1,4
:Solution
21
=
==
z
xx
![Page 32: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/32.jpg)
30
Register Allocation as IP
• Simplified example
1,,0
1
subject to
233min
≤≤
≥++
++∑
cba
cba
cba
mmm
mmm
mmma <-b <-c <- a + b <- a + c <- b
mvar is a decision variable0 means var is in register1 means var is in memory
![Page 33: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/33.jpg)
31
IP: Good News
• IP can precisely model registerallocation [Goodwin and Wilken 96] including irregular architecture features
[Kong and Wilken 98] can exploit structure of register allocation
problem to improve compile time[Fu and Wilken 2002]
• Can solve problem without integerconditions in polynomial time
![Page 34: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/34.jpg)
32
IP: Bad News
• With integer conditions problem isNP-complete
• No polynomial guarantee• Does not get feasible solution quickly
can’t just impose time limits and get ausable, if suboptimal, solution
![Page 35: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/35.jpg)
33
IP: Results
• SPEC92 (integer)• x86, models many irregular features
• 61% reduction in runtime spill codeoverhead
• >15 minutes on 2.4% of SPEC92functions
T. Kong and K. Wilken, “Precise Register Allocation for Irregular Architectures” 1998
![Page 36: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/36.jpg)
34
Outline
• Register Allocation Overview• ...for Irregular Architectures• Previous Work
Graph coloring improvements Integer Programming Separated IP PBQP
• A New Hope
![Page 37: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/37.jpg)
35
Separated IP
• Separate allocation and assignment[Appel and George 01]
• Use IP to optimally insert spill code also model some x86 features
• Result never has more than k livevariables at any point not necessarily k-colorable insert moves at every program point
![Page 38: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/38.jpg)
36
Separated IP: Second Pass
• Second pass performs assignment andremoves moves use heuristic solution [Park and Moon 98] optimal solution (IP) not tractable left as an open problem in paper
![Page 39: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/39.jpg)
37
Separated IP: Results
Overall 9.5% improvement in execution speed
A. W. Appel, L. George. “Optimal spilling for CISC machines with few registers.”
![Page 40: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/40.jpg)
38
Separated IP: Limitations
• Can still be prone to exponentialblow-up in first pass may not provide intermediate solution
• Second pass not optimal• Claims to be faster than full IP solution
different compilers, benchmarks, sourcelanguages, and target architectures
![Page 41: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/41.jpg)
39
Outline
• Register Allocation Overview• ...for Irregular Architectures• Previous Work
Graph coloring improvements Integer Programming Separated IP PBQP
• A New Hope
![Page 42: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/42.jpg)
40
Partitioned Boolen QuadraticOptimization Problem Formulation
• Similar to IP [Scholz and Eckstein 02] minimize quadratic function decision variables 0-1 constraints incorporated into function
![Page 43: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/43.jpg)
41
Partitioned Boolen QuadraticOptimization Problem Formulation
• Advantages Can fully model irregular features Fast, polynomial approximation performs
well in practice• Disadvantages
Approximation algorithm not bounded No iterative way to improve upon solution
![Page 44: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/44.jpg)
42
PBQP: Results
• Caramel 20xx DSP very irregular register requirements
• Geometric mean improvement Optimal: 5.85% Approximation: 3.93%
![Page 45: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/45.jpg)
43
Comparison
OptimalPolynomialrunningtime
Modelsirregularfeatures
Optimizes spillcode
Method
no/yesyes/noyesyesPBQP
nonoyesyesSeparated IP
yesnoyesyesIntegerProgramming
noyessome, withheuristics
withheuristics
GraphColoring
![Page 46: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/46.jpg)
44
Outline
• Register Allocation Overview• ...for Irregular Architectures• Previous Work
Graph coloring improvements Integer Programming Separated IP PBQP
• A New Hope
![Page 47: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/47.jpg)
45
New Problem FormulationGoals
• Explicitly represent architecturalirregularities and costs
• An optimum solution results inoptimal register allocation
• Suboptimal solution algorithm scales more computation → better solutionωdecent feasible solution obtained quickly competitive with current allocators
![Page 48: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/48.jpg)
46
One Possibility:Multicommodity Network Flow
• Given network (directed graph) with cost and capacity on each edge sources & sinks for multiple commodities
• Find lowest cost flow of commodities• Many different applications
communication networks, transportationnetworks, distribution networks, etc
• NP-complete for integer flows
![Page 49: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/49.jpg)
47
MCNF: Example
a b
a b
2
22 4
444
Thin edges havecapacity of oneThick edges haveinfinite capacityCost is zerounless labeled
![Page 50: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/50.jpg)
47
MCNF: Example
a b
a b
2
22 4
444
instruction
crossbarA D M
A D M
A D M
Thin edges havecapacity of oneThick edges haveinfinite capacityCost is zerounless labeled
![Page 51: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/51.jpg)
48
Register Allocation as MCNF
• Variables → Commodities• Variable Usage → Network Design• Registers Limits → Bundle Constraints• Spill Costs → Edge Costs• Variable Definition → Source• Variable Last Use → Sink
![Page 52: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/52.jpg)
49
Example
int foo(int a, int b)
{
int c = a-b;
return c/b;
}
![Page 53: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/53.jpg)
50
MCNF Representation
• Explicitly optimizes spill code,memory operands, and registerpreferences represented by edge costs
• Most restrictions on register usageeasily modeled capacity and bundle constraints
• Compact representation
![Page 54: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/54.jpg)
51
MCNF as Integer Program
• Variable for everycommodity forevery edge flow of that
commodity alongthat edge
• Flow constraints bundle network capacity
€
Minimize ckxkk∑
subject to
xijk
k∑ ≤ uij
Nxk = bk
0 ≤ xijk ≤ uij
k
![Page 55: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/55.jpg)
52
Solving an MCNF
• Can use standard IP solvers• Can exploit structure of problem
variety of MCNF specific solvers empirically faster than IP solvers integer solution still worst case exponential
• Noninteger solutions used to get integersolution used to reduce search space
branch and bound branch and cut
![Page 56: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/56.jpg)
53
Lagrangian Relaxation
• Bring constraints into min function
• Lagrangian multipliers: edge price subgradient optimization finds optimal price
• Relaxation removes bundle constraints€
L(w) =min ckxk + wij xijk −uij
k∑
i, j∑
k∑
L(w) =min cijk +wij( )xijk
i, j∑
k∑ − wijuij
i, j∑
![Page 57: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/57.jpg)
53
Lagrangian Relaxation
• Bring constraints into min function
• Lagrangian multipliers: edge price subgradient optimization finds optimal price
• Relaxation removes bundle constraints€
L(w) =min ckxk + wij xijk −uij
k∑
i, j∑
k∑
L(w) =min cijk +wij( )xijk
i, j∑
k∑ − wijuij
i, j∑
![Page 58: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/58.jpg)
54
Heuristic Solution
• Iterate solve independent single commodity
network flows in Lagrangian relaxation update Lagrangian multipliers
• Converge (or terminate at cutoff)• Use prices to guide greedy algorithm
build solution from single commodityflow subproblems
![Page 59: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/59.jpg)
55
Summary
• Graph coloring wrong approach forirregular architectures
• Other approaches can fully model architecture often optimal no performance guarantee
• Multicommodity network flow promising new formulation
![Page 60: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/60.jpg)
56
Questions?
![Page 61: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/61.jpg)
57
k-live, but not k-colorable
a <-
b <- <- a <- b c <- a <- <- c
c b
![Page 62: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/62.jpg)
57
k-live, but not k-colorable
a <-
b <- <- a <- b c <- a <- <- c
a
c b
![Page 63: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/63.jpg)
57
k-live, but not k-colorable
a <-
b <- <- a <- b c <- a <- <- c
a
c b
![Page 64: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/64.jpg)
57
k-live, but not k-colorable
a <-
b <- <- a <- b c <- a <- <- c
a
c b
![Page 65: Register Allocation for Irregular Architecturesdkoes/research/irregular.pdf · short a,b,c; r0 r1 r2 … r30 r31 Good register allocation ... EOR#3,tmp1 MOVEQ#32,tmp2or MOVEA #32,tmp2](https://reader033.fdocuments.in/reader033/viewer/2022060605/605aee5abf32cd3f175d8dd3/html5/thumbnails/65.jpg)
57
k-live, but not k-colorable
a <-
b <- <- a <- b c <- a <- <- c
a
c b