CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR...

48
CS153: Compilers Lecture 21: Register Allocation Stephen Chong https://www.seas.harvard.edu/courses/cs153 Contains content from lecture notes by Steve Zdancewic and Greg Morrisett

Transcript of CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR...

Page 1: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

CS153: Compilers Lecture 21: Register Allocation

Stephen Chong https://www.seas.harvard.edu/courses/cs153Contains content from lecture notes by Steve Zdancewic and Greg Morrisett

Page 2: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Pre-class Puzzle

•What’s the minimum number of colors needed to color a map of the USA?

2

•Every state is assigned one color

•Adjacent states must be given different colors

https://printable-maps.blogspot.com/2011/12/blank-map-of-united-states.html

Page 3: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Pre-class Puzzle Answer

•4

•Four-color theorem says ≤4

•Must be at least 4: •Suppose we had only 3 colors •Pick some colors for CA and OR

(Red and Green) •NV must be Blue •ID must be Red •AZ must be Green •UT!!!!!!

3

Red

Green

Blue

Red

Green

!!!

Page 4: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Announcements

•HW5: Oat v.2 out •Due in one week: Tue Nov 19

•HW6 released today •Due in 3 weeks: Tue Dec 3

4

Page 5: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Today

•HW6 overview •Register allocation

•Graph coloring by simplification •Coalescing •Coloring with coalescing • Pre-colored nodes to handle callee-save, caller-save, and special purpose registers

5

Page 6: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

HW6

•Analysis and optimization •Implement generic iterative dataflow •Implement Alias Analysis

•Implement Dead-Code Elimination •Implement Constant Propagation

•Register allocation

•Performance •Post a test case

•Optional: participate in leaderboard •More optimizations...

•Extra credit available! •Sophisticated register allocation, additional optimizations

6

Page 7: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Yet More General Dataflow Analysis

•Gen-kill framework suits many dataflow analyses •Forward: out[n] := gen[n] ∪ (in[n] – kill[n])

•Backward: in[n] := gen[n] ∪ (out[n] – kill[n])

•But some analyses can’t be phrased in this way •E.g., constant propagation, alias analysis

•Instead, characterize a (forward) dataflow analysis by: •Domain of dataflow values L

• The info we are computing

• E.g., for reaching definitions, L is the set of definitions

•Flow function for instruction n, Fn : L→L

•For gen-kill analyses, Fn(ℓ) = gen[n] ∪ (ℓ – kill[n])

•Combining operator ⊓ : L→L • “If either ℓ1 or ℓ2 holds just before node n, we know at most ℓ1 ⊓ ℓ2

• in[n] = ⨅n’∈pred[n] out[n’]

• E.g., for may analyses ⊓ is ∪ (set union), for must analyses ⊓ is ∩ (set intersection)

•(Backwards analysis is similar) 7

Page 8: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Generic Iterative Forward Analysis

•⊤ is the “top element” of L, typically the “maximum” amount of information

•Having “more” information enables more optimizations •“Maximum” information could be inconsistent with the constraints •Iteration refines the answer, eliminating inconsistencies

8

for all n, in[n] := ⊤, out[n] := ⊤ repeat until no change in ‘in’ and ‘out’ for all n in[n] := ⨅n’∈succ[n] out[n’] out[n] := Fn(in[n]) end end

Page 9: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Constant Propagation

•Domain •L = uid → SymConst

•SymConst = NonConst | Const i | Undef

•Flow function: •Fuid = ins (m) = m[ uid ↦ ⟦ins⟧m ]

•⟦o1 + o2⟧m = •NonConst if ⟦o1⟧m = NonConst or ⟦o2⟧m = NonConst

•Undef if ⟦o1⟧m = Undef or ⟦o2⟧m = Undef

•Const k where k = i + j and ⟦o1⟧m = Const i and ⟦o2⟧m = Const j

•⟦Null⟧m = NonConst

•⟦k⟧m = Const k (i.e., a constant integer operand

•⟦%u⟧m = m(u)

•...

•Combining operator: •m1, m2 : uid → SymConst •(m1 ⊓ m2)(%u) = m1(%u) ⊓ m2(%u)

9

NonConst

Undef

Const -1 Const 0 Const 1 Const 2... ...

Page 10: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Alias Analysis

•Domain •L = uid → SymPtr

•SymPtr = MayAlias | Unique | Undef

•Flow function: •Fuid = ins (m) = F(uid = ins, m)

•F(%s = alloca ..., m) = m[%s ↦ Unique]

•F(%s = load ..., m) = m[%s ↦ MayAlias]

•F(%s = store t %t o, m) = m[%t ↦ MayAlias] • (i.e., %t was stored in a location, and so it may no longer be a unique pointer)

•...

•Combining operator: •m1, m2 : uid → SymPtr •(m1 ⊓ m2)(%u) = m1(%u) ⊓ m2(%u)

10

MayAlias

Undef

Unique

Page 11: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Register Allocation Problem

•Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

•Find: a mapping from temporaries to machine registers such that •program semantics is preserved (i.e., behavior is the same) •register usage is maximized •moves between registers are minimized

•calling conventions / architecture requirements are obeyed

•Stack Spilling •If there are k registers available and m > k temporaries are live at the

same time, then not all of them will fit into registers.

•So: "spill" the excess temporaries to the stack.

11

Page 12: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Linear-Scan Register Allocation

•Simple, greedy register-allocation strategy:

•1. Compute liveness information: live_out(x) •recall: live_out(x)is the set of uids that are live immediately after x's definition

•2. Let pal be the set of usable registers •usually reserve a couple for spill code [our implementation uses rax,rcx]

•3. Maintain "layout" uid_loc that maps uids to locations •locations include registers and stack slots n, starting at n=0

•4. Scan through the program. For each instruction that defines a uid x •used = {r | reg r = uid_loc(y) s.t. y ∈ live_out(x)}

•available = pal - used •If available is empty: // no registers available, spill

uid_loc(x) := slot n ; n = n + 1

•Otherwise, pick r in available: // choose an available register uid_loc(x) := reg r

12

Page 13: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

For HW6

•HW 6 implements two naive register allocation strategies: •no_reg_layout: spill all registers

•greedy_layout: assign registers greedily using linear scan

•Your job: do “better” than these. •Quality Metric:

•registers other than rbp count positively

•rbp counts negatively (it is used for spilling)

•shorter code is better

•Linear scan register allocation should suffice •But… can we do better?

13

Page 14: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Register Allocation

•Register allocation is in generally an NP-complete problem •Can we allocate all these n temporaries to k registers?

•But we have a heuristic that is linear in practice! •Based on graph coloring •Given a graph, can we assign one of k colors to each node such that connected nodes have different colors?

•Here, nodes are temp variables, an edge between t1 and t2 means that t1 and t2 are live at the same time. Colors are registers.

•But graph coloring is also NP-complete! How does that work?

14

Page 15: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Coloring by Simplification

•Four phases

•Build: construct interference graph, using dataflow analysis to find for each program point vars that are live at the same time

•Simplify: color based on simple heuristic •If graph G has node n with k-1 edges, then G-{n} is k-colorable iff G is k-colorable •So remove nodes with degree <k

•Spill: if graph has only nodes with degree ≥k, choose one to potentially spill (i.e., that may need to be saved to stack)

•Then continue with Simplify

•Select: when graph is empty, start restoring nodes in reverse order and color them •When we encounter a potential spill node, try coloring it. If we can’t, rewrite program to

store it to stack after definition and load before use. Try again!

15

Build Simplify Spill Select

Page 16: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Example

16

{live-in: j, k}g := *(j+12)h := k - 1f := g * he := *(j+8)m := *(j+16)b := *(f+0)c := e + 8d := ck := m + 4j := b{live-out: d,j,k}

f

e

m

b

cd

kj

h

g

From Appel Interference graph

Page 17: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

g

Stephen Chong, Harvard University

Simplification (4 registers)

17

f

e

m

b

cd

kj

h

g

Stack:

g

Choose any node with degree <4

Page 18: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

h

Stephen Chong, Harvard University

Simplification (4 registers)

18

f

e

m

b

cd

kj

h

Stack:

gh

Choose any node with degree <4

Page 19: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

k

Stephen Chong, Harvard University

Simplification (4 registers)

19

f

e

m

b

cd

kj

Stack:

ghk

Choose any node with degree <4

Page 20: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

d

Stephen Chong, Harvard University

Simplification (4 registers)

20

f

e

m

b

cd

j

Stack:

ghkd

Choose any node with degree <4

Page 21: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

j

Stephen Chong, Harvard University

Simplification (4 registers)

21

f

e

m

b

c

j

Stack:

ghkdj

Choose any node with degree <4

Page 22: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

e

Stephen Chong, Harvard University

Simplification (4 registers)

22

f

e

m

b

c

Stack:

ghkdje

Choose any node with degree <4

Page 23: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

f

Stephen Chong, Harvard University

Simplification (4 registers)

23

f

m

b

c

Stack:

ghkdjef

Choose any node with degree <4

Page 24: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

b

Stephen Chong, Harvard University

Simplification (4 registers)

24

m

b

c

Stack:

ghkdjefb

Choose any node with degree <4

Page 25: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

c

Stephen Chong, Harvard University

Simplification (4 registers)

25

m

c

Stack:

ghkdjefbc

Choose any node with degree <4

Page 26: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

m

Stephen Chong, Harvard University

Simplification (4 registers)

26

m

Stack:

ghkdjefbcm

Choose any node with degree <4

Page 27: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Select (4 registers)

27

f

e

m

b

cd

kj

h

g

Stack:

ghkdjefbcm

f

e

m

b

cd

kj

h

g

Color nodes in order of stack

=t1 =t2 =t3 =t4

Graph is now empty!

Page 28: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Select (4 registers)

28

f

e

m

b

cd

kj

h

g

f

e

m

b

cd

kj

h

g

g := *(j+12)h := k - 1f := g * he := *(j+8)m := *(j+16)b := *(f+0)c := e + 8d := ck := m + 4j := b

=t1 =t2 =t3 =t4

Page 29: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Select (4 registers)

29

f

e

m

b

cd

kj

h

g

f

e

m

b

cd

kj

h

g

=t1 =t2 =t3 =t4

$t2 := *(t4+12)$t1 := $t1 - 1$t2 := $t2 * $t1$t3 := *($t4+8)$t1 := *($t4+16)$t2 := *($t2+0)$t3 := $t3 + 8$t3 := $t3$t1 := $t1 + 4$t4 := $t2

Some moves might subsequently be simplified...

Page 30: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Spilling

•This example worked out nicely! •Always had nodes with degree <k •Let’s try again, but now with only 3 registers...

30

Page 31: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Simplification (3 registers)

31

f

e

m

b

cd

kj

h

g

Stack:

h

Choose any node with degree <3

h

Page 32: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Simplification (3 registers)

32

f

e

m

b

cd

kj

g

Stack:

h

Choose any node with degree <3

c

c

Page 33: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Simplification (3 registers)

33

f

e

m

b

d

kj

g

Stack:

h

Choose any node with degree <3

g

cg

Page 34: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Simplification (3 registers)

34

f

e

m

b

d

kj

Stack:

h

Now we are stuck! No nodes with degree <3

cg

Pick a node to potentially spill

Choose any node with degree <3

Page 35: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Which Node to Spill?

•Want to pick a node (i.e., temp variable) that will make it likely we’ll be able to k color graph •High degree (≈ live at

many program points) •Not used/defined very

often (so we don’t need to access stack very often)

•E.g., compute spill priority of node

35

f

e

m

b

d

kj

Uses+defs outside loop

Uses+defs in loop ×10 +

degree of node

Page 36: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Which Node to Spill?

36

f

e

m

b

d

kj

Uses+defs outside loop

Uses+defs in loop ×10 +

degree of node

Spill priority =

{live-in: j, k}g := *(j+12)h := k - 1f := g * he := *(j+8)m := *(j+16)b := *(f+0)c := e + 8d := ck := m + 4j := b{live-out: d,j,k}

Page 37: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Simplification (3 registers)

37

f

e

m

b

d

kj

Stack:

h

Pick a node with small spill priority degree to potentially spill

cg

Choose any node with degree <3

d spill?

d

Page 38: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Simplification (3 registers)

38

f

e

m

bkj

Stack:

hcg

Choose any node with degree <3

d spill?kk

Page 39: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Simplification (3 registers)

39

f

e

m

bj

Stack:

hcg

Choose any node with degree <3

d spill? jkj

Page 40: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Simplification (3 registers)

40

f

e

m

b

Stack:

hcg

Choose any node with degree <3

d spill?bk

jb

Page 41: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Simplification (3 registers)

41

f

e

m

Stack:

hcg

Choose any node with degree <3

d spill?e

kjbe

Page 42: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Simplification (3 registers)

42

f

m

Stack:

hcg

Choose any node with degree <3

d spill?

f

kjbef

Page 43: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Simplification (3 registers)

43

m

Stack:

hcg

Choose any node with degree <3

d spill? m

kjbefm

Page 44: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Select (3 registers)

44

Stack:f

e

m

b

cd

kj

h

g

f

e

mjb

Color nodes in order of stack

=t1 =t2 =t3

Graph is now empty!

hcgd spill?kjbefm

k

Page 45: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Select (3 registers)

45

Stack:

hcgd spill?

f

e

m

b

cd

kj

h

g

f

e

m

=t1 =t2 =t3

We got unlucky!

In some cases a potential spill node is still colorable, and the Select phase can continue.

But in this case, we need to rewrite...

bj

k

Page 46: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Select (3 registers)

•Spill d

46

{live-in: j, k}g := *(j+12)h := k - 1f := g * he := *(j+8)m := *(j+16)b := *(f+0)c := e + 8d := ck := m + 4j := b{live-out: d,j,k}

{live-in: j, k}g := *(j+12)h := k - 1f := g * he := *(j+8)m := *(j+16)b := *(f+0)c := e + 8d := c*<fp+doff>:=dk := m + 4j := bd2:=*<fp+doff>{live-out: d2,j,k}

Page 47: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Build

47

{live-in: j, k}g := *(j+12)h := k - 1f := g * he := *(j+8)m := *(j+16)b := *(f+0)c := e + 8d := c*<fp+doff>:=dk := m + 4j := bd2:=*<fp+doff>{live-out: d2,j,k}

f

e

m

b

cd

kj

h

g d2

Page 48: CS153: Compilers Lecture 21: Register Allocation · Register Allocation Problem •Given: an IR program that uses an unbounded number of temporaries •e.g. the uids of our LLVM programs

Stephen Chong, Harvard University

Simplification (3 registers)

48

f

e

m

b

cd

kj

h

g d2

Stack:Choose any node with degree <3

hcgdd2kbmefj

This time we succeed and will be able to complete Select phase successfully!