Compiler Optimisation - 7 Register Allocation · 2019-01-23 · Register allocation Definitions...

Post on 25-May-2020

9 views 1 download

Transcript of Compiler Optimisation - 7 Register Allocation · 2019-01-23 · Register allocation Definitions...

Compiler Optimisation7 – Register Allocation

Hugh LeatherIF 1.18a

hleather@inf.ed.ac.uk

Institute for Computing Systems ArchitectureSchool of Informatics

University of Edinburgh

2019

Introduction

This lecture:Local Allocation - spill codeGlobal Allocation based on graph colouringTechniques to reduce spill code

Register allocation

Physical machines have limited number of registersScheduling and selection typically assume infinite registersRegister allocation and assignment ∞ → k registers

RequirementsProduce correct code that uses k (or fewer) registersMinimise added loads and storesMinimise space used to hold spilled valuesOperate efficiently

O(n), O(nlog2n), maybe O(n2), but not O(2n)

Register allocationDefinitions

Allocation versus assignmentAllocation is deciding which values to keep in registersAssignment is choosing specific registers for values

InterferenceTwo valuesa cannot be mapped to the same register wherever theyare both liveb

Such values are said to interfereaA value is stored in a variablebA value is live from its definition to its last use

Live rangeThe live range of a value is the set of statements at which it is liveMay be conservatively overestimated (e.g. just begin → end)

Register allocationDefinitions

SpillingSpilling saves a value from a register to memoryThat register is then free – Another value often loadedRequires F registers to be reserved

Clean and dirty valuesA previously spilled value is clean if not changed since last spillOtherwise it is dirtyA clean value can b spilled without a new store instruction

Spilling in ILOC

F is 0 (assuming rarp already reserved)Dirty valuestoreAI rx → rarp,@xloadAI rarp,@y ⇒ ry

Clean valueloadAI rarp,@y ⇒ ry

Local register allocation

Register allocation only on basic block

MAXLIVELet MAXLIVE be the maximum, over each instruction i in theblock, of the number of values (pseudo-registers) live at i.

If MAXLIVE ≤ k, allocation should be easyIf MAXLIVE ≤ k, no need to reserve F registers for spillingIf MAXLIVE > k, some values must be spilled to memoryIf MAXLIVE > k, need to reserve F registers for spilling

Two main forms:Top downBottom up

Local register allocationMAXLIVE

Example MAXLIVE computationSome simple code with virtual registers

Local register allocationMAXLIVE

Example MAXLIVE computationLive registers

Local register allocationMAXLIVE

Example MAXLIVE computationMAXLIVE is 4

Local register allocationTop down

Algorithm:If number of values > k

Rank values by occurrencesAllocate first k - F values to registersSpill other values

Local register allocationTop down

Example top downUsage counts

Local register allocationTop down

Example top downSpill rc . Now only 3 values live at once

Local register allocationTop down

Example top downSpill code inserted

Local register allocationTop down

Example top downRegister assignment straightforward

Local register allocationBottom up

Algorithm:Start with empty register setLoad on demandWhen no register is available, free one

Replacement:Spill the value whose next use is farthest in the futurePrefer clean value to dirty value

Local register allocationTop down

Example bottom downSpill ra. Now only 3 values live at once

Local register allocationTop down

Example bottom downSpill code inserted

Global register allocation

Local allocation does not capture reuse of values across multipleblocksMost modern, global allocators use a graph-colouring paradigm

Build a “conflict graph” or “interference graph”Data flow based liveness analysis for interference

Find a k-colouring for the graph, or change the code to anearby problem that it can k-colourNP-complete under nearly all assumptions1

1Local allocation is NP-complete with dirty vs clean

Global register allocationAlgorithm sketch

From live ranges construct an interference graphColour interference graph so that no two neighbouring nodeshave same colourIf graph needs more than k colours - transform code

Coalesce merge-able copiesSplit live rangesSpill

Colouring is NP-complete so we will need heuristicsMap colours onto physical registers

Global register allocationGraph colouring

DefinitionA graph G is said to be k-colourable iff the nodes can be labeledwith integers 1 ... k so that no edge in G connects two nodes withthe same label

Examples

Global register allocationInterference graph

The interference graph, GI = (NI ,EI)

Nodes in GI represent values, or live rangesEdges in GI represent individual interferences∀x , y ∈ NI , x → y ∈ EI iff x and y interfere2

A k-colouring of GI can be mapped into an allocation to k registers

2Two values interfere wherever they are both liveTwo live ranges interfere if their values interfere at any point

Global register allocationColouring the interference graph

Degree3 of a node (n°) is a loose upper bound on colourabilityAny node, n, such that n° < k is always trivially k-colourable

Trivially colourable nodes cannot adversely affect thecolourability of neighbours4

Can remove them from graphReduces degree of neighbours - may be trivially colourable

If left with any nodes such that n° ≥ k spill oneReduces degree of neighbours - may be trivially colourable

3Degree is number of neighbours4Proof as exercise

Global register allocationChaitin’s algorithm

1 While ∃ vertices with < k neighbours in GIPick any vertex n such that n° < k and put it on the stackRemove n and all edges incident to it from GI

2 If GI is non-empty (n° >= k, ∀n ∈ GI) then:Pick vertex n (heuristic), spill live range of nRemove vertex n and edges from GI , put n on “spill list”Goto step 1

3 If the spill list is not empty, insert spill code, then rebuild theinterference graph and try to allocate, again

4 Otherwise, successively pop vertices off the stack and colourthem in the lowest colour not used by some neighbour

Global register allocationChaitin’s algorithm

Example: colouring with Chaitin’s algorithmColour with k = 3 colours

Global register allocationChaitin’s algorithm

Example: colouring with Chaitin’s algorithma° = 2 < k Choose a

Global register allocationChaitin’s algorithm

Example: colouring with Chaitin’s algorithmPush a and remove from graph

Global register allocationChaitin’s algorithm

Example: colouring with Chaitin’s algorithmb° = 2 < k and c° = 2 < k Choose b

Global register allocationChaitin’s algorithm

Example: colouring with Chaitin’s algorithmPush b and remove from graph

Global register allocationChaitin’s algorithm

Example: colouring with Chaitin’s algorithmc° = 2 < k, d° = 2 < k, and e° = 2 < k Choose c

Global register allocationChaitin’s algorithm

Example: colouring with Chaitin’s algorithmPush c and remove from graph

Global register allocationChaitin’s algorithm

Example: colouring with Chaitin’s algorithmd° = 1 < k and e° = 1 < k Choose d

Global register allocationChaitin’s algorithm

Example: colouring with Chaitin’s algorithmPush d and remove from graph

Global register allocationChaitin’s algorithm

Example: colouring with Chaitin’s algorithme° = 0 < k Choose e

Global register allocationChaitin’s algorithm

Example: colouring with Chaitin’s algorithmPush e and remove from graph

Global register allocationChaitin’s algorithm

Example: colouring with Chaitin’s algorithmPop e, neighbours use no colours, choose red

Global register allocationChaitin’s algorithm

Example: colouring with Chaitin’s algorithmPop d , neighbours use red, choose green

Global register allocationChaitin’s algorithm

Example: colouring with Chaitin’s algorithmPop c, neighbours use red and green choose blue

Global register allocationChaitin’s algorithm

Example: colouring with Chaitin’s algorithmPop b, neighbours use red and green choose blue

Global register allocationChaitin’s algorithm

Example: colouring with Chaitin’s algorithmPop a, neighbours use blue choose red

Global register allocationOptimistic colouring

If Chaitins algorithm reaches a state where every node has kor more neighbours, it chooses a node to spill.

Example of Chaitin overzealous spilling

k = 2Graph is 2-colourable

Chaitin must immediately spill one of these nodes

Briggs said, take that same node and push it on the stackWhen you pop it off, a colour might be available for it!

Chaitin-Briggs algorithm uses this to colour that graph

Global register allocationChaitin-Briggs algorithm

1 While ∃ vertices with < k neighbours in GIPick any vertex n such that n° < k and put it on the stackRemove n and all edges incident to it from GI

2 If GI is non-empty (n° >= k, ∀n ∈ GI) then:Pick vertex n (heuristic) (Do not spill)Remove vertex n from GI , put n on stack (Not spill list)Goto step 1

3 Otherwise, successively pop vertices off the stack and colourthem in the lowest colour not used by some neighbour

If some vertex cannot be coloured, then pick an uncolouredvertex to spill, spill it, and restart at step 1

Step 3 is also different

Global register allocationChaitin-Briggs algorithm

Example: colouring with Chaitin-Briggs algorithmColour with k = 2 colours

Global register allocationChaitin-Briggs algorithm

Example: colouring with Chaitin-Briggs algorithma° = 2 ≥ k Don’t Spill! Choose a

Global register allocationChaitin-Briggs algorithm

Example: colouring with Chaitin-Briggs algorithmPush a and remove from graph

Global register allocationChaitin-Briggs algorithm

Example: colouring with Chaitin-Briggs algorithmb° = 1 < k and c° = 1 < k Choose b

Global register allocationChaitin-Briggs algorithm

Example: colouring with Chaitin-Briggs algorithmPush b and remove from graph

Global register allocationChaitin-Briggs algorithm

Example: colouring with Chaitin-Briggs algorithmc° = 1 < k, and d° = 1 < k Choose c

Global register allocationChaitin-Briggs algorithm

Example: colouring with Chaitin-Briggs algorithmPush c and remove from graph

Global register allocationChaitin-Briggs algorithm

Example: colouring with Chaitin-Briggs algorithmd° = 1 < k Choose d

Global register allocationChaitin-Briggs algorithm

Example: colouring with Chaitin-Briggs algorithmPush d and remove from graph

Global register allocationChaitin-Briggs algorithm

Example: colouring with Chaitin-Briggs algorithmPop d , neighbours use no colours, choose red

Global register allocationChaitin-Briggs algorithm

Example: colouring with Chaitin-Briggs algorithmPop c, neighbours use red choose green

Global register allocationChaitin-Briggs algorithm

Example: colouring with Chaitin-Briggs algorithmPop b, neighbours use red choose green

Global register allocationChaitin-Briggs algorithm

Example: colouring with Chaitin-Briggs algorithmPop a, neighbours use green choose red

Global register allocationSpill candidates

Minimise spill cost/ degreeSpill cost is the loads and stores needed. Weighted by scope -i.e. avoid inner loopsThe higher the degree of a node to spill the greater thechance that it will help colouringNegative spill cost load and store to same memory locationwith no other usesInfinite cost - definition immediately followed by use. Spillingdoes not decrease live range

Global register allocationAlternative spilling

Splitting live rangesCoalesce

Global register allocationLive range splitting

A whole live range may have many interferences, but perhapsnot all at the same timeSplit live range into two variables connected by copyCan reduce degree of interference graphSmart splitting allows spilling to occur in “cheap” regions

Global register allocationLive ranges splitting

Splitting exampleNon contiguous live ranges - cannot be 2 coloured

Global register allocationLive ranges splitting

Splitting exampleSplit live ranges - can be 2 coloured

Global register allocationCoalescing

If two ranges don’t interfere and are connected by a copycoalesce into one – opposite of splittingReduces degree of nodes that interfered with bothIf x := y and x → y ∈ GI then can combine LRx and LRy

Eliminates the copy operationReduces degree of LRs that interfere with both x and yIf a node interfered with both both before, coalescing helpsAs it reduces degree, often applied before colouring takes place

Global register allocationCoalescing

Coalescing can make the graph harder to colorTypically, LRxy ° > max(LRx °, LRy °)If max(LRx°, LRy°) < k and k < LRxy ° then LRxy might spill,while LRx and LRy would not spill

Global register allocationCoalescing

Observation led to conservative coalescingConceptually, coalesce x and y iff x → y ∈ GI and LRxy ° < kWe can do better

Coalesce LRx and LRy iff LRxy has < k neighbours withdegree > kOnly neighbours of “significant degree” can force LRxy to spill

Always safe to perform that coalesceCannot introduce a node of non-trivial degreeCannot introduce a new spill

Global register allocationOther approaches

Top-down uses high level priorities to decide on colouringHierarchical approaches - use control flow structure to guideallocationExhaustive allocation - go through combinatorial options -very expensive but occasional improvementRe-materialisation - if easy to recreate a value do so ratherthan spillPassive splitting using a containment graph to make spillseffectiveLinear scan - fast but weak; useful for JITs

Global register allocationOngoing work

Eisenbeis et al examining optimality of combined reg alloc andscheduling. Difficulty with general control-flowPartitioned register sets complicate matters. Allocation canrequire insertion of code which in turn affects allocation.Leupers investigated use of genetic algs for TM seriespartitioned reg sets.New work by Fabrice Rastello and others. Chordal graphsreduce complexityAs latency increases see work in combined code generation,instruction scheduling and register allocation

Summary

Local Allocation - spill codeGlobal Allocation based on graph colouringTechniques to reduce spill code

PPar CDT Advert

The biggest revolution in the technological landscape for fifty years

Now accepting applications! Find out more and apply at:

pervasiveparallelism.inf.ed.ac.uk

• • 4-year programme: 4-year programme: MSc by Research + PhDMSc by Research + PhD

• Collaboration between: ▶ University of Edinburgh’s School of Informatics ✴ Ranked top in the UK by 2014 REF

▶ Edinburgh Parallel Computing Centre ✴ UK’s largest supercomputing centre

• Full funding available

• Industrial engagement programme includes internships at leading companies

• Research-focused: Work on your thesis topic from the start

• Research topics in software, hardware, theory and

application of: ▶ Parallelism ▶ Concurrency ▶ Distribution