Coloring Heuristics for Register Allocation

RETROSPECTIVE:

Coloring Heuristics for Register Allocation

Preston Briggs Keith D. Cooper Ken Kennedy Linda TorczonCray Research, Inc. Department of Computer Department of Computer Department of Computer

411 First Avenue South Science Science ScienceSuite 600 Rice University Rice University Rice University

Seattle, WA 98104 Houston, TX 77251-1982 Houston, TX 77251-1982 Houston, TX [email protected] [email protected] [email protected] [email protected]

1. INTRODUCTIONFrom the earliest compilers, register allocation was recognized asan important optimization. Indeed, the original Fortran compilerspent two of its six passes on the problem [1]. (That compilerused an approach similar to the linear-scan algorithms being pro-posed today for just-in-time compilers.) In the early 1960’s, theSoviet mathematician Lavrov made the intellectual connection be-tween allocation problems and graph coloring [17]. Unfortunately,Lavrov gave no algorithm; instead, he suggested that it was pos-sible to enumerate all the colorings of the graph and take the bestone. Chaitin, et al. took Lavrov’s fundamental ideas, developedthem, and built the first graph-coloring register allocator [9, 8]. Inessence, Chaitin’s allocator builds an interference graph where eachnode represents a value, computes an ordering on the nodes of thatgraph, and assigns colors to the nodes in that order.

Our paper introduced an improved coloring strategy that pro-duced better allocations for many graphs on which Chaitin’s methodfails. The key difference between our algorithm and Chaitin’s al-gorithm lies in the timing of spill decisions. While computing theorder for coloring, the allocator can reach a point where it cannotfind a node in the interference graph that it can provably color. Atthis point, Chaitin’s allocator spills the value associated with thatnode and excludes the node from the coloring order. Our methodalso chooses a spill candidate at this point. Instead of spilling it, theallocator inserts it into the coloring order. The spill candidate eitherreceives a color or it does not—in which case the allocator spills it.Experimental evidence in the paper confirms the effectiveness ofthis deferred spilling approach. This technique has been adopted inmany commercial and research compilers.

2. BACKGROUNDWe implemented a graph-coloring register allocator in our com-piler for the IBM RT-PC, an early RISC workstation with 16 gen-eral purpose registers and 8 floating-point registers. We were, ingeneral, pleased with the results. In detailed examination of thecode for some inner loops, however, we noticed that the allocatoroverspilled—leaving some registers unused while spilling criticalvalues. This shortcoming led us to re-examine the register alloca-tor and its algorithms. In particular, we began to investigate liverange splitting [11, 16, 13].

During this time, the authors attended a colloquium talk thatJorge More gave at Rice. More’s talk included Matula’s smallest-last coloring algorithm [18]. Our study of Chaitin’s algorithm had

20 Years of the ACM/SIGPLAN Conference on Programming LanguageDesign and Implementation (1979-1999): A Selection, 2003.Copyright 2003 ACM 1-58113-623-4 ...$5.00.

produced a simple four-node counterexample, the diamond graph.

�d

�a

�c

�b

We quickly realized that Matula’s algorithm would two-color thisgraph. This shifted our attention away from live-range splitting andback onto the fundamental’s of Chaitin’s method.

At a high-level, Chaitin’s algorithm builds an interference graph,uses the graph to order the nodes for color assignment, and thenassigns colors in the specified order. The critical step, for our pur-poses, is when the algorithm picks the next node to add to the order-ing. The algorithm selects any node n with fewer than k neighbors,where k is the number of registers available to the allocator. If nhas fewer than k neighbors, any assignment of colors to n’s neigh-bors leaves at least one color for n. Thus, n must receive a color.If no such node remains in the graph, the algorithm picks a node,using some heuristic, and spills the corresponding value.

This approach fails to find a two-coloring for the diamond graphbecause every node has two neighbors. The first time it tries to picka node, it must spill one, say a. This lowers the degree of its twoneighbors, b and d, to one. It can now construct a coloring order forb, c, and d. (Any order beginning with b or d works.) Clearly, this ispessimistic because a can always use the same color as c. Smallest-last coloring constructs an ordering—it picks any node first sincethey all have the same degree. Since any order will produce a twocoloring, it succeeds where Chaitin’s algorithm did not.

Smallest-last coloring is not, in itself, the answer. For example,it provides no help in spilling when the graph cannot, in fact, bek-colored. The paper shows the insights and algorithms that weeventually derived. The resulting algorithm fits a stronger coloringheuristic into the basic structure of the original algorithm. We referto this heuristic as optimistic coloring.

3. INFLUENCEThis paper appeared at PLDI 89, in a session with two other pa-pers on improvements to graph-coloring allocators [3, 15]. The pa-pers were notable because they all improved on Chaitin’s algorithmand each showed apples-to-apples comparisons—that is, the sameallocator running in the same compiler with a single algorithmicchange. (Most prior work on register allocation came in the formof experience papers, rather than experimental comparisons.) Thatsession marked a resurgence of research interest in Chaitin-styleregister allocators.

Nickerson noted that optimistic coloring improved the behav-ior of the allocator for values that required multiple registers [19].

ACM SIGPLAN 283 Best of PLDI 1979-1999ACM SIGPLAN 283 Best of PLDI 1979-1999

Koblenz and Callahan built a hierarchical register allocator that re-lied, at its heart, on optimistic coloring [7]. Norris and Pollocktook another approach to hierarchical allocation with an allocatorthat operated on program-dependence graphs [20]. We introduced anumber of improvements, including rematerialization, conservativecoalescing, and biased coloring [5]. Appel and George invented it-erated coalescing [14]. Park and Moon struck a middle groundbetween Chaitin’s “aggressive” coalescing and the less aggressiveconservative and iterated schemes [21]. Bergner, et al. inventedinterference-region spilling [2].

Of equal importance, a second thread in the literature has dealtwith issues that arise in the implementation of graph-coloring reg-ister allocators. Gupta, et al. showed taht the compiler can useclique separators to reduce the memory requirements for alloca-tion [15]. Choi, et al. introduced sparse evaluation graphs, reduc-ing the time required to compute LIVE information [10]. Briggs,et al. explained how to encode multiple-register requirements intothe interference graph [4] and how to implement some of the datastructures [6]. Cooper et al. described fast techniques for construct-ing the interference graph [12]. These papers have made it easierto implement a fast and effective allocator.

The algorithm, along with various improvements, has been im-plemented in many compilers, both research systems and com-mercial products. Indeed, Hopkins reported (in conversation) thatIBM’s own implementation of our algorithm in the Tobey compileryielded almost exactly the improvements reported in our paper.

4. ACKNOWLEDGMENTSIBM Corporation, through Dr. Horace Flatt of the Palo Alto Scien-tific Center, provided the support that let us explore these ideas.

REFERENCES[1] J. W. Backus, R. J. Beeber, S. Best, R. Goldberg, L. M.

Haibt, H. L. Herrick, R. A. Nelson, D. Sayre, P. B. Sheridan,H. Stern, I. Ziller, R. A. Hughes, and R. Nutt. TheFORTRAN automatic coding system. In Proceedings of theWestern Joint Computer Conference, pages 188–198.Institute of Radio Engineers, NY, NY, USA, Feb. 1957.

[2] P. Bergner, P. Dahl, D. Engebretsen, and M. O’Keefe. Spillcode minimization via interference region spilling. SIGPLANNotices, 32(6):287–295, June 1997. Proceedings of the ACMSIGPLAN ’97 Conference on Programming LanguageDesign and Implementation.

[3] D. Bernstein, D. Q. Goldin, M. C. Golumbic, H. Krawczyk,Y. Mansour, I. Nahshon, and R. Y. Pinter. Spill codeminimization techniques for optimizing compilers.SIGPLAN Notices, 24(7):258–263, July 1989. Proceedingsof the ACM SIGPLAN ’89 Conference on ProgrammingLanguage Design and Implementation.

[4] P. Briggs, K. D. Cooper, and L. Torczon. Coloring registerpairs. ACM Letters on Programming Languages andSystems, 1(1):3–13, Mar. 1992.

[5] P. Briggs, K. D. Cooper, and L. Torczon. Rematerialization.SIGPLAN Notices, 27(7):311–321, July 1992. Proceedingsof the ACM SIGPLAN ’92 Conference on ProgrammingLanguage Design and Implementation.

[6] P. Briggs and L. Torczon. An efficient representation forsparse sets. ACM Letters on Programming Languages andSystems, 2(1–4):45–58, March–December 1993.

[7] D. Callahan and B. Koblenz. Register allocation viahierarchical graph coloring. SIGPLAN Notices,26(6):192–203, June 1991. Proceedings of the ACM

SIGPLAN ’91 Conference on Programming LanguageDesign and Implementation.

[8] G. J. Chaitin. Register allocation and spilling via graphcoloring. SIGPLAN Notices, 17(6):98–105, June 1982.Proceedings of the ACM SIGPLAN ’82 Symposium onCompiler Construction.

[9] G. J. Chaitin, M. A. Auslander, A. K. Chandra, J. Cocke,M. E. Hopkins, and P. W. Markstein. Register allocation viagraph coloring. Computer Languages, 6(1):47–57, Jan. 1981.

[10] J.-D. Choi, R. Cytron, and J. Ferrante. Automaticconstruction of sparse data flow evaluation graphs. InConference Record of the Eighteenth Annual ACMSymposium on Principles of Programming Languages, pages55–66, Orlando, FL, USA, Jan. 1991.

[11] F. C. Chow and J. L. Hennessy. Register allocation bypriority-based coloring. SIGPLAN Notices, 19(6):222–232,June 1984. Proceedings of the ACM SIGPLAN ’84Symposium on Compiler Construction.

[12] K. D. Cooper, T. J. Harvey, and L. Torczon. How to build aninterference graph. Software—Practice and Experience,28(4):425–444, Apr. 1998.

[13] K. D. Cooper and L. T. Simpson. Live range splitting in agraph coloring register allocator. In Proceedings of theSeventh International Compiler Construction Conference,CC ’98, Lecture Notes in Computer Science 1383, pages174–187, 1998.

[14] L. George and A. W. Appel. Iterated register coalescing.ACM Transactions on Programming Languages and Systems,18(3):300–324, May 1996.

[15] R. Gupta, M. L. Soffa, and T. Steele. Register allocation viaclique separators. SIGPLAN Notices, 24(7):264–274, July1989. Proceedings of the ACM SIGPLAN ’89 Conference onProgramming Language Design and Implementation.

[16] J. R. Larus and P. N. Hilfinger. Register allocation in theSPUR Lisp compiler. SIGPLAN Notices, 21(7):255–263, July1986. Proceedings of the ACM SIGPLAN ’86 Symposium onCompiler Construction.

[17] S. S. Lavrov. Store economy in closed operator schemes.Journal of Computational Mathematics and MathematicalPhysics, 1(4):687–701, 1961. English translation in U.S.S.R.Computational Mathematics and Mathematical Physics3:810-828, 1962.

[18] D. Matula and L. Beck. Smallest-last ordering and clusteringand graph coloring algorithms. Technical Report CSE-8104,Department of Computer Science and Engineering, SouthernMethodist University, July 1981.

[19] B. R. Nickerson. Graph coloring register allocation forprocessors with multi-register operands. SIGPLAN Notices,25(6):40–52, June 1990. Proceedings of the ACM SIGPLAN’90 Conference on Programming Language Design andImplementation.

[20] C. Norris and L. L. Pollock. Register allocation over theprogram dependence graph. SIGPLAN Notices,29(6):266–277, June 1994. Proceedings of the ACMSIGPLAN ’94 Conference on Programming LanguageDesign and Implementation.

[21] J. Park and S.-M. Moon. Optimistic register coalescing. InProceedings of the International Conference on ParallelArchitectures and Compilation Techniques (PACT), pages196–204. IEEE, 1998.

ACM SIGPLAN 284 Best of PLDI 1979-1999

Coloring Heuristics for Register Allocation

Documents

Transcript of Coloring Heuristics for Register Allocation