Simplistic Code Generation - cs.tau.ac.il

74
Simplistic Code Generation Mooly Sagiv Steven Muchnick: Advanced Compiler Design and Implementation https://www.cis.upenn.edu/~stevez/ CS341 Aho , Sethi, Ullman, Compiler Design https://en.wikipedia.org/wiki/Sethi%E2%80%93Ullman_algorithm

Transcript of Simplistic Code Generation - cs.tau.ac.il

Page 1: Simplistic Code Generation - cs.tau.ac.il

Simplistic Code GenerationMooly Sagiv

Steven Muchnick: Advanced Compiler Design and Implementationhttps://www.cis.upenn.edu/~stevez/ CS341Aho, Sethi, Ullman, Compiler Design https://en.wikipedia.org/wiki/Sethi%E2%80%93Ullman_algorithm

Page 2: Simplistic Code Generation - cs.tau.ac.il

Outline

• Recap activation frames

• X86 principles

• Direct AST X86• The labeling algorithm for register allocation

Page 3: Simplistic Code Generation - cs.tau.ac.il

Local/Temporary Variable Storage

• Need space to store• Global variables• Values passed as arguments to procedures• Local variables (either defined in the source program or introduced by the

• compiler)• Processors provide two options

• Registers: fast, small size (64 bits), very limited number• Memory: slow, very large amount of space (2 GB)• caching important

• In practice on X86• Registers are limited (and have restrictions)• Divide memory into regions including the stack and the heap

Page 4: Simplistic Code Generation - cs.tau.ac.il

The C memory model

• The code & data (or "text") segment• contains compiled code, constant strings, etc.

• The Heap• Stores dynamically allocated objects

• Allocated via "malloc"

• Deallocated via "free" or garbage collection

• c runtime system

• The Stack• Stores local variables

• Stores the return address of a function

• Compiler generated code to create/delete new frames

Code

Heap

Stack

Larg

er a

dd

ress

Page 5: Simplistic Code Generation - cs.tau.ac.il

Questions

• Why store local variables in stack frames?

• Can we store stack frames in the heap (e.g., via malloc/new)?

• What cannot be stored in a stack frame?

• Why do we use two machine registers to implement stack frames?

• What security risks do stack frames raise?

Page 6: Simplistic Code Generation - cs.tau.ac.il

Compiling factorial

int factorial(int num) {if (num == 1) return 1 ;else return num * factorial(num -1 );

}

. factorial(int):push rbpmov rbp, rspsub rsp, 16mov DWORD PTR [rbp-4], edicmp DWORD PTR [rbp-4], 1jne .L2mov eax, 1jmp .L3

.L2:mov eax, DWORD PTR [rbp-4]sub eax, 1mov edi, eaxcall factorial(int)imul eax, DWORD PTR [rbp-4]

.L3:leaveret

Page 7: Simplistic Code Generation - cs.tau.ac.il

Can we store activation frames in the heap?

Page 8: Simplistic Code Generation - cs.tau.ac.il

Limitations of Stack Frames• A local variable of P cannot be stored in the activation

record of P if its duration exceeds the duration of P

• Example 1: Static variables in C(own variables in Algol)void p(int x){

static int y = 6 ;y += x;

}

• Example 2: Features of the C languageint * f() { int x ;

return &x ;}

• Example 3: Dynamic allocationint * f() { return (int *) malloc(sizeof(int)); } 8

Page 9: Simplistic Code Generation - cs.tau.ac.il

Compiling factorial no rbp

int factorial(int num) {if (num == 1) return 1 ;else return num * factorial(num -1 );

}

. factorial(int):push rspsub rsp, 16mov DWORD PTR [rsp+4], edicmp DWORD PTR [rsp+4], 1jne .L2mov eax, 1jmp .L3

.L2:mov eax, DWORD PTR [rsp+4]sub eax, 1mov edi, eaxcall factorial(int)imul eax, DWORD PTR [rsp+4]

.L3:leaveret

Page 10: Simplistic Code Generation - cs.tau.ac.il

Dynamic Frame Size

// crt_malloca_simple.c#include <stdio.h>#include <malloc.h> void Fn() {

char * buf = (char *)_malloca( 100 ); // do something with buf

} int main() {

Fn(); }

Page 11: Simplistic Code Generation - cs.tau.ac.il

What are the security risks of frames?

Page 12: Simplistic Code Generation - cs.tau.ac.il

int foo(){int a, b;int *p = &a;scanf("%d", &b);*(p+b) = 5;

}

.LC0:.string "%d"

foo:push rbpmov rbp, rspsub rsp, 16lea rax, [rbp-12]mov QWORD PTR [rbp-8], raxlea rax, [rbp-16]mov rsi, raxmov edi, OFFSET FLAT:.LC0mov eax, 0call __isoc99_scanfmov eax, DWORD PTR [rbp-16]cdqelea rdx, [0+rax*4]mov rax, QWORD PTR [rbp-8]add rax, rdxmov DWORD PTR [rax], 5nopleaveret

Page 13: Simplistic Code Generation - cs.tau.ac.il

Buffer Overflow Exploits

void foo (char *x) {

char buf[2];

strcpy(buf, x);

}

int main (int argc, char *argv[]) {

foo(argv[1]);

}

./a.out abracadabra

Segmentation fault Stack grows this way

Memory addresses

Previous frame

Return address

Saved FP

char* x

buf[2]

ab

ra

ca

da

br

13

Page 14: Simplistic Code Generation - cs.tau.ac.il

Buffer Overflow Exploits

14

int check_authentication(char *password) {int auth_flag = 0;char password_buffer[16];

strcpy(password_buffer, password);if(strcmp(password_buffer, "brillig") == 0) auth_flag = 1;if(strcmp(password_buffer, "outgrabe") == 0) auth_flag = 1;return auth_flag;

}int main(int argc, char *argv[]) {

if(check_authentication(argv[1])) {printf("\n-=-=-=-=-=-=-=-=-=-=-=-=-=-\n");printf(" Access Granted.\n");printf("-=-=-=-=-=-=-=-=-=-=-=-=-=-\n"); }

else printf("\nAccess Denied.\n");

}

(source: “hacking – the art of exploitation, 2nd Ed”)

Page 15: Simplistic Code Generation - cs.tau.ac.il

Input Validation

Applicationevil input

AAAAAAAAAAAA -=-=-=-=-=-=-=-=-=-=-=-=-=-Access Granted. 65

-=-=-=-=-=-=-=-=-=-=-=-=-=-

Page 16: Simplistic Code Generation - cs.tau.ac.il

Preventing buffer overflow exploits?

Page 17: Simplistic Code Generation - cs.tau.ac.il

The rest of this lecture

• X86 Principles

• AST X86• The labeling algorithm for register allocation

• Intermediate Representations

Page 18: Simplistic Code Generation - cs.tau.ac.il
Page 19: Simplistic Code Generation - cs.tau.ac.il

X86 Assembly

• CISC

• 2- address instructions [op arg1, arg2] = arg1 op(arg1, arg2)

• Diverse data types 8-, 16-, 32-, 64-bit values + floating points, …

• Intel 64 and IA 32 architectures have a huge number of functions

• instructions range in size from 1 byte to 17 bytes

• Lots of hold-over design decisions for backwards compatibility

• Hard to understand

• The main ideas can be explained using a simple subset X86lite:• Only 64 bit signed integers (no floating point, no 16bit, no …)• 20 instructions

Page 20: Simplistic Code Generation - cs.tau.ac.il

X86lite Registers: 16 64-bit registers

register usage

rax general purpose accumulator

rbx base register, pointer to data

rcx counter register for strings & loops

rdx data register for I/O

rsi pointer register, string source register

rdi pointer register, string destination register

rbp base pointer, points to the stack frame

rsp stack pointer, points to the top of the stack

r08-r15 General purpose registers

rip(virtual) Current machine instruction

Page 21: Simplistic Code Generation - cs.tau.ac.il

Jumps, Call and Return

Instruction Informal formal

jmp dst Control goes to dst rip dst

call dst Control goes to dstand returns to the following instruction upon termination of dst

push riprip dst

ret Control returns to the caller

pop rip

Page 22: Simplistic Code Generation - cs.tau.ac.il

Enter and Leave

Instruction Informal formal

enter #bytes Open a stack frame of size #bites

push ebpmov rbp, rspsub rsp, #bytes

leave Restore caller’s stack frame

move rsp, rbppop rbp

Page 23: Simplistic Code Generation - cs.tau.ac.il

Directly Translating AST to Assembly

• For simple languages, no need for intermediate representation

• Main Idea: Maintain invariants• Code emitted for a given expression computes the answer into rax

• Key Challenges:• storing intermediate values needed to compute complex expressions

• some instructions use specific registers (e.g. shift)

Page 24: Simplistic Code Generation - cs.tau.ac.il

Calling Conventions• Specify the locations (e.g. register or stack) of arguments passed to a function

and returned by the function

• Designate registers either• Caller Save – e.g. freely usable by the called code• Callee Save – e.g. must be restored by the called code

• Define the protocol for deallocating stack-allocated arguments

• Caller cleans up

• Callee cleans up (makes variable arguments harder)

int64_t g(int64_t a, int64_t b) {return a + b;}int64_t f(int64_t x) {int64_t ans = g(3,4) + x;return ans;}

callee

caller

Page 25: Simplistic Code Generation - cs.tau.ac.il

x64 Calling Conventions: Caller Protocol

Page 26: Simplistic Code Generation - cs.tau.ac.il
Page 27: Simplistic Code Generation - cs.tau.ac.il
Page 28: Simplistic Code Generation - cs.tau.ac.il
Page 29: Simplistic Code Generation - cs.tau.ac.il

Callee Prolog

Page 30: Simplistic Code Generation - cs.tau.ac.il

Callee Prolog

Page 31: Simplistic Code Generation - cs.tau.ac.il

Callee Invariant: function argument

Page 32: Simplistic Code Generation - cs.tau.ac.il

Callee Invariant: calee saved registers

Page 33: Simplistic Code Generation - cs.tau.ac.il

Callee epilogue

Page 34: Simplistic Code Generation - cs.tau.ac.il

Callee epilogue

Page 35: Simplistic Code Generation - cs.tau.ac.il

Callee epilogue

Page 36: Simplistic Code Generation - cs.tau.ac.il

Callee epilogue

Page 37: Simplistic Code Generation - cs.tau.ac.il

Caller-Save and Callee-Save Registers

• callee-save-registers (MIPS 16-23, X86 r12-15, rbp, rsp)• Saved by the callee when modified

• Values are automatically preserved across calls

• caller-save-registers• Saved by the caller when needed

• Values are not automatically preserved

• Usually the architecture defines caller-save and callee-save registers• Separate compilation

• Interoperability between code produced by different compilers/languages

• But compilers can decide when to use calller/callee registers37

Page 38: Simplistic Code Generation - cs.tau.ac.il

Caller-Save vs. Callee-Save Registers

int foo(int a) {

int b=a+1;

f1();

g1(b);

return(b+2);

}

void bar (int y) {

int x=y+1;

f2(y);

g2(2);

}

38

Page 39: Simplistic Code Generation - cs.tau.ac.il

Syntax Directed Code Generation (Expressions)• Generate code for arguments in a designated register and store in

stack

• Generate code for expressions using stack operations

Page 40: Simplistic Code Generation - cs.tau.ac.il

Naïve Code Generation: Expressiongenerate Code(Node: expression) {switch node: {

case number(n: integer) {emit(load eax, $n)}

case localVariable(v: symbol) {let o: integer = offestFrame(v)emit(load eax, DWORD PTR [rbp-$o])}

case e1: Node + e2: Node {generate Code(e1) // Generate code for lhs into eaxemit(push eax) // Store lhs into the stack generate Code(e2) // Generate code for rhs into eaxemit(move edx, eax) // rhs into eaxemit(pop eax) // lhs into eaxemit(add eax, edx)}

Page 41: Simplistic Code Generation - cs.tau.ac.il

Abstract Syntax for Arithmetic Expressions

Exp id (IdExp)

Exp num (NumExp)

Exp Exp Binop Exp (BinExp)

Binop + (Plus)

Binop - (Minus)

Binop *

Binop /

(Times)

(Div)

ExpUnop Exp (UnExp)

Unop - (UnMin)41

Page 42: Simplistic Code Generation - cs.tau.ac.il

package Absyn;

abstract public class Absyn { public int pos ;}

Exp extends Absyn {} ;

class IdExp extends Exp { String rep ;

IdExp(r) { rep = r ;}

}

class NumExp extends Exp { int number ;

NumExp(int n) { number = n ;}

}

class OpExp {

public final static int PLUS=1; public final static int Minus=2;

public final static int Times=3; public final static int Div=4;

}

final static int OpExp.PLUS, OpExp.Minus, OpExp.Times, OpExp.Div;

class BinExp extends Exp {

Exp left, right; OpExp op ;

BinExp(Exp l, OpExp o, Bin Exp r) {

left = l ; op = o; right = r ;

}

}

42

Page 43: Simplistic Code Generation - cs.tau.ac.il

Java Code For Expressionsstatic void codeGen(Exp e) {

if isinstance of IdExp e {printf(“mov eax, DWORD PTR [rbp-%d]\n”, offset( ((IdExp) e).represent) ;}

else if isinstance of NumExp e { printf(“mov eax, %d\n”, (NumExp) e).number) ;}

else if isinstance of BinExp e {BinopEexp eb = (BinExp) e;codeGen(eb.left) ; // Generate code computing left into eaxprintf(“push eax\n”) ; // Push eax into the stack. codeGen(eb.right) ; // Generate code computing lhs into eaxprintf(“move edx, eax\n”) ; // rhs into edxprintf(“pop eax\n”) ; // rhs into eaxswitch(eb.op {

case PLUS: printf(“add eax, edx\n”) ;break…

}…

Page 44: Simplistic Code Generation - cs.tau.ac.il

Example Compilation

+

+5

7x

move eax, 5

push eax

move eax, DWORD PTR [rbp-16]

push eax

move eax, 7

move edx,eax pop eax add eax, edx

move edx,eax pop eax add eax, edx

Page 45: Simplistic Code Generation - cs.tau.ac.il

move eax, 5push eaxmove eax, DWORD PTR [rbp-16]push eaxmove eax, 7move edx,eaxpop eaxadd eax, edxmove edx,eaxpop eaxadd eax, edx

Code/Data

Stack

rbp

rsp

777777777

rbp-1680

Executing the generated code

Page 46: Simplistic Code Generation - cs.tau.ac.il

move eax, 5push eaxmove eax, DWORD PTR [rbp-16]push eaxmove eax, 7move edx,eaxpop eaxadd eax, edxmove edx,eaxpop eaxadd eax, edx

Code/Data

Stack

rbp

rsp

777777777

rbp-1680

Executing the generated code

5

eax

Page 47: Simplistic Code Generation - cs.tau.ac.il

move eax, 5push eaxmove eax, DWORD PTR [rbp-16]push eaxmove eax, 7move edx,eaxpop eaxadd eax, edxmove edx,eaxpop eaxadd eax, edx

Code/Data

Stack

rbp

rsp

777777777

rbp-1680

Executing the generated code

5

eax

5

Page 48: Simplistic Code Generation - cs.tau.ac.il

move eax, 5push eaxmove eax, DWORD PTR [rbp-16]push eaxmove eax, 7move edx,eaxpop eaxadd eax, edxmove edx,eaxpop eaxadd eax, edx

Code/Data

Stack

rbp

rsp

777777777

rbp-1680

Executing the generated code

80

eax

5

Page 49: Simplistic Code Generation - cs.tau.ac.il

move eax, 5push eaxmove eax, DWORD PTR [rbp-16]push eaxmove eax, 7move edx,eaxpop eaxadd eax, edxmove edx,eaxpop eaxadd eax, edx

Code/Data

Stack

rbp

rsp

777777777

rbp-1680

Executing the generated code

80

eax

5

80

Page 50: Simplistic Code Generation - cs.tau.ac.il

move eax, 5push eaxmove eax, DWORD PTR [rbp-16]push eaxmove eax, 7move edx,eaxpop eaxadd eax, edxmove edx,eaxpop eaxadd eax, edx

Code/Data

Stack

rbp

rsp

777777777

rbp-1680

Executing the generated code

7

eax

5

80

Page 51: Simplistic Code Generation - cs.tau.ac.il

move eax, 5push eaxmove eax, DWORD PTR [rbp-16]push eaxmove eax, 7move edx,eaxpop eaxadd eax, edxmove edx,eaxpop eaxadd eax, edx

Code/Data

Stack

rbp

rsp

777777777

rbp-1680

Executing the generated code

7

eax

5

80

7

edx

Page 52: Simplistic Code Generation - cs.tau.ac.il

move eax, 5push eaxmove eax, DWORD PTR [rbp-16]push eaxmove eax, 7move edx,eaxpop eaxadd eax, edxmove edx,eaxpop eaxadd eax, edx

Code/Data

Stack

rbp

rsp

777777777

rbp-1680

Executing the generated code

80

eax

5

80

7

edx

Page 53: Simplistic Code Generation - cs.tau.ac.il

move eax, 5push eaxmove eax, DWORD PTR [rbp-16]push eaxmove eax, 7move edx,eaxpop eaxadd eax, edxmove edx,eaxpop eaxadd eax, edx

Code/Data

Stack

rbp

rsp

777777777

rbp-1680

Executing the generated code

87

eax

5

80

7

edx

Page 54: Simplistic Code Generation - cs.tau.ac.il

move eax, 5push eaxmove eax, DWORD PTR [rbp-16]push eaxmove eax, 7move edx,eaxpop eaxadd eax, edxmove edx,eaxpop eaxadd eax, edx

Code/Data

Stack

rbp

rsp

777777777

rbp-1680

Executing the generated code

87

eax

5

80

87

edx

Page 55: Simplistic Code Generation - cs.tau.ac.il

move eax, 5push eaxmove eax, DWORD PTR [rbp-16]push eaxmove eax, 7move edx,eaxpop eaxadd eax, edxmove edx,eaxpop eaxadd eax, edx

Code/Data

Stack

rbp

rsp

777777777

rbp-1680

Executing the generated code

5

eax

5

80

87

edx

Page 56: Simplistic Code Generation - cs.tau.ac.il

move eax, 5push eaxmove eax, DWORD PTR [rbp-16]push eaxmove eax, 7move edx,eaxpop eaxadd eax, edxmove edx,eaxpop eaxadd eax, edx

Code/Data

Stack

rbp

rsp

777777777

rbp-1680

Executing the generated code

92

eax

5

80

87

edx

Page 57: Simplistic Code Generation - cs.tau.ac.il

Can we generate a more efficient code?

• How to better utilize machine registers

• Expression order does not matter• What is the result of “x = 1 ; ++x + (x +1)”?

• The code of the right-subtree can appear before the code of the left-subtree

• Often leads to faster code with fewer registers and load/store

• Dynamic programming can be used to compute “optimal” solution

Page 58: Simplistic Code Generation - cs.tau.ac.il

Two Phase SolutionDynamic ProgrammingSethi & Ullman• Bottom-up (labeling)

• Compute for every subtree• The minimal number of registers needed

• Weight

• Top-Down• Generate the code using labeling by preferring “heavier” subtrees (larger

labeling)

• Can integrate spilling

Page 59: Simplistic Code Generation - cs.tau.ac.il

The Labeling Principle

+

m registers n registers

m > n

m registers

Page 60: Simplistic Code Generation - cs.tau.ac.il

The Labeling Principle

+

m registers n registers

m < n

n registers

Page 61: Simplistic Code Generation - cs.tau.ac.il

The Labeling Principle

+

m registers n registers

m = n

m+1 registers

Page 62: Simplistic Code Generation - cs.tau.ac.il

The Labeling Algorithm

weight(Node: expression): integer {switch node: {

case number(n: integer): return 1;case localVariable(v: symbol) return 1;case e1: Node + e2: Node {

let lw: integer = weight(e1);let rw: integer = weight(e2);if (lw < rw) return rw ;else if (lw > rw) return lw;else return lw + 1 ;

}…}

Page 63: Simplistic Code Generation - cs.tau.ac.il

Labeling the example (weight)

-

*

*

b b 4 *

a c

1

2

1 1

1 1

2

2

3

Page 64: Simplistic Code Generation - cs.tau.ac.il

Top-Down

-3

*2*2

b1 b1 41 *2

a1 c1

move R1,b move R2,b

mult R1, R2

move R2, 4

move R3, a move R2, c

mult R3, R2

mult R2, R3

sub R2, R1

T=R1

T=R1

T=R1

T=R2

T=R2

T=R2

T=R3

T=R3 T=R2

64

Page 65: Simplistic Code Generation - cs.tau.ac.il

Generalizations

• More than two arguments for operators• Function calls

• Register/memory operations

• Multiple effected registers

• Spilling • Need more registers than available

Page 66: Simplistic Code Generation - cs.tau.ac.il

Register Memory Operations

• add R1, X

• mult R1, X

• No need for registers to store right operands

Page 67: Simplistic Code Generation - cs.tau.ac.il

Labeling the example (weight)

-

*

*

b b 4 *

a c

1

1

0 1

1 0

1

2

2

Page 68: Simplistic Code Generation - cs.tau.ac.il

Top-Down

-2

*1 *2

b1 b0 41 *1

a1 c0

move R1, b

mult R1, b

move R2, 4

move R1, a

mult R1, c

Mult R2, R1

subt R1, R2

T=R1

T=R1T=R2

T=R2

T=R2

T=R1

Page 69: Simplistic Code Generation - cs.tau.ac.il

Empirical Results

• Experience shows that for handwritten programs 5 registers suffice (Yuval 1977)

• But program generators may produce arbitrary complex expressions

Page 70: Simplistic Code Generation - cs.tau.ac.il

Spilling

• Even an optimal register allocator can require more registers than available

• Need to generate code for every correct program

• The compiler can save temporary results• Spill registers into temporaries

• Load when needed

• Many heuristics exist

Page 71: Simplistic Code Generation - cs.tau.ac.il

Simple Spilling Method

• Heavy tree – Needs more registers than available

• A `heavy’ tree contains a `heavy’ subtree whose dependents are ‘light’

• Generate code for the light tree

• Spill the content into memory and replace subtree by temporary

• Generate code for the resultant tree

Page 72: Simplistic Code Generation - cs.tau.ac.il

Summary (Register allocation)

• Register allocation of expressions is simple

• Good in practice

• Optimal under certain conditions• Uniform instruction cost• `Symbolic’ trees

• Can handle non-uniform cost• Code-Generator Generators exist (BURS)

• Even simpler for 3-address machines

• Simple ways to determine best orders

• But misses opportunities to share registers between different expressions• Can employ certain conventions

• Better solutions exist• Graph coloring

Page 73: Simplistic Code Generation - cs.tau.ac.il

Why do something else?

• The resulting code quality is poor

• Richer source language features are hard to encode• Structured data types, objects, first-class functions, …

• hard to optimize the resulting assembly code

• The representation is too concrete – e.g. it has committed to using certain registers and the stack• Only a fixed number of registers• Some instructions have restrictions on where the operands are located

• Control-flow is not structured:• Arbitrary jumps from one code block to another• Implicit fall-through makes sequences of code non-modular(i.e. you can’t rearrange sequences of

code easily)

• Retargeting the compiler to a new architecture is hard.

• Target assembly code is hard-wired into the translation

Page 74: Simplistic Code Generation - cs.tau.ac.il

Lecture Summary

• Simple X86 code generation from AST is conceptually easy

• But poor generated code

• No global optimizations

• No modularity• Hard to retarget to different machines

• Hard to reuse for different source languages

• Hard to maintain