Introduction to Code Generation Mooly Sagiv html://msagiv/courses/wcc10.html Chapter 4.
Intermediate Code Generation Mooly Sagiv [email protected] Schrierber 317 03-640-7606 Wed...
-
date post
19-Dec-2015 -
Category
Documents
-
view
224 -
download
2
Transcript of Intermediate Code Generation Mooly Sagiv [email protected] Schrierber 317 03-640-7606 Wed...
Intermediate Code GenerationMooly Sagiv
[email protected] 31703-640-7606
Wed 10:00-12:00
html://www.math.tau.ac.il/~msagiv/courses/wcc02.htmlChapter 7
(Chapter 6 next week)
Basic Compiler PhasesSource program (string)
Fin. Assembly
lexical analysis
syntax analysis
semantic analysis
Translate
Instruction selection
Register Allocation
Tokens
Abstract syntax tree
Intermediate representation
Assembly
Why use intermediate languages?• Simplify the compilation phase
– ultimately leads to a more efficient code
• Portability of the compiler front-end
• Reusability of the compiler back-end
Java
C
Pascal
C++
ML
Pentium
MIPS
Sparc
Java
C
Pascal
C++
ML
Pentium
MIPS
Sparc
IR
IR Design Goals• Convenient to generate IR from the source
• Convenient to generate machine code from IR– Missmatches between Source and Target
• Clear operational meaning
Textbook Solution
• Simple intermediate instructions
•Tree like expressions
A Grammar for the Tree IRT_stm ::= T_stm T_stm (T_SEQ)
T_stm ::= T_label (T_LABEL)
T_exp ::=T_exp (T_MEM)
T_stm ::= T_exp Temp_labelList (T_JUMP)
T_stm::= T_relop T_exp T_exp Temp_label Temp_label (T_CJUMP)
T_stm::=T_exp T_exp (T_MOVE)
T_stm ::= T_exp (T_EXP)
T_exp ::=T_binop T_exp T_Exp (T_BINOP)
T_exp ::= Temp_temp (T_TEMP)
T_exp ::= T_stm T_exp (T_ESEQ)
T_exp ::= Temp_label (T_LABEL)
T_exp ::=int (T_CONST)
T_exp::= T_exp T_expList (T_CALL)
/* tree.h */typedef struct T_exp_ *T_exp;struct T_stm_ { enum {T_SEQ, T_LABEL, T_JUMP, …, T_EXP} kind; union { struct {T_stm left, right;} SEQ;
… } u;};
T_stm T_Seq(T_stm left, T_stm right);T_stm T_Label(Temp_label);T_stm T_Jump(T_exp exp, Temp_labelList labels);T_stm T_Cjump(T_relOp op, T_exp left, T_exp right, Temp_label _true, Temp_label _false );T_stm T_Move(T_exp, T_exp);T_stm T_Exp(T_exp);typedef enum {T_plus, T_minus, T_mul, T_div, T_and, T_or, T_lshift, T_rshift, T_arshift, T_xor} T_binOp ;typedef enum {T_eq, T_ne, T_lt, T_gt, T_le, T_ge, T_ult, T_ule, T_ugt, T_uge} T_relOp;struct T_exp_ { enum {T_BINOP, T_MEM, T_TEMP, …, T_CALL} kind;
union {struct {T_binop op; T_exp left; T_exp right;} BINOP; …} u; } ;
Example factorial
let function nfactor (n: int): int = if n = 0 then 1 else n * nfactor(n-1)in nfactor(10)end
Abstract Tiger ProgramletExp(decList( functionDec(fundecList( fundec(nfactor, fieldList( field(n, int, fld-escaped=FALSE), fieldList()), int, ifExp( opExp(EQUAL, varExp(simpleVar(n)), intExp(0)), intExp(1), opExp(TIMES, varExp(simpleVar(n)), callExp(nfactor, expList(opExp(MINUS, varExp(simpleVar(n)), intExp(1)), expList()))))), fundecList())), decList()), seqExp(expList( callExp(nfactor, expList(intExp(10), expList())), expList())))
IR for Main
/* prologue of main starts with l1 *//* body of main */MOV(TEMP(RV), CALL(NAME(l2), ExpList(CONST(10), null /* next argument */)))/* epilogue of main */
IR for nfact/* Prologue of nfunc starts with l2 *//* body of nfunc */MOV(TEMP(RV), ESEQ(SEQ( CJUMP(=, “n”, CONST(0), NAME(l3), NAME(l4)), SEQ(LABEL(l3) /* then-clause */, SEQ(MOV(TEMP(t1), CONST(1)), SEQ(JUMP(NAME(l5)), SEQ(LABEL(l4), /* else-clause */ SEQ(MOV(TEMP(t1), BINOP(MUL, “n”, CALL(NAME(l2), ExpList(BINOP(MINUS, “n”, CONST(1)), null /* next argument */)))), LABEL(l5)))…), TEMP(t1)))/* epilogue of nfunc */
Outline of the Translation (translate.c)
• Top-down traversal over the abstract syntax tree• Generate code to allocate memory for declarations and
initializations (next week)• Generate code for function declarations:
– Prologue– The body expression– Epilogue
• Generate code for expressions– Value expressions
• x + y
– Location expressions • x < y
• Statements– x := y– Control flow
The rest of this lecture• L-values and R-Values• Arithmetic expressions• Conditionals and Loops• Conversions• Complex data types
– Arrays
– Structures
• Memory Checks
L-values vs. R-values
• Assignment x := exp is compiled into:– Compute the address of x
– Compute the value of exp
– Store the value of exp into the address of x
• Generalization– R-value
– L-value
rval(y) + rval(x) = y)+(x rval
5 = (5) rval
xof value= (x) rval
lval(*e)
rval(e) + lval(a) = (a[e]) lval
a of address base = a)array -Pascal lval(
a of address = a)pointer -lval(C
undefined = a)array -lval(C
undefined = y)+(x lval
undefined = (5) lval
xof address = (x) lval
Translating Expressions• Straightforward by induction on the abstract
expression tree
/* translate.c */Tr_exp Tr_opExp(A_oper oper, Tr_exp left, Tr_exp right){ switch (oper) { case A_plusOp: return Tr_opArithExp(T_plus, left, right); case A_minusOp: return Tr_opArithExp(T_minus, left, right); case A_timesOp: … case A_eqOp: return Tr_opCondExp(T_eq,left,right); case A_neqOp: return Tr_opCondExp(T_ne,left,right); case A_ltOp: … } assert(0); return NULL;}
Conditional Expressions• Translating Expressions in Conditions may
be tricky
• Two options– Value computation
• Compute a value of Boolean Expression
– Location computation• Compute a label in the code that will be reached if
the expression holds
• Allows shortcut computations
Example C code• if (a < 6 && b+1 >7)
a = b * c
CJUMP(<, “a” CONST(6), l1, l2)
LABEL(l1)
CJUMP(>, (BINOP(+, “b”, CONST(1)), CONST(7), l3, l2)
LABEL(l3)
MOVE(“a”, BINOP(*, “b”, “c”)
LABEL(l2)
Conditional Expressions in Tiger
static Tr_exp Tr_opCondExp( T_relOp oper,
Tr_exp left,
Tr_exp right)
{
struct Cx cx;
cx.stm = T_Cjump(oper, left, right, NULL, NULL);
cx.trues = PatchList(cx.stm->u.CJUMP._true, NULL);
cx.falses = PatchList(cx.stm->u.CJUMP._false, NULL);
return Tr_Cx(cx.trues, cx.falses, cx.stm);
}
if a >b then x := 5 SEQ
CJUMP
GT “a” “b”
t
NAME
f
NAME
SEQ
SEQ
Code for x:=5
t
LABEL
LABEL
f
Conversions
• Local translation may lead to converting representations – Value-computation Location-computation
• Examplesif (x+5) then 0 else 1
(a > b) + b
x := if (a>b) then a else b
x := (a > b)
(if a>b then a else b) + 1
Complex Data Types
• Data types like arrays, strings, and records may require special treatment
• Important questions– Duration– Static vs. Dynamic size– Structured L-values
Complex Data Types in Tiger• Arrays, strings, and record’s fields are long-lived
– Usually allocated in the heap
– No structured L-values
• Example: Tiger Record Allocation
type foo = { a : ty1 , b : ty2}... = foo {a =e1, b = e2}
ESEQ (SEQ ( MOV(TEMP r, CALL(NAME MALLOC, CONST 2*W)), SEQ( MOV(MEM(+(0*W, TEMP r)), TransExp(e1))), MOV(MEM(+(1*W, TEMP r)), (TransExp(e2))))), TEMP r)
Example Tiger Arrayslet type intArray = array of int var a := intArray[12] of 0 var b := intArray[13] of 7in a := b
SEQ( SEQ( CONST 0, SEQ( MOVE(TEMP ta, CALL(NAME initArray, CONST 12, CONST 0)), SEQ( MOVE(TEMP tb, CALL(NAME initArray, CONST 13, CONST 7)), MOVE(TEMP ta, TEMP tb)))))
L-values of Arrays and Structures(Tiger)
• The l-value of a[i] MEM(+(“a”, *(CONST W, “i”)))
• For a structure s.f MEM(+(“s”, *(CONST W, CONST kf)))
Big L-values
• In some programming languages, more than one word need to be copied or stored
• Examples: – C structures– Pascal arrays
• How can this be handled?
Memory checks• Can the compiler guarantee that no invalid memory is
referred– At compile-time– At runtime?
• Examples– Array references
• Algol, Pascal, Java, PL.1– Runtime checks
• C – No checks
• Ada, C#– User control
– Field and pointer dereferences• The best solutions combine runtime and compile-time
checks