PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof....

29
PSU CS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU Spring 2010 rev.: 4/16/2010

Transcript of PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof....

Page 1: PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.

PSU CS322 HM 1

Languages and Compiler Design IIIR Code Generation I

Material provided by Prof. Jingke Li

Stolen with pride and modified by Herb Mayer

PSU Spring 2010rev.: 4/16/2010

Page 2: PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.

PSU CS322 HM 2

Agenda

• Grammar G1• CodeGen Overview• Arithmetic Expression Translation• Boolean Expression Translation• Various Statement Translations

Page 3: PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.

PSU CS322 HM 3

Grammar G1Input: AST representation of MINI source

Output: Three-address code or IR tree code

Approach: Syntax-directed translation

Generic Grammar G1, start symbol: S

E -> E arithop E | E relop E | E logicop E

E -> ‘-’ E | ‘!’ E

E -> ‘newArray’ E // new int array size E1

E -> E ‘[’ E ‘]’ // indexed element

E -> id | num // end-nodesS -> E ‘:=’ E ‘;’ // assignment statementS -> ‘if’ ‘(‘ E ‘)’ ‘then’ S ‘else’ SS -> ‘while’ ‘(‘ E ‘)’ SS -> ‘print’ E ‘;’S -> ‘return’ E ‘;’

Page 4: PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.

PSU CS322 HM 4

CodeGen Overview• Arithmetic Expressions:

– preserve precedence and associativity– Pay attention, whether language requires check for zero-divide

• Boolean Expressions:– define short-circuit evaluation vs. complete evaluation– discern bit-wise vs. logical and, or, xor– multiple unary NOT allowed?

• Array definition:– 1D is simple for compiler– Per dimension, record: element size, low-bound, high-bound, total size, index type

and type• Array element reference:

– L-value or r-value?– Nested array reference: index expression can in turn be array element– Discern pass by value or reference, other

• Statements:– Goto into other scope, out of current scope issue in FTN, C– Return: long-jump in C non-trivial

• Parameters:– Function parameters in PL/I and Pascal hard– Easy to confuse pointer type parameters with reference parameters (in C)

Page 5: PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.

PSU CS322 HM 5

Arithmetic Expression Translation

Generate tree-address code:get new temp per operationE.s holds statements that evaluate EE.t is temp that holds E’s value

t = new Temp();

E.s := [ E1.s; E2.s; t := E1.t arithop E2.t; ]

E.t := t;

t = new Temp();

E.s := [ E1.s; t := unaryop E1.t; ]

E.t := t;

E -> E1 arithop E2

E -> unaryop E1

Page 6: PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.

PSU CS322 HM 6

Arithmetic Expression Translation, Cont’d

To generate IR trees, embed expression subtrees into current root. Attribute E.tr holds IR tree for E

E.tr := ( BINOP arithop E1.tr E2.tr )

E.tr := ( UNOP unaryop E1.tr null )

b * -c + b * d / e // assume l-2-r associativity of * / %=>t1 := -ct2 := b * t1t3 := b * dt4 := t3 / et5 := t2 + t4 =>(BINOP + (BINOP * b (UNOP – c ) ) (BINOP / (BINOP * b d ) e ) )

similar to Polish Postfix notation, after Lukasiewicz, 1920

E -> E1 arithop E2

E -> unaryop E1

Page 7: PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.

PSU CS322 HM 7

Boolean Expression Translation• Rely on target machine with conditional branches

– Condition can be part of instruction

– Or condition can be inquired by using machine flags

– Or condition can be evaluated separately (canonical execution) and then be provided as one of the arguments

– Operands are: condition, target address, and *+1

• CodeGen uses temps for intermediate booleans• Or CodeGen uses flow of control, so “code locations” imply state of

boolean subexpressions• Or combination of both• Target computer may provide boolean or even bitwise operations:

– And

– Or

– Xor

– Not, etc

Page 8: PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.

PSU CS322 HM 8

Boolean Expression Translation, Quads Relational operations need to record their result, e.g. in machine flags.

Logical operations can be realized through computation proper or via control flow. See sample expression: a < 5 || b > 2

a < 5 || b > 2 // source=> Pure code mapping, with logical and, or, xor instruction:

t1 := a < 5 // e.g. encode 0 as false, 1 as truet2 := b > 2t3 := t1 || t2 // generate jump out if t3 is false

=> Control flow mapping, w/o logical and, or, xor :t1 := 1 // guess t1 is true, override if neededif a < 5 goto l1 // could be quad: Cond_jump_if_lesst1 := 0 // guess was wrong, override to false

l1:t2 := 1 // guess: set t2 to true initiallyif b > 2 goto l2t2 := 0 // wrong guess, set t2 to false

l2:t3 := 1 // final guessif t1 goto l3if t2 goto l3t3 := 0 // final guess computed as false

l3: // use t3

Page 9: PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.

PSU CS322 HM 9

Better Representation of Booleans, and IR

Use target machine’s native logical operations for: and, or, xor; also in sample expression: a < 5 || b > 2

t1 := 1if a < 5 goto l1t1 := 0

l1:t2 := 1if b > 2 goto l2t2 := 0

l2:t3 := t1 or t2// use t3

MOVE t3 ( (BINOP ||(ESEQ [ [MOVE t1 (CONST 1) ]

[CJUMP < (NAME a) (CONST 5) l1 ][MOVE t1 (CONST 0) ][LABEL l1] ] t1

)(ESEQ [ [MOVE t2 (CONST 1)]

[CJUMP > (NAME b) (CONST 2) l2 ][MOVE t2 (CONST 0) ][LABEL l2] ] t2

) ) )

Page 10: PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.

PSU CS322 HM 10

Value Representation, Relational

L := new Label();t := new Temp();E.s := [ E1.s; E2.s;

t := 1; if ( E1.t relop E2.t ) goto L; t := 0;L: ... ]

E.t := t;

E -> E1 relop E2

• Three-Address Code:

• IR Tree Code:

L := new NAME();t := new TEMP();E.tr := ( ESEQ [ [MOVE t (CONST 1 ) ]

[ CJUMP relop E1.tr E2.tr L ][MOVE t (CONST 0) ][LABEL L] t

)

Page 11: PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.

PSU CS322 HM 11

Value Representation, Three-Address CodeL := new Label(); t := new Temp();E.s := [ E1.s; E2.s; t := 1;

if ( E1.t == 1 ) goto L; if ( E2.t == 1 ) goto L; t := 0; L: ... ]

E.t := t;

E -> E1 ‘||’ E2

L := new Label(); t := new Temp();E.s := [ E1.s; E2.s; t := 0;

if ( E1.t == 0 ) goto L; if ( E2.t == 0 ) goto L; t := 1; L: ... ]

E.t := t;

E -> E1 ‘&&’ E2

t := new Temp();E.s := [ E1.s; t := 1 – E1.t; ]E.t := t;

E -> E1 ‘!’ E2

Page 12: PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.

PSU CS322 HM 12

Value Representation, IR Tree CodeL = new NAME(); t = new TEMP();E.tr := (ESEQ [ [MOVE t (CONST 1) ]

[CJUMP == E1.tr (CONST 1) L ] [CJUMP == E2.tr (CONST 1) L ] [MOVE t (CONST 0) ] [LABEL L] t )

E -> E1 ‘||’ E2

E -> E1 ‘&&’ E2

t = new TEMP();E.tr := (ESEQ [MOVE t (BINOP – (CONST 1) E1.tr)]

t )

E -> E1 ‘!’ E2

L = new NAME(); t = new TEMP();E.tr := (ESEQ [ [MOVE t (CONST 0) ]

[CJUMP == E1.tr (CONST 0) L ] [CJUMP == E2.tr (CONST 0) L ] [MOVE t (CONST 1) ] [LABEL L] t )

Page 13: PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.

PSU CS322 HM 13

Control-Flow Mapping, Long If Version

Booleans used in programs to direct flow of control, e.g.if ( a < 5 || b > 2 ) S1; else S2;

Frequently, the Boolean result is not needed afterwards. Thus possible to generate positional code. Instead of:

// assume ( a < 5 || b > 2 ) stored in t3// code to compute t3, includes boolean OR ||if ( t3 == 0 ) goto l2

L1:code for S1goto L3

l2:code for S2

L3:... Successor of if-statement... t3 not needed; if machine flag: overridden by S1, S2

Page 14: PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.

PSU CS322 HM 14

Control-Flow Mapping, Shorter If Version

1. No need to create temps to compute boolean value 2. How does code-gen know where to branch to? Use “back-

patching!” Can be done by buffering code, or after CodeGen. Ramifications would be good Midterm question

if ( a < 5 ) goto L4if ( b > 2 ) goto L4goto L5

L4:code for S1goto L6

l5:code for S2

L6:... Code after if-statement

Page 15: PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.

PSU CS322 HM 15

Control-Flow Mapping, Nested If Statements

Data structures needed to back-patch?

Object Code Skeleton

if ( a >= 5 ) goto L8 // back-patchif ( b <= 2 ) goto L7 // back-patchcode for S1goto L10 // back-patch

L7: // L7 resolvedcode for S2goto L10 // back-patch

L8: // L8 resolvedif ( c >= 6 ) goto L9 // back-patchcode for S3goto L10 // back-patch

L9: // L9 resolvedcode for S4

L10: // L10 resolved

Source Code Skeleton

if ( a < 5 )if ( b > 2 )

S1;else

S2;//end if

elseif ( c < 6 )

S3;else

S4;//end if

//end if

Page 16: PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.

PSU CS322 HM 16

Control-Flow Mapping, Elsif Clauses (Ada)

How can linked-list of to-be-back-patched addresses be created?

Object Code Skeleton

if ( a >= 5 ) goto L11code for S1goto L14

L11:if ( b <= 2 ) goto L12code for S2goto L14

L12:if ( c >= 6 ) goto L13code for S3goto L14

L13:code for S4

L14:

Source Code Skeleton

if a < 5 thenS1;

elsif b > 2 thenS2;

elsif c < 6 thenS3;

elseS4;

end if;

Page 17: PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.

PSU CS322 HM 17

Control-Flow Mapping, While

Are there size (of code) limitations to back-patching?

Object Code Skeleton

// R1 holds induction variableL15:if ( R1 >= 10 ) goto L16

code for SR1++goto L15

L16:

Source Code Skeleton

while ( i < 10 ) {S;i++;

} //end while// assume i NOT needed after// “i” is pure IV

Page 18: PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.

PSU CS322 HM 18

Control-Flow Mapping, Repeat (Pascal)

Object Code Skeleton

// R1 holds induction variableL15:if ( R1 >= 10 ) goto L16

code for SR1++goto L15

L16:

Source Code Skeleton

//Pascal sourcerepeat

S;i++;

until i >= 10;// again no use of “i” after

“Fall-Through” in Repeat vs. initial test in While

Page 19: PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.

PSU CS322 HM 19

Control-Flow Mapping, For

• What happens if “i” (induction variable AKA IV) is defined outside, and used as loop parameter?

• What is its value after for loop completion?• Can it be referenced? i.e. value be printed?• What happens if IV is assigned inside loop?• What should happen, if IV value is > end-value at start?

Object Code Skeleton

mov R1, #0L17:

If ( R1 >= 10 ) goto L18code for SR1++goto L17

L18:

Source Code Skeleton

for( int i=0; i<10; i++ ) {S;

} //end for// i is undefined/not used// can be IV in reg

Page 20: PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.

PSU CS322 HM 20

Back Patching Exampleif ( a < 5 || b > 2 ) S1; else S2;

• Handling a<5:if (a < 5) goto <Lx>; // <Lx> needs to be patched; addr.

insertion• Handling b>2:

if (b > 2) goto <Ly>; // <Ly> needs to be patched• Handling ..||..:

if (a < 5) goto <Lx>; // .. else fall throughif (b > 2) goto <Lx>; // <Ly> is patched to <Lx>goto <Lz> // <Lz> needs to be patched

• Handling if .. S1 else S2:if (a < 5) goto L4; // <Lx> is patched to L4if (b > 2) goto L4;goto L5; // <Lz> is patched to L5

L4: [code for S1] goto L6; // then clauseL5: [code for S2] // else clauseL6: // end of If Statement

Page 21: PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.

PSU CS322 HM 21

Back Patching: Jump Labels• Three-Address Code: Add two attributes

E.true — position to jump to when E evaluates to true;

E.false — position to jump to when E evaluates to false.

E.s := [ E1.s;

E2.s;

if ( E1.t relop E2.t ) goto E.true;

E.false: ]

E1.true := E.true;

E1.false := new Label();

E2.true := E.true;

E2.false := E.false;

E.s := [ E1.s; E1.false: E2.s; ]

E -> E1 relop E2

E -> E1 ‘||’ E2

Page 22: PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.

PSU CS322 HM 22

Back Patching: Three-Address Code Cont’d

E1.true := new Label();

E1.false := E.false;

E2.true := E.true;

E2.false := E.false;

E.s := [ E1.s; E1.true: E2.s; ]

E1.true := E.false;

E1.false := E.true;

E.s := E1.s;

E -> E1 ‘&&’ E2

E -> ‘!’ E1

Page 23: PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.

PSU CS322 HM 23

Back Patching: Jump Labels Cont’d

IR Tree Code:

E.tr := ( ESEQ [CJUMP relop E1.tr E2.tr E.true ] null )

E1.true := E.true;

E1.false := new NAME();

E2.true := E.true;

E2.false := E.false;

E.tr := (ESEQ [stmt( E1.tr); LABEL( E1.false ); stmt( E2.tr); ] null )

E -> E1 relop E2

E -> E1 ‘||’ E2

Page 24: PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.

PSU CS322 HM 24

Back Patching: IR Tree Cont’d

E1.true := new NAME();

E1.false := E.false;

E2.true := E.true;

E2.false := E.false;

E.tr := (ESEQ [stmt( E1.tr ); LABEL( E1.true ); stmt( E2.tr ); ]

null)

E1.true := E.false;

E1.false := E.true;

E.tr := E1.tr;

E -> E1 ‘&&’ E2

E -> ‘!’ E1

Page 25: PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.

PSU CS322 HM 25

Converting Back to Value

Actual Boolean value are needed in programs, e.g.

boolean x = a < 5 || b > 2;

We still need to generate a value for the Boolean expression!

This can be implemented by patching the two labels E.true and E.false

for the Boolean expression E with two assignment statements for assigning

1 and 0, respectively.

t = new Temp();

E.true := new Label();

E.false := new Label();

L := new Label();

E.s := [ E.true: t := 1; goto L;

E.false: t := 0; L: ]

E.t := t;

Boolean expression E

Page 26: PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.

PSU CS322 HM 26

New Arrays

• Storage allocation for E: — Follow Java’s array storage convention. The length of array is stored as the 0th element. So storage for a 10-element array actually requires 11 cells

• Cell initialization — All elements automatically initialized to 0; you emit code

Pseudo IR Code:

L: new Label; t1,t2,t3: new Temps;// wdSize == 4

E.s := [ E1.s;

t1 := ( E1.t + 1 ) * wdSize; // number of elements

t2 := malloc( t1 ); // t2 points to cell 0

t2[0] := E1.t; // store array length

t3 := t2 + ( E1.t * wdSize ); // t3 points to last cell

L: t3[0] := 0; // init a cell to 0t3 := t3 - wdSize; // move down a cell

if ( t3 > t2 ) goto L; ] // loop back

E.t := t2;

E => ‘newArray’ E1

Page 27: PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.

PSU CS322 HM 27

Arrays Element Reference

• Calculate address for E: addr( a[i] ) = base a + (i+1) * wdSize

• Bounds check: i >= 0 and i < num-elements. i is general expression!

L1,L2: new Label; t1,t2,t3,t4: new Temps;

E.s := [ E1.s; E2.s;t1 := E1.t[ 0 ]; // t1 holds num elements

if ( E2.t < 0 ) goto L1; // too low?

if ( E2.t >= t1 ) goto L1; // too high?

t2 := E2.t + 1; // must be OK

t3 := t2 * wdSize; // compute offset

t4 := E1.t[ t3 ]; // address = start + offset

goto L2; // bypass exception handler

L1: param E1.t; param E2.t;call arrayError, 2;

L2: ] // t4 holds final address

E.t := t4;

E => E1 ‘[‘ E2 ‘]’

Page 28: PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.

PSU CS322 HM 28

StatementsAssignment Statement

=>

S.s :=[ E1.s; E2.s; E1.t := E2.t; ]

S => E1 ‘:=‘ E2 ‘;’

If Statement with Else Clause

=>

L1, L2, L3: new Labels;

E.true := L1;

E.false := L2;

S.s :=[ E.s; L1: S1.s; goto L3; L2: S2.s; L3: ; ]

S => ‘if’ ‘(‘ E ‘)’ then’ S1 ‘else’ S2 ‘;’

Page 29: PSUCS322 HM 1 Languages and Compiler Design II IR Code Generation I Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.

PSU CS322 HM 29

Statements, Cont’dWhile Statement

=>L1, L2, L3: new labels; // no explicit jump to L2E.true := L2;E.false := L3;S.s :=[ L1: E.s; L2: S1.s; goto L1; L3: ]

S => ‘while’ ‘(‘ E ‘)’ S1 ‘;’

Print Statement with 1 argument

=>

S.s :=[ E.s; param E.t; call prInt, 1; ]

S => ‘print’ E ‘;’