lec09-Code generation

download lec09-Code generation

of 36

Transcript of lec09-Code generation

  • 8/7/2019 lec09-Code generation

    1/36

    Code generatorThree address

    Statements

    Object

    Program

    CODE GENERATION

  • 8/7/2019 lec09-Code generation

    2/36

    Deciding what machine instructions to generate.

    Deciding in what order computations should bedone.

    Deciding which registers to use.

    General Issues in Code generation:

  • 8/7/2019 lec09-Code generation

    3/36

    Forms of Object program

    Absolute-Machine code

    Relocatable Machine code

    Assembly language code

  • 8/7/2019 lec09-Code generation

    4/36

    Basic

    Block

    A basic block is a sequence of consecutive statements in

    which flow of control enters at the beginning and leaves at

    the end without halt or possibility of branching except ,atthe end.

  • 8/7/2019 lec09-Code generation

    5/36

    1. a := b+c

    2. d := d-b

    3. e := a+f

    4. if a>b goto 7

    5. f := a-d6. goto 10

    7. b := d+f

    8. e := a-c

    9. if b>c goto 15

    10. b := d+c

    11. if a>b goto 1

  • 8/7/2019 lec09-Code generation

    6/36

    Algorithm:Partition into basic blocks

    Input:A sequence of three address statements

    Output:A list of basic blocks with each three-address statement

    in exactly one block.

    Method:1) We first determine the set of leaders, the first statement

    of basic blocks.The rules are

    b)Any statement that is the target of a conditional or un-

    onditional goto is a leader.

    a) The first statement is a leader.

    c)Any statement that immediately follows a goto

    or conditional goto statement is a leader.2) For each leader, its basic block consists of the leader and all

    statements up to but not including the next leader or the endof the program.

  • 8/7/2019 lec09-Code generation

    7/36

    1. a := b+c

    2. d := d-b

    3. e := a+f

    4. if a>b goto 7

    5. f := a-d6. goto 10

    7. b := d+f

    8. e := a-c

    9. if b>c goto 15

    10. b := d+c

    11. if a>b goto 1

    Fragment of code to be partitioned into basic

    blocks

  • 8/7/2019 lec09-Code generation

    8/36

    1. a := b+c

    2. d := d-b

    3. e := a+f

    4. if a>b goto 7

    5. f := a-d6. goto 10

    7. b := d+f

    8. e := a-c

    9. if b>c goto 15

    10. b := d+c

    11. if a>b goto 1

    Applying the algorithm we identify the following

    leaders

  • 8/7/2019 lec09-Code generation

    9/36

    1. a := b+c2. d := d-b3. e := a+f4. if a>b goto 7

    10. b := d+c

    11. if a>b goto 1

    5. f := a-d

    6. goto 10

    7. b := d+f8. e := a - c

    9. if b>c goto 15

    B1

    B3B2

    B4

  • 8/7/2019 lec09-Code generation

    10/36

    1. a := b+c2. d := d-b3. e := a+f4. if a>b goto 7

    10. b := d+c

    11. if a>b goto 1

    5. f := a-d

    6. goto 10

    7. b := d+f8. e := a - c

    9. if b>c goto 15

    B1

    B3B2

    B4

  • 8/7/2019 lec09-Code generation

    11/36

    Computing Next uses of variables in abasic block

    i: x:= +;

    (no intervening assignments to x)

    j: y:=+x*;

    Statement j uses x computed at i.

    The next use ofx is j

    To find next uses in a basic block we perform a backward scan

    from the end of the basic block.

  • 8/7/2019 lec09-Code generation

    12/36

    Computing Next uses of variables in abasic block

    Suppose we reach three-address statement i: x:=y OP z

    then, do the following

    1. Attach to statement i the information currently found in thesymbol table regarding the next use and liveness of x,y,z.

    2. In the symbol table,set x to not live and no next use.

    3. In the symbol table set y and z to live and the next usesof y and z to i.

  • 8/7/2019 lec09-Code generation

    13/36

    1. a1,3 := b1,2+c 1,02. d 1,0:=d 1,0-b d,03. e 1,0:= a 1,0+f1,0

    a d,0;b 1,1;c 1,1;d 1,2;e d,0;f1,3a 1

    ,3

    ;b 1,2

    ;c 1,0;d 1

    ,2

    ;ed,

    0;f1,3a 1,3;b d,0;c 1,0;d 1,0;e d,0;f1,3

    a 1,0;b d,0;c 1,0;d 1,0;e 1,0;f1,0

    5. f1,0 := a d,0 d 1,0

    7. b 1,0 := d 1,0 + f1,0

    8. e1,0 := a d,0 c 1,0

    10. b 1,0 := d 1,0 + c 1,0

    a1,5

    ;bd,0

    ;c1,0

    ;d1,5

    ;e1,0

    ;fd,0a d,0;b d,0;c 1,0;d 1,0;e 1,0;f1,0

    a d,0;b d,0;c 1,10;d 1,10;e 1,0;f1,0a d,0;b 1,0;c 1,0;d 1,0;e 1,0;f1,0

    a 1,8;b d,0;c 1,8;d 1,7;e d,0;f1,7a 1,8;b 1,0;c 1,8;d 1,0;e d,0;f1,0a d,0;b 1,0;c 1,0;d 1,0;e 1,0;f1,0

    The following shows the next use information for thebasic blocks considered earlier.

  • 8/7/2019 lec09-Code generation

    14/36

    MACHINE MODEL

    The machine for which we generate code is a byte

    addressable machine with 2 16 bytes(2 15 16-bit words)

    of memory.

    There are 8 general purpose registers numbered 0 to 7,

    each capable of holding a 16-bit quantity.

    The instructions are of the form

    OP source destination

    4 bits 6 bits 6 bits

    The bit patterns in the fields specify the nature of operands

    and the words that follow the instruction contain theoperands.

    The op codes we refer to are MOV, ADD, SUB.

    The length of the instruction in words is regarded as the cost

    of the instruction for analytical purpose.

  • 8/7/2019 lec09-Code generation

    15/36

    Addressing Mode Operand Bitpattern

    Meaning Extra cost

    1. Register mode ( r) 001xxx Operand in register xxx 0

    2. Indirect registermode (*r)

    010xxx Address of the operand in theregister xxx

    0

    3. Indexedmode X (r )

    011xxx Address of the operand is thecontents of the register xxx + thevalue X found in the word thatfollows the instruction

    1

    4. Indirect indexedmode *X (r)

    100xxx Address of the operand is in thelocation obtained as the contentsof the register xxx + the value Xfound in the word that follows the

    instruction

    1

    5. Immediate #X 101$$$ The word that follows theinstruction contains theimmediate operand

    1

    6. Absolute X 110$$$ The word that follows theinstruction contains the address

    of the operand.

    1

    The table that illustrates the addressing modes and Instruction formats:

  • 8/7/2019 lec09-Code generation

    16/36

    Some example instructions and their costs:

    MOV R0,R1 1

    MOV R5,M 2

    ADD #1,R3 2

    SUB 4(R0),*5(R1) 3

  • 8/7/2019 lec09-Code generation

    17/36

    For each quadruple A := B op C we perform the following

    1) Invoke a function GETREG() to determine the location L where thecomputation B op C should be performed. L will usually be a

    register, but it could also be a memory location

    2) If the value of B is not in L, generate the instruction MOV B1 ,L to

    place a copy of B in L. Consult the address descriptor for B to

    determine B1 ,(one of ) the current location(s) of B. Prefer the

    register for B1 ,if the value of B is currently both in memory and a

    register.

    A Code-Generation Algorithm

    R M

  • 8/7/2019 lec09-Code generation

    18/36

    For each quadruple A := B op C we perform the following

    3) Generate the instruction OP C1, L where C1 is the current locationof C. Update the address descriptor of A to indicate that A is in

    location L. If L is a register , update its descriptor to indicate that it

    will contain at run time the value ofA.

    4) If the current values of B and/or C have no next uses, are not live

    on exit from the block, and are in registers, alter the register

    descriptor to indicate that, after execution ofA :=B OP C, those

    registers no longer will contain B and/or C, respectively.

    A Code-Generation Algorithm

  • 8/7/2019 lec09-Code generation

    19/36

    1) If the name B is in a register that holds the value of no othernames (recall that copy instructions such as X := Y could

    cause a register to hold the value of two or more variables

    simultaneously), and B is not live and has no next use after

    execution of A := B+C, then return the register of B for L.

    Update the address descriptor of B to indicate that B is no

    longer in L.

    2) Failing (1), return an empty register for L if there is one.

    GETREG( ):

  • 8/7/2019 lec09-Code generation

    20/36

    3) Failing (2), ifA has a next use in the block ,or OP is anoperator, such as indexing ,that requires a register, find

    an occupied register R. Store the value of R into a memory

    location (by MOV R,M) if it is not already in the proper memory

    location M, update the address descriptor for M,and return R.

    A suitable occupied register might be one whose datum is

    referenced furthest in the future, or one whose value is also in

    memory. The exact choice is open, since there is no one proven

    best way to make the selection.

    4) If A is not used in the block ,or no suitable occupied register

    can be found ,select the memory location ofA as L.

    GETREG( ):

  • 8/7/2019 lec09-Code generation

    21/36

    By applying the code generation algorithm, to one of the

    basic blocks obtained above, we obtain the following code.

    Statements Code

    generated

    Register

    Descriptors

    Address

    Descriptors

    Registers empty a:M;b:M;c:M;

    d:M; e:M; f:M

    1. a 1,3 := b 1,2 + c 1,0 MOV b,R0

    ADD c,R0

    R0:a a:R0;b:M;c:M;

    d:M;e:M;f:M

    2. d1,0 := d 1,0 - b d,0 MOV d,R1

    SUB b,R1

    R0:a; R1:d a:R0;b:M;c:M;

    d:R1;e:M;f:M

    3. e 1,0:=a 1,0 + f1,0 MOV R0,R2

    ADD f ,R2

    R0:a;R1:d;R2:e; a:R0;b:M;c:M;

    d:R1;e:R2;f:M

    a,c,d,e,f live MOV R0,a

    MOV R1,d

    MOV R2,e

  • 8/7/2019 lec09-Code generation

    22/36

    Code optimization: A transformation to a program to make it

    run faster and/or take up less space Optimization should be safe, preserve the

    meaning of a program.

    Example: peephole optimization. A simple technique to improve target code.

    Peephole: a small moving window to the targetprogram.

    Technique: example a short sequence of targetinstructions (peephole) and try to replace it witha faster or shorter sequence

  • 8/7/2019 lec09-Code generation

    23/36

    Peephole optimization: Redundant instruction elimination

    Flow of control optimization Algebraic simplifications

    Instruction selection

    Examples: Redundant loads and stores

    MOV R0, a

    MOV a, R0

    Unreachable code

    If debug = 1 goto L1Goto L2

    L1: print debugging info

    L2:

  • 8/7/2019 lec09-Code generation

    24/36

    Examples:

    Flow of control optimization:

    goto L1

    L1: goto L2

    goto L2

    L1: goto L2

    if a < b goto L1

    L1: goto L2

    if a

  • 8/7/2019 lec09-Code generation

    25/36

    Algebraic simplification:

    x : = x+0

    x := x*1 == nop

    Reduction in strength

    X^2 x * x

    X * 4 x

  • 8/7/2019 lec09-Code generation

    26/36

    Code optimization can either be highlevel or low level:

    High level code optimizations:

    Loop unrolling, loop fusion, procedure inlining

    Low level code optimizations:

    Instruction selection, register allocation Some optimization can be done in both

    levels:

    Common subexpression elimination, strength

    reduction, etc. Flow graph is a common intermediate

    representation for code optimization.

  • 8/7/2019 lec09-Code generation

    27/36

    Basic block: a sequence of consecutive statementswith exactly 1 entry and 1 exit.

    Flow graph: a directed graph where the nodes arebasic blocks and block B1 block B2 if and only if B2can be executed immediately after B1:

    Algorithm to construct flow graph:

    Finding leaders of the basic blocks:

    The first statement is a leader Any statement that is the target of a conditional or

    unconditional goto is a leader

    Any statement that immediately follows a goto orconditional goto statement is a leader

    F

    or each

    leader, its basic block consists allstatements up to the next leader.

    B1B2 if and only if B2 can be executedimmediately after B1.

  • 8/7/2019 lec09-Code generation

    28/36

    Example:100: sum = 0

    101: j = 0102: goto 107

    103: t1 = j

  • 8/7/2019 lec09-Code generation

    29/36

    Optimizations within a basic block is called local optimization.

    Optimizations across basic blocks is called global optimization.

    Some common optimizations:

    Instruction selection

    Register allocation

    Common subexpression elimination

    Code motion

    Strength reduction

    Induction variable elimination

    Dead code elimination

    Branch chaining

    Jump elimination

    Instruction scheduling

    Procedure inlining Loop unrolling

    Loop fusing

    Code hoisting

  • 8/7/2019 lec09-Code generation

    30/36

    Instruction selection: Using a more efficient instruction to replace a sequence of

    instructions (space and speed).

    Example:

    Mov R2, (R3)

    Add R2, #1, R2

    Mov (R3), R2 Add (R3), 1, (R3)

  • 8/7/2019 lec09-Code generation

    31/36

    Register allocation: allocate variables to registers(speed)

    Example:M[R13+sum] = 0M[R13+j] = 0GOTO L18

    L19:R0 = M[R13+j]

  • 8/7/2019 lec09-Code generation

    32/36

    Code motion: move a loop invariant computationbefore the loop

    Example:

    R2 = 0R1 = 0GOTO L18

    L19:R0 = R1

  • 8/7/2019 lec09-Code generation

    33/36

    Strength reduction: replace expensive operation byequivalent cheaper operations

    Example:

    R2 = 0R1 = 0R4 = M[_n]

    GOTO L18L19:

    R0 = R1

  • 8/7/2019 lec09-Code generation

    34/36

    Induction variable elimination: can induce value fromanother variable.

    Example:

    R2 = 0R1 = 0R4 = M[_n]

    R3 = _aGOTO L18L19:

    R2 = R2+M[R3]R3 = R3 + 4

    R1 = R1+1L18:

    NZ = R1 R4if NZ < 0 goto L19

    R2 = 0R4 = M[_n]

  • 8/7/2019 lec09-Code generation

    35/36

    Common subexpression elimination:an expressionwas previously calculated and the variables in theexpression have not changed. Can avoidrecomputing the expression.

    Example:

    R1 = M[R13+I]

  • 8/7/2019 lec09-Code generation

    36/36

    The End