32bit power PC.ppt

download 32bit power PC.ppt

of 22

Transcript of 32bit power PC.ppt

  • 7/27/2019 32bit power PC.ppt

    1/22

    Comparison instructions

    Branch and jump instructions

    Simple Code Sequences

    1

  • 7/27/2019 32bit power PC.ppt

    2/22

    Where Are Branches Used?In C control statements If statement

    if(n > 0) {

    } else {

    }

    While loopwhile (s != NULL) {

    }

    For loopfor (i = 0; i < N; i++) {

    }

    Do loopdo {

    }while (s != NULL)

    Otherse.g. max = (x > y) ? x : y;

    2

  • 7/27/2019 32bit power PC.ppt

    3/22

    Comparison InstructionsTo set up conditions in CR or XER bits Set by arithmetic/logic/shift instructions with . suffix Set by comparison instructions

    Compare signed word and unsigned wordcmpw r3, r4 ; set CR0 as for signed r3-r4cmplw r3, r4 ; set CR0 as for unsigned r3-r4

    Cmplw: compare logical

    Compare using immediate valuescmpwi r3, 200 ; set CR0 as for signed r3-200cmplwi r3, 200 ; set CR0 as for unsigned r3-200

    3

  • 7/27/2019 32bit power PC.ppt

    4/22

    Comparison InstructionsCompare and set specific condition registers

    Comparison may specify which CR field to use

    cmpw cr3, r3, r4 ; set CR3 instead of CR0

    cmplwi cr2, r3, r4 ; logical and using immediate; and set CR2

    cmpw cr0, r3, r4 ; equivalent to cmpw r3, r4

    4

    CR0 CR1 CR2 CR3 CR4 CR5 CR6 CR7

    LT GT EQ SO

  • 7/27/2019 32bit power PC.ppt

    5/22

    Branch Basic Termsbranch condition, branch-target

    Unconditional branches Always jump to the target address

    Conditional branches Take the branch only if some condition holds

    Target address Determining the address of the next instruction

    5

  • 7/27/2019 32bit power PC.ppt

    6/22

    Unconditional Branches Unconditional branches

    C Assembly

    while (1) { loop:addi r9, r9, 1X=X+1;} b loop

    6

    (-4)

    The target loop is specified as an offset from the curre

    instruction (PC-relative).

  • 7/27/2019 32bit power PC.ppt

    7/22

    Conditional BranchesCommonly used branches

    Use condition register CR0 LT, GT, EQ, SO

    Common forms: ble target_address ble: branch if less then or equal GT=0

    blt: branch if less then LT=1

    beq: branch if equal EQ=1

    bne: branch if not equal EQ=0 bge: branch if greater than or equal to LT=0

    bgt: branch if greater thanGT=1

    All encoded in the same instruction format (see next)

    7

  • 7/27/2019 32bit power PC.ppt

    8/22

    Conditional BranchesUsing CR fieldsbne cr2, target ; branch if EQ of CR2 is zero

    Example: using branch with comparison instructionsloop:

    addi r3, r3, 1 ; increase r3cmpw r3, r4 ; compare r4bne target ; branch if r3 != r4

    Example: using different CR fieldloop:

    addi r3, r3, 1 ; increase r3cmpwcr3, r3, r4 ; compare using cr3bne cr3, target ; branch if r3 != r4

    8

  • 7/27/2019 32bit power PC.ppt

    9/22

    Determining Target Address

    1. PC-relative: next PC = PC + EXTS(PC-Offset || 0b00)2. Absolute: next PC = EXTS(PC-Offset || 0b00);

    3. Register: next PC = value of register Can use two special registers: LR or CTR

    Why sign-extension of an address (for absolute)?

    Are addresses ever negative?

    Upper address space usually reserved for I/Oaddresses (say oxff000000 onwards).

    0xff00 gets sign-extended to 0xffffff00.

    9

  • 7/27/2019 32bit power PC.ppt

    10/22

    Determining Target AddressUse PC-relative or absolute addressing: a suffix

    Use PC-relative address:

    Use absolute address: ba loop

    Update LR option: l suffix If updating, save PC+4 into LR

    Do not update LR: b target_addr

    Update LR: bl func_addr Update LR and use absolute address: bla func_addr

    When do we want to save PC+4?

    10

  • 7/27/2019 32bit power PC.ppt

    11/22

    Underlying Details

    bx: encodes 24-bit address (26-bit effective)

    bcx: encodes 14-bit address (16-bit effective)bclrx: uses LR register as target addressbcctrx: uses CR register as target addressx:representing AA and LK bits, e.g. l, a, la

    11

    16 BO AA LK

    0-5 6-10 30 31

    bcx BI BD

    11-15 16-29

    19 BO LKbclrx BI 00000 16

    19 BO LKbcctrx BI 00000 528

    18 PC-Offset AA LKbx

    Instruction format

  • 7/27/2019 32bit power PC.ppt

    12/22

    Underlying Details BO: Branch options

    Encodes branching on TRUE or FALSE or on CTR values

    BI: Index of the CR bit to use five bits index to 32 CR bits, 3-bit for CR index, 2-bit to select LT,

    GT, EQ, or SO

    BD: Branch displacement

    14-bit (16-bit effective), signed-extended

    AA: absolute address bit 1 use absolute addressing; 0 use PC-relative addressing

    LK: link bit 1 update LR with PC+4; 0 do not update

    12

    16 BO AA LKbcx BI BD

    Instruction Fields

  • 7/27/2019 32bit power PC.ppt

    13/22

    Underlying DetailsFrequently used BO encoding in bc, bclr, and bcctr BO=00100 (4): branch if the condition is false BO=01100 (12): branch if the condition is true

    BO=10100 (20): branch always BO=10000 (16): decreases CTR then branch if CTR!=0

    Examples: blt target_addr bc 12, 0, target_addr blt cr3, target_addr bc 12, 12, target_addr

    blr bclr 20, 0: unconditional branch to addr in LR bnelr target_addr bclr 4, 2: branch to LR if not equal

    Explanation: bc 4, 14, target_addr: branch if bit 14 inCR (CR3[EQ]) is false (because BO=4) bne cr3,target_addr

    13

    BO and BI Fields

  • 7/27/2019 32bit power PC.ppt

    14/22

    Underlying Details

    Branch examples using AA and LK bits (zeros by default)

    bl target_addr ; branch and save PC+4 in LRba target_addr ; branch using absolute addressing

    bla target_addr ; branch using absolute addressing

    ; and save PC+4 in LR

    14

    16 BO AA LK

    0-5 6-10 30 31

    bcx BI BD

    10-15 16-29

    19 BO LKbclrx BI 00000 16

    19 BO LKbcctrx BI 00000 528

    18 Offset AA LKbx

    AA and LK fields

  • 7/27/2019 32bit power PC.ppt

    15/22

    Support Procedure Call/ReturnLink RegisterSupporting function calls

    1. A parent function calls a child function: blchild_func LR

  • 7/27/2019 32bit power PC.ppt

    16/22

    Simple Code SequencesHow to translate:

    C arithmetic expressions C ifstatement

    C for loops

    Function calls (next week)

    16

  • 7/27/2019 32bit power PC.ppt

    17/22

    C Arithmetic ExpressionsBasic operationsstatic int sum;

    static int x1, x2;

    static int y1, y2;

    sum = (x1+x2)-(y1+y2)+100;

    Assembly

    lwz r3, 4(r13) ; load x1

    lwz r0, 8(r13) ; load x2

    add r4, r3, r0 ; x1+x2lwz r3, 12(r13) ; load y1

    lwz r0, 16(r13) ; load y2

    add r0, r3, r0 ; y1+y2

    subf r3, r0, r4 ; minusaddi r0, r3, 100; ; add 100

    stw r0, 0(r13) ; store sum

    17

    Q: What would happen if signed is changed to unsigned?

  • 7/27/2019 32bit power PC.ppt

    18/22

    C Arithmetic ExpressionsSign extensionstatic short sum;

    static short x1, x2;

    static short y1, y2;

    sum = (x1+x2)-(y1+y2) + 100;

    Assembly

    lha r3, 2(r13) ; load x1

    lha r0, 4(r13) ; load x2

    add r4, r3, r0 ; x1+x2

    lha r3, 6(r13) ; load y1

    lha r0, 8(r13) ; load y2

    add r0, r3, r0 ; y1+y2

    subf r3, r0, r4 ; minus

    addi r0, r3, 100 ; add 100

    sth r0, 0(r13) ; store sum

    18

  • 7/27/2019 32bit power PC.ppt

    19/22

    If-then-elseC Programif (x > y)

    z = 1;

    else z = 0;

    Assembly

    cmpw r3, r4

    ble skip1

    li r31, 1b skip2

    skip1: li r31, 0

    skip2:

    19

    Notes:

    Code generated by CodeWarrior and then revised

    x r3; y r4; z r31

    li r31, 1 => addi r31, 0, 1; li called simplified mnemonic

  • 7/27/2019 32bit power PC.ppt

    20/22

    If-then-elseC Programstatic int x, y;static int max;if (x y > 0)

    max = x;else

    max = y;

    Assemblylwz r4, 0(r13) ; load ylwz r0, 4(r13) ; load xsubf r0, r4, r0 ; x-ycmpwi r0, 0x0000 ; x-y>0?ble skip1 ; no, skip max=x

    lwz r0, 0(r13) ; load xstw r0, 8(r13) ; max=xb skip2 ; skip max=y

    skip1: lwz r0, 4(r13) ; load ystw r0, 8(r13) ; max=y

    skip2:

    20

    Notes:

    Generated by CodeWarrior and then revised

    Can you optimize the code? i.e. reduce number of

    instruction but produce the same output

  • 7/27/2019 32bit power PC.ppt

    21/22

    If-then-elseDisassembled code:Address Binary Assembly

    00000048: 7C001800 cmpw r0,r30000004C: 4081000C ble *+1200000050: 3BE00001 li r31,100000054: 48000008 b *+8

    00000058: 3BE00000 li r31,00000005C:

    Assembly Source:

    cmpw r0, r3

    ble skip1

    li r31, 1

    b skip2skip1: li r31, 0

    skip2:

    21

    Binary code

  • 7/27/2019 32bit power PC.ppt

    22/22

    For loopC codestatic int sum;static int X[100];int i;

    sum = 0;for (i = 0; i < 100; i ++)sum += X[i];

    Assemblyli r0, 0 ; sum = 0 ; sumr31

    stw r0, 0(r13); ; sum = 0li r31, 0 ; ir31b cmp_ ;

    loop: slwi r4, r31, 2 ; r4=i*4

    lis r3, X@ha ; load X addressori r3, r3, X@lo ; load X addressadd r3, r3, r4 ; X[i] addresslwz r4, 0(r3) ; load X[i]lwz r0, 0(r13) ; load sumadd r0, r0, r4 ; sum+=X[i]stw r0, 0(r13) ; store sum

    addi r31, r31, 1 ; increase icmp_: cmpwi r31, 0x0064 ; 0x64 = 100

    blt loop(generated by CodeWarrior and then revised)

    22

    Exercise: (1) How many instructions will be executed? (2) Optimize the code

    to reduce the loop body to 4 instructions; (3) further reduce the loop body to 3

    instructions. Loop body includes the branch instruction.