3.Computer.arithmetic

8/8/2019 3.Computer.arithmetic

1/72

Computer Arithmetic

Electrical and Computer Engineering Department


2/72

N umber RepresentationB inary numbers (base 2) - integers

0000 p 0001 p 0010 p 0011 p 0100 p 0101 p 0110 p 0111p 1000 p 1001 p . . .

in decimal from 0 to 2 n-1 for n bits

MIPS represents numbers as 32-bit constants.0 ten is in MIPS 0000 0000 0000 0000 0000 0000 0000 0000 twoo o o

B it 31 (most significant bit) B it 3 B it 0Thus the largest ( unsigned ) number is1111 1111 1111 1111 1111 1111 1111 1111 two or

4,294,967,295 ten = 2 32 -1


3/72

N umber Representation

B ut numbers can also be signed: if the sign bit is 0 thenumber is positive, if it is 1 the number is negative .Thus the signed number

1111 1111 1111 1111 1111 1111 1111 1111 two is-1 ten

B its are just bits (have no inherent meaning)

conventions define the relationships between bits and numbers

With 32 bits the range of integers becomes 2 31 (-2,147,483,648)

is 1 000 0000 0000 0000 0000 0000 0000 0000 two

to 231-1 (2,147,483,647)

Computers use the twos complement to represent the X signed number as (x31 v -231)+(x30 v 230)++(x1 v 21)+(x0 v 20)


4/72

32-bit signed numbers (2s complement):

0000 0000 0000 0000 0000 0000 0000 00 11 two = 3 ten1111 1111 1111 1111 1111 1111 1111 11 0 1 two = -3 ten

MIPS Representations

0000 0000 0000 0000 0000 0000 0000 0000 two = 3 ten -3 ten = 0

To get the twos complement all 0 s become 1 s, and all 1 s become0 s, then a 1 is added

Converting n-bit numbers into numbers with more than n bits:

MIPS 16-bit immediate gets converted to 32 bits for arithmetic

copy the most significant bit (the sign bit) into the other bits00 1 0 -> 0000 00 1 0

1 0 1 0 -> 1111 1 0 1 0


5/72

Exercise:what is the decimals equivalent of the twos complement

1111 1111 1111 1111 1111 111 0 0000 11 00 two

We negate first

0000 0000 0000 0000 0000 000 1 1111 00 11two1

MIPS Representations

0000 0000 0000 0000 0000 000 1 1111 0 1 00 two

Then we apply the formula

(x31 v -231)+(x30 v 230)++(x1 v 21)+(x0 v 20)

1v28 + 1v27 + 1v26 + 1v25 + 1v24 + 1v22 =

=256+128+64+32+16+4=500


6/72

N umber Representation

O f course, it gets more complicatedstorage locations (e.g., register file words) are finite ,so have to worry about overflow (i.e., when thenumber is too big to fit into 32 bits)have to be able to represent negative numbers, e.g.,how do we specify -8 in

addi $sp, $sp, -8 # $sp = $sp - 8

real systems have to provide for more than justintegers, e.g., fractions and real numbers (and

floating point )


7/72

M ore instructions

S ince MIPS uses signed arithmetic, we needunsigned versions of operations such asslt

0 rs rt rd 0 42

Sign bit 31 magnitude bits

10 rs rt constant

11 constant

S ign bit

slti

sltiuThe result of slt and sltu when comparing registerswill be different when the most significant bit is 1


8/72

MIPS Arithmetic and Logic InstructionsR-

type:

31 25 20 15 5 0

op R s R t R d funct

I-T ype: op R s R t Immed 16

INST op funct

ADDI 001000 xxaddiu 001001 xx

SLT I 001010 xx

sltiu 001011 xx

ANDI 001100 xx

OR I 001101 xx

XOR I 001110 xx

LUI 001111 xx

INST op funct

ADD 000000 10 0000addu 000000 10 0001

SUB 000000 10 0010

subu 000000 10 0011

AND 000000 10 0100

OR 000000 10 0101

XOR 000000 10 0110

NOR 000000 10 0111

INST op funct

000000 101000000000 101001

SLT 000000 10 1010

sltu 000000 10 1011

000000 101100


9/72

A ddition is done by carrying 1s to the left

00 00 1 00 (4 1 0 )

+ 00 00 1 00 (4 1 0 )

00 0 10 00 (8 1 0 )

S ubtraction is like addition with Two's complement of the second number

00 00 1 00 (4 1 0 )

- 00 000 11 (3 1 0 )

Addition and S ubtraction

p 00 00 1 00 (4 1 0 )+ 11 111 0 1 ( -3 1 0 )

00 0000 1 ( 1 1 0 )


10/72

A dding two numbers of different sign does not yield anoverflowS ubtracting operands of same sign does not yield anoverflowO verflow (when number of available bits is notsufficient addition causes carry out of MSBand subtraction causes a borrow into MSB )Examples of overflow:

Ov erflow

A +B


11/72

MIPS InstructionsC ategory Instr Op

C odeExample Meaning

A rithmetic(R & I format)

add unsigned 0 and 33 addu $s1, $s2, $s3 $s1 = $s2 + $s3

subt unsigned 0 and 35 subu $s1, $s2, $s3 $s1 = $s2 - $s3

add imm.unsigned

9 addiu $s1, $s2, 6 $s1 = $s2 + 6

DataTransfer

load byteunsigned

36 lbu $s1, 25($s2) $s1 = M emory($s2+25)

Cond.B ranch ( I & R format)

set on less thanunsigned

0 and 43 sltu $s1, $s2, $s3 if ($s2


12/72

Ov erflow Detection and EffectsS ome instructions ( add, addi, sub ) detect overflow andcause an interrupt (exception)When an overflow interrupt occurs control jumps to predefinedmemory address for exceptions

A ddress of instruction causing the overflow is saved for

possible resumption. The offending address is stored in theexception program counter $ epc (not part of theRegister File)

To debug the program, the contents of this register can be moved toa general purpose register move from system control (R-Type

operation) mfc0

$k 1,

$

epc #$k 1= $

epcp $k 0,

$k 1

OS-reserved registers in Register FileA fter debugging, and restoring registers, control can return with a

jump to the O S -reserved register j r $k 1


13/72

ExamplesF ind the shortest sequence of instructions to determine if there isa carry-out from the addition of $ t3 and $ t4 . P lace 0 s in $ t2 if carry-out is 0 and 1 s in $ t2 if carry-out is 1 .

addu $ t2 , $ t3 , $ t4sltu $ t2 , $ t2 , $ t4

or addu $ t2 , $ t3 , $ t4sltu $ t2 , $ t2 , $ t3

F ind the shortest sequence of instructions to perform double precision (64-bit) integer addition. A ssume one 2s complementinteger is in registers $ t4 and $ t5 and another in registers $ t6and $ t7 . The sum is to be placed in registers $ t2 and $ t3 . M ostsignificant word in even-numbered registers.


14/72

Example -continuedIf no overflow detection is requiredaddu $ t3 , $ t5 , $ t7 (least signif. 32 bits)

sltu $ t2 , $ t3 , $ t5 # $ t2 holds 0 saddu $ t2 , $ t2 , $ t4

addu $ t2 , $ t2 , $ t6I f overflow detection is needed

addu $ t3 , $ t5 , $ t7 (least signif. 32 bits)

sltu $ t2 , $ t3 , $ t5 # $ t2 holds 0 s

add $ t2 , $ t2 , $ t4add $ t2 , $ t2 , $ t6 (or add $ t2 , $ t4 , $ t6)


15/72

Arithmetic Logic Unit (ALU)The device that performs the computer arithmetic andlogic operations

I t has four building blocks

operationa b

a+b

F or logicoperations only


16/72

1 -bit Binary Adder

S um = minterms = a b c + a bc + ab c +a bc carry_out = ab+ac+bc

a b carry_in carry_out Sum0 0 0 0 0

0 0 1 0 1

0 1 0 0 1

0 1 1 1 01 0 0 0 1

1 0 1 1 0

1 1 0 1 0

1 1 1 1 1

+a

bSum

carry_in

carry_out


17/72

operation

3

a &b, a|b, a+b, 0

1-bit Binary Addercarry_out = a b+a carry I n+b carry I n

is implemented by the gates

C arryIn

a b

The 1-bit ALU with 0 is implemented withthe gates


18/72

Modifying the ALU C ell for slt

Set

Overflow

Less

Overflowdetection

S ubtraction is same asadding negative b thusBinvert goes Hi,and CarryIn is 1

This is the MSB ALU Cell


19/72

F irst perform asubtraction

M ake the result 1 if the

subtraction yields anegative result

M ake the result 0 if thesubtraction yields a

positive result

Set

Overflow

O

O

O

Modifying the ALU for slt

Bininvert OperationCarryIn


20/72

Modifying the ALU for beq OperationBnegate

O utput Zerogoes Hi for equality


21/72

mult $s 2 , $s 3 # h i||l o = $s 2 * $s 3

MIPS M ultiply Instruction

op rs rt rd shamt funct

000000 $s2 $s3 00000 00000 24

000000 $s2 $s3 00000 00000 25

L ow-order word of the product is placed in processor dedicated register lo and the high-order word is placedin processor register hi

mult uses signed integers and result is a signed 64-bit

number. O verflow is checked in softwareMIPS uses multu for unsigned products


22/72

M ultiplication is more complicated than addition - via shifting andaddition

00 1 0 ten (multiplicand)x 1 0 11 ten (multiplier)

00 1 000 1 0 (partial product

0000 array)00 1 0

The product needs to be moved to general purpose registers to become available for other operations. Instructions mfhi $s 0and mflo $s5 are provided for this.

mfhi $s 0

mflo $s5

M ultiply Instruction

000000 00000 00000 $s0 00000 16

000000 00000 00000 $s5 00000 18

000 1 0 11 0 ten (product)

m bits v n bits = m+n bit product 32+32=64 bits double precision product produced more time to compute


23/72

B inary numbers make it easy:0 => place 0 s (0 x multiplicand) in the proper place1 => place a copy of multiplicand in the proper place

M ultiply Instruction

0000000 00000 101100011 110032 0 s 32-bit multiplicand

at each stage shift the m ul t iplica n d left ( x 2)

use nextL SB

of b to determine whether to add in shiftedmultiplicandaccumulate 2n bit partial product at each stageThe process is repeated 32 times in MIPS


24/72

Unsigned shift-add multiplier ( v ersion 1 )

64-bit M ultiplicand reg., 64-bit ALU , 64-bit P roduct reg.(initialized to 0 s), 32-bit multiplier register

0000000 00000 101100011 1100

32 0 s 32-bit multiplicand

I f M ultiplier0 = 1


25/72

Multiply Algorithm Version 1

Multiplier Multiplicand Product

0 1001 00001001 00000000

1 100 1 00001001 00000000

1001 00010010 000010010100 00010010 00001001

2 010 0 00010010 00001001

0010 00100100 00001001

3 001 0 00100100 00001001

0001 01001000 000010014 000 1 01001000 00001001

0000 10010000 01010001

1001 two x 1001 two


26/72

O bser v ations on M ultiply Version 1

1 clock cycle => 100 clocks per multiplyRatio of multiply to add 1:5 to 1:100

1/2 bits in multiplicand always 0=> 64-bit adder is wasted0s inserted in right of multiplicand as it is shifted left=> least significant bits of product never changed once

formedInstead of shifting multiplicand to left, shift product toright?


27/72

M ultiply Hardware Version 232-bit M ultiplicand register, 32 -bit ALU , 64-bitP roduct register, 32-bit M ultiplier register

I f M ultiplier0 = 1


28/72

M ultiply Algorithm Version 2

M ultiplicand stays stilland product moves rightP roduct register wastes

space that exactly matchessize of multiplier S o we can combineM ultiplier register andP roduct register


29/72

M ultiply Hardware Version 332-bit M ultiplicand register, 32 -bit ALU , 64-bit P roductregister, ( 0-bit M ultiplier register)

2 steps per bit becauseM ultiplier &P

roductcombined

MIPS registersHi and Lo areleft and righthalf of P roductG ives us MIPS

instructionM ultU 0000000 00000 101100011 1100

32 0 s 32-bit multiplier

If

Product0=1


30/72

M ultiply Algorithm Version 3

Iter. Multiplicand Product

0 1001 0000 1001

1 1001 0000 100 1

add 1001 1001 1001shift 1001 0100 1100 2 1001 0100 110 0

shift 1001 0010 0110

3 1001 0010 011 0

shift 1001 0001 00114 1001 0001 001 1

add 1001 1010 0011

shift 1001 0101 0001

1001 two x 1001 two


31/72

M ultiplication of signed integers

What about signed multiplication?Easiest solution is to make both positive & remember whether to complement product when done (leave out the

sign bit, run for 31 steps)A pply definition of 2s complement. N eed to sign-extend

partial products and subtract at the endB ooths A lgorithm is elegant way to multiply signednumbers using same hardware as before and save cyclesI t can handle multiple bits at a time, thus it is faster


32/72

+ 0000 shift (0 in multiplier)

+ 00 1 0 add (1 in multiplier)+ 00 1 0 add (1 in multiplier)+ 0000 shift (0 in multiplier)

Example 2 x 6 = 0010 x 0110 two:00 1 0

x 0 11 0

M oti v ation for Booths Algorithm

00001100

ALU with add or subtract gets same result in more than one way:6 = 2 + 80 11 0 = 000 1 0 + 0 1 000 = 1111 0 + 0 1 000


33/72

M oti v ation for Booths AlgorithmF or example

00 1 0x 0 11 0

0000 shift (0 in multiplier)

0010 sub (first 1 in multiplier) .0000 shift (mid string of 1s) .

+ 00 1 0 add (prior step had last 1)00001100

B ooths algorithm handles signed products by looking at the strings of 1 s


34/72

N ow the test in the algorithm depends on two bits. Results are placed in the left half of the product register.

C urrent Bit Bit to the Right Explanation Example Op1 0 B egins run of 1s 000 111 10 00 subtract1 1 M iddle of run of 1s 000 11 11 000 no op0 1 End of run of 1s 00 01 111 000 add0 0 M iddle of run of 0s 0 00 1111 000 no op

O riginally for S peed (when shift was faster than add)

Replace a string of 1 s in multiplier with an initial subtract whenwe first see a 1 0 and then later add for the first 0 1

Booths Algorithm


35/72

Booths Example (2 x 7)

1a. P = P - m 111 0 + 111 0111 0 0111 0 shift P (sign extend)

1b. 00 1 0 1111 0 011 1 11 -> nop, shift

2. 00 1 0 1111 1 0 01 1 11 -> nop, shift

3. 00 1 0 1111 11 0 0 1 01 -> add

4a. 00 1 0 +00 1 0

000 1 11 0 0 1 shift

4b. 00 1 0 0000 111 0 0 done

Operation Multiplicand Product register next operation?

0. initial v alue 0010 0000 0111 0 10 -> subtract

mythical bit


36/72

Booths Example (2 x -3) ( 1111 1010 two )

1a. P = P - m 111 0 + 111 0111 0 1101 0 shift P (sign ext)

1b. 00 1 0 1111 0 110 1 01 -> add multiplicand

+ 00 1 0

2a. 000 1 0 110 1 shift P and sign ext.

2b. 00 1 0 0000 1 0 11 0 10 -> sub multiplicand+ 111 0

3a. 00 1 0 111 0 1 0 11 0 shift and sign ext3b. 00 1 0 1111 0 1 0 1 1 11 -> no op4a 1111 0 1 0 1 1 shift4b. 00 1 0 1111 1 0 1 0 1 done

O peration M ultiplicand P roduct next?

0. initial v alue 0010 0000 1101 0 10 -> subtract


37/72

div $s 2 , $s 3

MIPS Div ide Instruction

000000 $s2 $s3 00000 00000 27

op rs rt rd shamt funct

000000 $s2 $s3 00000 00000 26

The division quotient is placed in processor dedicatedregister lo and the remainder is placed in processor register hidiv uses signed integers and result is a signed 64-bit

number.O

verflow and division by 0 are checked insoftwareMIPS uses divu for unsigned divisions


38/72

1 01 0 11 0 1 0

1000

1000

Div ision: P aper & P encil

Divisor 1 000 ten 1 00 1 0 1 0 ten Dividend

A number can be subtracted, creating quotient bit on each stepB inary => 1 * divisor or 0 * divisor

D ivid en d = Q u ot i ent x D ivis or + Rem ai n d er=> | D ivid en d | = | Qu ot i ent | + | D ivis or |

We assume for now unsigned 32-bit integers.3 versions of divide, successive refinement

1 00 1 ten Quotient

1 0 Remainder (or Mo dul o re sul t)


39/72

Div ision Hardware (Version 1 )64-bit Divisor register, 64-bit ALU , 64-bit Remainder register, 32-bit Quotient register

I nitially holds dividend

I nitially holds 0 sI nitially holds divisor

I f Reminder63 = 1, Quotient0=0Reminder63 = 0, Quotient0=1


40/72

Div ide Algorithm Version 1


41/72

O bser v ations on Di v ide Version 1

1/2 bits in divisor always 0=> 1/2 of 64-bit adder is wasted=> 1/2 of divisor is wasted

Instead of shifting divisor to right, shift the remainder to left?1st step cannot produce a 1 in quotient bit(otherwise too big for the register)

=> switch order to shift first and then subtract,can save 1 iteration


42/72

Div ision Hardware (Version 2)32-bit Divisor register, 32-bit ALU , 64-bitRemainder register, 32-bit Quotient register

I f Reminder63 = 1, Quotient0=0Reminder63 = 0, Quotient0=1


43/72

O bser v ations on Di v ide Version 2We can eliminate Quotient register by combining with

Remainder as it is shifted leftS tart by shifting the Remainder left as before.Thereafter loop contains only two steps because the

shifting of the Remainder register shifts both theremainder in the left half and the quotient in the right half The consequence of combining the two registers together and the new order of the operations in the loop is that the

remainder will be shifted left one time too many.Thus the final correction step must shift back only the

remainder in the left half of the register


44/72

Div ision Hardware (Version 3)

32-bit Divisor reg, 32 -bit ALU , 64-bit Remainder reg,(0-bit Quotient reg)

A t the end right half

holds quotient

A t thestartdividendis here


45/72

Div ide Algorithm Version 3

Iteration Divisor Remainder reg oper?0. 0010 0000 0111 in. val0 a 0010 0000 1110 shft lf 1 1 a 0010 1 110 1110 Rem = Rem- Div1 b 0010 0000 1110 Rem = Rem+ Div

0001 1100 sll Rem , R 0=02a 0010 1 111 1100 Rem = Rem- Div

2b 0010 0001 1100 Rem = Rem+ Div0011 1000 sll Rem , R 0=03a 0010 0 001 1000 Rem = Rem- Div3b 0010 0011 0001 sll Rem , R 0=1

4a 0010 0 001 0001 Rem = Rem- Div4b 0010 0010 0011 sll Rem , R 0=1Shift left half

Of Rem right 1 0001 0011

Div ide 0000 0111 by 0010 (2s 1110 )


46/72

O bser v ations on Di v ide Version 3S ame Hardware as M ultiply: just need ALU to add or subtract,

and 64-bit register to shift left or shift rightHi and Lo registers in MIPS combine to act as 64-bit register for multiply and divideS igned Divides: S implest is to remember signs, make positive,

and complement quotient and remainder if necessary N ote: Dividend and Remainder must have same sign N ote: Quotient negated if Divisor sign & Dividend signdisagree

e.g., 7 2 = 3, remainder = 1A nd 7 ( 2)= 3, remainder = 1P ossible for quotient to be too large: if divide 64-bit integer by1, quotient is 64 bits (called saturation)


47/72

Re v iew: MIPS Instructions, so far C ategory Instruction Op C ode Example Meaning

A rithmetic(R & I format)

add 0 and 32 add $s1, $s2, $s3 $s1 = $s2 + $s3add unsigned 0 and 33 addu $s1, $s2, $s3 $s1 = $s2 + $s3

subtract 0 and 34 sub $s1, $s2, $s3 $s1 = $s2 - $s3

subt unsigned 0 and 35 subu $s1, $s2, $s3 $s1 = $s2 - $s3

add immediate 8 addi $s1, $s2, 6 $s1 = $s2 + 6

add immediateunsigned

9 addiu $s1, $s2, 6 $s1 = $s2 + 6

multiply 0 and 24 mult $s1, $s2 hi || lo = $s1 * $s2

multiplyunsigned

0 and 25 multu $s1, $s2 hi || lo = $s1 * $s2

divide 0 and 26 div $s1, $s2 lo = $s1/$s2,remainder in hi

divideunsigned

0 and 27 divu $s1, $s2 lo = $s1/$s2,remainder in hi


48/72

Re v iew: MIPS IS A, continuedC ategory Instr Op C ode Example MeaningS hift(R format)

sll 0 and 0 sll $s1, $s2, 4 $s1 = $s2 > 4

sra 0 and 3 sra $s1, $s2, 4 $s1 = $s2 >> 4

DataTransfer (I format)

load word 35 lw $s1, 24($s2) $s1 = M emory($s2+24)

store word 43 sw $s1, 24($s2) M emory($s2+24) = $s1

load byte 32 lb $s1, 25($s2) $s1 = M emory($s2+25)load byteunsigned

36 lbu $s1, 25($s2) $s1 = M emory($s2+25)

store byte 40 sb $s1, 25($s2) M emory($s2+25) = $s1

load upper imm 15 lui $s1, 6 $s1 = 6 * 2 16

move from hi 0 and 1 6 mfhi $ s 1 $ s 1 = hi move to hi 0 and 1 7 mthi $ s 1 hi = $ s 1

move from lo 0 and 18 mflo $ s 1 $ s 1 = lo

move to lo 0 and 19 mtlo $ s 1 lo = $ s 1


49/72

Review: MIPS ISA- continuedC ategory Instr Op C ode Example Meaning

Cond.B ranch(I & R format)

br on equal 4 beq $s1, $s2, L if ($s1==$s2) go to L

br on not equal 5 bne $s1, $s2, L if ($s1 !=$s2) go to L

set on less than 0 and 42 slt $s1, $s2, $s3 if ($s2


50/72

F loating- P ointWhat can be represented in N bits?U nsigned 0 to 2 N

2s Complement - 2 N- 1 to 2 N- 1 - 1

With MIPS singed integers the 32-bit architecture allows

us to represent numbers in the range

2. 1 5 109

B ut, what about very large numbers?9,349,398,989,787,762,244,859,087,678

What about very small numbers?0.0000000000000000000000045691

F loating point representation allows much larger rangeat the expense of accuracy


51/72

S cientific N otation

1.0 2 x 1 0 -1.673 x 1 023 -24

radix (base)

N ormalized notation single number to theleft of decimal point (no leading 0s)

significandS ign, magnitude

Exponent - how many digits the decimal point is moved

to left to get to 1S ign, magnitude

1.xxxxx.. 2yyyyyyy

radix (base 2)

Binary notation (binary floating point)

N umber of x s determines accuracy

N umber of y s determines range

Decimal notation


52/72

MIPS Register bit allocationRepresentation of floating point means that the binary point

floats - to get a non-0 bit before it. The binary point is not fixed.S ince number of bits in register is fixed - we need to compromise

1 8 bits 23 bitss ign

Signed exponent manti ss a:sign + magnitude, normalized

binary significand

S E M

Representation of 2. 0 ten x 10 -3 8 to 2. 0 ten x 10 3 8 . I n double- precision two registers are used. This increases the range to2. 0 ten x 10 -3 08 to 2. 0 ten x 10 3 08 . S ignificand now has 52 bits.

1 11 bits 2 0 bitss ign

Signed exponent

S E M

M

32 bits

When exponent is too large or too small an exceptionOverflow , or underflow


53/72

IEEE 754 F loating P oint S tandardRepresentation of floating point numbers in IEEE 754 standard -

assures uniformity across computer implementations.The 1 in significand is implicit - 0 ten is given as a reserved

representation 00 .. 000 two

The rest of the numbers are

N = (- 1 ) 2 ( 1 . M )S E- 1 27P lacing the sign bit and exponent first makes easier integer

comparisons for sorting

U sing an exponent bias allows exponent to be unsigned , smallest being 00000 two largest being 11 111 two . M akes comparisonseasier

Double precision bias is 1023 ten. U nderflow and overflow can stilloccur


54/72

IEEE 754 - continuedI f the 23 significand bits are numbered from left-to-right then the

floating point number represented by these bits is

N = (- 1 ) v ( 1 +s 1 2 - 1 +s 2 2 -2 + .. +s 23 2 -23 ) v2 (E-bias)S

S o the register containing the bits

1 1000 0011 111000 .. 0

represents

N = (- 1 ) v ( 1 +2 - 1 +2 -2 +2 -3 ) v2 (27 +2 1 +2 0 - 1 27)

N = - 1 . 8 75 v2 ( 1 3 1 - 1 27)

N = - 1 . 8 75 v2 4 = - 1 . 8 75 v 1 6 = -3 0


55/72

ExerciseS how the I EEE 754 representation of 1 0 ten in single and double

precision

The sign bit is 0 , the exponent is 3+127 = 130= 1 000 00 1 0 two

10 ten = 1010 two = 1 . 01 v2 3 in normalized notation

0 1000 0010 01000 .. 0

23 bitssp

20 bits0 100 0000 0010 01000 .. 0 dp

0 0000 00 00000 .. 0 32 bits

I n double precision the exponent is 3+1023 = 1026= 1 00 0000 00 1 0 two


56/72

MIPS F loating P oint InstructionsU se special registers $ f 0 $ f3 1 . F or double precision

$ f 0 , $ f2 , .. $ f3 0 (which are in fact pairs of registers).

The advantage of having separate registers for floating point operations we have twice as many registers

available andWe maintain the same number of bits in the instruction

format.

Disadvantage need more instructions need special instructions to load words into the floating point registers.


57/72

MIPS F loating P oint Instructions

R2000 C P U


58/72

MIPS F loating P oint InstructionsF loating point registers have special load and store instructions,

but still use integer registers to store the base address for the loadedword

load word in coprocessor 1

lwc 1 $f1, 1 00( $s 2) # $f1 =memory[ $s 2+ 1 00]store word from coprocessor 1swc 1 $f1, 1 00( $s 2) # memory[ $s 2+ 1 00]= $f 2

there are move instructions to move data between integer andfloating point registers

mtc 1 $

t 1, $

f5 #mo v e s $ t 1 i nto $f5 mfc 1 .d $ t2 , $ f2 #mo v e s $f 2 , $f 3 i nto $ t2 , $ t3


59/72

MIPS F loating P oint InstructionsThe suffix .s or .d in an instruction specifies it is floating point.

.s if it is single and .d if it is double precision.add.s $f1, $f 4 , $f5 # $f1 = $f 4+ $f5sub.s $f 2 , $f 4 , $f 6 # $f 2= $f 4- $f 6

mul.s $f 2 , $f 4 , $f 6 # $f 2= $f 4x $f 6div.s $f 2 , $f 4 , $f 6 # $f 2= $f 4/ $f 6

To load two floating point numbers, subtract them and place theresult into memory the code is

lwc 1 $f 4 , 1 00( $s 2) # l o ads a 32- bi t fl o a t i ngp o i nt n u mb er i nto $f 4

lwc 1 $f 6 , 200( $s 2) # l o ads a nother 32- bi tfl o a t i ng p o i nt n u mb er i nto $f 6sub.s $f1 0 , $f 4 , $f 6 # $f1 0= $f 4- $f 6swc 1 $f1 0 , 240( $s 2) # s tore 32- bi t fl o a t i ng

p o i nt i nto memory


60/72

MIPS F loating P oint InstructionsF low operations also have floating point variantsC

ompare less than single andC

ompare less than doublec.lt.s $f 2 , $f 4 # if $f 2< $f 4 condition =tr u e ,

otherw is e condition bi t is fals e

c.lt.d $f 2 , $f 4 # c ompa re i n d o ubl e p re cisi onR-type instructionO ther conditions can also be tested (eq , ne , gt , et c . )F loating point has branch operations based on the state of the

condition bit B ranch if FP comparison is true (true==1)

bc 1 t 2 5 #j u mp to add re ss PC+4+ 1 00 if tr u e B ranch if FP comparison is false (true==0) bc 1 f 1 00 #j u mp to add re ss PC+4+400 if fals e

I -type instructions


61/72

Basic Addition AlgorithmF or addition (or subtraction) there are specific steps that are taken to

make sure the proper digits are added:(1) the decimal (or binary) point has to be aligned

(2) this means that the significand of the smaller number is shifted tothe right until the decimal points are aligned.

(3) then the addition of the significand takes place

(4) the result needs to be normalized , which means the decimal pointis shifted left and exponent increases.

(5) the result needs to be truncated to available number of digits andround off (add 1 to the last available digit if number to the right is 5 or larger)


62/72

Basic Addition AlgorithmA ddition example

add 9.999 10 1 and 0.1610 10 0

first shift to align decimal point0.01610 10 1

then truncate 0.016 10 1

then add9.999 101

+ 0.016 101

10.015 101

after normalization 1.0015 10 2

after rounding 1.002 10 2


63/72

Arithmetic Unit for F P additionS ubtract todeterminewhich issmaller

O utput thelarger exponent

S ignificand of the smaller number

S ignificandof the larger number

A ddsignificands

F inal result

Round off significand


64/72

Extra Bits for roundingI EEE 754 allows the use of guard and round bits added to the

allowed number of bits to reduce round-off errorsHow many extra bits?I EEE: A s if computed the result exactly and rounded.

G uard Digits : digits to the right of the first P digits of significand toguard against loss of digits can later be shifted left into first P

places during normalization.A ddition: carry-out shifted inS ubtraction: borrow digit and guardM ultiplication: carry and guard, Division requires guardStic k y bit is set to 1 if there are nonzero bits to the right of theround bit helps deal with rounding numbers like 2.345


65/72

Extra Bits for roundingExercise add numbers 8.76 10 1 and 1.47 10 2 with only three

allowed significand digits : a) use guard and round digits;

b) do not use these two digitsa) We extend the numbers with the two digits

8.7600 101 and align 0.87600 10 2 and 1.4700 10 2

Then we add the significands 0.8760

+ 1.4700

2.3460A fter rounding off to three significand digits 2.35 10 2

b) Without guard and round digits 0.87

+ 1.47

2.34 or 2.3 4 10 2

Results differ


66/72

Using the sticky bit in roundingExercise add numbers 5.01 10 -1 and 1.34 10 2 with only three

allowed significand digits : a) use guard, round bits and sticky bits; b) use only guard, round bits

a) We extend the numbers with the two digits

5.0100 10 -1 and align 0.0050100 10 2 and 1.3400 10 2

Then we add the significands 0.0050+ 1.3400

1.34 50A fter rounding off to three significand digits 1.35 10 2 because sticky

bit was 1 b) Without sticky bit result is 1 .34 10 2

Results differ


67/72

F loating P oint M ultiplicationB iased exponents are added then one

bias is subtracted

1.110 1010 x 9.200 10 -5

N ew exponent is 10-5=5

when using bias 137 (which is10+127) + 122 (is -5+127)-127=132S ignificands are multiplied10.212000 tenU normalized product is 10.212 ten 105 N ormalized is 1.0212 ten 106. Roundedto four digits 1.021 ten 106S ign is determined by the sign of bothoperands + 1.021 ten 106


68/72

ExerciseA single-precision I EEE number is stored at memory address X.

Write a sequence of MIPS instructions to multiply the number at X

by 2 and store the results back at location X. A ccomplish thiswithout using any floating point instructions (Dont worry aboutoverflow).

a) M ultiplying by 2 is same as adding 1 to the exponent field

0 1000 0100 1000 .. 0

23 bits$t0

0 0000 0000 0000 .. 1

23 bits$s0after addition

lw $ t0 , X( $ 0)

addi $s 0 , $ 0 , 1sll $s 0 , $s 0 , 23

addu $ t0 , $ t0 , $s 0

sw $ t0 , X( $ 0)


69/72

ExerciseX=0100 0110 1101 1000 0000 0000 0000 two andY=1011 1110 1110 0000 0000 0000 0000 two represent

single-precision I EEE 754 floating point numbers .a) What is x + y ?

b) What is x * y ?a) Remember that

1 8 bits 23 bitss ign

Signed exponent manti ss a:sign + magnitude, normalizedbinary significand

S E M

Convert +1 . 1011 *2 1 4 + 1 . 11 *2 21 . 1011 0000 0000 0000 0000 000

0 . 0000 0000 0000 0001 1100 000

1 . 1010 1111 1111 1110 0100 0000100 0110 1101 0111 1111 1111 0010 0000


70/72

Exercise - continued b) What is x * y ?F irst we calculate a new exponent

111 11 1100 01101

+011 111011000 01010 011 11111 minus bias

1111 1111100 01011 new exponent

Then we multiply 1 . 101 1000 0000 0000 0000 0000the significands 1 . 110 0000 0000 0000 0000 0000

1 11 111 1011 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000

11 0110 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000+1.10 1100 0000 0000 0000 0000 0000 0000 0000 0000 0000 000010.11 1101 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000


71/72

Exercise - continued

N ormalize and round:exponent 100 0110 0

significand1.011 1010 0000 0000 0000 0000

S igns differ, so result is negative:1100 0110 0011 1010 0000 0000 0000 0000

E i


72/72

Exercise

I A -32 uses 82-bit registers, allowing 64-bit significands and 16-bitexponents:

- what is the bias in the exponent?- Range of numbers?- How much greater accuracy compared to double precision?

a) There are 15 bits available for the exponent size,thus bias is 2 15 - 1 = 327 6 7 ;

a) range of numbers is 2 . 0 x 1 0 98 6 4 to 2 . 0 x 1 0 98 6 4

b) accuracy is 20% better double precision range was 2. 0 ten x 10 -3 08 to 2. 0 ten x 10 3 08 sorange is 32 times larger (9834/308)

3.Computer.arithmetic

Documents

Transcript of 3.Computer.arithmetic