Advanced ALU Lecture 5 CS365. D. Barbara Advanced ALU CS465 F08 2 RoadMap Implementation of MIPS...
-
Upload
primrose-singleton -
Category
Documents
-
view
214 -
download
0
Transcript of Advanced ALU Lecture 5 CS365. D. Barbara Advanced ALU CS465 F08 2 RoadMap Implementation of MIPS...
Advanced ALU CS465 F082
D. Barbara
RoadMap Implementation of MIPS ALU
Signed and unsigned numbers (Section 3.2) Addition and subtraction (Section 3.3) Constructing an arithmetic logic unit (Appendix
B.5, B.6) Multiplication (Section 3.4) Division (Section 3.5) Floating point (Section 3.6)
This lecture
Advanced ALU CS465 F083
D. Barbara
Paper and pencil example (unsigned): Multiplicand 1000
Multiplier 1001 1000 0000 0000 1000
Product 01001000 m bits x n bits = m+n bit product (at most) Binary makes it easy:
0 => place 0 ( 0 x multiplicand) 1 => place a copy ( 1 x multiplicand)
4 versions of multiply hardware & algorithm: Successive refinement
Unsigned Multiply
Advanced ALU CS465 F084
D. Barbara
B0
A0A1A2A3
A0A1A2A3
A0A1A2A3
A0A1A2A3
B1
B2
B3
P0P1P2P3P4P5P6P7
0 0 0 0
Unsigned Combinational Multiplier
Stage i accumulates A * 2i if Bi == 1
Advanced ALU CS465 F085
D. Barbara
B0
A0A1A2A3
A0A1A2A3
A0A1A2A3
A0A1A2A3
B1
B2
B3
P0P1P2P3P4P5P6P7
0 0 0 00 0 0
How Does It Work?
At each stage shift A left ( x 2) Use next bit of B to determine whether to add in shifted
multiplicand Accumulate 2n bit partial product at each stage
Advanced ALU CS465 F086
D. Barbara
Product
Multiplier
Multiplicand
64-bit ALU
Shift Left
Shift Right
WriteControl
32 bits
64 bits
64 bits
Unsigned Multiplier (Version 1) Shift-add multiplier
64-bit Multiplicand reg, 64-bit ALU, 64-bit Product reg, 32-bit Multiplier reg
Figure 3.5
Advanced ALU CS465 F087
D. Barbara
3. Shift the Multiplier register right 1 bit.
Done
Yes: 32 repetitions
2. Shift the Multiplicand register left 1 bit.
No: < 32 repetitions
1. TestMultiplier0
Multiplier0 = 0Multiplier0 = 1
1a. Add multiplicand to product & place the result in Product register
32nd repetition?
Start
Multiply Algorithm Version 1
0010 x 0011
Product Multiplier Multiplicand0000 0000 0011 0000 00100000 0010 0001 0000 01000000 0110 0000 0000 10000000 0110 0000 0001 00000000 0110
Figure 3.6
Advanced ALU CS465 F088
D. Barbara
Observations on Multiply Version 1 One clock cycle per step => ~100 clocks per
multiply
Half bits in multiplicand always 0 => 64-bit adder is wasted
0 is inserted when shift the multiplicand left=> least significant bits of product never changed once formed
Instead of shifting multiplicand to left, shift product to right?
Advanced ALU CS465 F089
D. Barbara
B0
B1
B2
B3
P0P1P2P3P4P5P6P7
0 0 0 0
A0A1A2A3
A0A1A2A3
A0A1A2A3
A0A1A2A3
How Does It Work?
Multiplicand stays still and product moves right
Advanced ALU CS465 F0810
D. Barbara
Product
Multiplier
Multiplicand
32-bit ALU
Shift Right
WriteControl
32 bits
32 bits
64 bits
Shift Right
Unsigned Multiplier Version 2 Refined from the first version: 32-bit
Multiplicand reg, 32 -bit ALU, 64-bit Product reg, 32-bit Multiplier reg
Advanced ALU CS465 F0811
D. Barbara
3. Shift the Multiplier register right 1 bit.
Done
Yes: 32 repetitions
2. Shift the Product register right 1 bit.
No: < 32 repetitions
1. TestMultiplier0
Multiplier0 = 0Multiplier0 = 1
1a. Add multiplicand to the left half of product & place the result in the left half of Product register
32nd repetition?
Start
0000 0000 0011 00101: 0010 0000 0011 00102: 0001 0000 0011 00103: 0001 0000 0001 00101: 0011 0000 0001 00102: 0001 1000 0001 00103: 0001 1000 0000 00101: 0001 1000 0000 00102: 0000 1100 0000 00103: 0000 1100 0000 00101: 0000 1100 0000 00102: 0000 0110 0000 00103: 0000 0110 0000 0010
0010 x 0011
Product Multiplier Multiplicand
Multiply Algorithm Version 2
Advanced ALU CS465 F0812
D. Barbara
Observations on Multiply Version 2 Useful portion of multiplier shrinks while
valid portion of product grows in the process of multiplication
Product register wastes space that exactly matches size of multiplier
=>Combine Multiplier register and Product register?
Advanced ALU CS465 F0813
D. Barbara
Product (Multiplier)
Multiplicand
32-bit ALU
WriteControl
32 bits
64 bits
Shift Right
MULTIPLY Hardware Version 3 32-bit Multiplicand reg, 32 -bit ALU, 64-bit
Product reg, (0-bit Multiplier reg)
Advanced ALU CS465 F0814
D. BarbaraDone
Yes: 32 repetitions
2. Shift the Product register right 1 bit.
No: < 32 repetitions
1. TestProduct0
Product0 = 0Product0 = 1
1a. Add multiplicand to the left half of product & place the result in the left half of Product register
32nd repetition?
Start
Multiply Algorithm Version 3
Multiplicand Product0010 0000 0011
0010 00110010 0001 0001
0011 00010010 0001 10000010 0000 11000010 0000 0110
0010 x 0011
Advanced ALU CS465 F0815
D. Barbara
Multiply Version 3 Two steps per bit because Multiplier &
Product combined MIPS multiplication (unsigned)multu $t1, $t2 # t1 * t2 No destination register: product could be ~264;
need two special registers to hold it Registers Hi and Lo are left and right half of
Product To use the result:
mfhi $t3mflo $t4
Advanced ALU CS465 F0816
D. Barbara
Signed Multiplication What about signed multiplication?
Easiest solution is to make both positive & remember whether to complement product when done
Apply definition of 2’s complement Need to sign-extend partial products
Booth’s algorithm is an elegant way to multiply signed and unsigned numbers using the same hardware and save cycles
Can handle multiple bits at a time
Advanced ALU CS465 F0818
D. Barbara
Example 2 x 6 = 0010 x 0110: 0010
x 0110 + 0000 shift (0 in multiplier)
+ 0010 add (1 in multiplier)+ 0010 add (1 in multiplier)+ 0000 shift (0 in multiplier) 00001100
Basic idea: replace a string of 1s with an initial subtract on seeing a one and add after last one
0010 x 0110 0000 shift (0 in multiplier)
– 0010 sub (first 1 in multiplier) 0000 shift (mid string of 1s) + 0010 add (prior step had last 1) 00001100
Motivation for Booth’s Algorithm
Can get the same result in more than one way:6= – 2 + 8 0110 = – 0010 + 1000
0010 x (-2)
0010 x 8
Advanced ALU CS465 F0819
D. Barbara
0 1 1 1 1 0beginning of runend of run
middle of run
–1+ 1000001111
Booth’s Algorithm
Current Bit to Explanation Example Op bit right
1 0 Begins run of 1s 0001111000 sub 1 1 Middle of run of 1s 0001111000 none 0 1 End of run of 1s 0001111000 add 0 0 Middle of run of 0s 0001111000 none
Originally for speed when shift was faster than add Big advantage: it handles signed number
Why it works?
Advanced ALU CS465 F0820
D. Barbara
Booth’s Algorithm1. Depending on the current and previous
bits, do one of the following:00: Middle of a string of 0s, no arithmetic op01: End of a string of 1s, so add multiplicand to the left half of the product10: Beginning of a string of 1s, so subtract multiplicand from the left half of the product11: Middle of a string of 1s, so no arithmetic op
2. As in the previous algorithm, shift the Product register right (arithmetically) 1 bit
Advanced ALU CS465 F0821
D. Barbara
step Multiplicand Product operation
0. initial value 0010 0000 0111 0 10 -> sub
Booths Example (2 x 7)
1. P = P - m 1110 +1110
1110 0111 0 shift P(sign ext)
2. 0010 1111 0011 1 11 -> nop, shift
3. 0010 1111 1001 1 11 -> nop, shift
4. 0010 1111 1100 1 01 -> add
0010 +0010
0001 1100 1 shift
0010 0000 1110 0 done
Advanced ALU CS465 F0822
D. Barbara
step Multiplicand Product operation
0. initial value 0010 0000 1101 0 10 -> sub
Booths Example (2 x -3)
1. P = P - m 1110 +1110 1110 1101 0 shift P(sign
ext)2. 0010 1111 0110 1 01 -> add
0010 + 0010 0001 0110 1 shift P
3. 0010 0000 1011 0 10 -> sub 1110 +1110
0010 1110 1011 0 shift4. 0010 1111 0101 1 11 -> nop
1111 0101 1 shift 0010 1111 1010 1 done
Advanced ALU CS465 F0823
D. Barbara
Outline Implementation of MIPS ALU
Signed and unsigned numbers (Section 3.2) Addition and subtraction (Section 3.3) Constructing an arithmetic logic unit (Appendix
B.5, B.6) Multiplication (Section 3.4)
Booth’s algorithm: CD: 3.23 In More Depth Division (Section 3.5) Floating point (Section 3.6)
Advanced ALU CS465 F0824
D. Barbara
1001 QuotientDivisor 1000 1001010 Dividend
–1000 10 101 1010 –1000 10 Remainder (or Modulo result)
See how big a number can be subtracted, creating quotient bit on each step Binary => 1 * divisor or 0 * divisor
Dividend = Quotient x Divisor + Remainder 3 versions of divide, successive refinement
DIVIDE: Paper & Pencil
Advanced ALU CS465 F0825
D. Barbara
Remainder
Quotient
Divisor
64-bit ALU
Shift Right
Shift Left
WriteControl
32 bits
64 bits
64 bits
DIVIDE Hardware Version 1 64-bit Divisor reg, 64-bit ALU, 64-bit
Remainder reg, 32-bit Quotient reg
Advanced ALU CS465 F0826
D. Barbara
2b. Restore the original value by adding the Divisor register to the Remainder register, &place the sum in the Remainder register. Alsoshift the Quotient register to the left, setting the new least significant bit to 0.
Test Remainder
Remainder < 0Remainder 0
1. Subtract the Divisor register from the Remainder register, and place the result in the Remainder register.
2a. Shift the Quotient register to the left, setting the new rightmost bit to 1.
3. Shift the Divisor register right 1 bit.
Done
Yes: n+1 repetitions (n = 4 here)
Start: Place Dividend in Remainder
n+1repetition?
No: < n+1 repetitions
Divide Algorithm Version 1 Takes n+1 iterations for n-
bit Quotient & RemainderQuot. Divisor Rem.
0000 00100000 00000111 11100111 000001110000 00010000 00000111 11110111 000001110000 00001000 00000111 11111111 000001110000 00000100 00000111 000000110001 000000110001 00000010 00000011 000000010011 000000010011 00000001 00000001
Advanced ALU CS465 F0827
D. Barbara
Observations on Divide Version 1 1st step cannot produce a 1 in quotient bit
(otherwise quotient is too big for the register) => switch order to shift first and then subtract => save 1 iteration
1/2 bits in divisor always 0 => 1/2 of 64-bit adder is wasted => 1/2 of divisor is wasted
Instead of shifting divisor to right, shift remainder to left?
Advanced ALU CS465 F0828
D. Barbara
Remainder
Quotient
Divisor
32-bit ALU
Shift Left
WriteControl
32 bits
32 bits
64 bits
Shift Left
DIVIDE Hardware Version 2 32-bit Divisor reg, 32-bit ALU, 64-bit
Remainder reg, 32-bit Quotient reg
Advanced ALU CS465 F0829
D. Barbara
3b. Restore the original value by adding the Divisor register to the left half of the Remainder register, &place the sum in the left half of the Remainder register. Also shift the Quotient register to the left, setting the new least significant bit to 0.
Test Remainder
Remainder < 0Remainder 0
2. Subtract the Divisor register from the left half of the Remainder register, & place the result in the left half of the Remainder register.
3a. Shift the Quotient register to the left setting the new rightmost bit to 1.
1. Shift the Remainder register left 1 bit.
Done
Yes: n repetitions
nthrepetition?
No: < n repetitions
Start: Place Dividend in Remainder
Divide Algorithm Version 2
Advanced ALU CS465 F0830
D. Barbara
Observations on Divide Version 2 Remainder shrinks while Quotient grows Eliminate Quotient register by combining
with Remainder register Start by shifting the Remainder left as before Only shifting once for both Remainder(left half)
and Quotient(right half): 2 steps each iteration Another consequence is that the remainder
will be shifted left one more time Thus the final correction step must shift back only
the remainder in the left half of the register
Advanced ALU CS465 F0831
D. Barbara
Remainder (Quotient)
Divisor
32-bit ALU
WriteControl
32 bits
64 bits
Shift Left“HI” “LO”
DIVIDE HARDWARE Version 3 32-bit Divisor reg, 32 -bit ALU, 64-bit
Remainder reg, (0-bit Quotient reg)
Advanced ALU CS465 F0832
D. Barbara
3b. Restore the original value by adding the Divisor register to the left half of the Remainder register, &place the sum in the left half of the Remainder register. Also shift the Remainder register to the left, setting the new least significant bit to 0.
Test Remainder
Remainder < 0Remainder ≥ 0
2. Subtract the Divisor register from the left half of the Remainder register, & place the result in the left half of the Remainder register.
3a. Shift the Remainder register to the left setting the new rightmost bit to 1.
1. Shift the Remainder register left 1 bit.
Done. Shift left half of Remainder right 1 bit
Yes: n repetitions (n = 4 here)
nthrepetition?
No: < n repetitions
Start: Place Dividend in Remainder
Divide Algorithm Version 3Step Remainder Div.0 0000 0111 00101.1 0000 11101.2 1110 11101.3b 0001 11002.2 1111 11002.3b 0011 10003.2 0001 10003.3a 0011 0001
4.2 0001 00014.3a 0010 0011 0001 0011
Advanced ALU CS465 F0833
D. Barbara
More Division Issues Signed divides: simplest is to remember signs,
make positive, and complement quotient and remainder if necessary Satisfy Dividend =Quotient x Divisor + Remainder Note: Dividend and Remainder must have same sign Note: Quotient negated if Divisor sign & Dividend sign
disagree, e.g., –7 ÷ 2 = –3, remainder = –1 Quotient = 4 and remainder =1 is not correct
Possible for quotient to be too large If divide 64-bit integer by 1, quotient is 64 bits (“called
saturation”) MIPS ignores overflow, SW detection and processing
Advanced ALU CS465 F0834
D. Barbara
Recap Same hardware for Divide as Multiply: just need
ALU to add or subtract, and 64-bit register to shift left or shift right Hi and Lo registers in MIPS combine to act as 64-bit
register for multiply and divide div $s2, $s3 #Lo=$S2/$s3, Hi=$s2 mod $s3
Successive refinement to see final design Booth’s algorithm to handle signed multiplies
Overflow Both MIPS multiply and divide instructions ignore
overflow Software must check for overflow scenarios
Advanced ALU CS465 F0836
D. Barbara
Outline Implementation of MIPS ALU
Signed and unsigned numbers (Section 3.2) Addition and subtraction (Section 3.3) Constructing an arithmetic logic unit (Appendix
B.5, B.6) Multiplication (Section 3.4, CD: 3.23 In More
Depth) Division (Section 3.5) Floating point (Section 3.6)
Advanced ALU CS465 F0837
D. Barbara
Floating-Point: Motivation What can be represented in N bits?
Unsigned 0 to 2N-1 2’s Complement - 2N-1 to 2N-1 - 1 1’s Complement -2N-1+1to 2N-1-1 Binary Coded Decimal 0 to 10N/4 –1
But, what about? Very large numbers?
9,349,398,989,787,762,244,859,087,678 Very small number?
0.0000000000000000000000045691 Rational numbers 2/3 Irrational numbers pi
Advanced ALU CS465 F0838
D. Barbara
1.01 x 223
exponent
radix (base)Significand(Mantissa)
binary point
Recall Scientific Notation: Binary
Normalized form: no leading 0s, exactly one digit to left of decimal/binary point
Alternatives to represent 1/1,000,000,000 Normalized: 1.0 x 10-9
Not normalized: 0.1 x 10-8, 10.0 x 10-10
Computer arithmetic that supports scientific notation numbers is called floating point, because the binary point is not fixed, as it is for integers
Advanced ALU CS465 F0839
D. Barbara
S represents signExponent represents y’sFraction represents x’s
Represent numbers as small as 2.0 x 10-38 to as large as 2.0 x 1038
Normalized format: 1.xxxxxxxxxxtwo 2yyyytwo
Want to put it into multiple words: 32 bits for single-precision and 64 bits for double-precision
A simple single-precision representation:031
S Exponent30 2322
Fraction
1 bit 8 bits 23 bits
FP Representation
Advanced ALU CS465 F0840
D. Barbara
Next multiple of word size (64 bits)
Double precision (vs. single precision) Represent numbers almost as small as
2.0 x 10-308 to almost as large as 2.0 x 10308 But primary advantage is greater accuracy
due to larger fraction
031S Exponent
30 20 19Fraction
1 bit 11 bits 20 bits
Fraction (cont’d)
32 bits
Double Precision Representation
Advanced ALU CS465 F0841
D. Barbara
Regarding single precision, DP similar Sign bit:
1 means negative0 means positive
Significand: To pack more bits, leading 1 implicit for normalized
numbers Significand = 1 + fraction: 1 + 23 bits single, 1 + 52
bits double Always true: 0 < fraction < 1
(for normalized numbers) Note: 0 has no leading 1, so reserve exponent
value 0 just for number 0
IEEE 754 Standard (1/4)
Advanced ALU CS465 F0842
D. Barbara
0 1111 1111000 0000 0000 0000 0000 00001/20 0000 0001000 0000 0000 0000 0000 00002
Exponent: Need to represent positive and negative exponents Also want to compare FP numbers as if they were integers,
to help in value comparisons If use 2’s complement to represent?
e.g., 1.0 x 2-1 versus 1.0 x2+1 (1/2 versus 2)
IEEE 754 Standard (2/4)
If we use integer comparison for these two words, we will conclude that 1/2 > 2!!!
Advanced ALU CS465 F0843
D. Barbara
Biased (Excess) Notation Remapping: shift all values by subtracting a bias from the
unsigned representation Biased 7
•All-zero pattern is the smallest; all-one pattern is the largest •Now integer comparison can be used directly to compare exponents
Bit-pattern biased-7 value
0000 -70001 -60010 -50011 -40100 -30101 -20110 -10111 0
Bit-pattern biased-7 value
1000 11001 21010 31011 41100 51101 61110 71111 8
Advanced ALU CS465 F0844
D. Barbara
1/2 0 0111 1110000 0000 0000 0000 0000 00000 1000 0000000 0000 0000 0000 0000 00002
Use biased notation for exponents, where bias is the number subtracted to get the real number IEEE 754 uses bias of 127 for single precision:
subtract 127 from Exponent field to get actual value for exponent
1023 is the bias for double precision
All-zero and all-one exponent patterns are reserved for special numbers (Fig 3.15)
IEEE 754 Standard (3/4)
Advanced ALU CS465 F0845
D. Barbara
Summary (single precision):
(-1)S x (1.Fraction) x 2(Exponent-127)
Double precision identical, except longer fraction and exponent segments with exponent bias of 1023
031S Exponent30 2322
Fraction
1 bit 8 bits 23 bits
IEEE 754 Standard (4/4)
Advanced ALU CS465 F0846
D. Barbara
0 0110 1000101 0101 0100 0011 0100 0010
Sign: 0 => positive Exponent:
0110 1000two = 104ten
Bias adjustment: 104 - 127 = -23 Fraction:
1+2-1+2-3 +2-5 +2-7 +2-9 +2-14 +2-15 +2-17 +2-22
= 1.0 + 0.666115
Represents: 1.666115ten2-23 1.986 10-7
Example: FP to Decimal
Advanced ALU CS465 F0847
D. Barbara
1 0111 1110100 0000 0000 0000 0000 0000
Example 1: Decimal to FP Number = - 0.75
= - 0.11two 20 (scientific notation)
= - 1.1two 2-1 (normalized scientific notation)
Sign: negative => 1 Exponent:
Bias adjustment: -1 +127 = 126 126ten = 0111 1110two
Advanced ALU CS465 F0848
D. Barbara
A more difficult case: representing 1/3?= 0.33333…10 = 0.0101010101… 2 20
= 1.0101010101… 2 2-2
Sign: 0 Exponent = -2 + 127 = 12510=011111012
Fraction = 0101010101…
Example 2: Decimal to FP
0 0111 11010101 0101 0101 0101 0101 010
Advanced ALU CS465 F0849
D. Barbara
Special Numbers What have we defined so far? (single precision)
Exponent Fraction Object0 0 00 nonzero ???1-254 anything +/- floating-point255 0 ???255 nonzero ???
Range:1.0 2-126 1.8 10-38
What if result too small? (>0, < 1.8x10-38 => Underflow!)
(2 – 2-23) 2127 3.4 1038
What if result too large? (> 3.4x1038 => Overflow!)
Advanced ALU CS465 F0850
D. Barbara
Denormalized Numbers For single-precision f. p. (page 217)
0 is 00000000000000000000000000000000 Smallest positive normalized number is
1.0000 0000 0000 0000 0000 000 x 2-126
This is actually a pretty big gap... Gradual underflow: allow a number to degrade in
significance until it becomes 0 By getting rid of implicit leading 1 in front of the fraction...
we can represent a smaller number We denote a denormalized number by 0 exponent and a
non-zero fraction The smallest denormalized number is
0.0000 0000 0000 0000 0000 001 x 2-126 or 1.0 x 2-149
For double-precision f. p. Smallest denormalized number: 1.0 x 2-1074
Advanced ALU CS465 F0851
D. Barbara
Special Numbers What have we defined so far? (single
precision)
Exponent Fraction Object
0 0 0
0 nonzero denorm
1-254 anything +/- floating-point
255 0 ???
255 nonzero ???
Advanced ALU CS465 F0852
D. Barbara
Representation for +/- Infinity In FP, divide by zero should produce +/-
infinity, not overflow Why?
OK to do further computations with infinity, e.g., X/0 > Y may be a valid comparison
IEEE 754 represents +/- infinity by all-one exponent and all-zero fraction
S 1111 11110000 0000 0000 0000 0000 000
Advanced ALU CS465 F0853
D. Barbara
Representation for Not a Number What do I get if I calculate sqrt(-4.0) or 0/0?
If infinity is not an error, these should not be either They are called Not a Number (NaN) IEEE 754: all-one Exponent (255), nonzero Fraction
Why is this useful? A symbol for invalid operations: allow programmers to
postpone some tests and decisions to later time They contaminate: op(NaN,X) = NaN
Advanced ALU CS465 F0854
D. Barbara
Special Numbers What have we defined so far? (single-precision)
Exponent Fraction Object
0 0 0
0 nonzero denom
1-254 anything +/- fl. pt. #
255 0 +/- infinity
255 nonzero NaN
Fig. 3.15
Advanced ALU CS465 F0855
D. Barbara
FP Operations Complexities Operations are somewhat more complicated In addition to overflow we can have “underflow”
Absolute value too small Accuracy can be a big problem
IEEE 754 keeps two extra bits, guard and round Four rounding modes Special cases:
Positive divided by zero yields “infinity” Zero divide by zero yields “not a number”
Implementing the standard can be tricky Not using the standard can be even worse
See text for description of 80x86 and Pentium bug!
Advanced ALU CS465 F0856
D. Barbara
(1) We'll need to convert from fraction to significand(2) We'll need to shift the smaller number so that we have the same exponent:num of shifts= Larger Number's exponent - Smaller Number's exponent
-- right shift smaller number that many positions(3) Add or subtract the two numbers(4) Normalize: a. left shift result, decrement result exponent (e.g. 0.0001) b. right shift result, increment result exponent (e.g. 101.1)
continue until MSB of data is 1 (will not be stored)(5) Check for overflow or underflow(6) If the result is a 0 significand, set the exponent to zero by special step
Basic Addition Algorithm
Advanced ALU CS465 F0857
D. Barbara
Example: Add 9.999 101 and 1.610 10-1 assuming 4 decimal digits
Floating Point Addition Example
Align decimal point of number with smaller exponent 1.610 10-1 = 0.161 100 = 0.0161 101
Shift smaller number to right Add significands 9.999
+ 0.016 10.015 SUM = 10.015 101
NOTE: One digit of precision lost during shifting Shift sum to put it in normalized form 1.0015 102
Since significand only has 4 digits, we need to round the sum SUM = 1.002 102
NOTE: normalization maybe needed again after rounding, e.g, rounding 9.9999 you get 10.000
Advanced ALU CS465 F0858
D. BarbaraDone
2. Add the significands
4. Round the significand to the appropriatenumber of bits
Still normalized?
Start
Yes
No
No
YesOverflow orunderflow?
Exception
3. Normalize the sum, either shifting right andincrementing the exponent or shifting left
and decrementing the exponent
1. Compare the exponents of the two numbers.Shift the smaller number to the right until itsexponent would match the larger exponent
FP Addition Algorithm: Fig 3.16
Advanced ALU CS465 F0859
D. Barbara
0 10 1 0 1
Control
Small ALU
Big ALU
Sign Exponent Significand Sign Exponent Significand
Exponentdifference
Shift right
Shift left or right
Rounding hardware
Sign Exponent Significand
Increment ordecrement
0 10 1
Shift smallernumber right
Compareexponents
Add
Normalize
Round
Fig 3.17
Advanced ALU CS465 F0860
D. Barbara
Accurate Arithmetic Round accurately requires HW to include extra
bits in the calculation IEEE 754 standard specifies the use of 2 extra
bits on the right during intermediate calculations – Guard bit and Round bit
Example: Add 2.56 100 and 2.34 102, assuming 3 significant digits Shift the smaller number
2.56 100 = 0.0256 102(guard holds 5 and round holds 6) Without guard and round bits
2.34+ 0.02 = 2.36 102
With guard and round bits 2.34+ 0.0256 = 2.3656 102
ROUND 2.37102
Advanced ALU CS465 F0861
D. Barbara
Floating-Point Multiplication Algorithm: Z = X * Y
Product exponent = X's exponent + Y's exponent
Doubly-biased exponent must be corrected Product exponent = sum of stored exponents – bias
Xe = 7Ye = -3Bias 8
Multiply the significands: Zm = Xm x Ym Normalization, rounding and exception
checking (overflow and underflow) Determine the sign bit
Xe = 1111Ye = 0101 10100
= 15= 5 20
= 7 + 8= -3 + 8 4 + 8 + 8
Advanced ALU CS465 F0862
D. Barbara
MIPS Floating-Point Instructions Floating-point instructions
Computation: add.s, add.d, sub.s, sub.d, mul.s, mul.d, div.s, div.d
Comparison: c.x.s, c.x.d (x=eq, neq, lt, le, gt, ge) FP conditional branch: bc1t, bc1f Floating-point data transfer: lwc1, swc1, ldc1, sdc1
32 floating-point registers: $f0, …, $f31 Double precision: by convention, even/odd pair
contain one DP FP number: $f0/$f1, $f2/$f3 See page 206 for details
Advanced ALU CS465 F0863
D. Barbara
Summary Multiplication
Refined hardware and algorithms Booth’s algorithm
Division Similar hardware as multiplication
Floating point Representing numbers Addition and multiplication Precision is a big issue