Multiplication Discussion 11.1. Multiplier Binary Multiplication 4 x 4 Multiplier.
Multiplication for 2’s Complement System – Booth...
Transcript of Multiplication for 2’s Complement System – Booth...
ECE152B AU 1
Multiplication for 2’s Complement System – Booth Algorithm
Consider an unsigned five bit number:B = B4B3B2B1B0
= B4×16+ B3×8+ B2×4+ B1×2+ B0×1For a 2’s complement number:B 2’s comp=B4×(−16)+ B3×8+ B2×4+ B1×2+ B0×1
which can be re-expressed as:B2’s comp= (−16)×B4 + (16−8)×B3 + (8−4)×B2 +
(4−2) × B1 + (2−1) ×B0
= -16×(B4 − B3)−8×(B3 − B2)−4×(B2 −B1)− 2×(B1 − B0)−1×(B0 − 0)
The value in parentheses is difference of two consecutive bits, which could be +1,0, or -1.
ECE152B AU 2
Example: Use the Booth’s algorithm recoding scheme to perform the multiplication:
2510×−1910
0 1 1 0 0 1 -- A=2510 multiplicant1 0 1 1 0 1 -- B=-1910 multiplierB5B4B3B2B1B0
P = A×B = −32×(B5−B4)×A − 16×(B4−B3)×A− 8×(B3−B2)×A − 4×(B2−B1)×A − 2×(B1−B0)×A −1×(B0−0)×A
= −32×Α+16×Α+0×Α−4×Α+2×Α−1×Α
ECE152B AU 3
8-bit register
8-bit ALU
8-bit register 8-bit register
S1
S3
D7 ….………… D1
7
8
8
7
8
1
7
4
Multiplicand
prod_clr
prod_clk D0D7 ……… D1 D0
7
D7
8-bit shiftregister
80
PIER_CLK
PIER_LD
Multiplier
PCAND_CLK
D7 ……… D1 D0 1-bitFF
QD
Combologic
D0Bi Bi-1
ECE152B AU 4
Plier Bits ALU control Set
B1-H B0-H S3-H S2-H S1-H S0-H Function
0 0 0 0 0 0 pass product value
0 1 1 0 0 1 product plus multiplicand
1 0 0 1 1 0 product minus multiplicand
1 1 0 0 0 0 pass product value
The logic for ALU Select lines is implemented by two NAND gates.
ECE152B AU 5
Multiplication
A3 A2 A1 A0
× B3 B2 B1 B0
R0,3 R0,2 R0,1 R0,0
R1,3 R1,2 R1,1 R1,0
R2,3 R2,2 R2,1 R2,0
R3,3 R3,2 R3,1 R3,0
Sum of partial products
ECE152B AU 6
Using adders to add rows
ECE152B AU 7
Multiplication Using Adders
A3 A2 A1 A0
× B3 B2 B1 B0
R0,3 R0,2 R0,1 R0,0
R1,3 R1,2 R1,1 R1,0
R2,3 R2,2 R2,1 R2,0
R3,3 R3,2 R3,1 R3,0
Sum of partial products
1st level adder
ECE152B AU 8
Multiplication Using Adders
A3 A2 A1 A0
× B3 B2 B1 B0
R0,3 R0,2 R0,1 R0,0
R1,3 R1,2 R1,1 R1,0
R2,3 R2,2 R2,1 R2,0
R3,3 R3,2 R3,1 R3,0
Sum of partial products
2nd level adder
ECE152B AU 9
Multiplication Using Adders
A3 A2 A1 A0
× B3 B2 B1 B0
R0,3 R0,2 R0,1 R0,0
R1,3 R1,2 R1,1 R1,0
R2,3 R2,2 R2,1 R2,0
R3,3 R3,2 R3,1 R3,0
Sum of partial products
3rd leveladder
ECE152B AU 10
Row Reduction Method for Multiplication
A3 A2 A1 A0
× B3 B2 B1 B0
R0,3 R0,2 R0,1 R0,0
R1,3 R1,2 R1,1 R1,0
R2,3 R2,2 R2,1 R2,0
R3,3 R3,2 R3,1 R3,0
Sum of partial products
ECE152B AU 11
Using Carry Save Adders
ECE152B AU 12
Row Reduction Method for Multiplication
A3 A2 A1 A0
× B3 B2 B1 B0
R0,3 R0,2 R0,1 R0,0
R1,3 R1,2 R1,1 R1,0
R2,3 R2,2 R2,1 R2,0
R3,3 R3,2 R3,1 R3,0
Sum of partial products
1st level adder (row-reduction unit)
ECE152B AU 13
Row Reduction Method for Multiplication
A3 A2 A1 A0
× B3 B2 B1 B0
R0,3 R0,2 R0,1 R0,0
R1,3 R1,2 R1,1 R1,0
R2,3 R2,2 R2,1 R2,0
F5 F4 F3 F2 F1 F0C5 C4 C3 C2 C1 C0
1st level adder (row-reduction unit)
Outputs of 1st
level adder
ECE152B AU 14
Row Reduction Method for MultiplicationA3 A2 A1 A0
× B3 B2 B1 B0R0,3 R0,2 R0,1 R0,0
R1,3 R1,2 R1,1 R1,0R2,3 R2,2 R2,1 R2,0F5 F4 F3 F2 F1 F0
C5 C4 C3 C2 C1 C0R3,3 R3,2 R3,1 R3,0F6 F5 F4 F3 F2 F1 F0
C6 C5 C4 C3 C2 C1 C0
2nd level adder (row-reduction unit)
Outputs of 2nd level adder
Use a regular adder to add these two rows
ECE152B AU 15
Generalized row reduction method:
MULTIPLICAND
MULTIPLIER×
IntermediatePartial
Product SumsFormed in
Parallelthen Summed
PartialProduct
Array
PRODUCT
ECE152B AU 16
Example: design a high speed multiplier• 56×56 bit• longest available row reduction unit: 15-4• the final stage is a LACA with 8-bit basic adders.
15-4
15-4
15-4
15-4
15-47-3 3-2 FA
PRODUCT56
PARTIALPRODUCT
ROWS
ECE152B AU 17
DELAY: 1 + 2 × T15-4 rru + T7-3 rru + T3-2rru + TLACA
toform partialproduct
rowreductionunit
2+4×(⎡log8(112) ⎤-1)= 2+4×(3-1)= 10 gate delay
ECE152B AU 18
Multiplication with Sectioning
Design an 8×8 multiplier using 4×4 multipliers.
X3 X2 X1 X0
× Y3 Y2 Y1 Y0
R0,3 R0,2 R0,1 R0,0
R1,3 R1,2 R1,1 R1,0
R2,3 R2,2 R2,1 R2,0
R3,3 R3,2 R3,1 R3,0
P7 P6 P5 P4 P3 P2 P1 P0
ECE152B AU 19
X7 X6 X5 X4 X3 X2 X1 X0
× Y7 Y6 Y5 Y4 Y3 Y2 Y1 Y0
R0,7 R0,6 R0,5 R0,4 R0,3 R0,2 R0,1 R0,0
R1,7 R1,6 R1,5 R1,4 R1,3 R1,2 R1,1 R1,0
R R R R R R R R P1 =R R R R R R R R X3-0× Y3-0
R R R R R R R RR R R R R R R R
R R R R R R R RR R R R R R R R P2 = X7-4xY3-0
P1,7 P1,6 P1,5 P1,4 P1,3 P1,2 P1,1 P1,0
P2,7 P2,6 P2,5 P2,4 P2,3 P2,2 P2,1 P2,0
P3,7 P3,6 P3,5 P3,4 P3,3 P3,2 P3,1 P3,0
P4,7 P4,6 P4,5 P4,4 P4,3 P4,2 P4,1 P4,0
P3
P4
ECE152B AU 20
ECE152B AU 21
Division
DD = Q × DS + R
dividend quotient divisor remainder
The most straightforward method is to mimic the operations of paper-and-pencil long division for positive numbers.
ECE152B AU 22
Example: 1011 Quotient, Q Divisor,DS 101) 111010 ←dividend, DD
101 Q3×Ds 10010 R>Ds, continue 000 Q2×Ds, shifted 10010 R>Ds, continue 101 Q1×Ds, shifted 1000 R>Ds, continue 101 Q0×Ds, shifted 11 R<Ds, done
ECE152B AU 23
A block diagram for such a divider:
bitfrom
controlDS R Q
INPUT INPUT
ALU(subtract) Termination: Quotient in Q
Remainder in R
The division process involves repetitive shifts and subtraction operations.
ECE152B AU 24
Load Ds, Qclear Cout
clear R
begin
shift Q,R left one bit
inc count
isR-Ds
positive ?
prepare 0for Q reg
iscount
N?
prepare 1for Q reg
R=R-Ds
shift Q,Rleft one bit
Yes
Yes
No
No Done
DS R Q
INPUT INPUT
ALU(subtract)
ECE152B AU 25
Fig 6.10 Parallel Array Divider
R := (c → D:¬c → (D-d-bi) mod 2):
Borrow alwayscomputed
d1
q1
q2
0
0
0qm
r1 r2 rm
D1 d2 D2 dm Dm Dm+1 D2m
D
R
d
d
bo
c
bi
c
ECE152B AU 26
Division by Repeated Multiplication
Cost-effective if system contains high-speed multiplierQ = D D/ DS
• In each iteration, a factor fi is generated & used to multiply both divisor DS and dividend DD.• Q= (DD×f0×f1×f2 … )/(DS×f0×f1×f2 …)
• fi is so chosen that DS×f0×f1×f2 … converges rapidly toward 1.
• If the denominator converges toward 1, the numerator converges toward Q.
ECE152B AU 27
For simplicity, assume DD & DS are positive normalized fraction: DS=1-x where x<1.Set f0 = 1+x => DS×f0=1-x2 (closer to 1 than DS)=> Q= (DD×(1+x) )/(1-x2 )
Set f1 =1+x2
=> DS × f0× f1 =1-x4 (even closer to 1)=> Q= (DD×(1+x) ×(1+ x2) )/(1-x4 )
f0 =1+x = 1+(1- DS) = 2-DS (2’s complement of DS)f1= 1+x2 = 1+(1-DS×f0) = 2- DS×f0 = 2-DS0
=>fi =2- DS×f0 ×… fi-1 =2-DS(i-1)
ECE152B AU 28
Example: (1). 0.4/0.7:DD0 0.4000000 DS0 0.7000000 f0 1.3000000DD1 0.5200000 DS1 0.9099999 f1 1.0900000DD2 0.5668000 DS2 0.9918999 f2 1.0081000DD3 0.5713911 DS3 0.9999344 f3 1.0000656DD4 0.5714286 DS4 0.9999999 f4 1.0000000DD5 0.5714286 DS5 1.0000000 f5 1.0000000DD6 0.5714286 DS6 1.0000000
ECE152B AU 29
(2). 0.7/0.4:DD0 0.7000000 DS0 0.4000000 f0 1.5999999DD1 1.1199999 DS1 0.6400000 f1 1.3599999DD2 1.5231999 DS2 0.8704000 f2 1.1295999DD3 1.7206066 DS3 0.9832038 f3 1.0002821DD4 1.7495062 DS4 0.9997178 f4 1.0002821DD5 1.7499998 DS5 0.9999999 f5 1.0000001DD6 1.7499999 DS6 1.0000000
(3). 0.1/0.15:DD0 0.1000000 DS0 0.1500000 f0 1.8499999DD1 0.1850000 DS1 0.2775000 f1 1.7224999DD2 0.3186625 DS2 0.4779938 f2 1.5220062DD3 0.4850063 DS3 0.7275094 f3 1.2724905DD4 0.6171659 DS4 0.9257489 f4 1.0742511DD5 0.6629912 DS5 0.9944868 f5 1.0055132DD6 0.6666464 DS6 0.9999696
ECE152B AU 30
The # of iterations required is determined by the value of DS
It’s better to use a fixed # of iterations• To assure that the process converges to the
correct answer for all data, instead of using 2-DS to calculate f0, use a ROM to find an appropriate value for f0.
• It can then guarantee correct results after a fixed # of iterations.
ECE152B AU 31
Suppose ROM has 28 words(a) If DS is 8-bit, one iteration is sufficient
=> f0 = 1/DS
(b) If DS is > 8-bit, more than one iteration is required, DS• f0=1-x & x< 2-8
• At the 2nd iteration, Ds • f0 • f1= 1-x2
the difference from 1 is <2-16
• At the ith iteration (i>2)Ds•f0•f1•…•fi-1= Dsi-1 = 1-x2(i-1)
the difference from 1 is < (2-8)2(i-1)
(3rd iteration error < 2-32)(4th iteration error < 2-64)
ECE152B AU 32
mult mult mult
mult mult
ROMf0 2’s
compf1 2’s
compf2
DD
DS
Q
ECE152B AU 33
Fig 6.14 Floating-PointNumber Format
s is sign, e is exponent, and f is significand(mantissa)We will assume a fraction mantissa, but some representations have used integers
s
Sign
e f
me
1 + me + mf = m, Value(s, e, f ) = (– 1)s × f × 2e
m bits
1 mf
Exponent Fraction
ECE152B AU 34
Floating Point Arithmetic
Floating point additionThe difficulty when adding two floating point numbers stems from the fact that the mantissas, in general, have different significance.A = B+C
= MB × rSEB + MC× rS
Ec
Before the two numbers can be properly added together, the mantissas must be aligned.A= (MB × rS
EB-Ec+MC)× rSEc (assume |B|<|C|)
ECE152B AU 35
This involves determining which operand value is smaller, and then aligning the mantissa of that operand appropriately with the mantissa of the larger operand.The alignment is accomplished by shifting the mantissa of the smaller operand to line up with the digits of the same significance in the larger operand.The amount of the alignment, i.e. the # of positions to shift, is determined by the difference in the exponents.
ECE152B AU 36
A block diagram:
Exponent B Exponent C Mantissa B Mantissa C
Exponent Compare
Exponent Adjust
Add/Subtract
Post Normalization
Result Exponent Result Mantissa
Select Selectand align
ECE152B AU 37
The selection of the appropriate mantissa to be aligned is made based on a comparison of the magnitude of the two exponents.The resulting number of the addition/subtraction is provided to the “Post-Normalization” unit.Examples:
0.8045 Input A is normalized0.7133 Input B is normalized1.5177 Result is not normalized
+
0.80450.80320.0012
–
ECE152B AU 38
The post normalization unit must be capable of shifts of one or more positions for the mantissa and adjust the size of the exponent to reflect the normalization.Floating point addition, then, requires many more operations, and hence more hardware, than its integer counterpart.
ECE152B AU 39
Floating-point adder of IBM system/360 Model 91
E1 E2 M1 M2
Shifter 1Adder 1
Adder 2
Adder 3
E3
Zero digitchecker
R
Shifter 2
M3
Input Bus
Output Bus
ExponentComparison and
MantissaAlignment
Mantissaaddition-subtraction
Resultnormalization
E1-E2
ECE152B AU 40
Design a network to align the smaller mantissa to be added to the larger mantissa.Assume that the mantissa is 24 bitsThe alignment network must be capable of shifting any number of bits, from 0 to 24 (shift left).Assume the adders used to compare the exponents provide a binary number (size: 0 to 24; hence 5 bits S4S3S2S1S0) which indicates how far the number needs to be shifted in the alignment process.
ECE152B AU 41
Fig 6.11 A N × N Bit Crossbar Design for Barrel Rotator
S h iftco u n t
Dec
ode
r
y 0
x 0
x 1
x 2
x - in pu ty - o u tp u t
x 3
x 4
x 5
y1 y 2 y3 y 4 y5
ECE152B AU 42
Properties of the Crossbar Barrel Shifter
There is a 2-gate delay for any length shiftEach output line is effectively an n way multiplexer for shifts of up to n bitsThere are n2 3-state drivers for an n bit shifterFor n = 32, this means 1024 3-state drivers
For 32 bits, the decoder is 5 bits (1 out of 32)The minimum delay but large number of gates in the crossbar prompts a compromise:
the logarithmic barrel shifter
ECE152B AU 43
Shift count
Input word
Output word
x0 x1 x2 x29 x30 x31
One shift/bypass cell
Shift/bypass
y0 y1 y2 y29 y30 y31
Bypass/shift 1 bit right
Bypass/shift 2 bits right
Bypass/shift 4 bits right
Bypass/shift 8 bits right
Bypass/shift 16 bits right
s4 s3 s2 s1 s0
Logarithmic Barrel Shifter
ECE152B AU 44
The LSB of this number is used by the first level of MUXs to shift the number by 1 bit (the 1 condition), or provide no shift at all (the 0 condition).Similarly, the second LSB controls the 2nd set of MUXs to shift the number by 2 more or not to shift.Similarly, the MSB controls the 5th set of MUXsto shift the number by 16 more or not to shift.
ECE152B AU 45
Floating Point MultiplicationA = B × C
= MB × rSEBxMC× rS
Ec
= MB × MC × rSEB+Ec
Exponent B Exponent C Mantissa B Mantissa C
Exponent Add
Exponent Adjust
Multiply
Post Normalization
Result Exponent Result Mantissa
ECE152B AU 46
Post normalization unit only needs to shift the result by at most one bit position.Consider two extreme cases:
Largest / LargestBase 2 Base 10
0.1111 0.9999× 0.1111 × 0.9999
0.1110 0.9998 A ligned proper ly,=>no postnormalization.
Smallest / smallestBase 2 Base 10
0.1000 0.1000× 0.1000 × 0.1000 N ot aligned proper ly,
0.0100 0.0100 =>postnormalization ofone digit position.
ECE152B AU 47
Floating Point divisionA = B /C
= MB × rSEB/(MC× rS
Ec)= (MB / MC) × rS
EB-Ec
Exponent B Exponent C Mantissa B Mantissa C
Exponent subtract
Exponent Adjust
Divide
Post-Normalization
ECE152B AU 48
The result of the mantissa division may require post-normalization by at most one bit position in opposite direction of the multiplier.
Largest / smallestBase 2 Base 10
0.1111 0.9999÷ 0.1000 ÷ 0.1000 N ot aligned proper ly,
1.1110 9.9990 =>postnormalization isrequired.
Smallest / LargestBase 2 Base 10
0.1000 0.1000÷ 0.1111 ÷ 0.9999 A ligned proper ly,
0.1000 0.1000 =>no postnormalization.