Post on 30-Apr-2018
Lec 13 Systems Architecture 1
Systems Architecture
Lecture 13: Integer Multiplication and Division
Jeremy R. Johnson
Anatole D. Ruslanov
William M. Mongan
Some or all figures from Computer Organization and Design: The Hardware/Software Approach, Third Edition, by David Patterson and John Hennessy, are copyrighted material (COPYRIGHT 2004 MORGAN KAUFMANN PUBLISHERS, INC. ALL RIGHTS RESERVED).
Lec 13 Systems Architecture 2
Introduction
• Objective: To provide hardware support for MIPS integer
multiplication and divide instructions. To understand how to
implement multiplication and division in hardware.
• Topics
– Review MIPS ALU design
– Review integer multiplication and division
– MIPS integer multiply and divide instructions
– Multiplication algorithms
– Division algorithms
– Multiply/Divide unit
Lec 13 Systems Architecture 3
Support for SLT and Overflow
Detection
0
3
R e s u l t
O p e r a t i o n
a
1
C a r r y I n
C a r r y O u t
0
1
B in v e r t
b 2
L e s s
0
3
R e s u l t
O p e r a t i o n
a
1
C a r r y I n
0
1
B in v e r t
b 2
L e s s
S e t
O v e r f lo w
d e t e c t i o nO v e r f l o w
a .
b .
0
3
R e s u l t
O p e r a t i o n
a
1
C a r r y I n
C a r r y O u t
0
1
B in v e r t
b 2
L e s s
0
3
R e s u l t
O p e r a t i o n
a
1
C a r r y I n
0
1
B in v e r t
b 2
L e s s
S e t
O v e r f lo w
d e t e c t i o nO v e r f l o w
a .
b .
Lec 13 Systems Architecture 4
MIPS ALU
Set
a31
0
Result0a0
Result1a1
0
Result2a2
0
Operation
b31
b0
b1
b2
Result31
Overflow
Bnegate
Zero
ALU0
Less
CarryIn
CarryOut
ALU1
Less
CarryIn
CarryOut
ALU2
Less
CarryIn
CarryOut
ALU31
Less
CarryIn
ALU control lines Function
000 and
001 or
010 add
110 subtract
111 set on less than
ALU Result
Zero
Overflow
a
b
ALU operation
CarryOut
Lec 13 Systems Architecture 5
MIPS Integer Multiply and Divide
• Hi and Lo registers
– mfhi
– mflo
• Signed and unsigned multiply
– mult
– multu
• Divide instructions
– div
– divu
– quotient is available in Lo and remainder in Hi
Lec 13 Systems Architecture
• More complicated than addition
– accomplished via shifting and addition
• More time and more microchip area
• We will look at 3 versions based on a simple algorithm
we learned in elementary school:
0010 (multiplicand)
__x_1011 (multiplier)
• Negative numbers: convert and multiply
– there are better techniques, we won’t look at them
Multiplication in Hardware
Lec 13 Systems Architecture
Multiplication Hardware – Algorithm 1
Datapath Control
Multiplicand
Shift left
64 bits
64-bit ALU
Product
Write
64 bits
Control test
Multiplier
Shift right
32 bits
3 2 n d re p e t it io n ?
1 a . A d d m u lt ip l ic a n d to p ro d u c t a n d
p la c e th e re s u lt in P ro d u c t re g is te r
M u lt ip lie r0 = 01 . Te s t
M u lt ip lie r0
S ta r t
M u lt ip l ie r0 = 1
2 . S h ift th e M u ltip l ic a n d re g is te r le ft 1 b i t
3 . S h ift th e M u lt ip lie r re g is te r r ig h t 1 b it
N o : < 3 2 re p e t it io n s
Ye s : 3 2 re p e ti t io n s
D o n e
Lec 13 Systems Architecture 8 http://www.cs.rpi.edu/~hollingd/comporg.2000/Notes/Mult.PDF
Lec 13 Systems Architecture 9
Multiplication Hardware – Algorithm 2
Multiplier
Shift right
Write
32 bits
64 bits
32 bits
Shift right
Multiplicand
32-bit ALU
Product Control test
D on e
1 . T e s t
M ultip lie r0
1 a . A dd m u ltip lic an d to the le ft ha lf o f
th e p ro du ct an d p lac e the re su lt in
the le ft ha lf o f the P ro d uc t re g iste r
2 . S h if t th e P rod uc t reg is te r r ig h t 1 b it
3 . S h ift the M u ltip lie r reg is te r righ t 1 b it
3 2 nd rep e tition ?
S ta rt
M u ltip lie r0 = 0M u ltip lie r0 = 1
N o : < 3 2 re p e titio ns
Y e s: 3 2 rep e titio n sDatapath Control
Lec 13 Systems Architecture
Multiplication Hardware – Algorithm 3
Multiplicand
32 bits
32-bit ALU
ProductWrite
64 bits
Control
test
Shift right
Note: Multiplier starts in right half of product.
Done
1. Test
Product0
1a. Add multiplicand to the left half of
the product and place the result in
the left half of the Product register
2. Shift the Product register right 1 bit
32nd repetition?
Start
Product0 = 0Product0 = 1
No: < 32 repetitions
Yes: 32 repetitionsDatapath Control
Lec 13 Systems Architecture 11 http://www.cs.rpi.edu/~hollingd/comporg.2000/Notes/Mult.PDF
Lec 13 Systems Architecture
Fast Multiplication Hardware
• Unroll the addition
“loop”
• Use 32 32-bit adders
• Each adder produces
32-bits and a carry-out
• The least significant bit
of each intermediate
sum is a bit of the
product.
• The other 31 bits and
the carry-out are passed
along to the next adder.
22 December 2011 Chapter 3 — Arithmetic for
Computers
13
Faster Multiplier
• Uses multiple adders
– Cost/performance tradeoff
Can be pipelined
Several multiplication performed in parallel
Lec 13 Systems Architecture 14
Division Hardware – Algorithm 1
64-bit ALU
Control
test
Quotient
Shift left
Remainder
Write
Divisor
Shift right
64 bits
64 bits
32 bits
D on e
Te s t R e m a in de r
2 a. S hift the Q uo tien t re g ister to the le ft,
se tting th e n ew rig htm o st b it to 1
3 . S h ift th e D iv isor reg is te r rig ht 1 b it
3 3rd re pe titio n?
S tart
R e m aind er < 0
N o : < 3 3 rep etition s
Y es : 3 3 rep etition s
2b . R es tore the or ig ina l va lu e b y ad ding
the D iviso r re g ister to the R e m a in de r
re gister an d p la ce the su m in th e
R em aind er reg is te r. A lso sh ift the
Q u o tie nt reg is te r to th e le ft, se tting th e
n ew le as t s ig n ifica nt b it to 0
1 . S ub tra ct the D iviso r re g is ter fro m the
R e m a in de r re g is ter an d p la ce the
re su lt in th e R e m aind er reg is te r
R e m a in de r > 0
–
Lec 13 Systems Architecture 15
Division Hardware – Algorithm 3
Write
32 bits
64 bits
Shift left
Shift right
Remainder
32-bit ALU
Divisor
Control
test
D on e. S h ift le ft h a lf o f R e m aind er rig h t 1 b it
T e st R em a in de r
3 a . S h ift th e R e m aind er reg is te r to th e
le ft, s e tt in g the ne w righ tm os t b it to 1
3 2n d rep et ition ?
S ta rt
R e m a in de r < 0
N o : < 3 2 re pe titio ns
Y e s: 3 2 rep etitio ns
3b . R e store th e orig ina l v a lue b y ad d ing
th e D iv isor reg is te r to th e le f t h a lf o f th e
R e m a in de r re g iste r a n d p la ce th e su m
in th e le f t h a lf o f th e R em a ind e r reg is te r.
A lso sh ift th e R em aind er reg is te r to th e
le ft , se ttin g the n ew righ tm os t b it to 0
2 . S u btrac t the D iv is o r re g iste r f rom th e
le ft h a lf o f th e R em aind er reg is te r a nd
place the re su lt in th e le ft h a lf o f th e
R e m a in de r re g is te r
R e m a in de r 0
1 . S h ift the R em a in de r re g iste r le f t 1 b it
–>
Lec 13 Systems Architecture 16 http://www.cs.rpi.edu/~hollingd/comporg.2000/Notes/Mult.PDF