Lecture 8: Sequential Multipliers - George Mason...

37
ECE 645 – Computer Arithmetic Lecture 8: Sequential Multipliers ECE 645—Computer Arithmetic 3/25/08 2 Lecture Roadmap Sequential Multipliers Unsigned Signed Radix-2 Booth Recoding High-Radix Multiplication Principles Radix-4 Radix-4 Booth Recoding High-Radix Sequential Multipliers Using Carry-Save Adders Serial Multipliers Modular Multiplication

Transcript of Lecture 8: Sequential Multipliers - George Mason...

ECE 645 – Computer Arithmetic

Lecture 8:

Sequential Multipliers

ECE 645—Computer Arithmetic

3/25/08

2

Lecture Roadmap

• Sequential Multipliers

• Unsigned

• Signed

• Radix-2 Booth Recoding

• High-Radix Multiplication Principles

• Radix-4

• Radix-4 Booth Recoding

• High-Radix Sequential Multipliers

• Using Carry-Save Adders

• Serial Multipliers

• Modular Multiplication

3

Required Reading

• B. Parhami, Computer Arithmetic: Algorithms and Hardware Design

• Chapter 9, Tree and Array Multipliers

• Chapter 10, High-Radix Multipliers

• Chapter 12, Variations in Multipliers

• Note errata at:• http://www.ece.ucsb.edu/~parhami/text_comp_arit.htm#errors

Sequential Multipliers

ECE 645 – Computer Arithmetic

5

Multiplication Architectures

SEQUENTIAL MULTIPLIERS

Area

6

a Multiplicand ak-1ak-2 . . . a1 a0

x Multiplier xk-1xk-2 . . . x1 x0

p Product (a ⋅ x) p2k-1p2k-2 . . . p2 p1 p0

Notation

If multiplicand and multiplier different sizes, usually multiplier is the

smaller size

7

Multiplication of Two 4-bit Unsigned

Binary Numbers in Dot Notation

Partial Product 0

Partial Product 1

Partial Product 2

Partial Product 3

Number of partial products = number of bits in multiplier x

Bit-width of each partial product = bit-width of multiplicand a

8

x =∑ xi ⋅ 2i

i=0

k-1

p = a ⋅ x

p = a ⋅ x =∑ a ⋅ xi ⋅ 2i =

= x0a20 + x1a21 + x2a22 + … + xk-1a2k-1

i=0

k-1

Basic Multiplication Equations

9

p = a ⋅ x = x0a20 + x1a21 + x2a22 + … + xk-1a2k-1

= (...((0 + x0a2k)/2 + x1a2k)/2 + ... + xk-1a2k)/2 =

k times

=

p(0) = 0

p = p(k)

p(j+1) = (p(j) + xj a 2k) / 2 j=0..k-1

Sequential Shift-and-Add Multipliers: Right-Shift

Algorithm (Unsigned)

10

Sequential Shift-and-Add Multipliers: Right-Shift

Algorithm

11

Right-shiftmultiplication

algorithm: Example

12

Area Optimization for the Sequential Shift-and-Add

Multiplier with the Right-shift Algorithm

13

p = a ⋅ x = x0a20 + x1a21 + x2a22 + … + xk-1a2k-1

= (...((0⋅2 + xk-1a)⋅2 + xk-2a)⋅2 + ... + x1a)⋅2 + x0a=

k times

=

p(0) = 0

p = p(k)

p(j+1) = (p(j) ⋅2 + xk-1-ja) j=0..k-1

Sequential Shift-and-Add Multipliers: Left-Shift

Algorithm (Unsigned)

14

Sequential Shift-and-Add Multipliers: Left-Shift

Algorithm

More hardware thanright-shift so right-shiftis preferred

15

Left-shiftmultiplication

algorithm: Example

16

= (...((y2k + x0a2k)/2 + x1a2k)/2 + ... + xk-1a2k)/2 =

k times

p(0) = y2k

p = p(k)

p(j+1) = (p(j) + xj a 2k) / 2 j=0..k-1

= y + x0a20 + x1a21 + x2a22 + … + xk-1a2k-1 = y + a ⋅ x

Sequential Shift-and-Add Multipliers: Right-Shift

Algorithm for Multiply-Add

17

p(0) = y2-k

p = p(k)

p(j+1) = (p(j) ⋅2 + xk-(j+1)a) j=0..k-1

= (...((y2-k ⋅2 + xk-1a)⋅2 + xk-2a)⋅2 + ... + x1a)⋅2 + x0a =

k times

= y + xk-1a2k-1 + xk-2a2k-2 + … + x1a21 + x0a = y + a ⋅ x

Sequential Shift-and-Add Multipliers: Left-Shift

Algorithm for Multiply-Add

18

Signed Multiplication

• Previous sequential multipliers are for unsigned multiplication

• For signed multiplication:• Right-shift sequential algorithms (shift-add) will work directly if 2's complement

multiplier is POSITIVE

• Also assume sign-extended operation for p(i) + xia

• If 2's complement multiplier is NEGATIVE than must use "negative weight" representation and subtract xk-1a instead of add in last cycle

• Also assume sign-extended operation for p(i) + xia

• Slight increase in area due to control and one-bit sign extension on inputs of adder• Unsigned: k bit number + k bit number � k+1 bit number

• Signed: k+1 bit sign extended number + k+1 bit sign extended number � k+1 bit number

19

Sequential multiplication

of 2’s-complementnumbers

with right shifts(positive multiplier)

20

Sequential multiplication

of 2’s-complementnumbers

with right shifts(negative multiplier)

21

Sequential Signed Multiplication with Left-Shifts

Left shifts are not as efficient fortwo's complement because mustsign extend multiplicand by k bits

22

Sequential Shift-and-Add Multiplier

with a Carry Save Adder

Radix-2 Booth Recoding

ECE 645 – Computer Arithmetic

24

Radix-2 Booth Recoding

• Can be used to recode unsigned multipliers or signed (two's complement) multipliers

• Can reduce average number of additions required

• Not normally used in practice due to variable delay, but serves as help to understand radix-4 Booth recoding, which is used often in practice

25

Radix-2 Booth Recoding

ijj+1

(i.e. multiplier)

26

Radix-2 Booth Recoding

Unsigned multiplication requires the leading digit in parenthesis. Two's complement multiplication does not.

27

Sequential multiplication of 2’s-complementnumbers with

right shifts using Booth’s recoding

High-Radix Multipliers

ECE 645 – Computer Arithmetic

29

a Multiplicand (ak-1ak-2 . . . a1 a0)r

x Multiplier (xk-1xk-2 . . . x1 x0)r

p Product (a ⋅ x) (p2k-1p2k-2 . . . p2 p1 p0)r

High-Radix Notation

30

Radix-4, or Two-Bit-at-a-Time,

Multiplication in Dot Notation

31

x =∑ xi ⋅ ri

i=0

k-1

p = a ⋅ x

p = a ⋅ x =∑ a ⋅ xi ⋅ ri =

= x0ar0 + x1a r1 + x2a r2 + … + xk-1a rk-1

i=0

k-1

Basic Multiplication Equations

32

p = a ⋅ x = x0ar0 + x1ar1 + x2ar2 + … + xk-1ark-1

= (...((0 + x0ark)/r + x1ark)/r + ... + xk-1ark)/r =

k times

=

p(0) = 0

p = p(k)

p(j+1) = (p(j) + xj a rk) / r j=0..k-1

High-Radix Shift/Add Algorithms:

Right-Shift High-Radix Algorithm

33

p = a ⋅ x = x0ar0 + x1ar1 + x2ar2 + … + xk-1ark-1

= (...((0⋅r + xk-1a)⋅r + xk-2a)⋅r + ... + x1a)⋅r + x0a=

k times

=

p(0) = 0

p = p(k)

p(j+1) = (p(j) ⋅ r + xk-1-ja) j=0..k-1

High-Radix Shift/Add Algorithms:

Left-Shift High-Radix Algorithm

34

The multiple generation part of a radix-4

multiplier with precomputation of 3a

35

Example of Radix-4 Multiplication

using the 3a multiple (unsigned)

36

The multiple generation part of a radix -4multiplier

Based on replacing 3 a with 4 a(carry into next higher radix-4 multiplier

digit) and - a

37

Higher Radix Multiplication

• In radix-8, one must precompute 3a, 5a, 7a

• Overhead becomes prohibitive and does not help

• However, when we discuss CSA this may be useful

38

Radix-4 Booth Recoding

• Typically used for two's complement multiplication, but can

also use for unsigned multiplication

• Radix-4 Booth recoding also called "modified" Booth

recoding

• Goal is to reduce the number of partial products (see next

slide)

• Increase the complexity of "multiple-forming circuits"

• Formerly were AND gates in normal tree multiplication

• Reduces number of partial products by approximately half

• Formerly, k-bit two's multiplier implies k partial products

• Now k-bit two's complement multiplier recoded into ceil(k/2) digits, so ceil(k/2) partial products

39

Full Tree Architecture

Designs are distinguished by variations in three elements:

Higher-order product bits

Multipliera

a

a

a. . .

. . .

Some lower-order product bits are generated directly

Redundant result

Redundant-to-Binary Converter

Multiple- Forming Circuits

(Multi-Operand Addition Tree)

Partial-Products Reduction Tree

2. Partial products reduction tree

3. Redundant-to-binary converter

1. Multiple-forming circuits

40

Radix-2 and Radix-4 Booth Recoding

(1) -1 0 1 0 0 -1 1 0 -1 1 -1 1 0 0 -1 0Recoded radix-2version y

41

1 -1

42

Radix-4 Booth Recoding Examples: Unsigned xk-1=0

0 1 1 0 1 0 1 0 1 0 1 0

• k=12 bits, xk-1 = 0 � 6 digits

• Unsigned: (011010101010)2 = (1706)10 = 2 * 45 + -1 * 44 + -1 * 43 + -1 * 42 + -1 * 41 + -2 * 40

Always assume x-1 = 0

-2-1-1-1-12

0 1 0 1 0 1 0 1 0 1 0 Always assume x-1 = 0

-2-1-1-1-11

If unsigned and xkneeded, assume xk=0

• k=11 bits, xk-1 = 0 � 6 digits• Unsigned: (01010101010)2 = (682)10 = 1 * 45 + -1 * 44 + -1 * 43 + -1 * 42 + -1 * 41 + -2 * 40

Unsigned x k-1=0 requires ceil(k/2) digits i.e. partial products!

43

Radix-4 Booth Recoding Examples: Unsigned xk-1=1

1 1 1 0 1 0 1 0 1 0 1 0

• k=12 bits, xk-1 = 1 � 7 digits

• Unsigned: (111010101010)2 = (3754)10 = 1 * 46 + -1 * 44 + -1 * 43 + -1 * 42 + -1 * 41 + -2 * 40

Always assume x-1 = 0

-2-1-1-1-10

1 1 0 1 0 1 0 1 0 1 0 Always assume x-1 = 0

-2-1-1-1-12

• k=11 bits, xk-1 = 1 � 6 digits• Unsigned: (11010101010)2 = (1706)10 = 2 * 45 + -1 * 44 + -1 * 43 + -1 * 42 + -1 * 41 + -2 * 40

Unsigned x k-1=1 requires ceil((k+1)/2) digits i.e. partial produ cts!

If unsigned and xk-1 = 1 andk is even, then need anadditional digits, andassume xk+1=xk=0

If unsigned and xkneeded, assume xk=0

1

44

Radix-4 Booth Recoding Examples: Signed xk-1=0

0 1 1 0 1 0 1 0 1 0 1 0

• k=12 bits, xk-1 = 0 � 6 digits

• Signed: (011010101010)2 = (1706)10 = 2 * 45 + -1 * 44 + -1 * 43 + -1 * 42 + -1 * 41 + -2 * 40

Always assume x-1 = 0

-2-1-1-1-12

0 1 0 1 0 1 0 1 0 1 0 Always assume x-1 = 0

-2-1-1-1-11

If signed and xkneeded, sign extendso xk=xk-1. In this case,xk= xk-1 = 0

• k=11 bits, xk-1 = 0 � 6 digits• Signed: (01010101010)2 = (682)10 = 1 * 45 + -1 * 44 + -1 * 43 + -1 * 42 + -1 * 41 + -2 * 40

Signed x k-1=0 requires ceil((k)/2) digits i.e. partial product s!

45

Radix-4 Booth Recoding Examples: Signed xk-1=1

1 1 1 0 1 0 1 0 1 0 1 0

• k=12 bits, xk-1 = 1 � 6 digits

• Signed: (111010101010)2 = (-342)10 = -1 * 44 + -1 * 43 + -1 * 42 + -1 * 41 + -2 * 40

Always assume x-1 = 0

-2-1-1-1-10

1 1 0 1 0 1 0 1 0 1 0 Always assume x-1 = 0

-2-1-1-1-10

• k=11 bits, xk-1 = 1 � 6 digits• Unsigned: (11010101010)2 = (-342)10 = -1 * 44 + -1 * 43 + -1 * 42 + -1 * 41 + -2 * 40

Signed x k-1=1 requires ceil((k)/2) digits i.e. partial product s!

If signed and xkneeded, sign extendso xk=xk-1. In this case,xk= xk-1 = 1

46

Radix-4 Unsigned and Signed Summary

• Unsigned

• If xk-1 = 0, requires ceil(k/2) digits

• If xk-1 = 1, requires ceil((k+1)/2) digits

• Or you can always add a '0' to the MSB of both the

multiplicand and multiplier, treat the multiplication as

signed, then remove the two '0' MSBs of the output

• Signed

• Requires ceil(k/2) digits

47

Radix-4 Booth Multiplication (Two's Complement):

Sequential Right-Shift

48

Booth Recoding and Multiple Selection Logic

for High-Radix Multiplication (Sequential Multiplier)

49

Full Tree Architecture

Designs are distinguished by variations in three elements:

Higher-order product bits

Multipliera

a

a

a. . .

. . .

Some lower-order product bits are generated directly

Redundant result

Redundant-to-Binary Converter

Multiple- Forming Circuits

(Multi-Operand Addition Tree)

Partial-Products Reduction Tree

2. Partial products reduction tree

3. Redundant-to-binary converter

1. Multiple-forming circuits

Now these are more complicated than AND gates

…but there are fewer partial products

50

PARTIAL PRODUCT

Booth Recoding and Multiple Selection Logic

for High-Radix Multiplication (Tree Adder)

FOR MORE INFO: http://www.geoffknagge.com/fyp/booth.shtml

High-Radix Sequential Multipliers

ECE 645 – Computer Arithmetic

52

Multiplication Architectures

HIGH-RADIX SEQUENTIAL MULTIPLIERS

Area

53

Sequential Shift-and-Add Multiplier

with a Carry Save Adder

54

Radix-4 multiplication with a carry-save adder

used to combine the cumulative partial product, x ia, and 2x i+1a into two numbers

55

Radix-4 multiplication with a carry-save adder

and Radix-4 Booth-recoding

56

Radix-4 multiplier with two carry-save adders

57

Radix-16 multiplier with carry-save adders

Bit-Serial Multipliers

ECE 645 – Computer Arithmetic

59

• small area

• reduced pin count

• reduced wire length

• high clock rate

Bit Serial Multipliers

Main disadvantage is that it is slow

60

Systolic Array

• Systolic array: synchronous arrays of processing elements that are interconnected by only short, local wires thus allowing very high clock rates

61

Semisystolic Bit-Serial Multiplier (1)

62

a3x0 a2x0 a1x0 a0x0

a3x1 a2x1 a1x1 a0x1

a3x2 a2x2 a1x2 a0x2

a3x3 a2x3 a1x3 a0x3

a3 0 a2 0 a1 0 a0 0

a3 0 a2 0 a1 0 a0 0

a3 0 a2 0 a1 0 a0 0

a3 0 a2 0 a1 0 a0 0

p0

p1

p2

p3

p4

p5

p6

p7

Semisystolic Bit-Serial Multiplier (2)

63

d

kk

k+n k+n+d

d kk+d

k+d+n k+d+n

Retiming

64

Retimed Semisystolic Bit-Serial Multiplier (1)

65

a3 0 a2 0 a1 0 a0x0

a3 0 a2 0 a1x0 a0x1

a3 0 a2x0 a1x1 a0x2

a3x0 a2x1 a1x2 a0x3

a3 x1 a2x2 a1x3 a0 0

a3 x2 a2x3 a1 0 a0 0

a3x3 a2 0 a1 0 a0 0

a3 0 a2 0 a1 0 a0 0

p0

p1

p2

p3

p4

p5

p6

p7

Retimed Semisystolic Bit-Serial Multiplier (1)

66

Systolic Bit-Serial Multiplier

Modular Multiplication

ECE 645 – Computer Arithmetic

68

Special Cases

a

x

pH pL

a x = p = pH 2k + pL

k bits

a x mod 2k = pL

a x mod 2k-1 = pL + pH + carry

p

a

x

a x mod 2k+1 = pL - pH + borrow

Modular Multiplication

69

Special Case (1)

a x mod 2k-1 = (pH 2k + pL) mod (2k-1) == (pH (2k mod 2k-1) + pL) mod (2k-1) = = (pH + pL) mod (2k-1) =

=pH + pL if pH + pL < 2k - 1

pH + pL - (2k-1) if pH + pL ≥ 2k - 1

= pL + pH + carry

carry = carry from addition pL + pH

Modular Multiplication

70

Special Case (2)

a x mod 2k+1 = (pH 2k + pL) mod (2k+1) == (pH (2k+1-1) + pL) mod (2k+1) = = (pL - pH ) mod (2k+1) =

=pL - pH if pL - pH ≥ 0

pL - pH + (2k+1) if pL - pH < 0

= pL - pH + borrow

borrow = borrow from subtraction pL + pH

Modular Multiplication

71

Modulo (2b-1) Carry Save Adder

72

4 x 4 Modulo 15 Multiplier

Mod-15 CSA

Divide by 16

4

4

4

4

Mod-15 CSA

4

Mod-15 CPA

73

4 x 4 Modulo 13 Multiplier