Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 1
EE 5324 – VLSI Design IIEE 5324 – VLSI Design II
Kia Bazargan
University of Minnesota
Part II: AddersPart II: Adders
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 2
References and Copyright
• Textbooks referenced [WE92] N. H. E. Weste, K. Eshraghian
“Principles of CMOS VLSI Design: A System Perspective”Addison-Wesley, 2nd Ed., 1992.
[Rab96] J. M. Rabaey“Digital Integrated Circuits: A Design Perspective”Prentice Hall, 1996.
[Par00] B. Parhami“Computer Arithmetic: Algorithms and Hardware Designs”Oxford University Press, 2000.
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 3
References and Copyright (cont.)
• Slides used [©Hauck] © Scott A. Hauck, 1996-2000;
G. Borriello, C. Ebeling, S. Burns, 1995, University of Washington
[©Prentice Hall] © Prentice Hall 1995, © UCB 1996
Slides for [Rab96] http://bwrc.eecs.berkeley.edu/Classes/IcBook/instructors.html
[©Oxford U Press] © Oxford University Press, New York, 2000 Slides for [Par00] With permission from the authorhttp://www.ece.ucsb.edu/Faculty/Parhami/files_n_docs.htm
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 4
Outline
• One-bit adder, basic ripple-carry
adder
• Carry-Lookahead adders (CLA)
• Manchester carry chain
• Carry bypass
• Carry select adder
• Brent-Kung adder
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 5
Why Adders?
• Addition: a fundamental operation Basic block of most arithmetic operations Address calculation
• Faster, faster and faster• How?
Architectural level optimization Gate-level optimization Speed/area trade-off
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 6
• One-bit Half Adder:
• One-bit Full Adder:
Adding Two One-bit Operands
Sum = A B Cin
Cout = A.B + B.Cin + A.Cin
FA
A B
CinCout
Sum
Sum = A B
Cout = A.BHA
A B
Cout
Sum
A B Sum Cout0 0 0 00 1 1 01 0 1 01 1 0 1
Cin A B Sum Cout 0 0 0 0 0 0 0 1 1 0 0 1 0 1 0 0 1 1 0 1 1 0 0 1 0 1 0 1 0 1 1 1 0 0 1 1 1 1 1 1
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 7
N-Bit Ripple-Carry Adder: Series of FA Cells
• To add two n-bit numbers
C0FA
A0
S0
B0
FA
A1
S1
B1
FA
A2
S2
B2
FA
An-1
Sn-1
Bn-1
Cn. . .
• Note: adder delay = Tc * n
• Tc = (Cin:Cout delay)FA
A B
CinCout
Sum
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 8
4-bit Ripple Carry Addition: Example
C0FA
A0
S0
B0
FA
A1
S1
B1
FA
A2
S2
B2
FA
A3
S3
B3
C4 C1C2C3
T=1 00 10 10 01
00 10 01 11
0
00 00 00 00T=0
B=0101
A=0011
S=0000
S=0110
00 10 01 01T=2 S=0100
00 01 01 01T=3 S=0000
10 01 01 01T=4 S=1000
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 9
One-bit Full Adder Implementation
• Direct gate implementation
Cout = A.B + B.Cin + A.Cin = A.B + Cin. (A+B)
Sum = A B Cin
AB
CinSum
AB
AB
Cin Cout
32 Transistors Used32 Transistors Used
[WE92] p516
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 10
includes 111
excludes 000
One-Bit Full Adder: Share Logic
• An observation Almost always,
sum = NOT carry
Cin A B Sum Cout 0 0 0 0 0 0 0 1 1 0 0 1 0 1 0 0 1 1 0 1 1 0 0 1 0 1 0 1 0 1 1 1 0 0 1 1 1 1 1 1
Sum = A.B.Cin + (A+B+Cin).Cout
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 11
One-Bit Full Adder: Transistor Implementation
Sum = A.B.C + (A+B+C).CoutCout = A.B + C.(A+B)
A B B
AC
ABA B
C
Cout
C B AABC
CBACBA
Sum
– Use inverters to get Cout and Sum– C transistors close to output– Cout delay: 2 inverting stages (1-stage
possible?)– Sum delay: 3 inverting stages (not an issue,
though)
28 Transistors28 Transistors28 Transistors28 Transistors
[WE92] p517[Rab96] p390
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 12
• An observation Invert inputs =>
outputs invert
• Exploit this property: Get rid of the inverter
on the carry critical path
One-Bit Full Adder: Inverted Inputs
FA
Cin A B Sum Cout 0 0 0 0 0 0 0 1 1 0 0 1 0 1 0 0 1 1 0 1 1 0 0 1 0 1 0 1 0 1 1 1 0 0 1 1 1 1 1 1
FA
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 13
Ripple Carry Adder: Inverting Property
FA’ is similar to FA, but with no inverters on the outputs
Much faster (1-stage) Disadvantage: not regular data path
A1
S1
B1
C2C0
A0 B0
S0
C1
A2 B2
S2
C3. . . FA’
A3
S3
B3
C4
FA’ FA’FA’
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 14
Summary: Ripple-Carry Adder
• Basic ripple carry: AND-OR gates Area: 32 transistors (per bit position) Delay: 2 stages of inverting logic (per bit
position)
• Direct CMOS logic, share Cout’ Area: 28 transistors Delay: 2 stages
• Use “inverting” property Area: 27 (odd bits:26, even bits:28) Delay: ~1 stage
• So far: transistor/logic manipulation• Is that all we can do?!!
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 15
Outline
• One-bit adder, basic ripple-carry
adder
• Carry-Lookahead adders (CLA)
• Manchester carry chain
• Carry bypass
• Carry select adder
• Brent-Kung adder
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 16
Carry-Lookahead Adder: Idea
• New look: carry propagation• Idea:
Try to “predict” Ck earlier than Tc*k Instead of passing through k stages, compute
Ck separately using 1-stage CMOS logic
• Carry propagation: an example
Bit position
Carry
A B
Sum
7 6 5 4 3 2 1 0
1 0 0 1 1 1 1
0 1 0 0 1 1 0 1 +0 1 0 0 0 1 1 1
1 0 0 1 0 1 0 0
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 17
0-propagate
1-propagate generate
kill
(kill) (propagate) (propagate) (generate)
Carry-Lookahead Adder (CLA): One Bit
• What happens to thepropagating carry inbit position k? 0 0 - 0
0 1 C C 1 0 C C 1 1 - 1
C
A
A
B
BA
A
B
BCout
[Rab96] p391
p = A+B (or A B)
g = A.B
A B Cin Cout
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 18
CLA: Propagation Equations
• If C4=1, then either: g3 generated at bit pos 3
g2.p3 generated at bit pos 2, propagated 3
g1.p2.p3 generated at bit pos 1, propagated 2,3
g0.p1.p2.p3 generated at bit pos 0, propagated 1,2,3
Cin.p0.p1.p2.p3 input carry, propagated 0,1,2,3
• C4 = g3+ g2.p3 + g1.p2.p3 + g0.p1.p2.p3 + Cin.p0.p1.p2.p3
Implement Implement CC44 as a one-stage CMOS logic as a one-stage CMOS logic
delay=1 (or is it?) delay=1 (or is it?)
Implement Implement CC44 as a one-stage CMOS logic as a one-stage CMOS logic
delay=1 (or is it?) delay=1 (or is it?)
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 19
p3.g2 C4
p1.g2.g3C4
CLA: Static Logic Implementation
p0
p1
p2
p3
Cin
g0
g1
g2
g3
C4
[©Hauck][Rab96] p405
d
e
f
h
j
k
l
m
n
s
r
q
o
t
u
v
w
x
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 20
6 transistors6 transistorsin seriesin series
CLA: Dynamic Logic Implementation
• Dynamic gate implementation: C4 = g3+ p3 . (g2 + p2 . (g1 + p1 . (g0 + P0.Cin)))
C4
Cin
p0
p1
p2
p3
g0
g1
g2
g3
[©Hauck][WE92] p529
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 21
CLA: Dynamic Logic Implementation
• Can we reuse logic? Can we get C1, C2 and C3 from the same circuit?
C4
Cin
p0
p1
p2
p3
g0
g1
g2
g3
C1?
C2?
C3?
[©Hauck]
No!No! C1, C2 and C3 C1, C2 and C3 may be floating may be floating (not precharged) (not precharged)
Charge sharingCharge sharing problem problem
No!No! C1, C2 and C3 C1, C2 and C3 may be floating may be floating (not precharged) (not precharged)
Charge sharingCharge sharing problem problem
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 22
CLA: Dynamic Logic Implementation
[WE92] p529
C1g0p0
Cin
p1 g1
C2
g0p0
Cin
p1
p2
g1
g2
C3
g0p0
Cin
p1
p2
p3
g1
g2
g3
C4
g0p0
Cin
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 23
CLA: Basic Block (4 Bits) Architecture
• Block of 4-bit p, g, Cout
C0
A0
S0
B0A1
S1
B1A2
S2
B2A3
S3
B3
p,g p,g p,g p,g
p0 g0p1 g1p2 g2p3 g3
C1C2
C3
C4
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 24
CLA: N-Bit Architecture
• Put it all together:
C0
B0A0
S0
A1
S1
B1A2
S2
B2A3
S3
B3
p,g p,g p,g p,g
C4
A4
S4
A5
S5
B5A6
S6
B6A7
S7
B7
p,g p,g p,g p,g
B4
C8
…
…
…
…
Carry Generator Carry Generator
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 25
CLA: 12-Bit Example
p,g p,g p,g p,g
S0S1S2S3S4S5S6S7
p,g p,g p,g p,g
B0B1B2B3B5B6B7 B4
C0
C4
Carry Generator Carry Generator
C8
S8S9S10S11
p,g p,g p,g p,g
B9B10
A0A1A2A3A4A5A6A7A8A9A10A11B11 B8
Carry Generator
C12
00000 00000 00000T=0
01111101
01101001
11011010
0
B=A=
01001 11110 01111T=201001 00001 01111T=301011 00001 01111T=4
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 26
Summary: Carry Lookahead Adder
• CLA compared to ripple-carry adder: Faster (“4 times”?),
but delay still linear (w.r.t. # of bits) Larger area
o P, G signal generationo Carry generation circuitso Carry generation ckt for each bit position (no re-use)
• Limitation: cannot go beyond 4 bits of look-ahead Large p,g fan-out slows down carry generation
• Next: Manchester carry chains Tries to reuse logic by pre-charging each carry
position
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 27
Outline
• One-bit adder, basic ripple-carry
adder
• Carry-Lookahead adders (CLA)
• Manchester carry chain
• Carry bypass
• Carry select adder
• Brent-Kung adder
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 28
Recap: Carry Look-Ahead
• Charge sharing problem
C4
Cin
p0
p1
p2
p3
g0
g1
g2
g3
C1?
C2?
C3?
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 29
C1 C2 C3
Manchester Carry Chain: First Shot
• Improvement over CLA: Precharge internal nodes to avoid charge-sharing
problem
Cin g0
p0
g1
p1
g2
p2
g3
p3
C4
[©Hauck]
• Fastest way to do small adders– 6 transistors on the critical path
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 30
Manchester Carry Chain: Sizing
R1
C1
R2
C2
R3
C3
R4
C4
R5
C5
R6
C6
Out
M0 M1 M2 M3 M4MC
Discharge Transistor
1 2 3 4 5 6
tp 0.69 Ci Rjj 1=
i
i 1=
N=
1 1.5 2.0 2.5 3.0k
5
10
15
20
25
Spe
ed
1 1.5 2.0 2.5 3.0k
0
100
200
300
400
Are
a
Speed (normalized by 0.69RC) Area (in minimum size devices)
[© Prentice Hall] (“k” is t
he s
izin
g f
acto
r)
dela
y
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 31
Manchester Carry Chain: An Improvement• Problem: Cin arrives late move it closer to output
Use bypass logic:
Cin g0
p0
g1
p1
g2
p2
g3
p3
C4
p0 p1 p2 p3
Cin
[©Hauck]
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 32
Manchester Carry Chain: the Improvement
• Direct implementation
Cin
p0 g0 p1 g1 p2 g2 p3 g3
C4
C1 C2 C3
[©Hauck]
p0 p1 p2 p3
Cin
Cin
C4
C4
• Carry bypass circuitry
• Advantages of the carry bypass circuitry– Only 5 series transistors– Less capacitance in internal nodes
– Cin close to the output
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 33
Manchester Carry Chain: Summary• Compared to CLA:
Smaller areao Pre-charge internal nodeso Reuse logic for intermediate carry signals
Cin close to the output
• Carry chain can be any length Series propagate is slow (O(n2) delay)
buffer every 4 bits
• Compact adder: good for up to 16 bits• Using carries to compute sum slows down
MCC– Use two carry chains: one for sum, one for carry propagation
[©Hauck]
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 34
Outline
• One-bit adder, basic ripple-carry
adder
• Carry-Lookahead adders (CLA)
• Manchester carry chain
• Carry bypass
• Carry select adder
• Brent-Kung adder
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 35
Carry Bypass Adder: Idea
• The “bypass” idea is general Not just for Manchester carry chain The local carry chain could be “ripple carry adder”
Ci
Bit i to i+k
Setup
LocalCarryChain
Sum
Ci+k+1
BypasBypass?s?
• Structure– Could be static,
dynamic, pass transistor
– Carry and sum paths shown in different colors
– Bypass logic determines: “pass” or “kill/generate”?
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 36
Local Carry Chain
• Static implementation, using ripple carry adder
• Dynamic, Manchester (mux=wire!)
Carry Bypass Adder: Cell Examples
FA FA FA FA
p0.p1.p2.p3
g0 g1
p1
g2
p2
g3
p3
C4
p0 p1 p2 p3
Cin
[Rab96] p398
p0
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 37
Carry Bypass Adder: Cell Examples (cont.)
• Static (pass transistor logic), Manchester
T1=(p0.p1.p2).p3 T2=p3 T3=p0.p1.p2.p3
p0
p0
p0
g0
p1
p1
p1
g1
p2
p2
p2
g2
T2
T1
T1
g3
T2
T3
T3
C4C0
[WE92] p531
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 38
Carry Bypass Adder: the Structure and Timing
Bit 0-3
C0
[Rab96] p.399
Setup
LocalCarryChain
Sum
Bit 4-7
Setup
LocalCarryChain
Sum
Bit 8-11
Setup
LocalCarryChain
Sum
Bit 12-15
Setup
LocalCarryChain
Sum
• Timing (Critical path shown in different color):1-Setup2-Local carry generate/kill, MUX select line ready3-C0-C16 carry propagate (if applicable)
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 39
LocalCarryChain
Sum
Bit 8-11Setup
LocalCarryChain
Sum
Bit 8-11Setup
• For an intermediate stage, after setup: If in pass mode
o Local carry vector computes intermediatecarries (possibly incorrectly)
o At the same time, mux selection set to passo When input carry arrives, intermediate carries
might be recomputedo Meanwhile, input carry is sent to Cout
Carry Bypass Adder: Timing of a Sub-block
Sum
Setup
Setup– If not pass mode (assume bit 10
generates)• Local carry vector computes intermediate
carries (bits 10, 11 correc)• At the same time, mux selection set to
local• Meanwhile, output carry is sent to Cout
correctly• When input carry arrives, intermediate
carriesC8and C9 (S8,S9,S10) will be recomputed correctly
LocalCarryChain
LocalCarryChain
Sum
LocalCarryChain
SumSum
LocalCarryChain
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 40
3 x tFA+ tsum3 xtmux_pass +
max { tselect , 4 x tFA} +tsetup+
Carry Bypass Adder: Timing
Bit 0-3
C0
Setup
LocalCarryChain
Sum
Bit 4-7
Setup
LocalCarryChain
Sum
Bit 8-11
Setup
LocalCarryChain
Sum
Bit 12-15
Setup
LocalCarryChain
Sum
Delay =
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 41
Carry Bypass Adder: Pros and Cons
• Speed: Faster than
ripple adder Still linear!
• Area overhead: Mux (setup?) Not worth for
small adders (N<8) 10-20% for
large adders
[Rab96] p.399
Pro
pag
ati
on
Dela
yNumber of
bits
4..8
Ripple Adder
Bypass Adder
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 42
Outline
• One-bit adder, basic ripple-carry
adder
• Carry-Lookahead adders (CLA)
• Manchester carry chain
• Carry bypass
• Carry select adder
• Brent-Kung adder
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 43
Carry Select Adder: the Idea
• Similar to bypass Instead of “waiting” for
the input carry, ”precompute” the carry output
Compute Ci+k for both cases Ci=0 and Ci=1
When Ci arrives, select the appropriate result
Sum computed in one step after the intermediate carry signals are ready
[Rab96] p.400
p,g p,g
MultiplexersCi Ci+k
Sum GenerationCarry Vector
Setup (p,g)
k bits
0-Carry propagation
1-Carry propagation1
0
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 44
Linear Carry Select Adder: Structure
C0
Sum
Setup
Bits 0-3
0-Carry
1-Carry1
0
C4
Sum
Setup
Bits 4-7
0-Carry
1-Carry1
0
C8
Sum
Setup
Bits 8-11
0-Carry
1-Carry1
0
C12
Sum
Setup
Bits 12-15
0-Carry
1-Carry1
0
C16
[Rab96] p.401
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 45
Linear Carry Select Adder: Timing
Setup
Bits 0-3
Setup
Bits 4-7
Setup
Bits 8-11
Setup
Bits 12-15
C0 C4
Sum
C8
Sum
C12
Sum
0-Carry
1-Carry1
0 0-Carry
1-Carry1
0 0-Carry
1-Carry1
0 0-Carry
1-Carry1
0
Sum
C16
Delay = 3 + 1 + 1 + 1 + 1 = 7 (16 bits)
[Rab96] p.401
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 46
Square Root Carry Select Adder: the Idea
• Later stages have to wait for the multiplexers in the earlier stages
• Why not give them bigger chunks of data to compute? Balances the delay paths Sub-linear delay (we will see why)
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 47
3
Square Root Carry Select Adder: the Structure
• Assuming the following delays: Setup=1, carry propagate=1/bit, mux=1
C0Sum
Bits 0-1
C2
Bits 2-4
C5
4
Bits 5-8
C9
5
Bits 9-13
C14
6
Bits 14-19
C19
7
Delay from all paths = 8 (20 bits)
[Rab96] p.402
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 48
Square Root Carry Select Adder: Delay• Assume
N-bit adder P stages (delay directly depends on P) First stage computes M bits
• For M<<N (e.g. N=64, M=2) The first term dominates N P2/2
)2
1(
2
2
)1(
)1()2()1(
2
MPP
PPMP
PMMMMN
)2
1(
2
2
)1(
)1()2()1(
2
MPP
PPMP
PMMMMN
NP 2 NP 2
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 49
Carry Select Adder: Trade-offs• Area overhead:
An additional carry path and a multiplexer (not the whole adder)
About 30% more than a ripple-carry
• Delay Sub-linear (we can beat that too!)
0 20 40 60Number of bits
0.0
10.0
20.0
30.0
40.0ripple adder
linear select
square root select
[© Prentice Hall]
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 50
Outline
• One-bit adder, basic ripple-carry
adder
• Carry-Lookahead adders (CLA)
• Manchester carry chain
• Carry bypass
• Carry select adder
• Brent-Kung adder
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 51
Binary Carry-Lookahead or Brent-Kung Adder
• Idea: use binary tree for carry propagation logarithmic delay
A7
F
A6A5A4A3A2A1
A0
A0A1A2A3A4A5A6A7
F
tp log2(N)
tp N
[© Prentice Hall]
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 52
Brent-Kung Adder
• Basic component
Concatenation
MSB LSB
gleft pleft gright pright
g p
(g, p)
g = gleft + pleft • gright
p = pleft • pright
(gleft, pleft) (gright pright)
[©Hauck]
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 53
No! Doesn’t know aboutC0-3 yet!
C5?
Brent-Kung Adder: Structure• Define (Gi, Pi)
generate and propagate for least significant i bits(G0,P0) = (g0,p0) gi = Ai.Bi pi = AiBi
for i>0: (Gi, Pi) = (gi, pi) • (Gi-1, Pi-1)
= (gi, pi) • (gi-1, pi-1) • . . . . • (g1, p1)
• Key to Brent-Kung adder – use tree structure to perform concatenations
7 6 5 4 3 2 1 0
7-6 5-4 3-2 1-0
7-4 3-0
7-0 [©Hauck]
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 54
Brent-Kung: the Complete Tree
tadd log2 (N) [© Prentice Hall]
(g0 ,p0)(g1 ,p1)
(g2 ,p2)
(g3 ,p3)
(g4 ,p4)
(g5 ,p5)
(g6 ,p6)
(g7 ,p7)
C0C1
C3
C7
C2
C6
C5
C4
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 55
Brent-Kung: Timing
[©Oxford U Press][Par00] p.102
x0x1x2x3x4x5x6x7x8x9x10x11
x12x13x14x15
s0s1s2s3s4s5s6s7s8s9s10s11s12s13s14s15
1
2
3
4
5
6
Level
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 56
Brent-Kung Adder: Summary
• Area On average, twice as large as ripple adder Layout of the cells is very compact
• Delay Logarithmic time Once carry signals are ready,
sum bits derived in const time Good for wide adders
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 57
Comparing Adder Designs
0 10 20
Number of bits
0
20
40
60
80
0 10 20
Number of bits
0
0.2
0.4
Brent-Kung
select
bypassmanchestermirrorstatic
manchester
Brent-Kung
select
static
mirrorbypass
[© Prentice Hall]
t p(s
ec)
Are
a (
mm
2)
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 58
Combining Different Adders
[©Oxford U Press][Par00] p.103
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 59
Combining Different Adders
• Two-level carry skip adder Delay = 8 cycles Number of bits: 30
Blk E Block D Block C Block B Block AF
7 6 5 4 3 3
2 Cint=0
Coutt=8
[©Oxford U Press][Par00] p.113
c c
80
7 6 5 34 3
b b b b b b{8, 1} {7, 2} {6, 3} {5, 4} {4, 5} {3, 8}
inoutABCDEF
S2 S2 S2 S2 S2
Tproduce Tassimilate
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 60
Combining Different Adders
40 BitCarry Select Adder
24 BitDifferential CarryLookahead Adder
MSB LSBRA(23:0) RB(23:0)RA(63:24) RB(63:24)
cout2364 Bit Adder
EA(63:24)
EA(23:0)
real_add(40:0)hit/miss/data
TLB
Compare
DataCache
Compare
© Dan Stasiak, IBM Rochester, 2001
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 61
Combining Different Adders
© Dan Stasiak, IBM Rochester, 2001
40 Bit Adder Section 24 Bit Adder Section
EA(0:23) &EA_L(0:23)EA(24:63)
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 62
Combining Different Adders
• Ripple+skip adder: delay=8. Max adder width? Assume: p,g, ripple, skip signal, skipping: 1 unit delay Carry signals
o Pass mode: ready at time x through skip logic limit # blockso Local gen mode: blocks can process y bits and still have time to
deliver locally generated carry by time x for the next block.
Sum signalso If in local generation mode, y is OK
o If in pass mode, y not OK for left bits (e.g., bE receives cin at x=5, can process at most z=3 bits to meet the delay bound of 8 on the sum bits)
[©Oxford U Press][Par00] p.112
Cout Cin
7
0
6 5 4 232
b b b b b bABCDEF
S S S S S
bG
7 6 5 4 3 11 2 3 4
Should appear before
slide 126
Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 63
CLA Static Logic: Trimmed Down
p0
Cin
g0
C1
[©Hauck][Rab96] p405
h
j
k
s
t
u
Should appear before
slide 86
Top Related