How does CLA (carry look- ahead adder) work? Wei-jen Hsu TA for EE457 at USC, fall 2004 Modified...
-
Upload
juliana-wheeler -
Category
Documents
-
view
214 -
download
2
Transcript of How does CLA (carry look- ahead adder) work? Wei-jen Hsu TA for EE457 at USC, fall 2004 Modified...
How does CLA (carry look-ahead adder) work?
Wei-jen Hsu
TA for EE457 at USC, fall 2004
Modified fall 2005
A simple one-bit full adder
(+)
A B
Cin
S
Cout
It takes A, B, and Cin as input and generates S and Cout in 2 gate delays (SOP)
(+)
4-bit RCA
(+)
A3 B3
S3
C4 (+)
A1 B1
S1
A2 B2
S2
(+)
A0 B0
C0
S0
C1C2C3
• Work from lowest bit to highest bit sequentially.• With A0, B0, and C0, the lowest bit adder generates S0 and C1 in 2 gate delay.• With A1, B1, and C1 ready, the second bit adder generates S1 and C2 in 2 gate delay.• Each bit adder has to wait for the lower bit adder to propagate the carry.
Carry propagationforms a long sequentialwait chain, hence RCAIs slow!!
Observations
• The critical component each bit adder waits for is the carry input.
• Instead of generating and propagating carry bit-by-bit, can we generate all of them in parallel and break the sequential chain?
• This is exactly the idea of CLA (carry look-ahead adder).
Carry Look Ahead Logic
• Now even before the carry in (Cin) is available, based on the inputs (A,B) only, can we say anything about the carry out?
• Under what condition will the bit propagate an outgoing carry (Cout), if there is an incoming carry (Cin)?
• Under what condition will the bit generate an outgoing carry (Cout), regardless of whether there is an incoming carry (Cin)?
1-bit CLA adder
(+)
A B
CinS
p g
• Instead of Cout, an 1-bit CLA adder block takes A, B inputs and generates p,g• p=propagator =>I will propagate the Cin to the next bit. p = A+B (If either A or B is 1, Cin=1 causes Cout=1)• g=generator =>I will generate a Cout independent of what Cin is. g = AB (If both A and B are 1, Cout=1 for sure)• p,g are generated in 1 gate delay after we have A,B. Note that Cin is not needed to generate p,g.• S is generated in 2 gate delay after we get Cin (SOP).
4-bit CLA
(+)
A B
C0
p g
(+)
A B
p g
(+)
A B
p g
(+)
A B
p g
CLL (carry look-ahead logic)
• The CLL takes p,g from all 4 bits and C0 as input to generate all Cs in 2 gate delay.• C1=g0+p0C0,• C2=g1+p1g0+p1p0C0,• C3=g2+p2g1+p2p1g0+p2p1p0c0,• C4=g3+p3g2+p3p2g1+p3p2p1g0+p3p2p1p0c0 (Note: this C4 is too complicated to generate in 2-level SOP representation)
C1C2C3
C4
4-bit CLA
(+)
A0 B0
C0(+)
A1 B1
(+)
A2 B2
(+)
A3 B3
CLL (carry look-ahead logic)
p0 g0p1 g1p2 g2p3 g3
• Given A,B’s, all p,g’s are generated in 1 gate delay in parallel.
C1C2C3
• Given all p,g’s, all C’s are generated in 2 gate delay in parallel.
S3 S2 S1 S0
• Given all C’s, all S’s are generated in 2 gate delay in parallel.
• Key virtue of CLA: sequential operation in RCA is broken into parallel operation!!
Observation
• The CLL block cannot be made too big (at most 4 bits) because if the equations for C’s are too long it cannot be evaluated in 2 gate delay.
• So how about long operands, say 16 bits?
• We add another layer of CLL and make a multi-level CLA.
16-bit CLA
• Same as before, p,g’s are generated in parallel in 1 gate delay
• The second-tier CLL takes the P,G’s from first-tier CLLs and C0 to generate “seed C’s” for first-tier CLLs in 2 gate delay. (note that the logic for generating “seed C’s” from P,G’s is exactly the same to generating C’s from p,g’s!)
• With the seed C’s as input, the first-tier CLLs use Cin and p,g’s to generate C’s in 2 gate delay• With all C’s in place, S’s are calculated in 2 gate delay
Therefore, totally1+2+2+2+2=9 gate delayto finish the whole thing!!
• Now, without input carry, the first-tier CLL cannot generate C’s…… Instead they generate P,G’s (group propagator and group generator) in 2 gate delay P => This group will propagate the input carry to the group P=p0p1p2p3 G => This group will generate a output carry G=g3+p3g2+p3p2g1+p3p2p1g0
Now, how about 64-bit CLA?
• You can visualize that in mind by yourself now, I guess.
A bit more details
(+)
A0 B0
C0(+)
A1 B1
(+)
A2 B2
(+)
A3 B3
CLL (carry look-ahead logic)
p0 g0p1 g1p2 g2p3 g3
C1C2C3
C4
S3 S2 S1 S0
• Do all these 4 S’s (S3, S2, S1, S0) come together? Actually no! Since C0 is available from the beginning, S0 can be calculated in 2 gate delays (using original SOP expression for S bit in a single bit adder) (before S3,S2,S1)
A bit more details (Cont’d)
• Again, actually not all the S’s come together!• C0 is readily available, so S0 can be calculated in 2 gate delays.• Since C0 is readily available, the lowest first-tier CLL can generate C1, C2, C3 independent of the second-tier CLL. Since C1, C2, C3 are done earlier, so is S1, S2, S3. (in 5 gate delays. 1 for (p,g), 2 for (C1,C2,C3), 2 for S)• When we get C4, C8, C12, we can start to calculate S4, S8, S12 and get them in 2 more gate delays. That is the same time when we get the other C’s (purple guys) at 7 gate delays.• Timing for items generated (in terms of gate delay): black=already available (and special case for S0=4), orange=1, green=3, blue=5, purple=7, brown=9
C0
Thanks!!
(You can distribute these slides as one whole file to anywhere you
feel it may be useful.)