Adapted from EE271 notes, Stanford University · PDF fileAdapted from EE271 notes, Stanford...

download Adapted from EE271 notes, Stanford University · PDF fileAdapted from EE271 notes, Stanford University. Overview ... nShared contacts nTransistor sizing nGate sizing nReading nW&E

If you can't read please download the document

Transcript of Adapted from EE271 notes, Stanford University · PDF fileAdapted from EE271 notes, Stanford...

  • Delay Calculation

    Kenneth YunUC San Diego

    Adapted from EE271 notes,Stanford University

  • Overviewn Review (RC model)n Elmore delayn Transmission gatesn Shared contactsn Transistor sizingn Gate sizingn Reading

    n W&E 4.1-4.3.6, 4.5.4

  • Review of RC Modeln RC Model

    n Model transistor as a linear resistorn R = R x L/W

    n Model load as capacitorn Delay = RC

    n Recall that R is adjusted to RC the correct value

    n Source of load capacitancen Gate cap of driven transistorsn Diffusion cap of source/drain region of driving

    transistorn Wire cap

  • Rule of Thumb Cap Table

    2.0fF/pdiff (5 or 6 wide)

    2.0fF/ndiff (5 or 6 wide)

    2.0fF/gate (poly over diff)Cap/Transistor cap

    0.2fF/metal 2 (3 or 4 wide)0.3fF/metal 1 (3 or 4 wide)0.2fF/poly wire (2 wide)Cap/Wire cap

  • Simple Diffusion Cap Modeln Diffusion cap = W x / x 2fF/

    n assuming that width of diffusion contact is 5 (6)n W in n / = 0.2 in 0.35 m process

    n For example, for W=16 in 0.35 m processn Diffusion cap = 16 x 0.2 x 2 fF = 6.4fF

    W

    5

  • Folding Transistorsn To reduce diffusion cap

    n Cap of region x for 16 transistor = 6.4fFn Cap of region x for 16 folded transistor =

    3.2fF

    16

    5

    86

    x

    xGnd Gnd Gnd

  • RC Delay

    n Rpu = 30K/12 = 2.5K; Rpd = 15K/8 = 1.875K

    n Cload = 100 x 0.3fF + (16+8) x 0.2 x 2fF + (24+16) x 0.2 x 2fF = 30+9.6+16 = 55.6fF

    n Delayin x = 1.875K x 55.6fF = 104.25ps; Delayin x = 2.5K x 55.6fF = 139ps

    100 M1

    24:2

    16:2

    16:2

    8:2

    in x

  • RC Delay (Continued)

    n RP1 = 30K/4 = 7.5K; RN1 = 15K/2 = 7.5Kn Cx = 100 x 0.3fF + (16+8) x 0.2 x 2fF +

    (8+4) x 0.2 x 2fF = 30+9.6+4.8 = 44.4fFn Delayinx = 7.5K x 44.4fF = 333ps

    100 M1

    8:2

    4:2

    16:2

    8:2

    in x200 M1

    32:2

    16:2

    y

    P1

    N1

    P2

    N2

    P3

    N3

  • RC Delay (Continued)

    n RP2 = 30K/8 = 3.75K = RN2n Cy = 200 x 0.3fF + (32+16) x 0.2 x 2fF +

    (16+8) x 0.2 x 2fF = 60+19.2+9.6 = 88.8fFn Delayxy = 3.75K x 88.8fF = 333psn Delayiny = Delayinx + Delayxy = 666ps

    100 M1

    8:2

    4:2

    16:2

    8:2

    in x200 M1

    32:2

    16:2

    y

    P1

    N1

    P2

    N2

    P3

    N3

  • Series Stacks

    n If C3 >> C1 + C2, then the delay is approximately (R1+R2+R3)C3

    n Otherwise, ???n Distributed RC

    C1 C2 C3R1

    R2

    R3

  • Elmore Delayn For distributed RC network

    n Delay = R1C1 + (R1+R2)C2 + (R1+R2 +R3)C3 + (R1+R2 +R3 +R4)C4

    n Sum of the delays to charge (discharge) individual capacitors

    k

    n

    k

    k

    iin CR

    = =

    =

    1 1

    C1 C2 C3 C4

    R1 R2 R3 R4

    C1 C2 C3 C4

    R1 R2 R3 R4

  • Distributed RC Model for Wiren R and C are not

    really lumpedn Use modeln Break wire into many

    lumped elementsn Delay independent of

    number of segments

    Rtrans

    C

    Rtrans

    2

    C

    2

    C

    R

    Rtrans

    4

    C

    2

    C

    4

    C

    2

    R

    2

    R

    2

    RCCRtrans +=

  • Series Stack

    n Gate capn Cap between gate and channeln Case I: gate cap only seen from gate side if

    source is groundedn Case II: gate cap also seen on the other side if

    source voltage is in transition (and gate voltage fixed)

    Cx

    Cg

    Cx

    Cg

    Cx

    Case I Case IIsource

  • Series Stack Delayn R = 15K/4 = 3.75Kn C = Cdiff + Cgate = 2 x 8 x 0.2 x 2fF = 6.4fFn Cout = Cdiff + Cload = 3.2fF + Cloadn Delay = RC + 2RC + 3RC + 4RCout = 6RC +

    4RCout = 144ps + 4RCoutn For n stages, delay = n(n1)RC/2 + nRCout

    n O(n2): quadratic in n

    Gnd

    C C C

    R R R

    Cout

    R

    out

    8

    6

  • Resistance of Transmission Gaten Two transistors in parallel

    n When passing 0, resistance of pMOS doubles (roughly) because |Vgs| = Vdd |Vth| for pMOSn Rp = 2 x 30K/ = 60K/n Rn = 15K/n For 1:1 p to n ratio, R = Rp || Rn = 12K/

    n When passing 1, resistance of nMOS doubles (roughly) because Vgs = Vdd Vth for nMOSn Rn = 2 x 15K/ = 30K/n Rp = 30K/n For 1:1 p to n ratio, R = Rp || Rn = 15K/

    n So R 15K/n For 8:2 T-gate, R 3.75K

  • Capacitance of Transmission Gaten When off, cap on node A (or B) entirely from

    two diffusion contactsn 2 x 8 x 0.2 x 2fF = 6.4fF

    n When on, an additional gate cap is also seen because a source voltage is in transitionn But only one matters because Vgs = 0 for pMOS,

    when passing 0, and Vgs = 0 for nMOS, when passing 1

    n Total cap = 3 x 3.2fF = 9.6fF

    8:2

    8:2

    3.75K

    A B 9.6fF 9.6fF

  • Transmission Gate Example

    n Ca = 2 diff contacts (INV1) + TG cap (on) = 2 x 3.2fF + 9.6fF = 16fF

    n Cb = TG cap (on) + TG cap (off) + gate cap (INV2) = 9.6fF + 6.4fF + 6.4fF = 22.4fFn Why is the second TG cap smaller?

    All transistors are 8:2

    3.75K9.6fF 9.6fF

    1in

    0

    c

    a

    b d

    1

    2

  • RC Model of Example

    n tinb = 7.5K x 16fF + (7.5K + 3.75K) x 22.4fF = 372ps

    a b7.5K 3.75K

    16fF 22.4fF

    3.75K

    16fF 22.4fFa b

    3.75K

    n tinb = 3.75K x 16fF + (3.75K + 3.75K) x 22.4fF = 228ps

  • Shared Contacts

    n Cap on node a reduced by 6.4fFn Inverter and TG share diffusion

    contacts

    n Cap on node b reduced by 6.4fFn Two TGs share diffusion contacts

    INV TG TG

    ab

    a b3.75K

    9.6fF 16fFTG

  • RC Delay with Shared Contacts

    n tinb = 7.5K x 9.6fF + (7.5K + 3.75K) x 16fF = 252ps

    3.75K

    9.6fF 16fFa b

    3.75K

    n tinb = 3.75K x 9.6fF + (3.75K + 3.75K) x 16fF = 156ps

    > 30% reduction in delay!

    7.5K 3.75K

    9.6fF 16fFa b

  • Transistor Sizingn Need delay estimates to

    size transistorsn Need to know

    n Load the transistor drivesn Load the transistor

    presents to its predecessor

    n For example, to drive a large cap (2pF shown below)n need a large driver

    (400:2 and 200:2 for p and n)

    n But large driver also slows down the predecessor stage

    8:2

    2pF (10mm of M2)4:2

    15ns delay

    400:2

    200:2 2pF300ps delay

    8:2

    4:2

    7.5K x 600 x 0.2 x 2fF = 1.8ns delay

  • Optimum Transistor Sizingn Minimize delay of chain

    n Equalize the delay of every gaten How?

    n Each gate drives f times larger gate

    1 f f 2 f 3

  • Optimum Transistor Sizing Justification

    n Introduce irregularity in the chain by making the second inverter to be g times as large as the first (instead of f times)

    n Then the delay becomes

    Assume Wp = 2Wn so that rise time = fall time

    1 g f 2 f 3R

    gC

    R/g R/f 2 R/f 3

    f 2C f 3C

    Cload

    01 2

    2

    =

    =

    RCg

    f

    g

    RCC

    C

    ff

    g

    fgRC

    ffRCRC

    g

    fgRC

    +++=+++= load3

    2

    load3

    2 11

    n Differentiate it with respect to g

    n Optimum value of g is then equal to f

  • Choosing fn Each inverter drives an inverter f times its

    sizen N inverters in the chainn R = resistance of a driving transistor in the first

    inverter (assume that Wp = 2Wn)n C = input cap of the first inverter

    loadCCfN =

    ( )( )f

    CCN

    ln

    ln load=

    ( )( ) fRCf

    CC

    ln

    ln load=

    ( )( )[ ] ( ) 0lnln

    1ln2 =

    =

    RCCCf

    f

    f load

    ef =

    Total delay:

    e f

  • Choosing f (continued)n So far, we ignored diffusion and wire capn Assuming that Cdiff is the diffusion cap of the

    first inverter n Inverter delay fRC + RCdiff = (f+)RC

    n where = Cdiff / Cn Why is the second factor (RCdiff) independent of f?

    n Total delay = N(f+)RC ( )

    ( ) RCffCC

    )(ln

    ln += load

    ( )

    ( )[ ]( ) 0ln

    ln

    1ln

    2 =

    +

    =

    RCCCf

    ff

    f load

    e

    1

    ( )fln

    f

    f

    +1

    4 ffor small

  • Gate Sizingn Equalize loaded delay in every stagen If delays not equal,

    n Make the gate with the longest delay larger,

    n which decreases its delay but increases predecessors

    n But the overall delay decreases as long as the delay reduction is greater than the increase in predecessors delay

    n Repeat until all delays are equal

  • Gate Sizing Justificationn Suppose N2 has the longest delayn Make N2 x times larger

    0)1(

    )1(11

    )(1

    2211

    112211221122

  • Standard Cell Transistor Sizen Delay should not be too sensitive to

    placementn which implies that wire cap should be small

    compared to totaln Long wires are in mms (0.2pF/mm)n Thus transistors should be large

  • Wire Delayn For short wires, Rwire

  • Long Wiresn Break them into three regions

    Optimal repeater distance

    Optimal repeater

    size

    n For middle region, optimal spacing is determined whenn Added buffer delay matches the reduction

    in wire delay

  • Rules of Thumb for Fast Designsn Keep fanouts of all gates less than 5n Keep delays of gates in critical path

    roughly the samen Large fanin gates should have fewer

    fanoutsn Limit faninn Use short buffer chains (sometimes one

    inverter) when necessaryn Use bubble shuffling to reduce logic