ASICs the Course Book

download ASICs the Course Book

of 189

Transcript of ASICs the Course Book

  • 8/9/2019 ASICs the Course Book

    1/505

    ASICs... the course

    Michael John Sebastian Smith

    This course is based on ASICs... the book

    Application-Specic Integrated CircuitsMichael J. S. SmithVLSI Design Series1,040 pagesISBN 0-201-50022-1LOC TK7874.6.S63Addison Wesley Longman, http://www.awl.com

    Additional material (gures, resources, source code) is located atASICs... the website

    http://spectra.eng.hawaii.edu/~msmith/ASICs/HTML/ASICs.htm

  • 8/9/2019 ASICs the Course Book

    2/505

    Some material in this work is reprinted from IEEE Std 1149.1-1990, “IEEE Standard Test Access Port and Boundary-Scan Archi-tecture,” Copyright © 1990; IEEE Std 1076/INT-1991 “IEEE Standards Interpretations: IEEE Std 1076-1987, IEEE StandardVHDL Language Reference Manual,” Copyright © 1991; IEEE Std 1076-1993 “IEEE Standard VHDL Language ReferenceManual,” Copyright © 1993; IEEE Std 1164-1993 “IEEE Standard Multivalue Logic System for VHDL Model Interoperability(Std_logic_1164),” Copyright © 1993; IEEE Std 1149.1b-1994 “Supplement to IEEE Std 1149.1-1990, IEEE Standard TestAccess Port and Boundary-Scan Architecture,” Copyright © 1994; IEEE Std 1076.4-1995 “IEEE Standard for VITAL Applica-tion-Specic Integerated Circuit (ASIC) Modeling Specication,” Copyright © 1995; IEEE 1364-1995 “IEEE Standard Descrip-tion Language Based on the Verilog ® Hardware Description Language,” Copyright © 1995; and IEEE Std 1076.3-1997 “IEEEStandard for VHDL Synthesis Packages,” Copyright © 1997; by the Institute of Electrical and Electronics Engineers, Inc. TheIEEE disclaims any responsibility or liability resulting from the placement and use in the described manner. Information isreprinted with the permission of the IEEE . Figures describing Xilinx FPGAs are courtesy of Xilinx, Inc. ©Xilinx, Inc. 1996,1997, 1998. All rights reserved. Figures describing Altera CPLDs are courtesy of Altera Corporation. Altera is a trademark andservice mark of Altera Corporation in the United States and other countries. Altera products are the intellectual property of AlteraCorporation and are protected by copyright laws and one or more U.S. and foreign patents and patent applications. Figuresdescribing Actel FPGAs iare courtesy of Actel Corporation.

    The programs and applications presented in this work have been included for their instructional value. They have been tested with

    care but are not guaranteed for any particular purpose. The author does not offer any warranties, representations, or accept any lia-bilities with respect to the programs or applications.

    Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where thosedesignations appear in this work, and the author was aware of a trademark claim, the designations have been printed in initial caps orall caps.

    Figures copyright © 1997 by Addison Wesley Longman, Inc. Text copyright © 1997, 1998 by Michael John Sebastian Smith.

  • 8/9/2019 ASICs the Course Book

    3/505

    ASICs...THE COURSE (1 WEEK)

    1

    INTRODUCTIONTO ASICs

    An ASIC (“a-sick”) is an application-specific integrated circuit

    A gate equivalent is a NAND gate F = A • B (IBM uses a NOR gate), or four transistors

    History of integration: small-scale integration (SSI , ~10 gates per chip, 60’s), medium-scale integration (MSI, ~100–1000 gates per chip, 70’s), large-scale integration (LSI ,~1000–10,000 gates per chip, 80’s), very large-scale integration (VLSI , ~10,000–100,000gates per chip, 90’s), ultralarge scale integration (ULSI , ~1M–10M gates per chip)

    History of technology: bipolar technology and transistor–transistor logic (TTL) precededmetal-oxide-silicon (MOS ) technology because it was difcult to make metal-gate n-chan-nel MOS ( nMOS or NMOS ); the introduction of complementary MOS (CMOS , never cMOS)greatly reduced power

    The feature size is the smallest shape you can make on a chip and is measured in λ orlambda

    Origin of ASICs: the standard parts , initially used to design microelectronic systems ,were gradually replaced with a combination of glue logic , custom ICs , dynamic random-access memory (DRAM) and static RAM (SRAM )

    History of ASICs: The IEEE Custom Integrated Circuits Conference (CICC) and IEEE Inter-national ASIC Conference document the development of ASICs

    Application-specific standard products (ASSPs ) are a cross between standard parts andASICs

    1.1 Types of ASICsICs are made on a wafer . Circuits are built up with successive mask layers . The number ofmasks used to dene the interconnect and other layers is different between full-customICs and programmable ASICs

    Key concepts: The difference between full-custom and semicustom ASICs • The difference

    between standard-cell, gate-array, and programmable ASICs • ASIC design flow • Design

    economics • ASIC cell library

  • 8/9/2019 ASICs the Course Book

    4/505

    2 SECTION 1 INTRODUCTION TO ASICs ASICS... THE COURSE

    1.1.1 Full-Custom ASICsAll mask layers are customized in a full-custom ASIC .

    It only makes sense to design a full-custom IC if there are no libraries available.

    Full-custom offers the highest performance and lowest part cost (smallest die size) with thedisadvantages of increased design time, complexity, design expense, and highest risk.

    Microprocessors were exclusively full-custom, but designers are increasingly turning tosemicustom ASIC techniques in this area too.

    Other examples of full-custom ICs or ASICs are requirements for high-voltage (automobile),analog/digital (communications), or sensors and actuators.

    1.1.2 Standard-Cell–Based ASICs

    In datapath (DP ) logic we may use a datapath compiler and a datapath library . Cells suchas arithmetic and logical units (ALUs ) are pitch-matched to each other to improve timingand density.

    A silicon chip or integrated cicuit (IC) is more properly called a die

    A cell-based ASIC (CBIC —“sea-bick”)• Standard cells

    • Possibly megacells , megafunctions , full-custom blocks , system-level macros (SLMs ),fixed blocks , cores , or Functional StandardBlocks (FSBs )

    • All mask layers are customized—transistors andinterconnect

    • Custom blocks can be embedded

    • Manufacturing lead time is about eight weeks.

    silicondie

    (a) (b)0.1 inch

    4 5

    standard-cellarea

    2

    fixedblocks

    3

    0.02in500 µm

    1

  • 8/9/2019 ASICs the Course Book

    5/505

    ASICs... THE COURSE 1.1 Types of ASICs 3

    1.1.3 Gate-Array–Based ASICs

    A gate array , masked gate array , MGA, or prediffused array uses macros (books ) toreduce turnaround time and comprises a base array made from a base cell or primitivecell . There are three types:

    • Channeled gate arrays

    • Channelless gate arrays

    • Structured gate arrays

    Looking down on the layout of a standard cell from a standard-cell library

    pdiff

    n-well

    p-well

    ndiff

    pdiff

    ndiff

    VDD

    GND

    via

    cell bounding box(BB)

    m1

    contact

    poly

    A1 B1Z

    10λ

    (AB)cell abutment box

    pdiff

    metal2

  • 8/9/2019 ASICs the Course Book

    6/505

    4 SECTION 1 INTRODUCTION TO ASICs ASICS... THE COURSE

    Routing a CBIC (cell-based IC)

    • A “wall” of standard cells forms a flexible block

    • metal2 may be used in a feedthrough cell to cross over cell rows that use metal1 for wir-

    ing• Other wiring cells: spacer cells , row-end cells , and power cells

    A note on the use of hyphens and dashes in the spelling (orthography) of compound nouns: Be

    careful to distinguish between a “high-school girl” (a girl of high-school age) and a “high school

    girl” (is she on drugs or perhaps very tall?).

    We write “channeled gate array,” but “channeled gate-array architecture” because the gate

    array is channeled; it is not “channeled-gate array architecture” (which is an array of chan-

    neled-gates) or “channeled gate array architecture” (which is ambiguous).

    We write gate-array–based ASICs (with a en-dash between array and based) to mean (gate

    array)-based ASICs.

    expanded viewof part of flexible

    block 1

    rows of standard cells

    terminal250 λ

    50 λ

    VDDVSS

    Z

    cell A.11

    cell A.132

    I1

    VDDVSS

    metal1

    metal2 power cell

    row-endcells

    spacer

    cells

    to powerpads

    metal2

    metal1

    cell A.23cell A.14

    to powerpads

    metal2

    metal1

    noconnection

    connection

    1

    feedthrough

  • 8/9/2019 ASICs the Course Book

    7/505

    ASICs... THE COURSE 1.1 Types of ASICs 5

    1.1.4 Channeled Gate Array

    1.1.5 Channelless Gate Array

    1.1.6 Structured Gate Array

    A channeled gate array

    • Only the interconnect is customized

    • The interconnect uses predefined spaces between rowsof base cells

    • Manufacturing lead time is between two days and twoweeks

    A channelless gate array (channel-free gate array , sea-of-gates array , or SOG array)

    • Only some (the top few) mask layers are customized—the interconnect

    • Manufacturing lead time is between two days and twoweeks.

    array ofbase cells(not allshown)

    base cell

    array ofbase cells(not allshown)

    base cell

  • 8/9/2019 ASICs the Course Book

    8/505

    6 SECTION 1 INTRODUCTION TO ASICs ASICS... THE COURSE

    1.1.7 Programmable Logic Devices

    An embedded gate array or structured gatearray (masterslice or masterimage )

    • Only the interconnect is customized

    • Custom blocks (the same for each design)can be embedded

    • Manufacturing lead time is between two daysand two weeks.

    Examples and types of PLDs: read-only memory (ROM ) • programmable ROM or PROM •

    electrically programmable ROM , or EPROM • An erasable PLD (EPLD) • electrically eras-

    able PROM , or EEPROM • UV-erasable PROM , or UVPROM • mask-programmable ROM

    • A mask-programmed PLD usually uses bipolar technology

    Logic arrays may be either a Programmable Array Logic (PAL ® , a registered trademark of

    AMD) or a programmable logic array (PLA); both have an AND plane and an OR plane

    A programmable logic device (PLD )

    • No customized mask layers or logic cells

    • Fast design turnaround

    • A single large block of programmable intercon-nect

    • A matrix of logic macrocells that usually consist ofprogrammable array logic followed by a flip-flop orlatch

    embeddedblock

    array ofbase cells(not allshown)

    macrocell

    programmableinterconnect

  • 8/9/2019 ASICs the Course Book

    9/505

    ASICs... THE COURSE 1.2 Design Flow 7

    1.1.8 Field-Programmable Gate Arrays

    1.2 Design FlowA design flow is a sequence of steps to design an ASIC

    1. Design entry . Using a hardware description language (HDL) or schematic entry.

    2. Logic synthesis . Produces a netlist —logic cells and their connections.

    3. System partitioning . Divide a large system into ASIC-sized pieces.

    4. Prelayout simulation . Check to see if the design functions correctly.

    5. Floorplanning . Arrange the blocks of the netlist on the chip.

    6. Placement . Decide the locations of cells in a block.

    7. Routing . Make the connections between cells and blocks.

    8. Extraction . Determine the resistance and capacitance of the interconnect.

    9. Postlayout simulation . Check to see the design still works with the added loads of theinterconnect.

    1.3 Case StudySPARCstation 1: Better performance at lower cost • Compact size, reduced power, and quietoperation • Reduced number of parts, easier assembly, and improved reliability

    A field-programmable gate array (FPGA ) orcomplex PLD

    • None of the mask layers are customized• A method for programming the basic logiccells and the interconnect

    • The core is a regular array of programmablebasic logic cells that can implement combina-tional as well as sequential logic (flip-flops)

    • A matrix of programmable interconnect sur-rounds the basic logic cells

    • Programmable I/O cells surround the core

    • Design turnaround is a few hours

    programmablebasic logiccell

    programmableinterconnect

  • 8/9/2019 ASICs the Course Book

    10/505

    8 SECTION 1 INTRODUCTION TO ASICs ASICS... THE COURSE

    ASIC design ow. Steps 1–4 are logical design , and steps 5–9 are physical design

    The ASICs in the Sun Microsystems SPARCstation 1

    SPARCstation 1 ASIC Gates (k-gates)1 SPARC integer unit (IU) 202 SPARC oating-point unit (FPU) 503 Cache controller 94 Memory-management unit (MMU) 55 Data buffer 36 Direct memory access (DMA) controller 97 Video controller/data buffer 48 RAM controller 19 Clock generator 1

    design entry

    systempartitioning

    floorplanning

    placement

    routing

    logic synthesis

    VHDL/Verilog

    chip

    block

    logic cells

    netlist

    prelayoutsimulation

    circuitextraction

    postlayoutsimulation

    back-annotatednetlist finish

    start

    physicaldesign

    logicaldesign

    A B

    A

    14

    2

    3

    59

    6

    78

  • 8/9/2019 ASICs the Course Book

    11/505

    ASICs... THE COURSE 1.4 Economics of ASICs 9

    1.4 Economics of ASICsWe’ll compare the most popular types of ASICs: an FPGA, an MGA, and a CBIC. The g-ures in the following sections are approximate and used to illustrate the different compo-nents of cost.

    1.4.1 Comparison Between ASIC Technologies

    Example of an ASIC part cost : A 0.5 µm, 20k-gate array might cost 0.01–0.02 cents/gate(for more than 10,000 parts) or $2–$4 per part, but an equivalent FPGA might be $20.

    When does it make sense to use a more expensive part? This is what we shall examinenext.

    The CAD tools used in the design of the Sun Microsystems SPARCstation 1

    Design level Function ToolASIC design ASIC physical design LSI Logic

    ASIC logic synthesis Internal tools and UC Berkeley toolsASIC simulation LSI Logic

    Board design Schematic capture Valid LogicPCB layout Valid Logic AllegroTiming verication Quad Design Motive and internal tools

    Mechanical design Case and enclosure AutocadThermal analysis Pacic NumerixStructural analysis Cosmos

    Management Scheduling SuntracDocumentation Interleaf and FrameMaker

  • 8/9/2019 ASICs the Course Book

    12/505

    10 SECTION 1 INTRODUCTION TO ASICs ASICS... THE COURSE

    1.4.2 Product CostIn a product cost there are fixed costs and variable costs (the number of products sold isthe sales volume ):

    In a product made from parts the total cost for any part is

    For example, suppose we have the following (imaginary) costs:

    • FPGA: $21,800 (xed) $39 (variable)

    • MGA: $86,000 (xed) $10 (variable)

    • CBIC $146,000 (xed) $8 (variable)

    Then we can calculate the following break-even volumes :

    • FPGA/MGA ≈ 2000 parts

    • FPGA/CBIC ≈ 4000 parts

    • MGA/CBIC ≈ 20,000 parts

    total product cost = xed product cost + variable product cost × products sold

    total part cost = xed part cost + variable cost per part × volume of parts

    Break-even graph

    cost of parts

    number of parts or volume

    $10,000

    $100,000

    $1,000,000

    10 100 1000 10,000 100,000

    break-evenFPGA/MGA

    FPGA

    MGA

    CBIC

    break-evenFPGA/CBIC

    break-evenMGA/CBIC

  • 8/9/2019 ASICs the Course Book

    13/505

    ASICs... THE COURSE 1.4 Economics of ASICs 11

    1.4.3 ASIC Fixed Costs

    Spreadsheet, “Fixed Costs”

    Examples of xed costs: training cost for a new electronic design automation (EDA ) sys-

    tem • hardware and software cost • productivity • production test and design for test •programming costs for an FPGA • nonrecurring-engineering (NRE ) • test vectors and

    test-program development cost • pass (turn or spin ) • profit model represents the profit

    flow during the product lifetime • product velocity • second source

    FPGA MGA CBICTraining: $800 $2,000 $2,000

    Days 2 5 5Cost/day $400 $400 $400

    Hardware $10,000 $10,000 $10,000Software $1,000 $20,000 $40,000Design: $8,000 $20,000 $20,000

    Size (gates) 10,000 10,000 10,000Gates/day 500 200 200

    Days 20 50 50Cost/day $400 $400 $400

    Design for test: $2,000 $2,000Days 5 5

    Cost/day $400 $400

    NRE: $30,000 $70,000 Masks $10,000 $50,000 Simulation $10,000 $10,000

    Test program $10,000 $10,000Second source: $2,000 $2,000 $2,000

    Days 5 5 5Cost/day $400 $400 $400

    Total fixed costs $21,800 $86,000 $146,000

  • 8/9/2019 ASICs the Course Book

    14/505

    12 SECTION 1 INTRODUCTION TO ASICs ASICS... THE COURSE

    Prot model

    delay to market, d

    peak sales

    end ofproduct life

    sales perquarter, s

    timeQ1 Q2 Q3 Q4 Q1 Q2

    $10M

    $20M productintroduction

    t1 t2 t3

    s1

    s2

    lost sales

  • 8/9/2019 ASICs the Course Book

    15/505

    ASICs... THE COURSE 1.4 Economics of ASICs 13

    1.4.4 ASIC Variable Costs

    Spreadsheet, “Variable Costs”

    Factors affecting xed costs: wafer size • wafer cost • Moore’s Law (Gordon Moore of Intel)

    • gate density • gate utilization • die size • die per wafer • defect density • yield • die cost• profit margin (depends on fab or fabless ) • price per gate • part cost

    FPGA MGA CBIC UnitsWafer size 6 6 6 inchesWafer cost 1,400 1,300 1,500 $Design 10,000 10,000 10,000 gatesDensity 10,000 20,000 25,000 gates/sq.cmUtilization 60 85 100 %

    Die size 1.67 0.59 0.40 sq.cmDie/wafer 88 248 365Defect density 1.10 0.90 1.00 defects/sq.cmYield 65 72 80 %Die cost 25 7 5 $Profit margin 60 45 50 %Price/gate 0.39 0.10 0.08 cents

    Part cost $39 $10 $8

  • 8/9/2019 ASICs the Course Book

    16/505

    14 SECTION 1 INTRODUCTION TO ASICs ASICS... THE COURSE

    Example price per gate gures

    0.01

    0.10

    1.00

    cents/gate

    1984 1986 1988 1990 1992 1994 1996

    CBIC 2 µm

    CBIC 1.5 µm

    CBIC 1 µm

    CBIC 0.6 µm

    FPGA 1 µm

    FPGA 0.6 µm –32%/year

  • 8/9/2019 ASICs the Course Book

    17/505

    ASICs... THE COURSE 1.5 ASIC Cell Libraries 15

    1.5 ASIC Cell Libraries

    You can:

    (1) use a design kit from the ASIC vendor(2) buy an ASIC-vendor library from a library vendor

    (3) you can build your own cell library

    (1) is usually a phantom library —the cells are empty boxes, or phantoms , you hand off your

    design to the ASIC vendor and they perform phantom instantiation (Synopsys CBA)

    (2) involves a buy-or-build decision . You need a qualified cell library (qualified by the ASIC

    foundry ) If you own the masks (the tooling ) you have a customer-owned tooling (COT , pro-

    nounced “see-oh-tee”) solution (which is becoming very popular)

    (3) involves a complex library development process: cell layout • behavioral model • Ver-

    ilog/VHDL model • timing model • test strategy • characterization • circuit extraction • pro-

    cess control monitors (PCMs ) or drop-ins • cell schematic • cell icon • layout versus

    schematic (LVS ) check • cell icon • logic synthesis • retargeting • wire-load model • rout-

    ing model • phantom

  • 8/9/2019 ASICs the Course Book

    18/505

  • 8/9/2019 ASICs the Course Book

    19/505

    ASICs... THE COURSE 1.9 References 17

    1.9 ReferencesGlasser, L. A., and D.W.Dobberpuhl. 1985. The Design and Analysis of VLSI Circuits.

    Reading, MA: Addison-Wesley, 473 p. ISBN 0-201-12580-3. TK7874.G573. Detailed anal-ysis of circuits, but largely nMOS.

    Mead, C. A., and L. A. Conway. 1980. Introduction to VLSI Systems. Reading, MA: Addison-Wesley, 396 p. ISBN 0-201-04358-0. TK7874.M37.

    Weste, N. H. E., and K. Eshraghian. 1993. Principles of CMOS VLSI Design: A Systems Per-spective. 2nd ed. Reading, MA: Addison-Wesley, 713 p. ISBN 0-201-53376-6.TK7874.W46. Concentrates on full-custom design.

  • 8/9/2019 ASICs the Course Book

    20/505

  • 8/9/2019 ASICs the Course Book

    21/505

  • 8/9/2019 ASICs the Course Book

    22/505

    2 SECTION 2 CMOS LOGIC ASICS... THE COURS

    CMOS logic • a two-inputNAND gate • a two-inputNOR gate • Good '1's • Good '0's

    off

    off

    0 1A

    B

    1 0

    1 10

    1

    F=NAND(A, B)

    VDD

    off off

    F =1B=0

    A=0 on

    on

    VDD

    off on

    F =0B=0

    A=1 off

    on

    B=1

    VDD

    A=1

    off off

    on

    on

    F=0

    B=0

    VDD

    A=1

    on off

    off

    on

    F=1

    B=1

    VDD

    A=0

    off on

    on

    off

    F=1

    VDD

    on off

    F =0B=1

    A=0 on

    off

    VDD

    on on

    F =0B=1

    A=1

    0 1AB

    0 0

    1 00

    1

    F=NOR(A, B)

    p-channeln-channel

    p-channeln-channel

    (a)

    (b)

    F=1

    B=0

    VDD

    A=0

    on on

    off

    off

  • 8/9/2019 ASICs the Course Book

    23/505

    ASICs... THE COURSE 2.1 CMOS Transistors 3

    2.1 CMOS Transistors

    • Channel charge = Q (imagine taking a picture and counting the electrons)• t f is time of flight or transit time

    • µn is the electron mobility (µ p is the hole mobility )• E is the electric eld (units Vm –1)

    An n-channel transistor •channel • source • drain • depletion region • gate • bulk

    current (amperes) = charge (coulombs) per unit time (second)

    The drain-to-source current I DSn =Q / t f

    The (vector) velocity of the electronsv = –µnE

    L L2

    t f = ––– = –––––––v x µnV DS

    GND orVSS

    +

    V DS

    LW

    V GS

    bulksource drain

    Tox

    E x

    electrons

    ++

    V DS

    bulk

    drain

    gate

    sourceV GS

    +

    mobile channel charge

    depletionregion

    p-type

    n-type n-type

    gate

    fixed depletion charge

  • 8/9/2019 ASICs the Course Book

    24/505

    4 SECTION 2 CMOS LOGIC ASICS... THE COURS

    • The linear region (triode region) extends until V DS =V GS –Vtn• V DS =V GS –Vtn =V DS (sat) (saturation voltage )• V DS >V GS –Vtn (the saturation region , or pentode region, of operation)• saturation current , I DSn (sat)

    Q = C(V GC – Vtn) =C [ (V GS – Vtn) – 0.5V DS ] = WLCox [ (V GS – Vtn) – 0.5V DS ]

    I DSn = Q/ t f = (W/L)µnCox[ (V GS – Vtn)–0 .5 V DS ]V DS = (W/L)k'n [ (V GS – Vtn)–0 .5 V DS ]V DS

    k'n = µnCox is the process transconductance parameter (orintrinsic transconductance )

    βn = k'n(W/L) is thetransistor gain factor (or justgain factor )

    I DSn (sat) = (βn /2)(V GS – Vtn)2 ; V GS > Vtn

  • 8/9/2019 ASICs the Course Book

    25/505

    ASICs... THE COURSE 2.1 CMOS Transistors 5

    2.1.1 P-Channel Transistors

    • Vt p is negative• V DS and V GS are normally negative (and –3V

  • 8/9/2019 ASICs the Course Book

    26/505

    6 SECTION 2 CMOS LOGIC ASICS... THE COURS

    2.1.2 Velocity Saturation

    • vmaxn =105ms –1

    • velocity saturation

    • t f =Leff /vmaxn• mobility degradation

    2.1.3 SPICE Models

    • KP (in µAV –2) = k'n (k' p)• VT0 and TOX = Vtn (Vt p) and Tox• U0 (in cm2V –1s –1) = µn (and µ p)

    I DSn (sat) = WvmaxnCox (V GS – Vtn) ; V DS >V DS (sat) (velocity saturated).

    SPICE parameters

    .MODEL CMOSN NMOS LEVEL=3 PHI=0.7 TOX=10E-09 XJ=0.2U TPG=1 VTO=0.65DELTA=0.7+ LD=5E-08 KP=2E-04 UO=550 THETA=0.27 RSH=2 GAMMA=0.6 NSUB=1.4E+17NFS=6E+11+ VMAX=2E+05 ETA=3.7E-02 KAPPA=2.9E-02 CGDO=3.0E-10 CGSO=3.0E-10

    CGBO=4.0E-10+ CJ=5.6E-04 MJ=0.56 CJSW=5E-11 MJSW=0.52 PB=1.MODEL CMOSP PMOS LEVEL=3 PHI=0.7 TOX=10E-09 XJ=0.2U TPG=-1 VTO=-0.92 DELTA=0.29+ LD=3.5E-08 KP=4.9E-05 UO=135 THETA=0.18 RSH=2 GAMMA=0.47NSUB=8.5E+16 NFS=6.5E+11+ VMAX=2.5E+05 ETA=2.45E-02 KAPPA=7.96 CGDO=2.4E-10 CGSO=2.4E-10CGBO=3.8E-10+ CJ=9.3E-04 MJ=0.47 CJSW=2.9E-10 MJSW=0.505 PB=1

  • 8/9/2019 ASICs the Course Book

    27/505

  • 8/9/2019 ASICs the Course Book

    28/505

    8 SECTION 2 CMOS LOGIC ASICS... THE COURS

    2.2 The CMOS Process

    The CMOS manufacturing process

    Key words: boule • wafer • boat • silicon dioxide • resist • mask • chemical etch • isotropicplasma etch • anisotropic • ion implantation • implant energy and dose •polysilicon • chemicalvapor deposition (CVD) • sputtering • photolithography • submicron and deep-submicrprocess • n-well process • p-well process • twin-tub (or twin-well) • triple-well •substratecontacts (well contacts or tub ties) •active (CAA) •gate oxide • field • field implant or chan-nel-stop implant •field oxide (FOX) • bloat • dopant •self-aligned process • positive resist •negative resist • drain engineering • LDD process • lightly doped drain • LDD diffusion or Limplant • stipple-pattern

    1

    2 43

    6

    As+

    5

    7 8 9 10 11 12

    1hour

    grow crystal saw

    resistspinfurnace

    mask

    etchresistoxide

    wafer

    grow oxide

  • 8/9/2019 ASICs the Course Book

    29/505

    ASICs... THE COURSE 2.2 The CMOS Process 9

    Mask/layer nameDerivation

    from drawn

    layers

    Alternative names for mask/layer Mask label

    n-well =nwell bulk, substrate, tub, n-tub, moat CWNp-well =pwell bulk, substrate, tub, p-tub, moat CWPactive =pdiff+ndiff thin oxide, thinox, island, gate oxide CAA

    polysilicon =poly poly, gate CPGn-diffusion

    implant =grow(ndiff) ndiff, n-select, nplus, n+ CSN

    p-diffusionimplant =grow(pdiff) pdiff, p-select, pplus, p+ CSP

    contact =contact contact cut, poly contact, diffusion con-tact CCP and CCAmetal1 =m1 rst-level metal CMFmetal2 =m2 second-level metal CMS

    via2 =via2 metal2/metal3 via, m2/m3 via CVSmetal3 =m3 third-level metal CMTglass =glass passivation, overglass, pad COG

  • 8/9/2019 ASICs the Course Book

    30/505

    10 SECTION 2 CMOS LOGIC ASICS... THE COURS

    (a) nwell (b) pwell (c) ndiff (d) pdiff

    (e) poly (f) contact (g) m1 (h) via

    (i) m2 (j) cell (k) phantom

    The mask layers of a standard cell

  • 8/9/2019 ASICs the Course Book

    31/505

  • 8/9/2019 ASICs the Course Book

    32/505

    12 SECTION 2 CMOS LOGIC ASICS... THE COURS

    Drawn layers and stipple patterns

    The transistor layers

    pdiff polynwell pwell ndiff contact

    via1 via2m1 m2 m3 glass

    (or solid)

    (or solid)(or solid)

    pdiff

    nwell

    poly

    p-diffusion

    polysiliconfield oxide

    n-well (or substrate)

    gate oxide

    (a) (b)

    y

    x

    field implant

    source/drain diffusionLDD diffusion

    x

    y z

  • 8/9/2019 ASICs the Course Book

    33/505

    ASICs... THE COURSE 2.2 The CMOS Process 13

    2.2.1 Sheet Resistance

    The interconnect layers

    Sheet resistance (1 µm ) Sheet resistance (0.35 µm)

    Layer Sheetresistance Units LayerSheet

    resistance Units

    n-well 1.15± 0.25 kΩ /square n-well 1± 0.4 kΩ /squarepoly 3.5± 2.0 Ω /square poly 10± 4.0 Ω /square

    n-diffusion 75± 20 Ω /square n-diffusion 3.5± 2.0 Ω /squarep-diffusion 140± 40 Ω /square p-diffusion 2.5± 1.5 Ω /square

    m1/2 70± 6 mΩ /square m1/2/3 60± 6 mΩ /squarem3 30± 3 mΩ /square metal4 30± 3 mΩ /square

    Key words: diffusion • Ω /square (ohms per square) •sheet resistance • silicide • self-aligned silicide (salicide ) • LI, white metal, local interconnect, metal0, or m0 •m1 or metal1• diffusion contacts • polysilicon contacts • barrier metal • contact plugs (via plugs ) •chemical–mechanical polishing (CMP ) • intermetal oxide (IMO) • interlevel dielectric(ILD) • metal vias , cuts , or vias • stacked vias and stacked contacts • two-level metal(2LM) • 3LM (m3 or metal3) •via1 • via2 • metal pitch • electromigration • contact resis-tance and via resistance

    via1

    via2m3

    m2

    m1

    contact

    W plug(4000Å)

    AlCu(3000Å)

    Pt barrier(200Å)

    m3m2

    (a) (b)

    TiW

    y x x y z

    +via1

    contact+m1

    +m2

    m2+via2 +m3

  • 8/9/2019 ASICs the Course Book

    34/505

    14 SECTION 2 CMOS LOGIC ASICS... THE COURS

    2.3 CMOS Design Rules

    Scalable CMOS design rules

    nwell

    pwell

    nwell

    pwell

    ndiffpdiff

    pdiffndiff

    ndiffpdiff

    pdiffnwell

    poly

    nwell

    pwell

    p-selectn-select

    ndiffpoly

    poly

    nwell

    poly

    metal2

    m1

    polycontact

    pdiff

    polyactivecontact

    m3

    via2m3

    glass

    m2

    m1

    n-select

    pdiff

    p-select

    ndiff

    pdiff

    1. well 2. active 3. poly

    4. select5. polycontact

    6. activecontact

    7. metal1

    9. metal2 15. metal3 10. overglass (microns)

    pwell

    nwell

    hot

    poly

    ndiff

    m1 m2

    via1

    8. via1

    m2

    m3via2

    via1

    m2

    14. via2

    m2

    via1

    m1

    21

    4

    3

    5

    6

    7 8

    15 10149

    m3

    0 (1.4) 9 (1.2)

    10 (1.1)

    0 or 6 (1.3)

    3 (2.1)

    3(2.2)

    5 (2.3)0 or 4(2.5)

    0 or 4(2.5)

    3 (2.4)3

    (2.2)

    3 (2.1) 5 (2.3) 3 (2.4)

    2 (3.2)

    2 (3.1)

    2 (3.3)

    1 (3.5)

    3 (3.4)

    1.5(5.2a)

    2 × 2 (5.1a)

    2 (5.3a)

    1.5 (6.2a)

    2 × 2(6.1a)

    2 (6.4a)

    1.5(6.2a)

    3 (8.2)2 × 2 (8.1)2 (8.5)

    2 (8.5)2 (8.4)

    1 (8.3)

    2 (6.3a)1 (4.3)

    2 (4.2)

    3 (7.1)1 (7.3)

    3(7.2a)

    1 (7.4)

    3 (4.1)

    2(7.2b)

    3(14.2)

    2 (14.4)1(14.3)

    2 × 2 (14.1)3 (9.1)

    4(9.2a)

    1(9.3)

    3(9.2b)

    6(15.1)

    4 (15.2)

    2 (15.3)

    6 (10.3) 30 (10.4)

    15(10.5)

    100 ×100 (10.1)

  • 8/9/2019 ASICs the Course Book

    35/505

    ASICs... THE COURSE 2.4 Combinational Logic Cells15

    2.4 Combinational Logic Cells

    2.4.1 Pushing Bubbles

    2.4.2 Drive StrengthWe ratio a cell to adjust itsdrive strength and make βn =β p to create equal rise and falltimes

    Naming of complex CMOS com-binational logic cells

    The AOI family of cells with three index numbers or less

    Cell type1

    Cells Number of unique cellsXa1 X21, X31 2Xa11 X211, X311 2Xab X22, X33, X32 3Xab1 X221, X331, X321 3Xabc X222, X333, X332, X322 4

    Total 141Xabc: X={AOI, AO, OAI, OA}; a, b, c = {2, 3}; {} means “choose one.”

    BCDE

    AZ

    BCDE

    A

    F

    Z

    AOI221

    AOI221 OAI321

    OAI321

    (a) (b)

    OR AND INVERTORAND INVERT

  • 8/9/2019 ASICs the Course Book

    36/505

    16 SECTION 2 CMOS LOGIC ASICS... THE COURS

    2.4.3 Transmission Gates

    Charge sharing : suppose C BIG=0.2pF and C SMALL=0.02pF, V BIG=0V and V SMALL=5V;then

    Constructing a CMOS logic cell—an AOI221 • pushing bubbles •de Morgan’s theorem •network duals

    CMOS transmission gate (TG, TX gate, pass gate, coupler)

    (0.2× 10 –12) (0) + (0.02× 10 –12) (5)

    V F = –––––––––––––––––––––––––––

    – = 0.45 V(0.2× 10 –12) + (0.02× 10 –12)

    ZABCDE

    VDD

    Z

    A

    C

    E

    B

    D

    E A

    B

    C

    D

    ZABCDE

    push bubbles to the inputs

    OR = parallelAND = series

    OR = parallelAND = series

    1

    3

    VDD

    6/1 6/16/16/1

    6/1

    1/1 2/1

    2/1

    2/1

    2/1

    6/(1+1+1) =2/1

    2

    adjustsizes4

    (a) (c)(b)

    (a)

    A

    '1'

    Z

    CBIGCSMALL

    charge sharing

    VBIG→VFVSMALL→VF

    (c)

    '0'A

    S'

    ZA Z

    S=0

    ZA S=1A

    S

    ZS'

    (b)

    S

    strong '1'

    strong '0'

  • 8/9/2019 ASICs the Course Book

    37/505

    ASICs... THE COURSE 2.5 Sequential Logic Cells17

    2.5 Sequential Logic CellsTwo choices for sequential logic:multiphase clocks or synchronous design . We choosethe latter.

    2.5.1 Latch

    CMOS latch •enable • transparent • static • sequential logic cell • storage • initial value

    CLKNCLKCLKNI4 CLKPI5

    CLKNQ

    CLKPI2

    I3

    I1D

    CLKP

    QI2

    I3

    I1D QI2

    I3

    I1Dstorageloop

    (a) (b) (c)

    D

    CLK

    Q

    t

    D

    CLK

    Q

    t

    latch is transparent

    1DC1

  • 8/9/2019 ASICs the Course Book

    38/505

    18 SECTION 2 CMOS LOGIC ASICS... THE COURS

    2.5.2 Flip-Flop

    CMOS ip-op• master latch • slave latch• active clock edge • negative-edge–triggered flip-flop• setup time (tSU) • hold time (tH) • clock-to-Q propagation delay (tPD)

    • decision window

    CLKN

    CLKN

    CLKPI2

    I3

    I1D

    CLKP(a)

    D

    CLK

    t

    CLKP

    CLKNI6

    I7

    CLKCLKN

    I4CLKP

    I5

    CLKP

    Q

    QN

    I8

    I9

    S

    (b)MI2

    I3

    I1D

    load master

    SI6

    I7

    storeQ

    QN

    I8

    I9

    (c)MI2

    I3

    I1D

    load slave

    SI6

    I7

    storeQ

    QN

    I8

    I9

    CLK=1

    CLK=0

    M

    Q

    (d)

    load master load slave load master load slave

    tSUtH

    50%

    tPD

    1DC1

    decisionwindow

    master slave

    M

    CLKN

  • 8/9/2019 ASICs the Course Book

    39/505

    ASICs... THE COURSE 2.6 Datapath Logic Cells19

    2.6 Datapath Logic Cells

    • parity function ('1' for an odd numbers of '1's)• majority function ('1' if the majority of the inputs are '1')

    full adder (FA ): SUM = A B CIN = SUM(A, B, CIN) = PARITY(A, B, CIN) ,COUT = A · B + A · CIN + B · CIN = MAJ(A, B, CIN).

    S[i ] = SUM (A[i ], B[i ], CIN)COUT = MAJ (A[i ], B[i ], CIN)

    A datapath adder• Ripple-carry adder (RCA)• Data signals •control signals •datapath • datapath cell or datapath element• Datapath advantages: predictable and equal delay for each bit • built-in interconnect• Disadvantages of a datapath: overhead • harder design • software is more complex

    (a)

    SUMB[1]A[1]

    B[0]A[0]

    B[2]A[2]B[3]A[3]

    VSS

    COUT[3]

    (b)

    S[3]

    S[2]

    S[1]

    S[0]AB

    CIN

    COUTADD

    (d)

    A B COUTCIN

    (c)

    COUT[3]

    VSS

    control

    datam2

    m1

    S

    m2m1

    COUT[2]

    m1

    m2

    CIN

    COUT[2]

    CIN[0]

  • 8/9/2019 ASICs the Course Book

    40/505

    20 SECTION 2 CMOS LOGIC ASICS... THE COURS

    2.6.1 Datapath Elements

  • 8/9/2019 ASICs the Course Book

    41/505

  • 8/9/2019 ASICs the Course Book

    42/505

    22 SECTION 2 CMOS LOGIC ASICS... THE COURS

    2.6.2 AddersGenerate , G[i ] and propagate , P[i ]

    Carry signal:

    Carry chain using two-input NAND gates, one per cell:

    Carry-save adder (CSA ) cell CSA(A1[i ], A2[i ], A3[i ], CIN, S1[i ], S2[i ], COUT) has three out-puts:

    subtractionresult:OV=overow,

    OR=out of range

    OR=BOUT[MSB]BOUT is bor-row out

    as in addition as in addition as in addition

    negation:Z=–A (negate)

    NA Z=A;SG(Z)=NOT(SG(A))

    Z=NOT(A) Z=NOT(A)+1

    method 1 method 2

    G[i] = A[i] · B[i] G[i ] = A[i ] · B[i ]

    P[i ] = A[i ] B[i P[i ] = A[i ] + B[i ]C[i ] = G[i ] + P[i ] · C[i –1] C[i ] = G[i ] + P[i ] · C[i –1]S[i ] = P[i ] C[i –1] S[i ] = A[i ] B[i ] C[i –1]

    either C[i ] = A[i ] · B[i ] + P[i ] · C[i – 1]or C[i ] = (A[i ] + B[i ]) · (P[i ]' + C[i – 1]), where P[i ]'=NOT(P[i ])

    even stages odd stages

    C1[ i ]' = P[ i ] · C3[ i – 1] · C4[ i – 1] C3[ i]' = P[ i ] · C1[ i – 1] · C2[ i – 1]

    C2[ i ] = A[ i ] + B[ i ] C4[ i]' = A[ i ] · B[ i ]

    C[ i ] = C1[ i ] · C2[ i ] C[ i ] = C3[ i ]'+ C4[ i ]'

    S1[i ] = CIN ,

    S2[i ] = A1[i ] A2[i ] A3[i ] = PARITY(A1[i ], A2[i ], A3[i ])COUT = A1[i ] · A2[i ] + [(A1[i ] + A2[i ]) · A3[i ]] = MAJ(A1[i ], A2[i ], A3[i ])

  • 8/9/2019 ASICs the Course Book

    43/505

    ASICs... THE COURSE 2.6 Datapath Logic Cells23

    Carry-propagate adder (CPA )

    carry-bypass adders (CBA ):

    carry-skip adder :

    The carry-save adder (CSA) •pipeline • latency • bit slice

    C[7]=(G[7]+P[7]·C[6])·BYPASS'+C[3]·BYPASS

    CSKIP[i ] = (G[i ] + P[i ] · C[i – 1]) · SKIP' + C[i – 2] · SKIP

    (a)

    S1A1A2

    CIN

    COUT

    CSAS2A3

    COUT[MSB]

    CIN[0]

    (b)

    COUT[MSB–1]

    Σ

    A2[MSB:0]

    A4[MSB:0]

    +

    +A3[MSB:0] +

    Σ+

    ++

    A1[MSB:0]

    Σ S[MSB:0]+

    +

    (c)

    (d)

    ΣS[MSB:0]+

    A2[MSB:0]

    A4[MSB:0]

    +

    +A3[MSB:0] +

    A1[MSB:0]

    Σ+

    ++

    CLKCLK

    (e)

    Σ

    A1[MSB:0]

    A3[MSB:0]

    S1[MSB:0]+

    +A2[MSB:0]

    S2[MSB:0]+

    OV

    (f) (g)

    CSA1

    CSA2RCA

    CSA1

    CSA2 RCA

    pipeline registers

    RCACLK CLK

    pipeline registers

    CSA1 CSA2

    1

    2

    3

    4

    5 1 2 3 4 5

    n

    n

    n

    n

    n

    n

    n

    n

    n

    n

    n

    n

    n

    n

    n

    n

    n

    n

    n

    n

    n

    COUT[MSB]COUT[MSB–1]

    CSA1 CSA2 RCA

    bit sliceMSB

    LSB

  • 8/9/2019 ASICs the Course Book

    44/505

    24 SECTION 2 CMOS LOGIC ASICS... THE COURS

    Carry-lookahead adder (CLA , for example theBrent–Kung adder ):

    Carry-select adder duplicates two small adders for the cases CIN='0' and CIN='1' and thenuses a MUX to select the case that we need

    C[1] = G[1] + P[1] · C[0]= G[1] + P[1] · (G[0] + P[1] · C[–1])= G[1] + P[1] · G[0]

    C[2] = G[2] + P[2] · G[1] + P[2] · P[1] · G[0] ,C[3] = G[3] + P[2] · G[2] + P[2] · P[1] · G[1] + P[3] · P[2] · P[1] · G[0]

    The Brent–Kung carry-lookahead adder

    A[i ] B[i ]

    G[i ]

    P[i ]

    G[i +1]

    P[i +1]

    G[0]P[0]G[1]P[1]

    C[1] =G[1]+P[0]

    P[2]

    P[0]P[1]G[2]

    C[2] =G[2]+P[2]G[1]+P[2]P[1]G[0]

    P[3]

    P[0]P[1]P[2]G[3]

    C[3]= G[3]+P[3]G[2]+ P[3]P[2]G[1]+P[3]P[2]P[1]G[0]P[0]P[1]P[2]P[3]

    CLG

    CLG CLG CLG

    G[i +1]+P[ i ]

    P[i ]P[i + 1]

    G[0]P[0]G[1]P[1]

    G[2]P[2]G[3]P[3]

    CLG

    C[3]

    C[2]C[1]

    L1 L2 L3

    L4

    01 2 3

    123

    01

    23

    1

    3

    2

    (a)

    (b) (c)

    (d)

    (e) (f)

    CLG

    CLG CLG

    L1

    L2 L3

    01234567

    0

    0

    012345

    6

    7

    G[i ]/P[i ] in C[i ] out

    Each wire is a bundle ofG[i +1]+P[ i ] and P[i ]P[i +1].

    (g)

    A[i ] B[i ]

    G[i ]P[i ]

    Sum[i ]C[i ]

    orP[i ]

    Create generate and propagate signals.Create carry signals.

    Create sum signals.

  • 8/9/2019 ASICs the Course Book

    45/505

    ASICs... THE COURSE 2.6 Datapath Logic Cells25

    The conditional-sum adder

    A[0] B[0]

    C1_0_0

    H0

    C[0]

    C1_0_1

    A[1] B[1]H1

    stage0

    1

    2

    S[1] C[2] S[0]

    bit 1 0

    Q1_0

    Q2_1

    A[i ] B[i ]H

    A[i ] B[i ]

    (A[i ] B[i ])'

    A[i ].B[i ]

    A[i ]+B[ i ]

    (a) (c)

    Ci _ j _ k

    Si _ j _1 orCi _ j _1

    Si _ j _0 orCi _ j _0

    G111

    Si _ j _ k orCi _ j _ k

    Si _ j _ k or Ci _ j _ k Qi _ j

    (b)

    (k =0 or 1)

    Ci _ j _ k =carry in to thei th bit assuming the carry in to the j th bit isk (k =0 or 1)S

    i _

    j _

    k =sum at the

    i th bit assuming the carry in to the

    j th bit is

    k (

    k =0 or 1)

    Ci _ j _ k Si _ j _0 orCi _ j _0Si _ j _1 orCi _ j _1

    carry out (carry in=0)sum (carry in =0)

    sum (carry in =1)carry out (carry in=1)

    Q1_1

  • 8/9/2019 ASICs the Course Book

    46/505

    26 SECTION 2 CMOS LOGIC ASICS... THE COURS

    2.6.3 A Simple Example

    2.6.4 Multipliers• Mental arithmetic: 15 (multiplicand ) × 19 (multiplier ) = 1 5×(20–1) = 15×21• Suppose we want to multiply by B=00010111 (decimal 16+4+2+1=23)• Use the canonical signed-digit vector (CSD vector ) D=00101001(decimal 32–8+1=23)• B h a s a weight of 4, but D has a weight of 3 — and saves hardware

    An 8-bit conditional-sum adder

    module m8bitCSum (C0, a, b, s, C8); // Verilog conditional-sum adderfor an FPGA //1input [7:0] C0, a, b; output [7:0] s; output C8; //2wire A7,A6,A5,A4,A3,A2,A1,A0,B7,B6,B5,B4,B3,B2,B1,B0,S8,S7,S6,S5,S4,S3,S2,S1,S0; //3wire C0, C2, C4_2_0, C4_2_1, S5_4_0, S5_4_1, C6, C6_4_0, C6_4_1,C8; //4assign {A7,A6,A5,A4,A3,A2,A1,A0} = a; assign {B7,B6,B5,B4,B3,B2,B1,B0} = b; //5assign s = { S7,S6,S5,S4,S3,S2,S1,S0 }; //6assign S0 = A0^B0^C0 ; // start of level 1: & = AND, ^ = XOR, | =OR, ! = NOT //7assign S1 = A1^B1^(A0&B0|(A0|B0)&C0) ; //8assign C2 = A1&B1|(A1|B1)&(A0&B0|(A0|B0)&C0) ; //9assign C4_2_0 = A3&B3|(A3|B3)&(A2&B2) ; assign C4_2_1 =A3&B3|(A3|B3)&(A2|B2) ; //10assign S5_4_0 = A5^B5^(A4&B4) ; assign S5_4_1 = A5^B5^(A4|B4) ; //11assign C6_4_0 = A5&B5|(A5|B5)&(A4&B4) ; assign C6_4_1 =A5&B5|(A5|B5)&(A4|B4) ; //12assign S2 = A2^B2^C2 ; // start of level 2 //13

    assign S3 = A3^B3^(A2&B2|(A2|B2)&C2) ; //14assign S4 = A4^B4^(C4_2_0|C4_2_1&C2) ; //15assign S5 = S5_4_0&!(C4_2_0|C4_2_1&C2)|S5_4_1&(C4_2_0|C4_2_1&C2) ; //16assign C6 = C6_4_0|C6_4_1&(C4_2_0|C4_2_1&C2) ; //17assign S6 = A6^B6^C6 ; // start of level 3 //18assign S7 = A7^B7^(A6&B6|(A6|B6)&C6) ; //19assign C8 = A7&B7|(A7|B7s)&(A6&B6|(A6|B6)&C6) ; //20endmodule //21

  • 8/9/2019 ASICs the Course Book

    47/505

    ASICs... THE COURSE 2.6 Datapath Logic Cells27

    Datapath adders

    To recode (or encode) any binary number, B, as a CSD vector, D: Di = Bi + Ci – 2Ci + 1 ,where Ci +1 is the carry from the sum of Bi +1+Bi +Ci (we start with C0=0).

    If B=011 (B2=0, B1=1, B0=1; decimal 3), then: D0 = B0 + C0 – 2C1 = 1 + 0 – 2 =1,D1 = B1 + C1 – 2C2 = 1 + 1 – 2 = 0,D2 = B2 + C2 – 2C3 = 0 + 1 – 0 = 1,

    so that D= 101 (decimal 4–1=3).

    We can use a radix other than 2, for exampleBooth encoding (radix-4):B=101001 (decimal 9–32=–23) E=1 21 (decimal –16–8+1=–23)

    B=01011 (eleven) E=11 1 (16–4–1)B=101 E=11

    normalizeddelay

    bits8 16 32 64

    120

    80

    40

    area/kλ2

    bits8 16 32 64

    3000

    2000

    1000

    2-inputNAND =1

    ripple-carry

    carry-select

    carry-save

    ripple-carry

    carry-select

    carry-save

    (b)(a)

  • 8/9/2019 ASICs the Course Book

    48/505

    28 SECTION 2 CMOS LOGIC ASICS... THE COURS

    Tree-based multiplication – at each stage we have the following three choices:(1) sum three outputs using a full adder(2) sum two outputs using a half adder(3) pass the outputs to the next stage

    FA

    A B

    Sum

    COUT CIN

    full adderS31S51 S41

    S22S42 S32

    S23

    S14

    S13

    S04

    S33

    S24

    S50

    P5 P4P6

    '0''0'

    S40

    '0'

    '0'

    S15 S05

    '0'

    '0'

    S41S50

    S32S14

    S23

    S05

    P5

    '0'

    a0

    b1

    c2

    d3

    e4

    f5

    b0

    c1

    d2

    e3

    f4

    a1

    b2

    c3

    d4

    e5

    f6

    5.1 5.2

    5.3

    5.4

    5.5

    Wallace treecarry-save chain

    (a) (b)

    (c)

    fulladder

    halfadder

    Each dotrepresentsan output ofone stageand aninput to thenext.

    P5

    S50S41S32S23S14S05

    5.1

    5.4

    5.5

    5.2

    5.3

    1

    2

    3

    4

    0

    5

    6

    redundantcarry

  • 8/9/2019 ASICs the Course Book

    49/505

    ASICs... THE COURSE 2.6 Datapath Logic Cells29

    A Wallace-tree multiplier works forward from the multiplier inputs• Full adder is a3:2 compressor or (3, 2) counter• Half adder is a(2, 2) counter

    FA

    A B

    Sum

    COUT CIN

    fulladder

    S50S32 S05

    P5

    '0'

    1

    2

    34567

    0

    S41

    S15

    S33

    S24

    P6

    P7P8P9P10P11

    S55

    S42S51S45S54 S44S53 S43S52S34S35

    S25

    P4

    P3

    P2

    P1

    '0''0'

    '0'

    '0'

    '0'S04 S03 S02

    S31 S30

    S14S23 S13S22 S12S21 S11S20S01

    S10

    S40

    1

    2

    3

    4

    5

    6

    7

    15

    P5

    S00

    P0

    1 2 3

    10 11 12

    17

    4 5

    13

    18 19

    22 23

    25

    26

    27 28 29 30

    6 7 8 9

    1614

    20

    24

    21

  • 8/9/2019 ASICs the Course Book

    50/505

    30 SECTION 2 CMOS LOGIC ASICS... THE COURS

    The Dadda multiplier works backward from the nal product• Each stage has a maximum of 2, 3, 4, 6, 9, 13, 19, ...outputs (each successive stage is3/2 times larger—rounded down to an integer

    The number of stages and thus delay (in units of an FA delay—excluding the CPA) for ann-bittree-based multiplier using (3, 2) counters islog1.5 n = log10 n /log10 1.5 = log10 n /0.176

    P1P2P3P4P5P6P8 P7P9P10P11

    S04S13

    S03 S12

    S40

    '0'

    S22

    S25S34

    '0'S10S01

    S02 S11

    S35S44

    S55

    S54

    S52

    S53 S50 S21

    S31

    S30

    S24'0'S33

    S42S51S15 S14'0'S23

    S32S41S05S43

    P0

    S00

    S20

    '0'

    S45

    1

    234

    0

    '0'

    1

    7 8 9 10

    2 3 4

    13 14 15 16 17

    21 22 23 24 25 26

    6

    11 12

    5

    18 19

    27 28

    20

    29 30

  • 8/9/2019 ASICs the Course Book

    51/505

  • 8/9/2019 ASICs the Course Book

    52/505

    32 SECTION 2 CMOS LOGIC ASICS... THE COURS

    2.6.5 Other Arithmetic Systems

    • 101 (decimal) is1100101 (in binary and CSD vector) or11100111• 188 (decimal) is10111100 (in binary),111000100, 101001100, or 101000100(CSDvector)• 101 is represented as 010010 (using sign magnitude) — rather wasteful

    Residue number system

    • 11 (decimal) is represented as [1, 2] residue (5, 3)• 11R5=11 mod 5=1 and 11R3=11 mod 3=2• The size of this system is 3×5=15• We can now add, subtract, or multiply without using any carry

    binary decimal redundant binary CSD vector1010111 87 10101001 10101001 addend

    + 1100101 101 + 11100111 + 01100101 augend01001110 = 11001100 intermediate sum11000101 11000000 intermediate carry

    = 10111100 = 188 111000100 101001100 sum

    Redundant binary addition • redundant binary encoding avoids carry propagation

    A[i ] B[ i ] A[i –1] B[ i –1] Intermediate

    sum

    Intermediate

    carry1 1 x x 0 1

    1 0A[i–1]=0/1 and

    B[i–1]=0/1 1 0

    0 1 A[i–1]= 1 or B[i–1]= 1 1 1

    1 1 x x 0 0

    1 1 x x 0 0

    0 0 x x 0 0

    0 1

    A[i–1]=0/1 and

    B[i–1]=0/1 1 11 0 A[i–1]= 1 or B[i–1]= 1 1 0

    1 1 x x 0 1

  • 8/9/2019 ASICs the Course Book

    53/505

    ASICs... THE COURSE 2.6 Datapath Logic Cells33

    4 [4, 1] 12 [2, 0] 3 [3, 0]+ 7 + [2, 1] – 4 – [4, 1] × 4 × [4, 1]= 11 = [1, 2] = 8 = [3, 2] = 12 = [2, 0

    The 5, 3 residue number system

    n residue 5 residue 3 n residue 5 residue 3 n residue 5 residue 30 0 0 5 0 2 10 0 11 1 1 6 1 0 11 1 22 2 2 7 2 1 12 2 03 3 0 8 3 2 13 3 1

    4 4 1 9 4 0 14 4 2

  • 8/9/2019 ASICs the Course Book

    54/505

    34 SECTION 2 CMOS LOGIC ASICS... THE COURS

    2.6.6 Other Datapath Operators

    Symbols for datapath elements

    Full subtracter DIFF = A NOT(B) ΝΟΤ(BIN)= SUM(A, NOT(B), NOT(BIN))

    NOT(BOUT) = A · NOT(B) + A · NOT(BIN) + NOT(B) · NOT(B

    = MAJ(NOT(A), B, NOT(BIN))

    Keywords: adder/subtracter • barrel shifter • normalizer • denormalizer • leading-one detecto• priority encoder • exponent correcter • accumulator • multiplier–accumulator (MACincrementer • decrementer • incrementer/decrementer • all-zeros detector • all-ones detecto• register file • first-in first-out register (FIFO) • last-in first-out register (LIFO)

    Q[MSB:0]CLK PRE

    D[MSB:0]

    S

    01

    A[MSB:0]

    B[MSB:0]

    Z[MSB:0]Σ

    S[MSB:0]A[MSB:0]

    B[MSB:0]

    +

    +/-Z[MSB:0]+/-1 =1Z

    =0Z

    (a)

    A[MSB:0]B[MSB:0]

    Z[MSB:0] B[MSB:0]

    A Z[MSB:0]

    (b)

    (c)

    (d) (e) (f) (g) (h)

  • 8/9/2019 ASICs the Course Book

    55/505

    ASICs... THE COURSE 2.7 I/O Cells 35

    2.7 I/O Cells

    2.8 Cell Compilers

    2.9 Summary• The use of transistors as switches• The difference between a ip-op and a latch• The meaning of setup time and hold time

    Keywords: Tri-State ® is a registered trademark of National Semiconductor) •drivers • con-

    tention • bus keeper or bus-hold cell (TI calls this Bus-Friendly logic) •slew rate • power-supply bounce • simultaneously switching outputs (SSOs) •quiet-I/O • bidirectional I/O• open-drain • level shifter • electrostatic discharge , orESD • electrical overstress (EOS)• ESD implant • human-body model (HBM) •machine model (MM) •charge-device model(CDM, also called device charge–discharge) •latch-up • undershoot • overshoot • guardrings

    A three-state bidirectional output buffer

    Keywords: silicon compilers • RAM compiler • multiplier compiler • single-port RAM • duaRAMs • multiport RAMs • asynchronous • synchronous • model compiler • netlist compicorrect by construction

    I/Opad

    VDD

    OE

    DATAout

    M1

    M2

    ND1

    NR1

    I2

    DATAinI1

    outputenable

    to corelogic

    from corelogic

  • 8/9/2019 ASICs the Course Book

    56/505

    36 SECTION 2 CMOS LOGIC ASICS... THE COURS

    • Pipelines and latency• The difference between datapath, standard-cell, and gate-array logic cells• Strong and weak logic levels

    • Pushing bubbles• Ratio of logic• Resistance per square of layers and their relative values in CMOS• Design rules and λ

    2.10 ProblemsSuggested homework: 2.1, 2.2, 2.38, 2.39 (from ASICs... the book )

  • 8/9/2019 ASICs the Course Book

    57/505

    ASICs...THE COURSE (1 WEEK)

    1

    ASIC LIBRARYDESIGN

    ASIC design uses predened and precharacterized cells from a library—so we need todesign or buy a cell library. A knowledge of ASIC library design is not necessary but makesit easier to use library cells effectively.

    3.1 Transistors as Resistors

    Key concepts: Tau, logical effort, and the prediction of delay • Sizes of cells, and their drive

    strengths • Cell importance • The difference between gate-array macros, standard cells, and

    datapath cells

    – t PDf 0.35 V DD = V DD exp –––––––––––––––––

    R pd (Cout + C p)

    An output trip point of 0.35 is convenient because ln(1/0.35)=1.04 ≈1 and thust PDf = R pd (Cout + C p) ln (1/0.35) ≈ R pd (Cout + C p)

    For output trip points of 0.1/0.9 we multiply by –ln(0.1) = 2.3, because exp (–2.3) = 0.100

  • 8/9/2019 ASICs the Course Book

    58/505

    2 SECTION 3 ASIC LIBRARY DESIGN ASICS... THE COURSE

    A linear model for CMOS logic delay• Ideal switches = no delay • Resistance and capacitance causes delay

    • Load capacitance, Cout • parasitic output capacitance, C p • input capacitance, C

    • Linearize the switch resistance • Pull-up resistance, R pu • pull-down resistance, R pd

    • Measure and compare the input, v(in1) and output, v(out1)

    • Input trip point of 0.5 • output trip points are 0.35 (falling) and 0.65 (rising)

    • The linear prop–ramp model: falling propagation delay, t PDf ≈R pd (C p + Cout )

    (c)

    R pu

    R pd

    C p

    V DD

    in1 out1

    CoutC

    (a)

    V DD

    in1 out1

    Cout

    m1

    m2

    v(in1)v(out1)

    t PDf

    V DD

    0.5 V DD

    0.35 V DD

    saturation linearoff

    0

    (b)

    t'

    m2

    m1

    m1 :t' =0

    t' =0 ≈ R pd (C p + Cout )

    V DD exp[– t' / (R pd (C p + C out ))]

    t' =0 – I DSp

    I DSn –( I DSp + I DSn )

  • 8/9/2019 ASICs the Course Book

    59/505

    ASICs... THE COURSE 3.1 Transistors as Resistors 3

    (a) (b)

    CMOS inverter characteristics

    • Equilibrium switching

    • Non-equilibrium switching

    • Nonlinear switching resistance

    • Switching current

    (c)

    v(in1) /V

    v(out1) / V

    0

    1

    2

    3

    1 2 30

    equilibriumpath

    nonequilibrium path

    I DSn =–I DSp

    1

    0

    3

    1

    23

    0

    3

    2

    0

    1

    v(in1) /V

    v(out1) / V

    nonequilibrium pathmax( I DSn , –I DSp ) /mA

    I DSn

    –I DSp

    1

    2

    equilibriumpath

    v(in1) /V

    0.2

    0.4

    0.01 2 30

    equilibriumpath

    2

    I DSn =–I DSp

    max( I DSn , –I DSp ) /mA

  • 8/9/2019 ASICs the Course Book

    60/505

  • 8/9/2019 ASICs the Course Book

    61/505

    ASICs... THE COURSE 3.2 Transistor Parasitic Capacitance 5

    NAME m1 m2MODEL CMOSN CMOSPID 7.49E-11 -7.49E-11VGS 0.00E+00 -3.00E+00

    VDS 3.00E+00 -4.40E-08VBS 0.00E+00 0.00E+00VTH 4.14E-01 -8.96E-01VDSAT 3.51E-02 -1.78E+00GM 1.75E-09 2.52E-11GDS 1.24E-10 1.72E-03GMB 6.02E-10 7.02E-12CBD 2.06E-15 1.71E-14CBS 4.45E-15 1.71E-14CGSOV 1.80E-15 2.88E-15CGDOV 1.80E-15 2.88E-15

    CGBOV 2.00E-16 2.01E-16CGS 0.00E+00 1.10E-14CGD 0.00E+00 1.10E-14CGB 3.88E-15 0.00E+00

    • ID (I DS ), VGS , VDS , VBS , VTH (Vt), and VDSAT (V DS (sat) ) are DC parameters

    • GM, GDS , and GMB are small-signal conductances (corresponding to ∂I DS / ∂V GS ,∂I DS / ∂V DS , and ∂I DS / ∂V BS , respectively)

  • 8/9/2019 ASICs the Course Book

    62/505

    6 SECTION 3 ASIC LIBRARY DESIGN ASICS... THE COURSE

  • 8/9/2019 ASICs the Course Book

    63/505

    ASICs... THE COURSE 3.2 Transistor Parasitic Capacitance 7

    Calculations of parasitic capacitances for an n-channel MOS transistor.

    PSpice Equation Values 1 for V GS =0V, V DS =3V, V SB =0V

    CBD

    CBD = CBDJ + CBDSW CBD = 1.855 × 10

    –13

    + 2.04 × 10 –16

    = 2.06 × 10 –13 F

    CBDJ + AD CJ ( 1 + V DB / φB) –mJ (φB =

    PB )

    CBDJ = (4.032 × 10 –15 )(1 + (3/1)) –0.56 = 1.86 ×

    10 –15 F

    CBDSW = P D CJSW (1 + V DB / φB) –mJSW

    (P D may or may not include channeledge)

    CBDSW = (4.2 × 10 –16 )(1 + (3/1)) –0.5 = 2.04 ×

    10 –16 FCBS

    CBS = CBSJ + CBSSW

    CBS = 4.032 × 10 –15 + 4.2 × 10 –16 = 4.45 ×

    10 –15

    F

    CBSJ + AS CJ ( 1 + V SB / φB) –mJ

    AS CJ = (7.2 × 10 –15 )(5.6 × 10 –4 ) = 4.03 ×

    10 –15 F

    CBSSW = P S CJSW (1 + V SB / φB) –mJSW

    P S CJSW = (8.4 × 10 –6 )(5 × 10 –11 ) = 4.2 ×

    10 –16 FCGSOV CGSOV =W EFF CGSO ; W EFF =W–2W

    D CGSOV = (6 × 10 –6 )(3 × 10 –10 ) = 1.8 × 10 –16 F

    CGDOV CGDOV =W EFF CGSO CGDOV = (6 × 10 –6 )(3 × 10 –10 ) = 1.8 × 10 –15 F

    CGBOV CGBOV =L EFF CGBO ; L EFF =L–2L D CGDOV = (0.5 × 10 –6 )(4 × 10 –10 ) = 2 × 10 –16 FCGS CGS /CO = 0 (off), 0.5 (lin.), 0.66 (sat.)

    CO (oxide capacitance) = W EF LEFF εox / Tox

    CO = (6 × 10 –6 )(0.5 × 10 –6 )(0.00345) = 1.03 ×

    10 –14 FCGS = 0.0 F

    CGD CGD /CO = 0 (off), 0.5 (lin.), 0 (sat.) CGD = 0.0 F

    CGB CGB = 0 (on), = C O in series with CGS (off)

    CGB = 3.88 × 10 –15 F, C S =depletion capaci-

    tance1Input .MODEL CMOSN NMOS LEVEL=3 PHI=0.7 TOX=10E-09 XJ=0.2U TPG=1

    VTO=0.65 DELTA=0.7+ LD=5E-08 KP=2E-04 UO=550 THETA=0.27 RSH=2 GAMMA=0.6NSUB=1.4E+17 NFS=6E+11+ VMAX=2E+05 ETA=3.7E-02 KAPPA=2.9E-02 CGDO=3.0E-10CGSO=3.0E-10 CGBO=4.0E-10+ CJ=5.6E-04 MJ=0.56 CJSW=5E-11 MJSW=0.52 PB=1m1 out1 in1 0 0 cmosn W=6U L=0.6U AS=7.2P AD=7.2P PS=8.4UPD=8.4U

  • 8/9/2019 ASICs the Course Book

    64/505

    8 SECTION 3 ASIC LIBRARY DESIGN ASICS... THE COURSE

    3.2.1 Junction Capacitance• Junction capacitances, C BD and C BS , consist of two parts: junction area and sidewall

    • Both C BD and C BS have different physical characteristics with parameters: CJ and MJ

    for the junction, CJSW and MJSW for the sidewall, and PB is common• C BD and C BS depend on the voltage across the junction ( V DB and V SB )

    • The sidewalls facing the channel ( CBSJ GATE and C BDJ GATE ) are different from the side-walls that face the eld

    • It is a mistake to exclude the gate edge assuming it is in the rest of the model—it is not

    • In HSPICE there is a separate mechanism to account for the channel edge capaci-tance (using parameters ACM and CJGATE )

    3.2.2 Overlap Capacitance

    • The overlap capacitance calculations for C GSOV and C GDOV account for lateral diffusion• SPICE parameter LD=5E-08 or LD =0.05 µm

    • Not all SPICE versions use the equivalent parameter for width reduction, WD, in calcu-lating C GDOV• Not all SPICE versions subtract W D to form W EFF

    3.2.3 Gate Capacitance• The gate capacitance depends on the operating region

    • The gate–source capacitance C GS varies from zero (off) to 0.5C O in the linear region to

    (2/3)C O in the saturation region• The gate–drain capacitance C GD varies from zero (off) to 0.5C O (linear region) andback to zero (saturation region)

    • The gate–bulk capacitance C GB is two capacitors in series: the xed gate-oxide capaci-tance, C O, and the variable depletion capacitance, C S• As the transistor turns on the channel shields the bulk from the gate—and C GB falls tozero

    • Even with V GS =0V, the depletion width under the gate is nite and thus C GB is less thanCO

  • 8/9/2019 ASICs the Course Book

    65/505

    ASICs... THE COURSE 3.2 Transistor Parasitic Capacitance 9

    The variation of n-channel transistor parasitic capacitance

    • PSpice v5.4 ( LEVEL=3 )

    • Created by varying the input voltage, v(in1) , of an inverter

    • Data points are joined by straight lines

    • Note that CGSOV = CGDOV

    0

    2

    4

    6

    0 0.5 1 1.5 2 2.5 3

    CBD CBS CGSOV CGDOV

    CGBOV CGS CGD CGB

    capacitance/fF

    inverter input voltage, v(in1) /V

    off saturation linear

  • 8/9/2019 ASICs the Course Book

    66/505

    10 SECTION 3 ASIC LIBRARY DESIGN ASICS... THE COURSE

    3.2.4 Input Slew Rate

    (a)

    (b)

    (c)

    Measuring the input capacitance of an inverter

    (a) Input capacitance is measured by monitoring the input current to the inverter, i(Vin)

    (b) Very fast (non-equilibrium) switching: input current of 40fA = input capacitance of 40fF

    (c) Very slow (equilibrium) switching: input capacitance is now equal for both transitions

  • 8/9/2019 ASICs the Course Book

    67/505

    ASICs... THE COURSE 3.2 Transistor Parasitic Capacitance 11

    (a) (c)

    (b)(d)

    Parasitic capacitance measurement

    (a) All devices in this circuit include parasitic capacitance

    (b) This circuit uses linear capacitors to model the parasitic capacitance of m9/10 .

    • The load formed by the inverter ( m5 and m6 ) is modeled by a 0.0335pF capacitor ( c2 )

    • The parasitic capacitance due to the overlap of the gates of m3 and m4 with their source,drain, and bulk terminals is modeled by a 0.01pF capacitor ( c3 )

    • The effect of the parasitic capacitance at the drain terminals of m3 and m4 is modeled by a0.025pF capacitor ( c4 )

    (c) Comparison of (a) and (b). The delay (1.22–1.135=0.085ns) is equal to t PDf for the in-verter m3/4

    (d) An exact match would have both waveforms equal at the 0.35 trip point (1.05V).

  • 8/9/2019 ASICs the Course Book

    68/505

    12 SECTION 3 ASIC LIBRARY DESIGN ASICS... THE COURSE

    3.3 Logical EffortWe extend the prop–ramp model with a “catch all” term, t q, that includes:

    • delay due to internal parasitic capacitance

    • the time for the input to reach the switching threshold of the cell

    • the dependence of the delay on the slew rate of the input waveform

    • R and C will change as we scale a logic cell, but the RC product stays the same

    • Logical effort is independent of the size of a logic cell

    • We can nd logical effort by scaling a logic cell to have the same drive as a 1Xminimum-size inverter

    • Then the logical effort, g , is the ratio of the input capacitance, C in, of the 1X logic cell toC inv

    t PD = R(Cout + C p) + t qWe can scale any logic cell by a scaling factor s : t PD = (R / s )·(Cout + sC p) + st q

    Cout

    t PD = RC –––––– + RC p + st q C in

    (RC ) (Cout / C in ) + RC p + st qNormalizing the delay: d = ––––––––––––––––––––––––––––––– = f + p + q

    τ

    The time constant tau , τ = Rinv C inv , is a basic property of any CMOS technology

    The delay equation is the sum of three terms, d = f + p + q or

    delay = effort delay + parasitic delay + nonideal delayThe effort delay f is the product of logical effor t, g , and electrical effort , h: f = gh

    Thus, delay = logical effort × electrical effort + parasitic delay + nonideal delay

  • 8/9/2019 ASICs the Course Book

    69/505

    ASICs... THE COURSE 3.3 Logical Effort 13

    The h depends only on the load capacitance C out connected to the output of the logiccell and the input capacitance of the logic cell, C in; thus

    Logical effort • For a two-input NAND cell, the logical effort, g =4/3

    (a) Find the input capacitance, Cinv, looking into the input of a minimum-size inverter interms of the gate capacitance of a minimum-size device

    (b) Size a logic cell to have the same drive strength as a minimum-size inverter (assuminga logic ratio of 2). The input capacitance looking into one of the logic-cell terminals is thenC in

    (c) The logical effort of a cell is C in / C inv

    electrical effort h = Cout / C in

    parasitic delay p = RC p / τ (the parasitic delay of a minimum-size inverter is: pinv = C p /Cinv )

    nonideal delay q = st q / τ

    Cell effort, parasitic delay, and nonideal delay (in units of τ) for single-stage CMOS cells

    Cell Cell effort(logic ratio=2)Cell effort

    (logic ratio=r) Parasitic delay/ τ Nonideal delay/ τ

    inverter 1 (by denition) 1 (by denition) p inv (by denition) q inv (by denition)n-input NAND ( n +2)/3 ( n + r )/(r +1) np inv nq invn-input NOR (2 n +1)/3 ( nr +1)/( r +1) np inv nq inv

    (b)

    Cin =2+2=4

    Measure the inputcapacitance of aminimum-sizeinverter.

    Make the cell have the samedrive strength as aminimum-size inverter.

    g = C in / C inv = 4/3

    1X

    2/1

    2/1

    2/1 2/1

    (a)

    2/1

    1/1

    2 units ofgate capacitance

    1 unitCinv

    1X

    C inv

    (c)

    C inv =2+1=3

    A

    A

    Measure ratio of cell inputcapacitance to that of aminimum-size inverter.

    1

    2 3

    ZNZN

  • 8/9/2019 ASICs the Course Book

    70/505

    14 SECTION 3 ASIC LIBRARY DESIGN ASICS... THE COURSE

    3.3.1 Predicting Delay• Example: predict the delay of a three-input NOR logic cell

    • 2X drive

    • driving a net with a fanout of four• 0.3pF total load capacitance (input capacitance of cells we are driving plus the inter-connect)

    • p =3 p inv and q =3 q inv for this cell

    • the input gate capacitance of a 1X drive, three-input NOR logic cell is equal to gC inv• for a 2X logic cell, C in = 2 gC inv

    The delay of the NOR logic cell, in units of τ , is thus

    Cout g ·(0.3 pF) (0.3 pF) gh = g ––––– = ––––––––––– = –––––––––––– (Notice g cancels out in this equation)

    C in 2 gC inv (2)·(0.036 pF)

    0.3 × 10 –12

    d = gh + p + q = –––––––––––––––––––– + (3)·(1) + (3)·(1.7)

    (2)·(0.036 × 10 –12 )

    = 4.1666667 + 3 + 5.1

    = 12.266667 τ equivalent to an absolute delay, t PD ≈12.3 ×0.06ns=0.74ns

    The delay for a 2X drive, three-input NOR logic cell is t PD = (0.03 + 0.72 Cout + 0.60) ns

    With C out =0.3pF, t PD = 0.03 + (0.72)·(0.3) + 0.60 = 0.846 ns compared to our prediction of0.74ns

  • 8/9/2019 ASICs the Course Book

    71/505

    ASICs... THE COURSE 3.3 Logical Effort 15

    3.3.2 Logical Area and Logical Efciency

    An OAI221 logic cell

    • Logical-effort vector g =(7/3, 7/3,5/3)

    • The logical area is 33 logicalsquares

    An AOI221 logic cell

    • g =(8/3, 8/3, 7/3)

    • Logical area is 39 logical squares

    • Less logically efficient than OAI221

    Z

    4/1 4/1

    4/14/1

    A

    B

    C

    D

    VDD2/1E

    3/1A

    3/1C

    3/1B

    3/1D

    Z

    ABCDE 3/1

    E

    VDD

    Z

    6/1 6/1

    6/16/1

    6/1

    A

    C

    E

    B

    D

    1/1E

    2/1A

    2/1B

    2/1C

    2/1D

    Z

    ABCD

    E

  • 8/9/2019 ASICs the Course Book

    72/505

    16 SECTION 3 ASIC LIBRARY DESIGN ASICS... THE COURSE

    3.3.3 Logical Paths

    3.3.4 Multistage Cells

    path delay D = ∑ g i h i + ∑ ( p i + q i )

    i path i path

    Logical paths • Comparison of multistage and single-stage implementations

    (a) An AOI221 logic cell constructed as a multistage cell, d 1 = 20 + CL

    (b) A single-stage AOI221 logic cell, d 1 = 18.8 + CL

    (a)

    (b)

    d 1=(1 × 2.6+1+1.7) +(1 × C L +5+8.5)=18.8+ C L

    d 1=( g 0 h0 + p0 + q0) +( g2 h2 + p2 + q2) +( g3h 3 + p3 + q3) +( g4h4 + p4 + q4)

    =(1 × 1.4 +1+1.7)+(1.4 × 1+2+3.4)+(1.4 × 0.7+ 2+3.4) +(1 × C L +1+1.7)=20+ CL

    AOI21 g4 =1 p4 =1q4 =1.7

    ZN

    C

    A1A2

    B1B2

    ZN

    1.01.4

    1.41.0

    2.0 1.6

    C LC

    A1A2

    B1B2

    g1 =(2, 1.6) p1 =3q1 =5.4

    g0 =1 p0 =1q0 =1.7

    g2 =1.4 p2 =2q2 =3.4

    delay d1

    g3 =1.4 p3 =2q3 =3.4

    ZN

    C

    A1A2B1B2

    AOI221

    2.6

    g0 =1 p0 =1

    q0 =1.7 g1 =(2.6, 2.6, 2.2) p1 =5q1 =8.5

    CL

    delay d1

    h0 =1.41.4

    h3 =0.7

    h4 =C L

    2

    13 4 43

    1

    2

    AOI221 AOI221

    1.0

    (b) isslightlyfasterthan (a)

    h2 =1.0

  • 8/9/2019 ASICs the Course Book

    73/505

    ASICs... THE COURSE 3.3 Logical Effort 17

    3.3.5 Optimum Delay

    3.3.6 Optimum Number of Stages

    • Chain of N inverters each with equal stage effort, f=gh

    • Total path delay is Nf=Ngh=Nh , since g =1 for an inverter

    path logical effort G = ∏ g ii path

    Cout

    path electrical effort H = ∏ h i –––––

    i path C inCout is the load and C in is the rst input capacitance on the path

    path effort F = GH

    optimum effort delay f^i = g i h i = F 1/ N

    optimum path delay D^ = NF 1/ N = N (GH )1/ N + P + Q

    P + Q = ∑ p i + h i

    i path

    Stage effort

    h h/(ln h)1.5 3.72 2.9

    2.7 2.73 2.7

    4 2.95 3.1

    10 4.3

    0

    2

    4

    6

    8

    10

    12

    1 2 3 4 5 6 7 8 9 10

    = h /(ln h)

    stage electrical effort, h=H 1/ N

    Delay of N inverter stages drivinga path effort of H = C out / C in .

    C in Cout

    1 N2

    h

    delay/(ln H )

    h

  • 8/9/2019 ASICs the Course Book

    74/505

    18 SECTION 3 ASIC LIBRARY DESIGN ASICS... THE COURSE

    • To drive a path electrical effort H, h N = H , or N ln h =lnH

    • Delay, Nh = h lnH /ln h

    • Since ln H is xed, we can only vary h /ln( h)

    • h /ln( h) is a shallow function with a minimum at h =e ≈2.718• Total delay is N e=eln H

    3.4 Library-Cell Design• A big problem in library design is dealing with design rules

    • Sometimes we can waive design rules

    • Symbolic layout , sticks or logs can decrease the library design time (9 months forVirtual Silicon–currently the most sophisticated standard-cell library)

    • Mapping symbolic layout uses 10–20 percent more area (5–10 percent with compac-tion)

    • Allowing 45°layout decreases silicon area (some companies do not allow 45° layout)

  • 8/9/2019 ASICs the Course Book

    75/505

    ASICs... THE COURSE 3.5 Library Architecture 19

    3.5 Library Architecture

    (a) (b )

    (c) (d)

    Cell library statistics

    • 80percent of an ASIC uses less than20percent of the cell library

    • Cell importance

    • A D flip-flop (with a cell importance of 3.5)contributes 3.5 times as much area on a typi-cal ASIC than does an inverter (with a cell im-portance of 1)

    (e)

    cell numberordered bycell use

    normalized cell use(minimum-size inverter=1)

    0

    1

    cell numberordered bycell use

    50 normalized cell area(minimum-size inverter=1)

    0

    cell area × cell use(minimum-size inverter=1)

    cell numberordered bycell use

    0

    4

    0

    1

    cell numberordered bycell importance

    normalized cell importance(D flip-flop=1)

    cell importance =cell area × cell use(D flip-flop=1)

    cell use (minimum-size inverter=1)

    0

    1

    cell numberordered bycell use andby cell importance

  • 8/9/2019 ASICs the Course Book

    76/505

    20 SECTION 3 ASIC LIBRARY DESIGN ASICS... THE COURSE

    3.6 Gate-Array Design

    Key words: gate-array base cell (or base cell) • gate-array base (or base) • horizontal tracks •

    vertical track • gate isolation • isolator transistor • oxide isolation • oxide-isolated gate array

    The construction of a gate-isolated gate array(a) The one-track-wide base cell containing one p-channel and one n-channel transistor

    (b) The center base cell is isolating the base cells on either side from each other

    (c) The base cell is 21 tracks high (high for a modern cell library)

    n-well

    p-well

    n-diff

    p-diff

    poly

    m1

    m2

    contact

    (a) (b) (c)

    continuousp-diff strip

    continuousn-diff strip

    VDD

    VSS

    n-wellcontact

    p-wellcontact

    bent gate

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    contact forisolator

  • 8/9/2019 ASICs the Course Book

    77/505

    ASICs... THE COURSE 3.6 Gate-Array Design 21

    An oxide-isolated gate-array base cell

    • Two base cells, each contains eight transistors and two well contacts

    • The p-channel and n-channel transistors are each 4 tracks high

    • The cell is 12 tracks high (8–12 is typical for a modern library)

    • The base cell is 7 tracks wide

    VDD

    GND

    n-wellp-welln-diff

    p-diffpoly

    m1

    m2contact

    n-wellcontactp-wellcontact

    base cell

    break in diffusion

    12(3)456789(10)1112

    1 2 3 4 5 6 7 poly

    p-diff

    n-diff

    p-diff

  • 8/9/2019 ASICs the Course Book

    78/505

    22 SECTION 3 ASIC LIBRARY DESIGN ASICS... THE COURSE

    An oxide-isolated gate-array base cell

    • 14 tracks high and 4 tracks wide

    • VDD (tracks 3 and 4) and GND (tracks 11 and 12) are each 2 tracks wide

    • 10 horizontal routing tracks (tracks 1, 2, 5–10, 13, 14)—unusually large number for mod-ern cells

    • p-channel and n-channel polysilicon bent gates are tied together in the center of the cell

    • The well contacts leave room for a poly cross-under in each base cell.

    n-well

    p-well

    n-diff

    p-diff

    poly

    m1

    m2

    contact

    poly cross-under

    VDD

    VSS

    1

    2

    (3)

    (4)

    5

    6

    7

    8

    9

    10

    (11)

    (12)

    13

    14

    n-well contact

    1 2 3 4

  • 8/9/2019 ASICs the Course Book

    79/505

    ASICs... THE COURSE 3.6 Gate-Array Design 23

    Flip-op macro in a gate-isolated gate-array library

    • Only the first-level metallization and contact pattern, the personalization , is shown, butthis is enough information to derive the schematic

    • This is an older topology for 2LM (cells for 3LM are shorter in height)

    D

    Q

    contact forisolator

    VDD

    VSS

    connector

    CLR

    QN

    CLK

  • 8/9/2019 ASICs the Course Book

    80/505

    24 SECTION 3 ASIC LIBRARY DESIGN ASICS... THE COURSE

    The SiARC/Synopsys cell-based array (CBA) basic cell

    • This is CBA I for 2LM (CBA II is intended for 3LM and salicide proceses)

    n-well

    p-well

    n-diff

    p-diff

    poly

    m1

    m2

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

  • 8/9/2019 ASICs the Course Book

    81/505

    ASICs... THE COURSE 3.6 Gate-Array Design 25

    A simple gate-array base cell

    aa bb cc dd ee ff gg hh ii jj kk ll

    ab

    c

    def

    g

    hi

    j

    k

    l

    m

    n

    o

    p

    q

    mm

    nn

    oo

    xx

    yy

    aa+bb

    = 2.75= (0.5 × P.4 + C.3)

    = 1.25

    = 0.5 × P.4

    = xx + yy

    = 1.5= C.3

    n-wellpoly

    ndiff

    contact

    ndiff pdiff

    pdiff

    BB

    BB BB

    basecell 1

    basecell 2

    BB = cell bounding box

    p-well

  • 8/9/2019 ASICs the Course Book

    82/505

    26 SECTION 3 ASIC LIBRARY DESIGN ASICS... THE COURSE

    3.7 Standard-Cell Design

    A D ip-op standard cell

    • Performance-optimized library • Area-optimized library

    • Wide power buses and transistors for a performance-optimized cell

    • Double-entry cell intended for a 2LM process and channel routing

    • Five connectors run vertically through the cell on m2

    • The extra short vertical metal line is an internal crossover

    • bounding box (BB) • abutment box (AB) • physical connector • abut

  • 8/9/2019 ASICs the Course Book

    83/505

    ASICs... THE COURSE 3.7 Standard-Cell Design 27

    A D ip-op from a 1.0 µm standard-cell library

    VDD

    GND

  • 8/9/2019 ASICs the Course Book

    84/505

    28 SECTION 3 ASIC LIBRARY DESIGN ASICS... THE COURSE

    D ip-op(Top) n-diffusion, p-diffusion, poly, contact (n-well and p-well are not shown)

    (Bottom) m1, contact, m2, and via layers

    poly

    ndiff

    pdiff

    contactt1 t2 t3

    t4 t5t6 t7 t8

    t9 t10 t11

    t12 t13

    t14 t15

    t20 t21t22

    t23

    t24

    t25 t26 t27

    t28

    t29t30

    t31

    t32

    t33

    t34

    1 1

    1

    1 1 1 1

    00

    00

    2

    2

    3

    3

    3

    44

    4

    5

    5

    5 5

    6

    6

    6 6

    6

    7

    7

    7

    7

    8

    8

    8

    9

    9

    9

    10

    10

    10

    c

    d

    d

    a

    b

    b

    e

    e

    10

    9

    10

    via

    m1

    m2

    contactVDD

    VSS

    ba

    1

    0

    2 3

    4 5

    5 5

    35

    6

    67

    7

    8

    9

    10

    c

    d6

    8

    c d e

  • 8/9/2019 ASICs the Course Book

    85/505

    ASICs... THE COURSE 3.8 Datapath-Cell Design 29

    3.8 Datapath-Cell Design

    A datapath D ip-op cell

    VDD

    VDD

    VSS

    VSS

  • 8/9/2019 ASICs the Course Book

    86/505

  • 8/9/2019 ASICs the Course Book

    87/505

    ASICs... THE COURSE 3.9 Summary 31

    3.9 Summary

    Key concepts:

    • Tau, logical effort, and the prediction of delay• Sizes of cells, and their drive strengths

    • Cell importance

    • The difference between gate-array macros, standard cells, and datapath cells

  • 8/9/2019 ASICs the Course Book

    88/505

    32 SECTION 3 ASIC LIBRARY DESIGN ASICS... THE COURSE

  • 8/9/2019 ASICs the Course Book

    89/505

    ASICs...THE COURSE (1 WEEK)

    1

    PROGRAMMABLEASICs

    4.1 The Antifuse

    Key concepts: programmable logic devices (PLDs) • eld-programmable gate arrays

    (FPGAs) • programming technology • basic logic cells • I/O logic cells • programmable inter-

    connect • software to design and program the FPGA

    Actel antifuse

    antifuse • programming current (about 5mA) • (PLICE‘) • oxide–nitride–oxide (ONO) dielec-tric • Activator • in-system programming (ISP) • gang programmers • one-time programma-ble (OTP) FPGAs

    n+ antifusediffusion

    antifusepolysilicon

    2 λ

  • 8/9/2019 ASICs the Course Book

    90/505

  • 8/9/2019 ASICs the Course Book

    91/505

    ASICs... THE COURSE 4.1 The Antifuse 3

    4.1.1 Metal–Metal Antifuse

    Metal–metal antifuse

    QuickLogic metal–metal antifuse (ViaLink‘) • alloy of tungsten, titanium, and silicon • bulk re-sistance of about 500m Ω cm

    Resistance values for the QuickLogicmetal–metal antifuse

    m1

    m2

    SiO2

    SiO2via

    link

    link

    m2

    amorphous Si

    (a) (b)

    SiO2

    tungstenplug

    m3

    4 λ

    4 λ

    amorphous Si2 λ

    m3

    m2

    2 λ2 λ

    antifuse resistance/ Ω

    0

    100percentage

  • 8/9/2019 ASICs the Course Book

    92/505

    4 SECTION 4 PROGRAMMABLE ASICs ASICS... THE COURSE

    4.2 Static RAM

    4.3 EPROM and EEPROM Technology

    Xilinx SRAM (static RAM) congura-tion cell

    • use in reconfigurable hardware

    • use of programmable read-onlymemory or PROM to hold configu-ration

    An EPROM transistor

    (a) With a high (>12V) programming voltage, V PP , applied to the drain, electrons gainenough energy to “jump” onto the floating gate (gate1)

    (b) Electrons stuck on gate1 raise the threshold voltage so that the transistor is always offfor normal operating voltages

    (c) UV light provides enough energy for the electrons stuck on gate1 to “jump” back to thebulk, allowing the transistor to operate normally

    Facts and keywords: Altera MAX 5000 EPLDs and Xilinx EPLDs both use UV-erasable

    electrically programmable read-only memory (EPROM) • hot-electron injection or avalancheinjection • floating-gate avalanche MOS (FAMOS)

    DATA

    READ orWRITE

    Q

    Q'

    configurationcontrol

    source drain

    +VPPGND

    electronsGND

    +V GS > V tn

    gate1

    gate2

    source drain

    +V DSGND

    no channelbulk

    GND

    +V GS > Vtn h ν

    UV light

    (a) (b) (c)

    bulkbulk

  • 8/9/2019 ASICs the Course Book

    93/505

    ASICs... THE COURSE 4.4 Practical Issues 5

    4.4 Practical Issues

    4.4.1 FPGAs in Use• inventory

    • risk inventory or safety supply

    •just-in-time

    (JIT

    )

    • printed-circuit boards (PCBs )

    • pin locking or I/O locking

    4.5 Specications• qualification kit

    • down-binning

    4.6 PREP Benchmarks• Programmable Electronics Performance Company (PREP )

    • http://www.prep.org

    Hardware security keycomputer-aided engineering (CAE ) tools • PC vs. workstation •ease of use • cost of ownership

  • 8/9/2019 ASICs the Course Book

    94/505

    6 SECTION 4 PROGRAMMABLE ASICs ASICS... THE COURSE

    4.7 FPGA Economics

    Xilinx part-naming convention

    Not all parts are available in all packag-es

    Some parts are packaged with fewerleads than I/Os

    Programmable ASIC part codes

    Item Code Description Code DescriptionManufac-turer’s

    code

    A Actel ATT AT&T (Lucent)XC Xilinx isp Lattice Logic

    EPM Altera MAX M5 AMD MACH 5 is on thedevice

    EPF Altera FLEX QL QuickLogicCY7C Cypress

    Package

    type

    PL or

    PC

    plastic J-leaded chip carrier,

    PLCC

    VQ very thin quad atpack,

    VQFPPQ plastic quad atpack, PQFP TQ thin plastic atpack, TQFPCQ orCB

    ceramic quad atpack, CQFP PP plastic pin-grid array, PPGA

    PG ceramic pin-grid array, PGA WB,PB

    ball-grid array, BGA

    Application C commercial B MIL-STD-883I industrial E extendedM military

    XC4010-10 PG156C

    device typespeedpackagenumber of pinstemperature range

  • 8/9/2019 ASICs the Course Book

    95/505

  • 8/9/2019 ASICs the Course Book

    96/505

  • 8/9/2019 ASICs the Course Book

    97/505

    ASICs... THE COURSE 4.7 FPGA Economics 9

    4.7.2 Pricing Examplesbase prices and adjustment factors • “sticker price”

    • Marshall at http://marshall.com , carry Xilinx

    • Hamilton-Avnet, at http://www.hh.avnet.com , carry Xilinx

    • Wyle, at http://www.wyle.com carries Actel and Altera

    Example Actel part-price calculation

    Example: A1020A-2-PQ100 in (100–999) quantity, purchased 1H92.Factor Example ValueBase price A1020A $43.30Quantity 100–999 84%Time 1H92 100%Qualication type Industrial (I) 120%

    Speed bin 1 2 140%

    Package PQ100 125%Estimated price (1H92) $76.38Actual Actel price (1H92) $75.601The speed bin is a manufacturer’s code (usually a number) that follows the family part numberand indicates the maximum operating speed of the device

  • 8/9/2019 ASICs the Course Book

    98/505

    10 SECTION 4 PROGRAMMABLE ASICs ASICS... THE COURSE

    4.8 Summary

    All FPGAs have the following key elements:

    • The programming technology

    • The basic logic cells

    • The I/O logic cells

    • Programmable interconnect

    • Software to design and program the FPGA

    Programmable ASIC technologies

    Actel Xilinx LCA1

    Altera EPLD Xilinx EPLDProgrammingtechnology

    Poly–diffusionantifuse, PLICE

    Erasable SRAM

    ISP

    UV-erasableEPROM (MAX 5k)

    EEPROM (MAX7/9k)

    UV-erasableEPROM

    Size ofprogrammingelement

    Small but requirescontacts to metal

    Two inverters pluspass and switchdevices. Largest.

    One n -channelEPROM device.

    Medium.

    One n -channelEPROM device.

    Medium.Process Special: CMOS

    plus three extramasks.

    Standard CMOS Standard EPROMand EEPROM

    Standard EPROM

    Program-ming method

    Special hardware PC card, PROM,or serial port

    ISP (MAX 9k) orEPROM program-mer

    EPROM program-mer

    QuickLogic Crosspoint Atmel Altera FLEXProgrammingtechnology

    Metal–metalantifuse, ViaLink

    Metal–polysiliconantifuse

    Erasable SRAM.

    ISP.

    Erasable SRAM.

    ISP.

    Size ofprogramming

    element

    Smallest Small Two inverters pluspass and switchdevices. Largest.

    Two inverters pluspass and switchdevices. Largest.

    Process Special, CMOSplus ViaLink

    Special, CMOSplus antifuse

    Standard CMOS Standard CMOS

    Program-ming method

    Special hardware Special hardware PC card, PROM,or serial port

    PC card, PROM,or serial port

    1Lucent (formerly AT&T) FPGAs have almost identical properties to the Xilinx LCA family

  • 8/9/2019 ASICs the Course Book

    99/505

    ASICs... THE COURSE 4.9 Problems 11

    4.9 Problems

  • 8/9/2019 ASICs the Course Book

    100/505

    12 SECTION 4 PROGRAMMABLE ASICs ASICS... THE COURSE

  • 8/9/2019 ASICs the Course Book

    101/505

    ASICs...THE COURSE (1 WEEK)

    1

    PROGRAMMABLEASIC LOGICCELLS

    5.1 Actel ACT5.1.1 ACT 1 Logic Module

    Key concepts: basic logic cell • multiplexer-based cell • look-up table (LUT) • programmable

    array logic (PAL) • inuence of programming technology • timing • worst-case design

    The Actel ACT architecture

    (a) Organization of the basic logic cells

    (b) The ACT 1 Logic Module (LM, the Actel basic logic cell). The ACT 1 family uses justone type of LM. ACT 2 and ACT 3 FPGA families both use two different types of LM

    (c) An example LM implementation using pass transistors (without any buffering)

    (d) An example logic macro. Connect logic signals to some or all of the LM inputs, the re-maining inputs to VDD or GND

    F

    (b) (c) (d)

    S3

    FA0

    SA F1

    A1

    B0

    SB

    F2

    B1

    S0

    S1 O1

    S

    A0

    A1

    SA

    01

    01

    SB0

    B1

    SB

    01S

    S0S1

    M1

    M2

    O1

    M3

    S3

    F1

    F2

    01

    01

    01

    D

    '1'

    D

    A

    '1'

    C

    '0'B

    F

    (a)

    Actel ACT

    Logic Module Logic Module Logic Module

    F=(A ·B) +(B' ·C)+D

    F1

    F2

  • 8/9/2019 ASICs the Course Book

    102/505

    2 SECTION 5 PROGRAMMABLE ASIC LOGIC CELLS ASICS... THE COURSE

    5.1.2 Shannon’s Expansion Theorem• We can use the Shannon expansion theorem to expand F =A·F(A='1') +A'·F(A='0')

    Example: F =A' ·B + A·B·C' + A'·B' ·C = A·(B·C' ) + A' ·(B + B'·C)• F(A='1')=B·C' is the cofactor of F with respect to ( wrt ) A or FA• If we expand F wrt B, F =A