Unit3_ghongade

265
Programmable Logic Devices R.B.Ghongade UNIT 3

Transcript of Unit3_ghongade

  • Programmable LogicDevices

    R.B.Ghongade

    UNIT 3

  • Key Terms

    Field-Programmable Device (FPD) a general term that refers to any type of integrated circuit used for implementing digital hardware, where the chip can be configured by the end user to realize different designs. Programming of such a device often involves placing the chip into a special programming unit, but some chips can also be configured in-system. Another name for FPDsis programmable logic devices (PLDs); although PLDsencompass the same types of chips as FPDs, we prefer the term FPD because historically the word PLD has referred to relatively simple types of devices.

    PLA a Programmable Logic Array (PLA) is a relatively small FPD that contains two levels of logic, an AND-plane and an OR-plane, where both levels are programmable.

  • Key Terms PAL a Programmable Array Logic (PAL) is a relatively

    small FPD that has a programmable AND-plane followed by a fixed OR-plane

    SPLD refers to any type of Simple PLD, usually either a PLA or PAL

    CPLD a more Complex PLD that consists of an arrangement of multiple SPLD-like blocks on a single chip. Alternative names are sometimes adopted for this style of chip are Enhanced PLD (EPLD), Super PAL, Mega PAL, and others.

    FPGA a Field-Programmable Gate Array is an FPD featuring a general structure that allows very high logic capacity. Whereas CPLDs feature logic resources with a wide number of inputs (AND planes), FPGAs offer more narrow logic resources. FPGAs also offer a higher ratio of flip-flops to logic resources than do CPLDs.

  • Key Terms

    HCPLDs high-capacity PLDs: a single acronym that refers to both CPLDs and FPGAs. This term has been coined in trade literature for providing an easy way to refer to both types of devices.

    Interconnect the wiring resources in an FPD. Programmable Switch a user-programmable switch

    that can connect a logic element to an interconnect wire, or one interconnect wire to another

    Logic Block a relatively small circuit block that is replicated in an array in an FPD. When a circuit is implemented in an FPD, it is first decomposed into smaller sub-circuits that can each be mapped into a logic block. The term logic block is mostly used in the context of FPGAs, but it could also refer to a block of circuitry in a CPLD.

  • Key Terms Logic Capacity the amount of digital logic that can be

    mapped into a single FPD. This is usually measured in units of equivalent number of gates in a traditional gate array. In other words, the capacity of an FPD is measured by the size of gate array that it is comparable to. In simpler terms, logic capacity can be thought of as number of 2-input NAND gates.

    Logic Densitythe amount of logic per unit area in an FPD.

    Speed-Performance measures the maximum operable speed of a circuit when implemented in an FPD. For combinational circuits, it is set by the longest delay through any path, and for sequential circuits it is the maximum clock frequency for which the circuit functions properly.

  • Digital VLSI Chips -classification

    ASIC

    FPD

    SPLD CPLD FPGA

    ASIC

    GATE ARRAY STANDARDCELLFULL

    CUSTOM

    PLA PAL GAL PROM EPLD E2PLD

    Genericusage

    Field Programmable Device

    Typical usage

    Increasing complexity Increasing complexity

  • General Programmable Logic Device

    Logic gates and

    programmableswitches

    Inputs

    (logic variables) Outputs

    (logic functions)

    Logic gates and

    programmableswitches

    Inputs

    (logic variables) Outputs

    (logic functions)

    Consists of a set of inputs (the logic variables) and set of outputs (logic functions).

    The job of the designer is to simply program the switches and hence configure the logic gates to perform the desired function

  • AND-OR realization of logicfunctions

    Y AB AC BC

    0111

    1011

    1101

    0001

    1110

    0010

    0100

    0000

    YCBA

    A

    B

    C

    Y

    Thus given a logic function in SOP form , it can be implemented by using AND and OR arrays. This forms the basic working principle of programmable logic devices

  • General form of programmable functiondevice

    a

    b

    Logic 1

    y

    Pull-up resistors

    Links that can be programmed

    Inputs are available in their true as well as inverted (complementary) forms. This is an important development since all possibilities of inputs are available

    readily. The user can now put links and construct the desired function. Putting the link ( or removing it) is called as programming the device.

  • Programming technologiesThe type of links gives rise to two different technologies: Fusible link and Anti-fuse

    Fusible link technologies

  • Fusible Link Technology

    a

    b

    Logic 1

    y

    a

    b

    Logic 1

    y=a.b'

    blown fuses

    Un-programmed Device

    Programmed Device

  • Fusible link technologies

    Devices based on fusible-link technologies are said to be one-time programmable, or OTP, because once a fuse has been blown, it cannot be replaced. This places a severe limitation on the usage of the device.

  • Antifuse technologies

    a

    b

    Logic 1

    y

    antifuse links

    a

    b

    Logic 1

    programmed antifuse links

    y=a.b'

    Un-programmed Device

    Programmed Device

  • Simplified Antifuse An antifuse is a microscopic column of amorphous (non-

    crystalline) silicon linking two metal tracks. In its un-programmed state, the amorphous silicon acts as an insulator with a very high resistance in excess of one billion ohms

    Un-programmed Programmed

  • Types of antifuse technologies

    There are two classes of antifuse technologies: Poly-diffusion Anti-fuse (used by Actel)

    Metal-Metal Anti-fuse (used by Quicklogic)

  • Poly-diffusion Anti-fuse

    An Oxide-Nitride-Oxide dielectric normally prevents current from flowing between diffusion and poly-silicon layers

    When a programming pulse is applied the dielectric melts and a circuit is formed between the diffusion and poly-silicon

  • Metal-Metal Anti-fuse

    The link is an alloy of tungsten, titanium and silicon The conductive link usually forms at the corner of the via where the

    electric field is highest during programming

  • Programming!

    The act of programming this particular element effectively grows a linkknown as a viaby converting the insulating amorphous silicon into conducting polysilicon

    Devices based on antifuse technologies are OTP, because once an antifuse has been grown, it cannot be removed. Again it is a severe limitation of the technology, but antifuse technology has found its way in space applications because of high reliability

  • PLD Notation

    a b c d

    a.b'.d

    a'.c'

    link

    no link

  • Non-programmable link

    a b c d

    a.b'.d

    a'.c'

    non-programmable connection

  • Programmable Logic Array (PLA)

    The AND array along with an OR array can be put together to form a Programmable Logic Array (PLA)

    We have already explored the technique of realizing any logic expression by using AND and OR gates. This is the underlying principle of PLA.

    In a PLA both the AND as well as OR arrays are both programmable.

    PLAs are specified in terms of: Number of inputs (n) Number of outputs(m) Number of product terms(p)

  • PLAa b c d

    f1 f2 f3

    Programmable AND array

    Programmable OR array

  • PLA programmed for various logic expressions

    a.b.c'.d'+a'.b'.c'

    a b c d

    a.b.c'.d'

    a'.b'.c'

    b'.c

    a'.b'.c'+b'.c

    a.b.c'.d'+b'.c

  • PLA QP82S100

    Consists of 16 dedicated inputs and 8 dedicated outputs

    Each output is capable of being actively controlled by any or all of the 48 product terms. The True, Complement, or Dont Care condition of each of the 16 inputs can be ANDed together to comprise one product term

    All 48 product terms can be selectively ORed to each output

    48816

    pnm

  • PLA QP82S100

  • Programmable AND Array Logic(PAL)

    Many applications do not require that both the AND as well as OR arrays be programmable.

    Programmable links are slower than permanent links owing to the considerable resistance shown by the fusible material.

    Hence another option for the design engineers -The AND array can be kept programmable as in PLA but the OR array has got no programmability!

    Permanent connections are only available in the OR array thus pre-defining the sum terms.

    This reduces the flexibility but greatly improves the speed and reduces the manufacturing cost.

  • PALa b c d

    f1 f2 f3

    Programmable AND array

    Fixed OR array

  • Additional Features

    Tri-state outputs gives programmable bi-directional pins

    saves the pin-count

    Registered outputs Enables the use of the PAL in finite state

    machines

    Increases the versatility of the device

  • Macrocell

  • PAL16L8A

  • SpecificationsPart Number = PAL16L8ADescription = Programmable array logic deviceFuse type=titanium-tungsten Manufacturer = Texas InstrumentsNumber of Inputs = Upto 16Prod. Terms Max. = 64No. of Outputs = Upto 8 Nom. Supp (V) = 5.0Package = DIP, LEADLESS CERAMIC CHIP CARRIER(FK)Pins = 20Technology = Advanced Low-Power SchottkyBi-directional pins=6

  • Programming the PLD Programming a traditional PLD is easy because there are computer

    programs and associated tools specially created for the task. The user first creates a computer file known as a PLD source file

    containing a textual description of the required functionality. In addition to Boolean equations, the PLD source file may also

    support truth tables, state tables, and other constructs, all in textual format.

    Automatic selection on a variety of criteria, such as the speed, cost, and power consumption of the devices.

    The program may also be used to partition a large design across several devices, in which case it will output a separate JEDEC file for each device.

    Finally, the designer takes a new device of the appropriate type and places it in a socket on a special tool, which may be referred to as a programmer, blower, or burner.

    The main computer passes the JEDEC file to the programmer, which uses the contents of the file to determine which fuses to blow

    JEDEC: Joint Electron Device Engineering Council

  • Setup for programming

  • Reprogrammable PLDs

    The basic (and most severe) limitation with fusible link and antifuse technologies is that, the device cannot be re-programmed.

    This may be a severe short-coming especially during the development phases of the system

  • Technologies for re-programmable PLD

    EPROM( Erasable Programmable Read-Only Memory )

    E2PROM( Electrically Erasable Programmable Read-Only Memory )

    FLASH

    SRAM (Static Random Access Memory)

  • EPROM

    An EPROM transistor has the same basic structure as a standard MOS transistor, but with the addition of a second polysilicon floating gate isolated by layers of oxide

    DRAINSOURCE

    GATE

    MOS TRANSISTOR

    CONTROL GATE TERMINAL

    DRAIN TERMINAL

    SOURCETERMINAL

    DRAINSOURCE

    FLOATING GATE

    EPROM TRANSISTOR

    CONTROL GATE TERMINAL

    DRAIN TERMINAL

    SOURCETERMINAL

    GATESiO2

    Si

  • EPROM

    In its un-programmed state, the floating gate is uncharged and doesnt affect the normal operation of the control gate.

    To program the transistor, a relatively high voltage in the order of 12V is applied between the control gate and drain terminals.

    This causes the transistor to be turned hard on, and excited electrons push through the oxide into the floating gate in a process known as hot (high energy) electron injection.

    When the programming signal is removed, a negative charge remains on the floating gate.

    This charge is very stable and will not dissipate for more than a decade under normal operating conditions.

    The stored charge on the floating gate inhibits the normal operation of the control gate, and thus distinguishes those cells that have been programmed from those which have not.

  • EPROM

  • E2PROM

    An E2PROM cell is approximately 2.5 times larger than an EPROM cell because it contains two transistors.

    One of the transistors is similar to that of an EPROM transistor in that it contains a floating gate, but the insulating oxide layers surrounding the floating gate are very much thinner.

    The second transistor can be used to erase the cell electrically, and E2PROM devices can typically be erased and reprogrammed on a word-by-word basis.

  • E2PROM

  • FLASH

    The name FLASH was originally coined to reflect the technologys rapid erasure times compared to EPROM

    These devices can be electrically erased, but only by erasing the whole device or a large portion of it.

    architectures have a two-transistor cell which is very similar to that of an E2PROM cell allowing them to be erased and reprogrammed on a word-by-word basis.

  • FLASH

  • SRAM It consists of two cross-coupled inverters and two access transistors The SRAM cell drives the gates of other transistors on the chip - either ON

    to make connection or OFF to break the connection. The access transistors are connected to the at their respective

    gate terminals, and the DATA at their source/drain terminals. The is used to select the cell while the DATA are used to

    perform read or write operations on the cell. Internally, the cell holds the stored value on one side and its complement on

    the other side. To store data, is set to to 1 (5v), the NMOS now passes the

    data from the left hand side to the right hand side of the transistor. After the data stabilizes around the two NOT gates, is set to 0, and the data remains running forever.

    Note that the lower NOT is labeled WEAK, meaning it has weaker transistors. That is in case we want to set a new data and we want the STRONG NOT to override the WEAK one in case the logical level has to change

    R E A D / W R IT E

    R E A D / W R IT E

    R E A D / W R IT E

    R E A D / W R IT E

  • SRAM Cell

  • SRAM

    SRAM cells are used for the following:

    1. They can store a logic value of 0 or 1.

    2. They can store a value of an LUT.

    3. They configure the interconnection switches of the FPGA

  • FPGACMOSNoYes

    (in-circuit)FLASH

    FPGACMOSYesYes

    (in-circuit)SRAM

    SPLD & CPLD

    EECMOSNoYes

    (in-circuit)E2PROM

    SPLD & CPLD

    UVCMOSNoYes

    (out of circuit)EPROM

    FPGACMOSNoNoAnti-fuse

    SPLDBipolarNoNoFusible link

    Associated with

    TechnologyVolatileRe-programmableSymbolType

  • Comparison between programming technologies

    Largest Area element using 5 to 6 transistors plus switch = 30u per node @ 0.25u

    switch is medium impedance -3k/ohms per square (500uA/micron)

    high capacitance -1.6 fA per micron/ per node @ 0.25u

    volatile requires external memory to load designs easily copied dead until loaded soft ware is difficult

    Base logic process - so it uses leading edge processing

    Re-programmable 100% testable no programmer No socket

    SRAM

    requires high voltage - 1 generation below SRAM

    requires programmer requires socket high impedance 80uA/ minimum gate

    (12K ohm) impact ionization limits voltage across

    the device

    Mainstream Technology Reprogrammable 100% testable non-volatile software is simple

    EPROM

    LimitationsAdvantagesTechnology

  • LimitationsAdvantagesTechnology

    Requires high voltagesAbout the same speed as SRAMRadiation Hardness is expected to behave similar to EPROM - has not been tested yet

    Re-programmable in the boardNo socketNon-volatileOne transistor instead of 6 for routing control - i.e. denser partsPasses full Vcc without pumpLive at power up.Difficult to reverse engineer

    FLASH

    Requires programmer Requires a socket - aproblem for devices with > 200 pinssolved with BGAThose who design by test will throw out a lot of parts. Requires one to two transistors per wire for programming ~ 10mA for Metal antifusesONO antifuses require less only 5mA needed so can be programmed from the edgeSome antifuse defects not testable until programming - hence only 98% to 99 % programming yield - but 100% functional

    Highest density - a mere cross point - 10X the density of SRAMLowest switch resistance - 25 OhmsVery low capacitance 1 fF per node.-approaching the metal line capacitancenon- volatileNearly impossible to reverse engineerRadiation hardLive with in 1 millisecond of the power supply reaching spec voltageSoftware is easy to place and route

    Antifuse

  • CPLD

    R.B.Ghongade

  • Key Terms

    CPLD: Complex programmable logic device. A programmable logic device consisting of several interconnected programmable blocks.

    Logic Array Block (LAB): A group of macrocells that share common resources in a CPLD.

    Programmable Interconnect Array (PIA): An internal bus with programmable connections that link together the Logic Array Blocks of a CPLD.

    Buried logic: Logic circuitry in a PLD that has no connection to the input or output pins of the PLD, but is used solely as internal logic.

    I/O Control Block: A circuit in a CPLD that controls the type of tri-state switching used in a macrocell output.

  • Key Terms Parallel logic expanders: Product terms that are

    borrowed from neighbouring macrocells in the same LAB.

    Shared logic expanders: Product terms that are inverted and fed back into the programmable AND matrix of an LAB for use by any other macrocell in the LAB.

    Specifications: There are several performance specifications for complex programmable logic devices Internal frequency is the speed at which CPLDs can perform

    operations or transfer data internally. The propagation delay is the time interval between the

    application of an input signal and the occurrence of the corresponding output in a logic circuit.

    Speed grade indicates the delay in nanoseconds (ns) through a macrocell in the CPLD. For example, a CPLD with a speed grade of 10 has a delay of 10 ns through a macrocell. CPLD with low speed grade numbers run faster than devices with high-speed grade numbers

  • CPLD The term complex PLD (CPLD) is generally taken to refer to a class

    of devices that contain a number of simple PLA or PAL functions (generically referred

    to as simple PLDs (SPLDs) share a common programmable interconnection matrix.

    Thus CPLDs consist of multiple SPLD-like blocks on a single chip. However, CPLD products are much more sophisticated than SPLDs,

    even at the level of their basic SPLD-like blocks. While each manufacturer has a different variation, in general they

    are all similar in that they consist of function blocks, input/output block, and an interconnect matrix.

    The devices are programmed using programmable elements that, depending on the technology of the manufacturer, can be EPROM cells EEPROM cells Flash EPROM cells

  • Generic building blocks

    PLD blocks (also called Function Blocks)

    Interconnection matrix

    I/O blocks

  • Altera MAX7000S Complex PLD

  • Some tricks!

    Using XOR gate as programmable NOT gate

    LOGIC CIRCUIT

    1

    10

    LOGIC CIRCUIT

    0

    11

  • Some tricks!

    Using MUX as programmable switch

    4:1 MUX

    ProgrammableCells

  • Packages

    PQFP: Plastic Quad Flat Package

    PLCC: Plastic Leaded Chip Carrier

    TQFP: Thin Quad Flat Pack

  • PGA: Pin Grid Array

  • Device number

    84-pin PLCCpackage

    In-system programmable

    Number of macrocells

    MAX7000family

    LC84S128EPM7

    E P M 7 128 S LC84

  • MAX 7000 family

    Features Advanced CMOS technology

    EEPROM-based

    provides 600 to 5,000 usable gates

    In System Programmable

    pin-to-pin delays as low as 5 ns

    counter speeds of up to 175.4 MHz

  • Architecture

    The MAX 7000 architecture includes the following elements: Logic array blocks (LAB)

    Macrocells

    Expander product terms (shareable and parallel)

    Programmable interconnect array

    I/O control blocks

  • CLOCK & RESET pins

    The MAX7000S family has four pins that can be configured as control signals or inputs.

    GCLK1 is a global clock that is common to all macrocells in the device and can be used to synchronously clock all registers.

    OE1 is an output enable that can globally activate or disable the tristate outputs of the device macrocells.

    GCLRn is an active- LOW global clear function. The fourth control pin can be configured as an input, as

    can the other three pins, or as a second global clock (GCLK2) or output enable (OE2).

    If the control functions are not used, these pins add four inputs to the available total.

  • ArchitectureGlobal Clock Active- LOW Global Clear

  • Logic Array Block

    LABs consist of 16-macrocell arrays Multiple LABs are linked together via the

    programmable interconnect array (PIA), a global bus that is fed by all dedicated inputs, I/O pins, and macrocells

    Each LAB in a MAX7000S device has from 6 to 16 I/O pins

    For EPM7128SLC84 there are only 60 I/Os available

  • Macrocell

  • Macrocell

    The macrocell is similar to that of a GAL or Universal PAL in that it provides a sum-of-products function with active- HIGH or -LOW options and the choice of registered or combinational output.

    Registered outputs can be clocked with one of two global clocks or by a product term from the AND matrix.

    The register can be cleared globally or by a product term and preset with a product term.

    The macrocell has five dedicated product terms, which is fewer than found in the PAL and GAL.

    This is generally sufficient to implement most logic functions. If more terms are required, they can be supplied by a set of shared logic expanders or parallel logic expanders.

  • Shareable Expanders

  • Shareable Expanders

    Shared logic expanders do not add more product terms to a given macrocell.

    They do make the programming of the entire LAB more efficient by allowing a product term to be programmed once and used in several macrocells of the same LAB.

    One product term per macrocell is inverted and fed back into the shared expander pool of product terms. Since there are 16 macrocells per LAB, the shared logic expander pool has up to 16 product terms

  • Parallel Expanders

  • Parallel Expanders

    Parallel logic expanders allow a macrocell to borrow up to 15 product terms from its three lower-numbered neighbours (5 product terms per neighboring macrocell). For example, macrocell 4 can borrow up to 5 terms each from macrocells 3, 2, and 1.

    By using its 5 dedicated product terms and the maximum number of parallel expanders, a macrocell can have up to 20 product terms at its disposal. These borrowed terms are not usable by the macrocell from which they were borrowed.

    The parallel expanders are set up so that a lower-number cell lends product terms to a higher-number cell, so the number of available terms depends on how close to the end of a chain a macrocell is.

  • Programmable Interconnect Array

  • PIA

    Logic is routed between LABs via the programmable interconnect array (PIA).

    This global bus is a programmable path that connects any signal source to any destination on the device.

    All MAX 7000 dedicated inputs, I/O pins, and macrocell outputs feed the PIA, which makes the signals available throughout the entire device.

    Only the signals required by each LAB are actually routed from the PIA into the LAB.

    An EEPROM cell controls one input to a 2-input AND gate, which selects a PIA signal to drive into the LAB.

    While the routing delays of channel-based routing schemes in masked or FPGAs are cumulative, variable, and path-dependent, the MAX 7000 PIA has a fixed delay.

    The PIA thus eliminates skew between signals and makes timing performance easy to predict.

  • I/O Block

  • I/O Block

    The I/O control block allows each I/O pin to be individually configured for input, output, or bidirectional operation.

    All I/O pins have a tri-state buffer that is individually controlled by one of the global output enable signals or directly connected to ground or VCC.

    The I/O control block of EPM7032, EPM7064, and EPM7096 devices has two global output enable signals that are driven by two dedicated active-low output enable pins (OE1 and OE2).

    The I/O control block of MAX 7000E and MAX 7000S devices has six global output enable signals that are driven by the true or complement of two output enable signals, a subset of the I/O pins, or a subset of the I/O macrocells

  • I/O Control

  • I/O Block

    When the tri-state buffer control is connected to ground, the output is tri-stated (high impedance) and the I/O pin can be used as a dedicated input.

    When the tri-state buffer control is connected to VCC, the output is enabled.

    The MAX 7000 architecture provides dual I/O feedback, in which macrocell and pin feedbacks are independent.

    When an I/O pin is configured as an input, the associated macrocell can be used for buried logic

  • Output Configuration

    MultiVolt I/O Interface MAX 7000 device outputs can be programmed to

    meet a variety of system-level requirements. MultiVolt I/O Interface MAX 7000 devicesexcept 44-

    pin devicessupport the MultiVolt I/O interface feature, which allows MAX 7000 devices to interface with systems that have differing supply voltages.

    The 5.0-V devices in all packages can be set for 3.3-V or 5.0-V I/O pin operation.

    These devices have one set of VCC pins for internal operation and input buffers (VCCINT), and another set for I/O output drivers (VCCIO).

  • Output Configuration

    Open-Drain Output Option (MAX 7000S Devices Only) This open-drain output enables the device to

    provide system-level control signals (e.g., interrupt and write enable signals) that can be asserted by any of several devices.

    It can also provide an additional wired-OR plane

  • Output Configuration

    Slew-Rate Control The output buffer for each MAX 7000E and MAX

    7000S I/O pin has an adjustable output slew rate that can be configured for low-noise or high-speed performance.

    A faster slew rate provides high-speed transitions for high-performance systems

    However, these fast transitions may introduce noise transients into the system.

    A slow slew rate reduces system noise, but adds a nominal delay of 4 to 5 ns.

  • Xilinx XC95XX/XC95XXX Complex PLD

    PLD like blocks called as FUNCTION BLOCKS

  • Available packages Xilinx CPLD

    192 166 ----352-Pin BGA

    168 166 ----208-Pin HQFP

    -133 133 108 --160-Pin PQFP

    --81 81 72 -100-Pin PQFP

    --81 81 72 -100-Pin TQFP

    ---69 69 -84-Pin PLCC

    -----34 48-Pin CSP

    ----34 34 44-Pin PLCC

    -----34 44-Pin VQFP

    XC95288 XC95216 XC95144 XC95108 XC9572 XC9536

  • More packages

    VQFP: Very Fine Pitch Quad Flat Pack/ Very Thin Quad Flat Package

  • CSP: Chip Scale Package

  • HQFP: Heat-sinked Quad Flat Pack

  • BGA: Ball Grid Array

  • Device marking

  • Features

    High-performance: 5 ns pin-to-pin logic delays on all pins, fCNT to 125 MHz

    Large density range: 36 to 288 macrocells with 800 to 6,400 usable gates

    5V in-system programmable: Endurance of 10,000 program/erase cycles

    Enhanced pin-locking architecture Flexible 36V18 Function Block: 90 product terms drive any or all of

    18 macrocells within Function Block, global and product term clocks, output enables, set and reset signals, extensive IEEE Std 1149.1boundary-scan (JTAG) support ,slew rate control on individual outputs, user programmable ground pin capability, extended pattern security features for design protection, High-drive 24 mA outputs, 3.3V or 5V I/O capability

    Advanced CMOS 5V FLASH technology Supports parallel programming of multiple XC9500 devices

  • XC9500 Architecture

  • CLOCK ,RESET, TRI-STATE pins

    The pins labeled GCK (three), GSR (one), GTS (two or four) can be used for special purposes

    GCK: global clock

    GSR: global set/reset

    GTS: global three-state controls

  • Function Blocks

  • Function Blocks

    The AND plane still exists as shown by the crossing wires.

    The AND plane can accept inputs from the I/O blocks, other function blocks, or feedback from the same function block.

    The terms are then ORed together using a fixed number of OR gates, and terms are selected via a large multiplexer.

    The outputs of the mux can then be sent straight out of the block, or through a clocked flip-flop.

    This particular block includes additional logic such as a selectable exclusive OR and a master reset signal, in addition to being able to program the polarity at different stages

  • Function Blocks Each Function Block is comprised of 18 independent

    macrocells, each capable of implementing a combinatorial or registered function.

    The FB also receives global clock, output enable, and set/reset signals.

    The FB generates 18 outputs that drive the Fast CONNECT switch matrix.

    These 18 outputs and their corresponding output enable signals also drive the IOB.

    Logic within the FB is implemented using a sum-of-products representation.

    Thirty-six inputs provide 72 true and complement signals into the programmable AND-array to form 90 product terms.

    Any number of these product terms, up to the 90 available, can be allocated to each macrocell by the product term allocator.

  • XC9500 macrocell

    Up to 5 product terms

    Programmableinversion or XORproduct term

    Global clock or product-term clock

    Set control

    Reset control

    OE control

  • Macrocell Clock and Set/Reset Capability

  • Product term allocator

  • IOB

  • Switch matrix

  • Xilinx CoolRunner-II CPLD FamilyFeatures Optimized for 1.8V systems : Low power CPLD, Densities from 32 to 512 macrocells 0.18 micron CMOS CPLD : Optimized architecture for effective logic synthesis, multi-

    voltage ,I/O operation ( 1.5V to 3.3V) Advanced system features: Fast in system programming, On-The-Fly Reconfiguration

    (OTF),boundary scan test, multiple I/O banks on all devices, low- power management External signal control, flexible clocking modes Clock divider ( 2,4,6,8,10,12,14,16) Global signal options with macrocell control, multiple global clocks with phase selection

    per macrocell Multiple global output enables Global set/reset: Abundant product term clocks, output enables and set/resets, efficient

    control term clocks, output enables and set/resets for each macrocell and shared across function blocks

    Advanced design security Open-drain output option for Wired-OR and LED drive Optional bus-hold, 3-state or weak pullup on select I/O pins: Optional configurable

    grounds on unused I/Os, mixed I/O voltages compatible with 1.5V, 1.8V, 2.5V, and 3.3V logic levels on all parts

    Wide package availability including fine pitch:Chip Scale Package (CSP) BGA, Fine Line BGA, TQFP, PQFP, VQFP, PLCC, and QFN packages

    Guaranteed 1,000 program/erase cycles, Guaranteed 20 year data retention

  • CoolRunner-II CPLD Architecture

  • Coolrunner-II family Function Block

  • Macrocell

  • New control signals

    Control Terms (CT) are available to be shared for key functions within the FB, and are generally used whenever the exact same logic function would be repeatedly created at multiple macrocells.

    The CT product terms are available for FBclocking (CTC), FB asynchronous set (CTS), FB asynchronous reset (CTR), and FB output enable (CTE).

  • Advanced Interconnect Matrix

    The Advanced Interconnect Matrix is a highly connected low power rapid switch.

    The AIM is directed by the software to deliver up to a set of 40 signals to each FB for the creation of logic.

    Results from all FB macrocells, as well as, all pin inputs circulate back through the AIM for additional connection available to all other FBsas dictated by the design software.

    The AIM minimizes both propagation delay and power as it makes attachments to the various FBs

  • I/O blocks

    Output Banking:The output pins are grouped in large banks which allow easy interfacing to 3.3V, 2.5V, 1.8V, and 1.5V in a single part. Thus these CPLDs can be widely used as voltage interface translators

  • DataGate

  • DataGate

    Is used for power reduction. Each I/O pin has a series switch that can block the arrival of free

    running signals that are not of interest. Signals that serve no use may increase power consumption, and

    can be disabled. Users are free to do their design, then choose sections to participate

    in the DataGATE function. DataGATE is a logic function that drives an assertion rail threaded

    through the medium and high-density CoolRunner-II CPLD parts. Designers can select inputs to be blocked under the control of the

    DataGATE function, effectively blocking controlled switching signals so they do not drive internal chip capacitances.

    Output signals that do not switch, are held by the bus hold feature. Any set of input pins can be chosen to participate in the DataGATE

    function.

  • Choice of CPLD When considering a CPLD for use in a design, the following issues should

    be taken into account :1. The programming technology

    EPROM, EEPROM, or Flash EPROM? This will determine the equipment needed to program the devices and whether they can be programmed only once or many times.

    2. The function block capabilityHow many function blocks are there in the device?How many product and sum terms can be used?What are the minimum and maximum delays through the logic?What additional logic resources are there such as XNORs, ALUs, etc.?What kind of register controls are available (e.g., clock enable, reset, preset, polarity control)?How many are local inputs to the function block and how many are global, chipwide inputs?What kind of clock drivers are in the device and what is the worst case skew of the clock signal on the chip. This will help determine the maximum frequency at which the device can run.

    3. The I/O capabilityHow many I/O are independent, used for any function, and how many are dedicated for clock input, master reset, etc.?What is the output drive capability in terms of voltage levels and current?What kind of logic is included in an I/O block that can be used to increase the functionality of the design?

  • FPGA

    R.B.Ghongade

  • Key terms

    Look-up table (LUT): A circuit that implements a combinational logic function by storing a list of output values that correspond to all possible input combinations.

    CLB: Configurable Logic Block is the name for programmable logic block in a FPGA.

    Logic element (LE): A circuit internal to a FPGA used to implement a logic function as a look-up table.

    Cascade chain: A circuit in a FPGA that allows the input width of a Boolean function to expand beyond the width of one logic element.

    Carry chain: A circuit in a FPGA that is optimized for efficient operation of carry functions between logic elements.

    DCM: Digital clock manager is a very important circuit that offers various clock management functions in a FPGA.

    Clock trees: Distribution of clock signal lines along the FPGA architecture.

  • Field Programmable Gate Arrays

    Structure much like a gate array ASIC Visualized as islands of programmable

    logic in a sea of programmable interconnect.

    More closer to programmable ASICs Can be scaled to large sizes Large emphasis is laid on interconnection

    routing Timing performance is difficult to predict

  • Generic FPGA architecture

    Contain the following blocks: Programmable logic block I/O blocks Programmable interconnect

    In addition the FPGA has: Clock distribution circuit Embedded memory blocks Special purpose blocks:

    DSP blocks: Hardware multipliers, adders and registers

    Embedded microprocessors/microcontrollers High-speed serial transceivers

  • FPGA architectureProgrammable logic block

    Programmable interconnect

    Many times the FPGA is described in terms of the fabricwhich means the underlying structure of the device

  • Programming

    FPGAs can use any one of the following programming technologies: SRAM

    Antifuse

    FLASH

    Hybrid FLASH-SRAM

  • FPGA fabric

    Programmable Logic Block

    Programmable Logic Block

    Programmable Logic Block

    Programmable Logic Block

    Programmable Logic Block

    Programmable Logic Block

    Programmable interconnects

  • Types of architectures

    Fine grained Each programmable logic block can be used to

    implement only a very simple function. For example, it might be possible to configure the block to act as any 3-input function, such as a primitive logic gate (AND,OR, NAND, etc.) or a storage element (D-type flip-flop, D-type latch, etc.).

    fine-grained architectures are said to be particularly efficient when executing systolic algorithms (functions that benefit from massively parallel implementations).

    Fine-grained implementations require a relatively large number of connections into and out of each block compared to the amount of functionality that can be supported by those blocks

  • Types of architectures

    Coarse grained In the case of a coarse-grained architecture, each

    logic block contains a relatively large amount of logic compared to their fine-grained counterparts. For example, a logic block might contain four 4-input LUTs, four multiplexers, four D-type flip-flops, and some fast carry logic.

    As the granularity of the blocks increases to medium-grained and higher, the amount of connections into the blocks decreases compared to the amount of functionality they can support.

  • Logic realization techniques

    There are two fundamental methods employed by vendors for the programmable logic blocks used to form the medium-grained architectures referenced in the previous section: MUX (multiplexer) based

    LUT (lookup table) based

  • MUX-based

    This is based on the Shannons decomposition theorem which states that:

    Let f(x) be a switching function on n variables. Then f(a) can be factored as

    OR

    1 2( ) i if a a f a f

    1 2 1 2 1 2( , ,..., ) (0, ,... ) (1, ,... )n n nf a a a a f a a a f a a

  • Example (MUX implementation)

    Consider a 3-input function

    y a b c

    1111

    1011

    1101

    0001

    1110

    0010

    1100

    0000

    ycba

    111

    101

    110

    000

    111

    001

    110

    000

    ycb

    2y b c

    1y c

    ( ) ( )y a c a b c Using Shannons decomposition theorem we can write y as

  • Example

    111

    101

    110

    000

    ycb

    2 3 4

    2 1

    y y y

    y b c b

  • MUX implementation

  • Another possible implementation

  • LUT-based

    An n-input LUT is that it can implement any possible n-input combinational.

    The underlying concept behind a LUT is relatively simple.

    A group of input signals is used as an index (pointer) to a lookup table.

    The contents of this table are arranged such that the cell pointed to by each input combination contains the desired value

  • LUT implementation

    Using pass transistors Using transmission gates

  • # of LUTs?

    It has been statistically concluded that a 4-input LUT is best for FPGA devices.

    One additional advantage of LUT based programmable block is that the SRAM the cells forming the LUT can be used as a small block of RAM (the 16 cells forming a 4-input LUT, for example, could be used as a 16 X 1 RAM). This is referred to as distributed RAM.

    Also all the SRAM cells are effectively connected in a chain. This is so as to facilitate the programming. But this offers a new possibility of using this chain as a shift register.

    Because of all these advantages , majority of todays FPGA architectures are LUT based

  • Major FPGA Vendors

    Lattice SemiconductorQuick Logic Corp

    Atmel

    Altera Corp.Actel Corp.

    Xilinx, Inc.

    Flash & antifuse FPGAsSRAM-based FPGAs

  • Xilinx FPGA Devices Old families

    XC3000, XC4000, XC5200 Old 0.5m, 0.35m and 0.25m technology. (Not recommended

    for modern designs) Low Cost Family

    Spartan/XL derived from XC4000 Spartan-II derived from Virtex Spartan-IIE derived from Virtex-E Spartan-3

    High-performance families Virtex (0.22m) Virtex-E, Virtex-EM (0.18m) Virtex-II, Virtex-II PRO (0.13m) Virtex-4 (0.09m) Virtex-5 (0.065m)

    FXTSXTLXTLXEmbedded/SerialDSP/SerialLogic/SerialLogic

    Virtex 5 flavours

  • Xilinx devices

    1985

    Xil

    inx

    Dev

    ice

    Com

    ple

    xity

    XC200050 MHz1K gates

    XC4000100 MHz

    250K gates

    Virtex200 MHz1M gates

    Virtex-II 450 MHz8M gates

    Spartan80 MHz

    40K gates

    Spartan-II200 MHz

    200K gates

    Spartan-3326 MHz5M gates

    19911987

    XC300085 MHz

    7.5K gates

    Virtex-E240 MHz4M gates

    XC520050 MHz

    23K gates

    1995 1998 1999 2000 2002 2003

    Virtex-II Pro450 MHz8M gates*

    2004 2006

    Virtex-4500 MHz

    16M gates*

    Virtex-5550 MHz

    24M gates*

    0.35m

    0.3m

    0.25m

    0.22m

    0.18m

    0.13m

    0.13m 90nm 65nm

  • Xilinx FPGA devices

    All Xilinx FPGAs contain the same basic resources: Logic cells (LCs) grouped into Slices which are grouped into

    Configurable Logic Blocks (CLBs) Contain combinatorial logic and register resources I/O Blocks Interface between the FPGA and the outside world Programmable interconnect Other resources Memory Multipliers Global clock buffers Boundary scan logic

  • Xilinx logic cell (LC)

    MUX

    0

    1

    FLIP-FLOP

    16-bit Shift Register

    16 X 1 RAM

    4-input LUT

    y

    q

    abcd

    e

    clock

    clock enable

    set/reset

    The core building block in a modern FPGA from Xilinx is called a logic cell

  • Logic Cell

    The register can be configured to act as a flip-flop, or as a latch.

    The polarity of the clock (rising- edge triggered or falling-edge triggered) can be configured, as can the polarity of the clock enable and set/reset signals (active-high or active-low).

    In addition to the LUT, MUX, and register, the LC also contains other elements, including some special fast carry logic for use in arithmetic operations.

  • The Slice

    A slice contains two LCs Each logic cells LUT, MUX, and

    register have their own data inputs and outputs; the slice has one set of clock, clock enable, and set/reset signals common to both logic cells.

  • Configurable Logic Block (CLB)

    Xilinx FPGAs can have two or four slices in each CLB

    There is also some fast programmable interconnect within the CLB. This interconnect is used to connect neighboring slices.

  • Why the hierarchy?

    The reason for having this type of logic-block hierarchyLC Slice (with two LCs) CLB (with four slices)is that it is complemented by an equivalent hierarchy in the interconnect.

    Thus, there is fast interconnect between the LCsin a slice, then slightly slower interconnect between slices in a CLB, followed by the interconnect between CLBs.

    This is to achieve the optimum trade-off between making it easy to connect things together without incurring excessive interconnect-related delays.

  • Fast carry chains

    A key feature of modern FPGAs is that they include the special logic and interconnect required to implement fast carry chains.

    Each LC contains special carry logic. This is complemented by dedicated interconnect

    between the two LCs in each slice, between the slices in each CLB, and between the CLBs themselves.

    This special carry logic and dedicated routing boosts the performance of logical functions such as counters and arithmetic functions such as adders.

    The availability of these fast carry chainsin conjunction with features like the shift register use of LUTs and embedded multipliers are useful when the FPGAs are to be used for applications like DSP

  • Embedded RAM

  • Embedded RAM

    A lot of applications require the use of memory, so FPGAs may include relatively large chunks of embedded RAM called block RAM.

    Depending on the architecture of the component, these blocks might be positioned around the periphery of the device, scattered across the face of the chip in relative isolation, or organized in columns.

    Each block of RAM can be used independently, or multiple blocks can be combined together to implement larger blocks.

    These blocks can be used for a variety of purposes, such as implementing standard single- or dual-port RAMs, first-in first-out (FIFO) functions and state machines

  • Embedded multipliers, adders, MACs

    MAC

  • Embedded multipliers, adders, MACs

    Some functions, like multipliers, are inherently slow if they are implemented by connecting a large number of programmable logic blocks together.

    Since these functions are required by a lot of applications, many FPGAs incorporate special hardwired multiplier blocks.

    These are typically located in close proximity to the embedded RAM blocks because these functions are often used in conjunction with each other

    Similarly, some FPGAs offer dedicated adder blocks. One operation that is very common in DSP-type

    applications is called a multiply-and-accumulate (MAC). As its name would suggest, this function multiplies two

    numbers together and adds the result to a running total stored in an accumulator

  • Embedded processor cores

    Some functions such as reading switch positions and flashing light-emitting diodes (LEDs) require low speed counters.

    Slowing the hardware down to implement this sort of function (using huge counters to generate delays, for example) is often impracticable. Thus, its often better to implement these tasks with microprocessors.

    High-end FPGAs contain one or more embedded microprocessors, which are typically referred to as microprocessor cores.

    In this case, it often makes sense to move all of the tasks that used to be performed by the external microprocessor into the internalcore.

    This provides a number of advantages, saves the cost of having two devices; eliminates large numbers of tracks, pads, and pins on the circuit board makes the board smaller and lighter

  • Types of microprocessor cores

    There are two types of microprocessor cores : Hard microprocessor core: Implemented as a

    dedicated, predefined block.

    Soft microprocessor core: It is possible to configure a group of programmable logic blocks to act as a microprocessor. These are typically called soft cores, but they may be more precisely categorized as either soft or firm depending on the way in which the microprocessors functionality is mapped onto the logic blocks

  • Clock trees

    All of the synchronous elements inside an FPGAfor example, the registers configured to act as flip-flops inside the programmable logic blocksneed to be driven by a clock signal.

    Such a clock signal typically originates in the outside world, comes into the FPGA via a special clock input pin, and is then routed through the device and connected to the appropriate registers.

  • Clock trees

  • Clock managers

    Some FPGA clock managers are based on phase-lockedloops (PLLs), while others are based on digital delay-locked loops

  • Clock manager functions

    Jitter removal

  • Jitter removal

  • Skew correction

  • Digital frequency synthesis& Phase shifting

  • General-purpose I/O

  • I/O

    Each bank can be configured individually to support a particular I/O standard.

    Allows the FPGA to work with devices using multiple I/O standards,

    FPGA can actually be used to interface between different I/O standards (and also to translate between different protocols that may be based on particular electrical standards).

  • Configurable I/O impedances

    Modern FPGA output signals with fast edge rates require termination to prevent reflections and maintain signal integrity.

    High pin count packages (especially ball grid arrays) cannot accommodate external termination resistors.

    Thus the Digitally Controlled Impedance (DCI) circuit is employed DCI eliminates the need for external resistors, and improves signal

    integrity. The DCI feature can be used on any IOB by selecting one of the

    DCI I/O standards. When applied to inputs, DCI provides input parallel termination. When applied to outputs, DCI provides controlled impedance drivers

    (series termination) or output parallel termination. DCI operates independently on each I/O bank.

  • Core versus I/O supply voltages

    Over time, the geometries of the structures on silicon chips became smaller because smaller transistors have lower costs, higher speed, and lower power consumption. However, these processes demanded lower supply voltages, which have continued to fall over the years

    This supply (which is actually provided using large numbers of power and ground pins) is used to power the FPGAs internal logic.

    For this reason, this is known as the core voltage. However, different I/O standards may use signals with

    voltage levels significantly different from the core voltage, so each bank of general-purpose I/Os can have its own additional supply pins.

  • Core voltages

  • Gigabit transceivers

    The traditional way to move large amounts of data between devices is to use a bus, a collection of signals that carry similar data and perform a common function

    Buses grew to 16 bits in width, then 32 bits, then 64 bits, and so forth.

    The problem is that this requires a lot of pins on the device and a lot of tracks connecting the devices together. Routing these tracks so that they all have the same length and impedance becomes increasingly difficult as boards grow in complexity.

    Furthermore, it becomes increasingly difficult to manage signal integrity issues (such as susceptibility to noise) when we are dealing with large numbers of bus-based tracks.

  • Todays high-end FPGAs include special hard-wired gigabit transceiver blocks.

    These blocks use one pair of differential signals (which means a pair of signals that always carry opposite logical values) to transmit (TX) data and another pair to receive (RX) data

  • Interconnect and routing

    A programmable switch matrix forms the heart of interconnect in a FPGA.

    PSM PSM

    CLB

    PSM PSM

    CLB CLB

    CLBCLB CLB

    CLBCLB CLB

    ProgrammableSwitchMatrix

    PSM PSM

    CLB

    PSM PSM

    CLB CLB

    CLBCLB CLB

    CLBCLB CLB

    PSM PSM

    CLB

    PSM PSM

    CLB CLB

    CLBCLB CLB

    CLBCLB CLB

    ProgrammableSwitchMatrix

  • The Switch

    The actual switching matrix employs a structure of six pass transistors per cross point. Thus connectivity can be established by controlling the transistors

  • Various types of connections

  • Various types of connections Single lines : used to connect a CLB to another CLB that

    is one hop away. These wires have to go through a programmable switch hence adds delay.

    Double lines: These wires travel past two CLBs before hitting the switch, hence they provide shorter delays for longer connections.

    Long lines: Wires in Long groups do not go through any programmable switch at all; instead they travel all the way across or down a row or column and are driven by three-state drivers near the CLB.

    Direct connect lines: These are the CLB outputs that are directly connected to CLBs immediately below and to the right of it.

    Global clock lines: These lines are optimized for use as clock inputs to the CLB, providing short delay and minimal skew.

  • FPGA

    R.B.Ghongade

  • Key terms

    Look-up table (LUT): A circuit that implements a combinational logic function by storing a list of output values that correspond to all possible input combinations.

    CLB: Configurable Logic Block is the name for programmable logic block in a FPGA.

    Logic element (LE): A circuit internal to a FPGA used to implement a logic function as a look-up table.

    Cascade chain: A circuit in a FPGA that allows the input width of a Boolean function to expand beyond the width of one logic element.

    Carry chain: A circuit in a FPGA that is optimized for efficient operation of carry functions between logic elements.

    DCM: Digital clock manager is a very important circuit that offers various clock management functions in a FPGA.

    Clock trees: Distribution of clock signal lines along the FPGA architecture.

  • Field Programmable Gate Arrays

    Structure much like a gate array ASIC Visualized as islands of programmable

    logic in a sea of programmable interconnect.

    More closer to programmable ASICs Can be scaled to large sizes Large emphasis is laid on interconnection

    routing Timing performance is difficult to predict

  • Generic FPGA architecture

    Contain the following blocks: Programmable logic block I/O blocks Programmable interconnect

    In addition the FPGA has: Clock distribution circuit Embedded memory blocks Special purpose blocks:

    DSP blocks: Hardware multipliers, adders and registers

    Embedded microprocessors/microcontrollers High-speed serial transceivers

  • FPGA architectureProgrammable logic block

    Programmable interconnect

    Many times the FPGA is described in terms of the fabricwhich means the underlying structure of the device

  • Programming

    FPGAs can use any one of the following programming technologies: SRAM

    Antifuse

    FLASH

    Hybrid FLASH-SRAM

  • FPGA fabric

    Programmable Logic Block

    Programmable Logic Block

    Programmable Logic Block

    Programmable Logic Block

    Programmable Logic Block

    Programmable Logic Block

    Programmable interconnects

  • Types of architectures

    Fine grained Each programmable logic block can be used to

    implement only a very simple function. For example, it might be possible to configure the block to act as any 3-input function, such as a primitive logic gate (AND,OR, NAND, etc.) or a storage element (D-type flip-flop, D-type latch, etc.).

    fine-grained architectures are said to be particularly efficient when executing systolic algorithms (functions that benefit from massively parallel implementations).

    Fine-grained implementations require a relatively large number of connections into and out of each block compared to the amount of functionality that can be supported by those blocks

  • Types of architectures

    Coarse grained In the case of a coarse-grained architecture, each

    logic block contains a relatively large amount of logic compared to their fine-grained counterparts. For example, a logic block might contain four 4-input LUTs, four multiplexers, four D-type flip-flops, and some fast carry logic.

    As the granularity of the blocks increases to medium-grained and higher, the amount of connections into the blocks decreases compared to the amount of functionality they can support.

  • Logic realization techniques

    There are two fundamental methods employed by vendors for the programmable logic blocks used to form the medium-grained architectures referenced in the previous section: MUX (multiplexer) based

    LUT (lookup table) based

  • MUX-based

    This is based on the Shannons decomposition theorem which states that:

    Let f(x) be a switching function on n variables. Then f(a) can be factored as

    OR

    1 2( ) i if a a f a f

    1 2 1 2 1 2( , ,..., ) (0, ,... ) (1, ,... )n n nf a a a a f a a a f a a

  • Example (MUX implementation)

    Consider a 3-input function

    y a b c

    1111

    1011

    1101

    0001

    1110

    0010

    1100

    0000

    ycba

    111

    101

    110

    000

    111

    001

    110

    000

    ycb

    2y b c

    1y c

    ( ) ( )y a c a b c Using Shannons decomposition theorem we can write y as

  • Example

    111

    101

    110

    000

    ycb

    2 3 4

    2 1

    y y y

    y b c b

  • MUX implementation

  • Another possible implementation

  • LUT-based

    An n-input LUT is that it can implement any possible n-input combinational.

    The underlying concept behind a LUT is relatively simple.

    A group of input signals is used as an index (pointer) to a lookup table.

    The contents of this table are arranged such that the cell pointed to by each input combination contains the desired value

  • LUT implementation

    Using pass transistors Using transmission gates

  • # of LUTs?

    It has been statistically concluded that a 4-input LUT is best for FPGA devices.

    One additional advantage of LUT based programmable block is that the SRAM the cells forming the LUT can be used as a small block of RAM (the 16 cells forming a 4-input LUT, for example, could be used as a 16 X 1 RAM). This is referred to as distributed RAM.

    Also all the SRAM cells are effectively connected in a chain. This is so as to facilitate the programming. But this offers a new possibility of using this chain as a shift register.

    Because of all these advantages , majority of todays FPGA architectures are LUT based

  • Major FPGA Vendors

    Lattice SemiconductorQuick Logic Corp

    Atmel

    Altera Corp.Actel Corp.

    Xilinx, Inc.

    Flash & antifuse FPGAsSRAM-based FPGAs

  • Xilinx FPGA Devices Old families

    XC3000, XC4000, XC5200 Old 0.5m, 0.35m and 0.25m technology. (Not recommended

    for modern designs) Low Cost Family

    Spartan/XL derived from XC4000 Spartan-II derived from Virtex Spartan-IIE derived from Virtex-E Spartan-3

    High-performance families Virtex (0.22m) Virtex-E, Virtex-EM (0.18m) Virtex-II, Virtex-II PRO (0.13m) Virtex-4 (0.09m) Virtex-5 (0.065m)

    FXTSXTLXTLXEmbedded/SerialDSP/SerialLogic/SerialLogic

    Virtex 5 flavours

  • Xilinx devices

    1985

    Xil

    inx

    Dev

    ice

    Com

    ple

    xity

    XC200050 MHz1K gates

    XC4000100 MHz

    250K gates

    Virtex200 MHz1M gates

    Virtex-II 450 MHz8M gates

    Spartan80 MHz

    40K gates

    Spartan-II200 MHz

    200K gates

    Spartan-3326 MHz5M gates

    19911987

    XC300085 MHz

    7.5K gates

    Virtex-E240 MHz4M gates

    XC520050 MHz

    23K gates

    1995 1998 1999 2000 2002 2003

    Virtex-II Pro450 MHz8M gates*

    2004 2006

    Virtex-4500 MHz

    16M gates*

    Virtex-5550 MHz

    24M gates*

    0.35m

    0.3m

    0.25m

    0.22m

    0.18m

    0.13m

    0.13m 90nm 65nm

  • Xilinx FPGA devices

    All Xilinx FPGAs contain the same basic resources: Logic cells (LCs) grouped into Slices which are grouped into

    Configurable Logic Blocks (CLBs) Contain combinatorial logic and register resources I/O Blocks Interface between the FPGA and the outside world Programmable interconnect Other resources Memory Multipliers Global clock buffers Boundary scan logic

  • Xilinx logic cell (LC)

    MUX

    0

    1

    FLIP-FLOP

    16-bit Shift Register

    16 X 1 RAM

    4-input LUT

    y

    q

    abcd

    e

    clock

    clock enable

    set/reset

    The core building block in a modern FPGA from Xilinx is called a logic cell

  • Logic Cell

    The register can be configured to act as a flip-flop, or as a latch.

    The polarity of the clock (rising- edge triggered or falling-edge triggered) can be configured, as can the polarity of the clock enable and set/reset signals (active-high or active-low).

    In addition to the LUT, MUX, and register, the LC also contains other elements, including some special fast carry logic for use in arithmetic operations.

  • The Slice

    A slice contains two LCs Each logic cells LUT, MUX, and

    register have their own data inputs and outputs; the slice has one set of clock, clock enable, and set/reset signals common to both logic cells.

  • Configurable Logic Block (CLB)

    Xilinx FPGAs can have two or four slices in each CLB

    There is also some fast programmable interconnect within the CLB. This interconnect is used to connect neighboring slices.

  • Why the hierarchy?

    The reason for having this type of logic-block hierarchyLC Slice (with two LCs) CLB (with four slices)is that it is complemented by an equivalent hierarchy in the interconnect.

    Thus, there is fast interconnect between the LCsin a slice, then slightly slower interconnect between slices in a CLB, followed by the interconnect between CLBs.

    This is to achieve the optimum trade-off between making it easy to connect things together without incurring excessive interconnect-related delays.

  • Fast carry chains

    A key feature of modern FPGAs is that they include the special logic and interconnect required to implement fast carry chains.

    Each LC contains special carry logic. This is complemented by dedicated interconnect

    between the two LCs in each slice, between the slices in each CLB, and between the CLBs themselves.

    This special carry logic and dedicated routing boosts the performance of logical functions such as counters and arithmetic functions such as adders.

    The availability of these fast carry chainsin conjunction with features like the shift register use of LUTs and embedded multipliers are useful when the FPGAs are to be used for applications like DSP

  • Embedded RAM

  • Embedded RAM

    A lot of applications require the use of memory, so FPGAs may include relatively large chunks of embedded RAM called block RAM.

    Depending on the architecture of the component, these blocks might be positioned around the periphery of the device, scattered across the face of the chip in relative isolation, or organized in columns.

    Each block of RAM can be used independently, or multiple blocks can be combined together to implement larger blocks.

    These blocks can be used for a variety of purposes, such as implementing standard single- or dual-port RAMs, first-in first-out (FIFO) functions and state machines

  • Embedded multipliers, adders, MACs

    MAC

  • Embedded multipliers, adders, MACs

    Some functions, like multipliers, are inherently slow if they are implemented by connecting a large number of programmable logic blocks together.

    Since these functions are required by a lot of applications, many FPGAs incorporate special hardwired multiplier blocks.

    These are typically located in close proximity to the embedded RAM blocks because these functions are often used in conjunction with each other

    Similarly, some FPGAs offer dedicated adder blocks. One operation that is very common in DSP-type

    applications is called a multiply-and-accumulate (MAC). As its name would suggest, this function multiplies two

    numbers together and adds the result to a running total stored in an accumulator

  • Embedded processor cores

    Some functions such as reading switch positions and flashing light-emitting diodes (LEDs) require low speed counters.

    Slowing the hardware down to implement this sort of function (using huge counters to generate delays, for example) is often impracticable. Thus, its often better to implement these tasks with microprocessors.

    High-end FPGAs contain one or more embedded microprocessors, which are typically referred to as microprocessor cores.

    In this case, it often makes sense to move all of the tasks that used to be performed by the external microprocessor into the internalcore.

    This provides a number of advantages, saves the cost of having two devices; eliminates large numbers of tracks, pads, and pins on the circuit board makes the board smaller and lighter

  • Types of microprocessor cores

    There are two types of microprocessor cores : Hard microprocessor core: Implemented as a

    dedicated, predefined block.

    Soft microprocessor core: It is possible to configure a group of programmable logic blocks to act as a microprocessor. These are typically called soft cores, but they may be more precisely categorized as either soft or firm depending on the way in which the microprocessors functionality is mapped onto the logic blocks

  • Clock trees

    All of the synchronous elements inside an FPGAfor example, the registers configured to act as flip-flops inside the programmable logic blocksneed to be driven by a clock signal.

    Such a clock signal typically originates in the outside world, comes into the FPGA via a special clock input pin, and is then routed through the device and connected to the appropriate registers.

  • Clock trees

  • Clock managers

    Some FPGA clock managers are based on phase-lockedloops (PLLs), while others are based on digital delay-locked loops

  • Clock manager functions

    Jitter removal

  • Jitter removal

  • Skew correction

  • Digital frequency synthesis& Phase shifting

  • General-purpose I/O

  • I/O

    Each bank can be configured individually to support a particular I/O standard.

    Allows the FPGA to work with devices using multiple I/O standards,

    FPGA can actually be used to interface between different I/O standards (and also to translate between different protocols that may be based on particular electrical standards).

  • Configurable I/O impedances

    Modern FPGA output signals with fast edge rates require termination to prevent reflections and maintain signal integrity.

    High pin count packages (especially ball grid arrays) cannot accommodate external termination resistors.

    Thus the Digitally Controlled Impedance (DCI) circuit is employed DCI eliminates the need for external resistors, and improves signal

    integrity. The DCI feature can be used on any IOB by selecting one of the

    DCI I/O standards. When applied to inputs, DCI provides input parallel termination. When applied to outputs, DCI provides controlled impedance drivers

    (series termination) or output parallel termination. DCI operates independently on each I/O bank.

  • Core versus I/O supply voltages

    Over time, the geometries of the structures on silicon chips became smaller because smaller transistors have lower costs, higher speed, and lower power consumption. However, these processes demanded lower supply voltages, which have continued to fall over the years

    This supply (which is actually provided using large numbers of power and ground pins) is used to power the FPGAs internal logic.

    For this reason, this is known as the core voltage. However, different I/O standards may use signals with

    voltage levels significantly different from the core voltage, so each bank of general-purpose I/Os can have its own additional supply pins.

  • Core voltages

  • Gigabit transceivers

    The traditional way to move large amounts of data between devices is to use a bus, a collection of signals that carry similar data and perform a common function

    Buses grew to 16 bits in width, then 32 bits, then 64 bits, and so forth.

    The problem is that this requires a lot of pins on the device and a lot of tracks connecting the devices together. Routing these tracks so that they all have the same length and impedance becomes increasingly difficult as boards grow in complexity.

    Furthermore, it becomes increasingly difficult to manage signal integrity issues (such as susceptibility to noise) when we are dealing with large numbers of bus-based tracks.

  • Todays high-end FPGAs include special hard-wired gigabit transceiver blocks.

    These blocks use one pair of differential signals (which means a pair of signals that always carry opposite logical values) to transmit (TX) data and another pair to receive (RX) data

  • Interconnect and routing

    A programmable switch matrix forms the heart of interconnect in a FPGA.

    PSM PSM

    CLB

    PSM PSM

    CLB CLB

    CLBCLB CLB

    CLBCLB CLB

    ProgrammableSwitchMatrix

    PSM PSM

    CLB

    PSM PSM

    CLB CLB

    CLBCLB CLB

    CLBCLB CLB

    PSM PSM

    CLB

    PSM PSM

    CLB CLB

    CLBCLB CLB

    CLBCLB CLB

    ProgrammableSwitchMatrix

  • The Switch

    The actual switching matrix employs a structure of six pass transistors per cross point. Thus connectivity can be established by controlling the transistors

  • Various types of connections

  • Various types of connections Single lines : used to connect a CLB to another CLB that

    is one hop away. These wires have to go through a programmable switch hence adds delay.

    Double lines: These wires travel past two CLBs before hitting the switch, hence they provide shorter delays for longer connections.

    Long lines: Wires in Long groups do not go through any programmable switch at all; instead they travel all the way across or down a row or column and are driven by three-state drivers near the CLB.

    Direct connect lines: These are the CLB outputs that are directly connected to CLBs immediately below and to the right of it.

    Global clock lines: These lines are optimized for use as clock inputs to the CLB, providing short delay and minimal skew.

  • FPGA II

    R.B.Ghongade

  • Spartan-II FPGA Family Features Second generation ASIC replacement technology

    Densities as high as 5,292 logic cells with up to 200,000 system gates

    Streamlined features based on Virtex architecture Unlimited re-programmability Very low cost 0.18 micron process

    System level features SelectRAM+ hierarchical memory: 16 bits/LUT distributed RAM Configurable 4K bit block RAM Fast interfaces to external RAM Fully PCI compliant Low-power segmented routing architecture

  • Spartan-II FPGA Family Features

    Full readback ability for verification/observability Dedicated carry logic for high-speed arithmetic Efficient multiplier support Cascade chain for wide-input functions Abundant registers/latches with enable, set, reset Four dedicated DLLs for advanced clock control Four primary low-skew global clock distribution nets IEEE 1149.1 compatible boundary scan logic

    Versatile I/O and packaging Low-cost packages available in all densities Family footprint compatibility in common packages 16 high-performance interface standards Hot swap Compact PCI friendly Zero hold time simplifies system timing

  • Spartan II family

    56K75,2642841,17628 x 42200,0005,292XC2S200

    48K55,29626086424 x 36150,0003,888XC2S150

    40K38,40017660020 x 30100,0002,700XC2S100

    32K24,57617638416 x 2450,0001,728XC2S50

    24K13,8249221612 x 1830,000972XC2S30

    16K6,14486968 x 1215,000432XC2S15

    Total Block RAM Bits

    Total DistributedRAM Bits

    Maximum Available User I/O

    Total CLBs

    CLB Array

    (R x C)

    System Gates

    (Logic and RAM)

    Logic Cells

    Device

  • Available packages

    284176140---284XC2S200

    260176140---260XC2S150

    -176140-92-176XC2S100

    -176140-92-176XC2S50

    ---92926092XC2S30

    ----866086XC2S15

    FG456FGG456

    FG256FGG256

    PQ208 PQG208

    CS144 CSG144

    TQ144 TQG144

    VQ100 VQG100

    Available User I/O According to Package TypeMaximum User I/O

    Device

  • Spartan II FPGA architecture

  • Slice

  • BUFT

    Each Spartan-II CLB contains two 3-state drivers (BUFTs) that can drive on-chip busses.

    Each Spartan-II BUFT has an independent 3-state control pin and an independent input pin.

  • Block RAM Spartan-II FPGAs incorporate

    several large block RAM memories. These complement the distributed

    RAM Look-Up Tables (LUTs) that provide shallow memory structures implemented in CLBs.

    Block RAM memory blocks are organized in columns.

    All Spartan-II devices contain two such columns, one along each vertical edge.

    These columns extend the full height of the chip.

    Each memory block is four CLBshigh, and consequently, a Spartan-II device eight CLBs high will contain two memory blocks per column, and a total of four blocks.

  • Programmable Routing Matrix

    Five levels of hierarchies are used for

    routing in Spartan II family : Local

    General purpose

    IO

    Dedicated

    Global

  • Local Routing

    Provide the following three types of connections: Interconnections among the LUTs, flip-flops, and

    General Routing Matrix (GRM)

    Internal CLB feedback paths that provide high-speed connections to LUTs within the same CLB, chaining them together with minimal routing delay.

    Direct paths that provide high-speed connections between horizontally adjacent CLBs, eliminating the delay of the GRM

  • Local Routing

  • General Purpose Routing Most Spartan-II signals are routed on the general purpose routing, and

    consequently, the majority of interconnect resources are associated with this level of the routing hierarchy.

    The general routing resources are located in horizontal and vertical routing channels associated with the rows and columns CLBs.

    The general-purpose routing resources are listed below. Adjacent to each CLB is a General Routing Matrix(GRM). The GRM is the switch matrix through which horizontal and vertical routing resources connect,and is also the means by which the CLB gains access to the general purpose routing. 24 single-length lines route GRM signals to adjacent GRMs in each of the four directions. 96 buffered Hex lines route GRM signals to other GRMs six blocks away in each one of the four directions. Organized in a staggered pattern, Hex lines may be driven only at their endpoints. Hex-line signals can be accessed either at the endpoints or at the midpoint (three blocks from the source). One third of the Hex lines are bidirectional, while the remaining ones are unidirectional. 12 Long lines are buffered, bidirectional wires that distribute signals across the device quickly and efficiently. Vertical Long lines span the full height of the device, and horizontal ones span the full width of the device.

  • I/O Routing

    Spartan-II devices have additional routing resources around their periphery that form an interface between the CLB array and the IOBs.

    This additional routing, called the VersaRing, facilitates pin-swapping and pin-locking, such that logic redesigns can adapt to existing PCB layouts.

    Time-to-market is reduced, since PCBs and other system components can be manufactured while the logic design is still in progress.

  • Dedicated Routing

    Some classes of signal require dedicated routing resources to maximize performance.

    In the Spartan-II architecture, dedicated routing resources are provided for two classes of signal. Horizontal routing resources are provided for on-chip3-state

    busses.Four partition-able bus lines are provided per CLB row, permitting multiple busses within a row

    Two dedicated nets per CLB propagate carry signals vertically to the adjacent CLB

  • Global Routing

    Global Routing resources distribute clocks and other signals with very high fanout throughout the device.

    Spartan-II devices include two tiers of global routing resources referred to as primary and secondary global routing resources. The primary global routing resources are four dedicated global

    nets with dedicated input pins that are designed to distribute high-fanout clock signals with minimal skew. Each global clock net can drive all CLB,IOB, and block RAM clock pins. The primary global nets may only be driven by global buffers. There are four global buffers, one for each global net.

    The secondary global routing resources consist of 24backbone lines, 12 across the top of the chip and 12 across bottom. From these lines, up to 12 unique signals per column can be distributed via the 12longlines in the column. These secondary resources are more flexible than the primary resources since they are not restricted to routing only to clock pins

  • Spartan II clock distributionscheme

  • Input/Output Block

  • I/O Banking

  • Boundary Scan

    Spartan-II devices support all the mandatory boundary-scan instructions specified in the IEEE standard 1149.1

    A Test Access Port (TAP) and registers are provided that implement the EXTEST, SAMPLE/PRELOAD, and BYPASS instructions

  • Virtex IV family

    Contain the same basic resources Slices (grouped into CLBs)

    Contain combinatorial logic and register resources

    IOBs Interface between the FPGA and the outside world

    Programmable interconnect

    Other resources Memory

    Multipliers

    Global clock buffers

    Boundary scan logic

  • Overview of Virtex IV

    The Virtex-4 Family is a new generation FPGA from Xilinx. The innovative Advanced Silicon Modular Block or ASMBL

    column-based architecture is unique in the programmable logic industry.

    ASMBL column-based architecture is unique in the programmable logic industry.

    Virtex-4 FPGAs contain three families (platforms): LX, FX, and SX. A wide array of hard-IP core blocks complete the system solution. These cores include the PowerPC processors, Tri-Mode Ethernet

    MACs, 622 Mb/s to 10+ Gb/s serial transceivers, dedicated DSPslices, high-speed clock management circuitry, and source-synchronous interface blocks.

    The basic Virtex-4 building blocks are an enhancement of those found in the popular Virtex-based product families: Virtex, Virtex-E, Virtex-II, Virtex-II Pro, and Virtex-II Pro X, allowing upward compatibility of existing designs.

    Virtex-4 devices are produced on a 90-nm copper process, using 300 mm (12 inch) wafer technology.

  • Features of Virtex IV family

    Three families LX/SX/FX - Virtex-4 LX: High-performance logic applications solution - Virtex-4 FX: High-performance, full-featured solution for embedded

    platform applications - Virtex-4 SX: High-performance solution for Digital Signal Processing

    (DSP) applications Xesium Clock Technology - Digital Clock Manager (DCM) blocks - Additional Phase-Matched Clock Dividers (PMCD) - Differential Global Clocks XtremeDSP Slice - 18x18, twos complement, signed Multiplier - Optional pipeline stages - Built-In Accumulator (48-bits) & Adder/Subtracter

  • Features of Virtex IV family Smart RAM Memory Hierarchy

    - Distributed RAM - Dual-Port 18-Kbit RAM blocks Optional pipeline stages Optional

    programmable FIFO logic- Automatically remaps RAM signals as FIFO signals - High-speed memory interface support: DDR and DDR-2 SDRAM, QDR-II, and

    RLDRAM-II. SelectIO Technology - 1.5 to 3.3 V I/O Operation - Built-In ChipSync Source-Synchronous

    Technology - Digitally-controlled impedance (DCI) active termination - Fine grained I/O banking (Configuration in one bank) Flexible Logic Resources Secure Chip AES Bitstream Encryption 90-nm copper CMOS process 1.2V core voltage Flip-Chip Packaging RocketIO 622 Mb/s to 10+ Gb/s Multi-Gigabit Transceivers (MGT) (FX only) IBM PowerPC RISC Processor Core (FX only)- PowerPC 405 (PPC405) Core - Auxiliary Processor Unit Interface (User

    Coprocessor) Multiple Tri-Mode Ethernet MACs (FX only)

  • Virtex Architecture

    I/O Blocks (IOBs)I/O Blocks (IOBs)I/O Blocks (IOBs)

    ConfigurableLogic Blocks (CLBs)

    ConfigurableConfigurableLogic Blocks Logic Blocks (CLBs)(CLBs)

    Clock Management (DCMs, BUFGMUXes)Clock Management Clock Management (DCMs, BUFGMUXes)(DCMs, BUFGMUXes)

    Block SelectRAMresourceBlock SelectRAMBlock SelectRAMresourceresource

    Dedicated multipliersDedicated Dedicated multipliersmultipliers

    Programmable interconnectProgrammable Programmable interconnectinterconnect

  • Slices and CLBs

  • Slice 0

    LUTLUT CarryCarry

    LUTLUT CarryCarryD QCE

    PRE

    CLR

    DQCE

    PRE

    CLR

    Simplified Slice Structure

    Each slice has four outputs

    Two registered outputs, two non-registered outputs

    Two BUFTs associated with each CLB, accessible by all 16 CLB outputs

    Carry logic runs vertically, up only

    Two independent carry chains per CLB

  • Detailed Slice Structure

  • SLICEM & SLICEL

    The elements common to both slice pairs (SLICEM and SLICEL) are two logic-function generators (or look-up tables), two storage elements, wide-function multiplexers, carry logic, and arithmetic gates. These elements are used by both SLICEM and SLICEL to provide logic, arithmetic, and ROM functions.

    SLICEM supports two additional functions: storing data using distributed RAM shifting data with 16-bit registers.

    SLICEM represents a superset of elements and connections found in all slices.

  • Logic Resources in One CLB

    64 bits 64 bits 2 8 8 8 4

    Shift Registers

    Distributed RAM

    Arithmetic & Carry-Chains

    MULT_ANDsFlip-Flops LUTsSlices

  • MULTI AND gateCO

    DI CIS

    LUT

    CY_MUX

    CY_XOR

    MULT_AND

    A

    B

    A x B

    LUT

    LUT

    A new feature introduced in Virtex family is the MULTI AND gate for efficient multiply and add implementation. Earlier FPGA architectures require two LUTs per bit to perform the multiplication and addition.

    The MULT_AND gate enables an area reduction by performing the multiply and the add in one LUT per bit

  • Look-Up Table (LUT)

    Virtex-4 function generators are implemented as 4-input look-up tables (LUTs). There are four independent inputs for each of the two function generators in a slice (F and G). The function generators are capable of implementing any arbitrarily defined four-input Boolean function. The propagation delay through a LUT is independent of the function implemented. Signals from the function generators can exit the slice (through the X or Y output), enter the XOR dedicated gate enter the select line of the carry-logic multiplexer feed the D input of the storage element, or go to the MUXF5.

  • Connecting LUTs

    F5F8

    F5F8

    F5F6

    F5F6

    CLB

    Slice S3

    Slice S2

    Slice S0

    Slice S1 F5F7

    F5F7

    F5F6

    F5F6

    MUXF8 combines the two MUXF7 outputs (from the CLB above or below)

    MUXF6 combines slices S2 and S3

    MUXF7 combines the two MUXF6 outputs

    MUXF6 combines slices S0 and S1

    MUXF5 combines LUTs in each slice

  • Fast Carry Logic

    Simple, fast, and complete arithmetic Logic

    Dedicated XOR gate for single-level sum completion

    Uses dedicated routing resources

    All synthesis tools can infer carry logic

    COUT COUT

    SLICE S0

    SLICE S1

    Second Carry Chain

    To S0 of the next CLB To CIN of S2 of the next CLB

    First Carry Chain

    SLICE S3

    SLICE S2

    COUT

    COUT

    CIN

    CIN

    CIN CIN CLB

  • DCE

    PRE

    CLR

    Q

    FDCPE

    D

    CE

    S

    R

    Q

    FDRSE

    D

    CE

    PRE

    CLR

    Q

    LDCPE

    G

    _1

    Flexible Sequential Elements

    Either flip-flops or latches

    Two in each slice; eight in each CLB

    Inputs come from LUTs or from an independent CLB input

    Separate set and reset controls

    Can be synchronous or asynchronous

    All controls are shared within a slice

    Control signals can be inverted locally within a slice

  • Shift Register LUT (SRL16CE)

    Dynamically addressable serial shift registers

    Maximum delay of 16 clock cycles per LUT (128 per CLB)

    Cascadable to other LUTs or CLBs for longer shift registers

    Dedicated connection from Q15 to D input of the next SRL16CE

    Shift register length can be changed asynchronously by toggling address A

    LUT

    D QCE

    D QCE

    D QCE

    D QCE

    LUTD

    CECLK

    A[3:0]

    Q

    Q15 (cascade out)

  • IOB Element

    Input path

    Two DDR registers

    Output path

    Two DDR registers

    Two 3-state enable DDR registers

    Separate clocks and clock enables for I and O

    Set and reset signals are shared

    RegReg

    RegReg

    DDR MUX

    3-state

    OCK1

    OCK2

    RegReg

    RegReg

    DDR MUX

    Output

    OCK1

    OCK2

    PADPAD

    RegReg

    RegReg

    Input

    ICK1

    ICK2

    IOB

  • SelectIO Standard

    Allows direct connections to external signals of varied voltagesand thresholds Optimizes the speed/noise tradeoff Saves having to place interface components onto your board

    Differential signaling standards LVDS, BLVDS, ULVDS LDT LVPECL

    Single-ended I/O standards LVTTL, LVCMOS (3.3V, 2.5V, 1.8V, and 1.5V) PCI-X at 133 MHz, PCI (3.3V at 33 MHz and 66 MHz) GTL, GTLP

  • Digital ControlledImpedance (DCI)

    DCI provides

    Output drivers that match the impedance of the traces

    On-chip termination for receivers and transmitters

    DCI advantages

    Improves signal integrity by eliminating stub reflections

    Reduces board routing complexity and component count by eliminating external resistors

    Eliminates the effects of temperature, voltage, and process variations by using an internal feedback circuit

  • Distributed SelectRAMResources

    Uses a LUT in a slice as memory Synchronous write Asynchronous read

    Accompanying flip-flops can be used to create synchronous read