Asic &fpga

Semi Custom Integrated Circuit Design

Introduction :

The semiconductor industry has evolved from the first ICs of the early 1970s and later on grown

rapidly to the present state. The first small-scale integration (SSI ) ICs contained a few (1 to 10)

logic gates NAND gates, NOR gates, and so on amounting to a few tens of transistors. The era

of medium-scale integration (MSI) increased the range of integrated logic available to counters

and similar, larger scale, logic functions. The era of large-scale integration ( LSI ) packed even

larger logic functions, such as the first microprocessors, into a single chip. The era of very large-

scale integration (VLSI ) now offers 64-bit microprocessors, complete with cache memory and

floating-point arithmetic units well over a million transistors on a single piece of silicon. With

the rapid developments in CMOS process technology, transistors continue to get smaller and ICs

hold more and more transistors. Some people (especially in Japan) use the term ultra large scale

integration ( ULSI ), but most people stop at the term VLSI.

The earliest ICs used bipolar technology and the majority of logic ICs used either

transistor logic (TTL) or emitter-coupled logic (ECL). Although invented before the bipolar

transistor, the metal-oxide-silicon (MOS) transistor was initially difficult to manufacture because

of problems with the oxide interface. As these problems were gradually solved, metal-gate n -

channel MOS (n-MOS or NMOS ) technology developed in the 1970s. At that time MOS

technology required fewer masking steps, was denser, and consumed less power than equivalent

bipolar ICs. This meant that, for a given performance, an MOSIC was cheaper than a bipolar IC

and led to investment and growth of the MOS IC market. The introduction of polysilicon as a

gate material was a major improvement in CMOS technology, making it easier to make two

types of transistors, n -channel MOS and p -channel MOS transistors, on the same IC a

complementary MOS (CMOS) technology. The principal advantage of CMOS over NMOS is

lower power consumption. Another advantage of a polysilicon gate was a simplification of the

fabrication process, allowing devices to be scaled down in size.

With the advent of VLSI in the 1980s engineers began to realize the advantages of

designing an IC that was customized or tailored to a particular system or application rather than

using standard ICs alone. Microelectronic system design then becomes a matter of defining the

1

functions that you can implement using standard ICs and then implementing the remaining logic

functions (sometimes called glue logic ) with one or more custom ICs . As VLSI became

possible you could build a system from a smaller number of components by combining many

standard ICs into a few custom ICs. Building a microelectronic system with fewer ICs allows

you to reduce cost and improve reliability.

In early 90s IC industry recognized the importance of Custom Integrated Circuits . As different

types of custom ICs began to evolve for different types of applications, these new ICs gave rise

to a new term : application-specific IC, or ASIC. Examples of ICs that are not ASICs include

standard parts such as : memory chips ,ROMs, DRAM, and SRAM ; microprocessors; TTL or

TTL-equivalent ICs at SSI, MSI, and LSI levels. Examples of ICs that are ASICs include: a chip

for a toy bear that talks; a chip for a satellite; a chip designed to handle the interface between

memory and a microprocessor for a workstation CPU; and a chip containing a microprocessor as

a cell together with other logic.

Types of ASICs : ASIC stands for an Application-Specific Integrated Circuit ((pronounced a-

sick). Based on the design technology ASICs are broadly classified into three types.(i) Full

custom and (ii) Semicustom and (iii) Programmable ASICs.The further classification of ASICs is

shown below.

Full Custom IC

In a full-custom IC only all the mask layers are customized and only , some (possibly all) logic

cells are customized. i.e In a full-custom ASIC some or all of the logic cells, circuits, or layout

2

are designed specifically. This means the designer do not use the pre-tested and pre-

characterized cells for all or part of that design. So, this approach is considered only when there

are no suitable existing cell libraries available that can be used for the entire design. This might

be due to the reason that existing cell libraries are not fast enough, or the logic cells are not

small enough or consume large power.

A microprocessor is an example of a full-custom IC_designers spend many hours squeezing the

most out of every last square micron of microprocessor chip space by hand. Customizing all of

the IC features in this way allows designers to include analog circuits,optimized memory cells,

or mechanical structures on an IC, for example. Full-custom IC share the most expensive to

manufacture and to design. The manufacturing lead time (the time it takes just to make an IC not

including design time) is typically eight weeks for a full-custom IC. These specialized full-

custom ICs are often intended for a specific application, so we might call some of them full-

custom Asics.

A cell-based ASIC (cell-based IC, or CBIC) uses predesigned logic cells (AND gates, OR gates,

multiplexers, and flip-flops, for example) known as standard cells . We could apply the term

CBIC to any IC that uses cells, but it is generally accepted that a cell-based ASIC or CBIC

means a standard-cell based ASIC.The standard-cell areas (also called flexible blocks) in a CBIC

are built of rows of standard cells like a wall built of bricks. The standard-cell areas may be used

in combination with larger predesigned cells, perhaps microcontrollers or even microprocessors,

known as mega cells . Mega cells are also called mega functions, full-custom blocks, system-

level macros (SLMs), fixed blocks, cores, or Functional Standard Blocks (FSBs).

The important features of this type of ASIC are as follows:

All mask layers are customized transistors and interconnect.

Custom blocks can be embedded.

Manufacturing lead time is about eight weeks.

3

Fig. Cell based ASIC

The diagram shows a cell-based ASIC (CBIC) die with a single standard-cell area together with

four fixed blocks. The flexible block contains rows of standard cells. The small squares around

the edge of the die are bonding pads that are connected to the pins of the ASIC package.Each

standard cell in the library is constructed using full-custom design methods. This design style

provides the same performance and flexibility advantages of a full-custom ASIC but reduces

design time and reduces risk.

Gate-Array_Based ASICs :

In a gate array based ASIC the transistors are predefined on the silicon wafer. The predefined

pattern of transistors on a gate array is the base array , and the smallest element that is replicated

to make the base array is the base cell (sometimes called a primitive cell ). Only the top few

layers of metal, which define the interconnect between transistors, are defined by the designer

using custom masks. To distinguish this type of gate array from other types of gate array, it is

often called a masked gate array ( MGA ). The designer chooses from a gate-array library of

predesigned and pre-characterized logic cells. The logic cells in a gate-array library are often

called macros . The reason for this is that the base-cell layout is the same for each logic cell, and

only the interconnect (inside cells and between cells) is customized, so that there is a similarity

between gate-array macros and a software macro.

There are three different types of MGA or gate-array based ASICs .They are

Channeled gate arrays.

Channel less gate arrays.

Structured gate arrays.

4

The meaning of these arrays lies in their names itself and explains their construction. For

example, in the term channeled gate-array architecture, the gate array is channeled . There are

two common ways of arranging (or arraying) the transistors on a MGA: in a channeled gate array

we leave space between the rows of transistors for wiring; the routing on a channel less gate

array uses rows of unused transistors. The channeled gate array was the first to be developed, but

the channel less gate-array architecture is now more widely used. A structured (or embedded)

gate array can be either channeled or channel less but it includes (or embeds) a custom block.

Channeled Gate Array : The important features of this type of MGAs are,

(i) Only the interconnect is customized.

(ii).The interconnects uses predefined spaces between

rows of base cells.

(iii). Manufacturing lead time is between two days and two weeks.

A channeled gate array is similar to a CBIC both use rows of cells separated by channels used for

interconnect. One difference is that the space for interconnect between rows of cells are fixed in

height in a channeled gate array, whereas the space between rows of cells may be adjusted in a

CBIC.

Channel less Gate Array :

5

This channel less Gate array is also known as a channel-free gate array or sea-of-gates array , or

SOG array). The important features of this type of MGA are as follows

Only some (the top few) mask layers are customized the inter connect.

Manufacturing lead time is between two days and two weeks.

The diagram below shows a channel less gate-array or sea-of-gates (SOG) array die. The core

area of the die is completely filled with an array of base cells (the base array.

The key difference between a channel less gate array and channeled gate array is that there are

no predefined areas set aside for routing between cells on a channel less gate array. Instead we

route over the top of the gate-array devices. We can do this because we customize the contact

layer that defines the connections between metal1, the first layer of metal, and the transistors.

When we use an area of transistors for routing in a channel less array, we do not make any

contacts to the devices lying below but we simply leave the transistors unused.

Structured Gate Array :

An embedded gate array or structured gate array (also known as master slice or master image )

combines some of the features of CBICs and MGAs. One of the disadvantages of the MGA is the

fixed gate-array base cell. This makes the implementation of memory, for example, difficult and

inefficient. In an embedded gate array we set aside some of the IC area and dedicate it to a

specific function. This embedded area either can contain a different base cell that is more

6

suitable for building memory cells, or it can contain a complete circuit block, such as a

microcontroller.

The Fig. below shows an embedded gate array. The important features of this type of MGA are

the following:

Only the interconnect is customized.

Custom blocks (the same for each design) can be embedded.

Manufacturing lead time is between two days and two weeks.

Programmable Logic Devices (PLDs):

Programmable logic devices ) are standard ICs that are available in standard configurations .

However, PLDs may also be configured or programmed to create a part customized to a

specific application, and so they also considered as a family of ASICs. PLDs use different

technologies to allow programming of the device. The important features of the PLDs are

outlined below.

No customized mask layers or logic cells

Fast design turnaround

A single large block of programmable interconnect

A matrix of logic macro cells that usually consist of programmable array logic followed

by a flip-flop or latch.

The simplest type of programmable IC is a read-only memory ( ROM ). The most common types

of ROM use a metal fuse that can be blown permanently (programmable ROM or PROM ). An

electrically programmable ROM , or EPROM , uses programmable MOS transistors whose

7

characteristics are altered by applying a high voltage. One can erase an EPROM either by using

another high voltage (an electrically erasable PROM , or EEPROM ) or by exposing the device

to ultraviolet light ( UV-erasable PROM , or UVPROM ).There is another type of ROM that can

be placed on any ASIC - a mask-programmable ROM (mask-programmed ROM or masked

ROM). A masked ROM is a regular array of transistors permanently programmed using custom

mask patterns. An embedded masked ROM is thus a large, specialized, logic cell.

Field-Programmable Gate Arrays :

FPGAs are the newest member of the ASIC family and are rapidly growing in importance,

replacing TTL in microelectronic systems. Even though an FPGA is a type of gate array, we do

not consider the term gate-array_based ASICs to include FPGAs.

The important characteristics of an FPGA are

None of the mask layers are customized.

A method for programming the basic logic cells and the interconnect.

The core is a regular array of programmable basic logic cells that can implement

combinational as well as sequential logic (flip-flops).

A matrix of programmable interconnect surrounds the basic logic cells.

Programmable I/O cells surround the core.

Design turn around is a few hours.

8

As shown in the figure above , all the FPGAs contain a regular structure of programmable basic

logic cells surrounded by programmable interconnect.

The ASIC Design process :

The ASIC design process is explained by the design flow diagram in which the sequence of

steps to design an ASIC are shown. This is called the design flow diagram. In the below flow

diagram , each step involved in the process is explained.

In the above diagram the process is divided into two stages the first one is logical design and the

next one is physical design. The steps involved in the ASIC design process are explained below.

1.Design entry: Enter the design into an ASIC design system, either using a hardware description language ( HDL ) or schematic entry ..2.Logic synthesis : Use an HDL (VHDL or Verilog) and a logic synthesis tool to produce a netlist a description of the logic cells and their connections.

3. System partitioning. Divide a large system into ASIC-sized pieces.

4. Prelayout simulation. Check to see if the design functions correctly.

5. Floorplanning. Arrange the blocks of the net list on the chip.

6. Placement. Decide the locations of cells in a block.

9

Complex Programmable Logic Device (CPLD) :

The CPLD is an advancement in terms of complexity from the SPLD .It is build on on SPLD

architecture and creates a much larger design. So,the CPLD can be used to integrate the

functions of a number of SPLDs into a single device. The CPLD architecture is based on a small

number of logic blocks and a global programmable interconnect.

Generic CPLD architecture :

The CPLD consists of a number of logic blocks (also called as functional blocks), each of which

contains a macro cell and either a PLA or PAL circuit arrangement. In the diagram below, eight

logic blocks are shown. The macro cell provides additional circuitry to accommodate registered

or nonregistered outputs, alongwith signal polarity control. Polarity control provides an output

that is a true signalor a complement of the true signal. The actual number of logic blocks within

aCPLD varies; the more logic blocks available, the larger the design that can be configured. In

the center of the design is a global programmable interconnect. This interconnect allows

connections to the logic block macro cells and the I/O cell arrays .

Fig. Generic CPLD architecture

The programmable interconnect is usually based on either array-based interconnect or

multiplexer-based interconnect:

• Array-based interconnect allows any signal within the programmable interconnect to connect to

any logic block within the CPLD. This is achieved by allowing horizontal and vertical routing

10

within the programmable interconnect and allowing the crossover points to be connected or

unconnected (the same idea as with the PLA and PAL), depending on the CPLD configuration.

• Multiplexer-based interconnect uses digital multiplexers connected to each of the macro cell

inputs within the logic blocks. Specific signals within the programmable interconnect are

connected to specific inputs of the multiplexers. It would not be practical to connect all internal

signals within the programmable interconnect to the inputs of all multiplexers due to size

and speed of operation considerations.

FPGA Architecture:

FPGA stands for Field Programmable Gate Array. FPGAs are popular with Microsystems

designers because they fill the gap between TTL and PLD design and also expensive ASICs.

FPGAs are ideal for prototyping systems or for low-volume production. Normally FPGAs

comprises of :

• Programmable logic blocks which implement logic functions.

• Programmable routing that connects these logic functions.

• I/O blocks that are connected to logic blocks through routing interconnect and that make off-

chip connections.

..

11

FPGAs have a regular array of basic logic cells that are configured using a programming

technology .The chip inputs and outputs use special I/O logic cells that are different from the

basic logic cells. A programmable interconnects scheme forms the wiring between the two types

of logic cells. Finally, the designer uses custom software, tailored to each programming

technology and FPGA architecture, to design and implement the programmable connections. The

programming technology in an FPGA determines the type of basic logic cell and the interconnect

scheme. The logic cells and interconnection scheme, in turn, determine the design of the input

and output circuits as well as the programming scheme

Configurable Logic Block (CLB):

A configurable logic block (CLB) is a basic component of an FPGA that provides the basic logic

and storage functionality for a target application design. Exact numbers and features vary from

device to device, but every CLB consists of a configurable switch matrix with 4 or 6 inputs,

some selection circuitry (MUX, etc), and flip-flops. The switch matrix is highly flexible and can

be configured to handle combinatorial logic, shift registers or RAM . The CLB acts as the main

logic resource for implementing logic circuits. Generally the CLBs contain RAM based

LUTs(look up tables) to implement logic and storage elements that can be used as flip-flops or

latches .CLBs can be programmed to perform various logical functions as well as to store data.

The diagram below shows the XILINX XC3000 CLB which has five logic inputs (A-E), a

common clock input (K), an asynchronous direct-reset input (RD), and an enable clock

(EC).Using programmable MUXes connected to the SRAM programming cells, one can

independently connect each of the two CLB outputs (X and Y) to the output of the flip-flops (QX

and QY) or to the output of the combinational logic (F and G).

12

Normally the CLB is designed such that the CLB propagation delay is fixed, and is equal to the

LUT access time, and independent of the logic function being implemented.

There are seven inputs for the combinational logic in the XC3000 CLB among them five are

CLB inputs from A to E and two are the flip-flop outputs.(OX and OY) There are two outputs

from the LUT (F and G). Since a 32-bit LUT requires only five variables to form a unique

address (32 = 25 ), there are several ways to use the LUT(One can use five of the seven possible

inputs (A to E, QX, QY) with the entire 32-bit LUT. The CLB outputs (F and G) are then

identical

The 32-bit LUT can be split into half, to implement two functions of four variables each. And

can choose four input variables from the seven inputs(A to E, QX, QY). We have to choose two

of the inputs from the five CLB inputs (A to E); then one function output connects to F and the

other output connects to G.

It is also possible to split the 32-bit LUT in half, using one of the seven input variables as a

select input to a 2:1 MUX that switches between F and G. This allows to implement some

functions of six and seven variables.

Interconnect:

While the CLB provides the logic capability, flexible interconnect routes the signals between

CLBs and to and from I/Os. Routing comes in several flavors, from that designed to interconnect

between CLBs to fast horizontal and vertical long lines spanning the device to global low-skew

routing for Clocking and other global signals.

13

Modern CMOS ASICs use two, three, or more levels (or layers) of metal for interconnect. This

allows wires to cross over different layers in the same way that we use copper traces on different

layers on a printed-circuit board. In a two-level metal CMOS technology, connections to the

standard-cell inputs and outputs are usually made using the second level of metal . This allows

for more sophisticated routing programs to take advantage of the extra metal layer to route

interconnect over the top of the logic cells. A connection that needs to cross over a row of

standard cells uses a feed through.

All FPGAs contain some type of programmable interconnect. The structure and complexity of

the interconnect is largely determined by the programming technology and the architecture of the

basic logic cell. The raw material used in building the interconnect is aluminum-based

metallization, which has a sheet resistance of approximately 50 mW /square and a line

capacitance of 0.2 pFcm . The first programmable ASICs were constructed using two layers of

metal newer programmable ASICs use three or more layers of metal interconnect.

In the diagram below the XILINX interconnect architecture is shown.

In the architecture

The vertical lines and horizontal lines run between CLBs. The general-purpose interconnect joins switch boxes (also known as magic boxes or

switching matrices).

The long lines run across the entire chip. It is possible to form internal buses using long lines and the three-state buffers that are next to each CLB.

The direct connections (not used on the XC4000) bypass the switch matrices and directly connect adjacent CLBs.

The Programmable Interconnection Points ( PIP s) are programmable pass transistors that

connect the CLB inputs and outputs to the routing network.

14

The bidirectional ( BIDI ) interconnect buffers restore the logic level and logic strength

on long interconnect paths.Switching Technology:

It is the most important process used for the routing between the logic blocks/logic clusters. In

the FPGA architectures. The basic switch Block is shown below diagram.

Each Switch Block programmably connects each incoming track from a channel to number of

outgoing tracks in other channels. The transistors in the Switch Block add capacitance and

resistance loading to the each track in a channel, and hence the Switch Block has a significant

effect on the speed of each routable connection and thus a major impact on the speed of the

FPGA as a whole.

In addition, since such a large portion of an FPGA is devoted to the routing, the chip area

required by each Switch Block will have a significant effect on the achievable logic density of

the device. Thus, the design of a good Switch Block is of the utmost importance.

Xilinx Field Programmable Gate Arrays :

The basic structure of Xilinx FPGAs is array-based, meaning that each chip comprises a two-

dimensional array of logic blocks that can be interconnected via horizontal and vertical routing

channels. An illustration of this type of architecture was shown in Figure below. Xilinx

introduced the first FPGA family, called the XC2000 series, around 1985 and now offers three

more generations: XC3000, XC4000, and XC5000. Although the XC3000 devices are still

15

widely used, the rent one and more popular is XC4000 family. The XC5000 is similar to

XC4000, but has been designed to offer similar features at a lower cost , with slightly lower

speed. Xilinx has recently introduced an FPGA family based on anti-fuses, called the XC8100.

The XC8100 has many interesting features, but it is not yet in widespread use. The Xilinx 4000

family devices range in capacity from about 2000 to more than 15,000 equivalent gates.

The XC4000 features a logic block (called a Configurable Logic Block (CLB) by Xilinx) that is

based on look-up tables (LUTs). A LUT is a small one bit wide memory array, where the address

lines for the memory are inputs of the logic block and the one bit output from the memory is the

LUT output. A LUT with K inputs would then correspond to a 2K x 1 bit memory, and can

realize any logic function of its K inputs by programming the logic function’s truth table directly

into the memory. The XC4000 CLB contains three separate LUTs, in the configuration shown in

Figure 18. There are two 4-input LUTS that are fed by CLB inputs, and the third LUT can be

used in combination with the other two. This arrangement allows the CLB to implement a wide

range of logic functions of up to nine inputs, two separate functions of four inputs or other

possibilities. Each CLB also contains two flip-flops.

16

The XC4000 chips have “system oriented” features. For example each CLB contains circuitry

that allows it to efficiently perform arithmetic (i.e., a circuit that can implement a fast carry

operation for adder-like circuits) and also the LUTs in a CLB can be configured as read/write

RAM cells. Also, each XC4000 chip includes very wide AND-planes around the periphery of the

logic block array to facilitate implementing circuit blocks such as wide decoders.

The other key feature that characterizes an FPGA is its interconnect structure. The XC4000

interconnect is arranged in horizontal and vertical channels. Each channel contains some number

of short wire segments that span a single CLB (the number of segments in each channel depends

on the specific part number), longer segments that span two CLBs, and very long segments that

span the entire length or width of the chip. Programmable switches are available to connect the

inputs and outputs of the CLBs to the wire segments, or to connect one wire segment to another.

Acknowledgment: The author is thankful to Prof. Michael Smith with out whose book this

class notes would not have been prepared.

This class notes is based on the book “Application-Specific Integrated Circuits By – Michael

Smith and other internet resources. It is purely meant for academic purpose not for any

commercial use.

17

Asic &fpga

Documents

Transcript of Asic &fpga