Download - Chapter 3 Sizing the Design.pdf

7/27/2019 Chapter 3 Sizing the Design.pdf

1/53

Chapter 3 Sizing the Design

o Functional Specification - A Closer Look

o Review the Available Arrayso Architectural Specification or Hardware Specification

o Array Sizing

o Cell Capabilities

o Array Architecture

o Netlist

o Example: AMCC Interface Optionso Example - AMCC Arrays - Power Supply Options

o Interface Cell Functionality

o Exampleso Internal Cell Functionality

o

Multi-Cell Macroso Hard and soft macroso Refining Interface Requirements

o Adding Extra Power and Ground Macros

o Dual-Function I/O Macros

o Example - Simultaneously Switching Outputso Thermal Diodes

o The AMCC Speed Monitor

o Final Interface Cell Utilizationo Drivers

o Exercises


2/53

Sizing the Design - Selecting the Array

Functional Specification - A Closer Look

The functional ortarget specification is the first level of description of the project that

may encompass one or more arrays when the design is partitioned. There may be a

specification tree with the total project at the top node and individual circuit blocks ormodules detailed underneath. Topics included in a functional specification are listed in

Table 3-1.

At this early stage, afunctional description of what is to be accomplished is created along

with some of the top-level circuit requirements.

Array Interfacing

For the partitioned project (multiple arrays), the individual array specifications would

include a description of array interfacing.

Interconnection between arrays is faster when done with ECL. When choosing single or

dual (differential) rail ECL use the following guidelines:

If the arrays will be placed on the same board and will be adjacent to each other,

single rail (non-differential) ECL may be acceptable.

If the arrays will communicate across a backplane or be remote on the board,

differential ECL may be required.

Differential ECL is required if the operating speeds exceed the maximum

frequency specifications for single rail ECL.

The potential need for differential ECL should be indicated at the functional specification level.

Partitioned circuits should attempt to balance

the distribution of I/O and internal cell usage

between the different arrays while maintaining

critical paths within one array if possible.

This is still the rule to follow - no matter how big the arrays get.

It is also a good guideline for how to break up a 6-8 million gate array intotop-level blocks - keep the critical paths inside the block if possible.

Interblock connections today are what interarray connections were yesterday.


3/53

Table 3-1 Components of the Functional specification

Functional Specification

Block diagram to the module level- including any

partitioning into more than one array

Description of the boundaries between the modules

and the rest of the system

Initial sizing of the I/O interface by type - ECL, TTL,

etc.

Functional Description of the Modules

Description of the interface between the circuitmodules - busses, control, critical interconnects

The overall performance requirements

- - - the maximum frequency of operation

- - - target clock speed (per clock)

- - - path propagation delay requirements set by

modules external to this design

I/O toggle rates

Synchronous/asynchronous signals

Allowed or available power supplies

Power restrictions

Physical size restrictions

Environmental requirements -Commercial, Military,

Industrial, Other

Packaging requirements

Derating for junction temperature

Prioritized design objectives

Hard Specifications

Design criteria that are considered as hard (inflexible) specifications should be clearlydocumented as such. Specifications that might be alterable should also be clearly

identified. If a tradeoff or judgment call needs to be made during the remainder of the

design project, such information can save time and possibly the project.


4/53

Design Objectives

Overall design objectives should be clearly identified and documented. These includeoptimization for speed, power or die size, which translates to minimized internal cell

utilization and minimized I/O utilization. Since these objectives are in conflict, they

should be prioritized.

As a last step, there should be a careful design review of the circuit and sys tem functionalspecifications and the partitioning

Review the Available Arrays

With a clear understanding of the design description and overall objectives, review the

arrays currently available that could be used.

For a listing of currently-available array series, check with the latest ASIC vendor

surveys run by several of the engineering magazines. These buyer's guides provide a

cursory look at what is available and allow a first-pass sort of available arrays into

feasible and non-feasible, a staring point from which the designer can proceed. They havelimited space to review technology, die size, cell counts, metal layers, number of macros,

interface levels, second sources and the EWS workstations the array vendor supports.They may not have the latest updates on an array series. They can provide addresses and

phone numbers for array vendors.

Once one or more vendors have been selected, the designer should obtain data sheets and

design guides from the prospective vendors for the most promising array series and begin

a more in-depth review.

Example - The AMCC Arrays - as of 1991

The industry shows an evolutionary trend as designers drive them to develop larger,

faster and cooler arrays.

There have been five bipolar array families from AMCC since 1984, (see Table 3-2)

increasing in cell size and speed while reducing die size and power. The most recent is

the AMCC Q20000 series, officially released in September 1989.

Table 3-2 AMCC Bipolar Array Series

AMCC Array Series YearQ20000 1989

Q5000 1987

Q3500 1986

Q1500 1985

Q700 1981


5/53

The Q20000 Series speed estimates list its internal toggle rate, at least twice as fast asthat of the previous Q5000 Series, at 1.25GHz, with an enhanced drive and much lower

power. Individual macros have been found to run at 1.4GHz and higher.

There are two AMCC BiCMOS array families, the Q14000 Series and the Q24000

Series, a partial shrink of the Q14000, as shown in Table 3-3.

Table 3-3 AMCC BiCMOS Array Series

AMCC Array Series Year

Q24000 Series 1990

Q14000 Series 1988

The current BiCMOS families were preceded by three CMOS array series, each fasterthan its predecessor. The BiCMOS arrays combine the drive and interface ability of

bipolar with the cooler operation of CMOS. The newer BiCMOS Series must be larger,

faster and cooler.

Comparing the arrays

The items that define the differences between array series include those shown in Table

3-4.

Table 3-4 Features for Array Series Comparison

Array Series Comparison Topics

The process technology

Metal layers routed (2, 3)

Series gating techniques

Sea-of-cells versus routing track architectures alsocalled channelless vs. channelled

Overall Maximum Speed of Operation specified asI/O and internal toggle rates

Frequency Ft (frequency at which beta for transistor

becomes unity)

Noise immunity

Edge rates - programmable or not

Symmetry in rise and fall times

Power-supply options allowed

Power-supply variation stability


6/53

Maximum number of I/O cells available

I/O modes allowed (TTL, ECL, MIXED, etc.)

ECL terminations

On-chip translators

Maximum number of internal cells or gates availableFeatures for Array Series Comparison

Macro Options - Standard (S); Power (P); Low-

power (L); High-speed (H); Fast (V), Drivers (D);

super-drivers (D)

- - - or lack of options; i.e., speed-power

programmability

Variety in the macro library available

Wire-ORs (dot-wire) allowed or not

Design constraints

Power dissipation per gate

Packaging Available

Autoplace, Autoroute

Engineering Workstation support

Simulators supported

Second source

Military compatible

Commercial compatible

Military qualified testing

Other topics as dictated by the arrays, their

technology and the design issues

The arrays within a series refine these differences with specific information on size,

number of cells by type, and details about interfacing, as shown in Table 3-5.

Data sheets, product profiles and macro library design guides or design manuals supply

the specific information for an array series. The design manual, supplied with the array

library media, is the controlling document.

Architectural Specification or Hardware Specification

Once a clear definition exists of the circuit or circuits that will be placed on one array,

then the planned design can be developed. This is on a smaller module scale than the


7/53

block-level functional specification, e.g., at the level of counters, adders, latches,

registers, sequencers, etc. The performance requirements defined in the functional

specification can be used to select the technology.

Table 3-5 Array-Specific Specifications

Hardware or Architectural Specification

Number of internal cells

Number of I/O cells

Number of outputs

Number of bidirectional macros

Number of fixed power and grounds

Rules for adding power and ground

Packaging Options

Maximum internal current limits

On-chip memory

Macro-type design use restrictions such as number of

Darlington; CML outputs

Placement rules that affect design

Variable bonding

The review of available arrays is conducted in parallel with the creation of the hardwarespecification.

With the descriptions developed for the modules, equivalent gate estimates can be made

for the circuit, or estimated cell usages can be computed for the circuit on a specific

array. The array vendor Applications Engineer can help with the sizing estimate.

The hardware design specification details what the designer intends to do to meet the

target functional specification. This level of specification can be equated to a PDL(program definition language) description of software and is the basis for the evolution of

HDL, hardware description language, and its derivatives.

If a particular testing methodology is being enforced, the sizing estimates must take thisadditional logic into account. If additional testing logic, such a parametric gate tree or

parity logic is to be used, it must be included in the sizing estimates.

The specification may include proposed vendors and arrays.


8/53

Table 3-6 Components of The Hardware Specification

Hardware Specification Components

The selected technology or technologies

Potential array series (1-3 at the most)

Block level diagram to the sub-module level

The functional description of the different circuit sub

-modules such as adders, counters, registers, etc.

Sub-module sizing

--- equivalent gates or estimated internal cell

utilization

--- estimated I/O cell utilization

--- estimated pad utilization

--- estimated internal pin counts

Refined details on the array interface

--- number of CMOS I/O

--- number of TTL I/O

--- number of ECL 10K I/O

--- number of ECL 100K I/O

--- all four types partitioned into inputs and outputs

and bidirectionals

--- number of outputs switching simultaneously (by

type) (SSOs)

--- maximum toggle frequencies for each I/O

--- external set-up and hold window unless this circuitwill establish the window specification for the

driving circuit

Critical path throughput performance

Estimated power - DC and AC as required

Package to be used

Heatsinks required and/or air cooling required

Estimated junction temperature

There should be a design review of the architectural or hardware specification beforefinal selection of an array series. On final selection, the specification should be revised to

show that series and all computations performed for that series.


9/53

Note that a workstation can provide some assistance. The critical path may be captured in

more than one version and comparisons made based on an annotated simulation.

Power and sizing details can be run against a macro list rather than a full interconnectnetlist. (This tool is vendor-dependent.) Check if such a pre-capture tool is available to

help size the circuit.

Array Sizing

Cell Structure

Each cell in an array consists of a number of uncommitted transistors, resistors and other

discrete components and is designed around the performance criteria for the intended

macro library. The cells will vary between array series, regardless of the vendor.

Equivalent gates

The number of equivalent gates has been a design measure dating from the days of

discrete designs first converting into SSI-level ICs. Integrated circuits were classed asSSI, MSI and LSI based on their equivalent gate counts. Circuits were "sized" based on

the number of equivalent gates it would take to create them. CMOS arrays carried on

with the equivalent gate count and it was reasonable because the internal cell in a CMOS

array can be sized as 1, 2 or 3 gates.

Bipolar arrays carry equivalent gate counts on their data sheets as a sizing measure but it

serves only to show relative sizing between arrays in the same series. Bipolar array cellcomplexities render equivalent gates a rough measure at best. BiCMOS cells are more

complex than CMOS and equivalent gate estimates are not recommended for them either.

To complicate the problem, vendors use many different methods for computing

equivalent gates. The designer would need the algorithms before a rational comparison

based on equivalent gates can be made between and two array series, even from the same

vendor.

Example - Method 1

One approach to array sizing is to count the number of transistors in the internal corecells, assume that 2.5 transistors is equivalent to a gate (Digital Equipment's definition),

and compute the number of equivalent gates per cell. The product of the number of cells

times the number of gates per cell provides the equivalent gates per array.

equivalent gates = ( number of transistors in core / 2.5 )

Example - Method 2

Another method is to use the D flip/flop. Sizing the D flip/flop as 11 gates, the Q20000

Series D flip/flop uses 2 internal cells.

equivalent gates = ( number of internal cells / 2 ) * 11


10/53

Example - Method 3

The usual AMCC method is to size a 3:1 MUX-D flip/flop macro as 11 gates. The

Q20000 Series 3:1 MUX-D flip/flop uses 3 internal cells.

equivalent gates = (number of internal cells / 3) * 11

Example - Method 4

The last method discussed here is to size a full adder at 16 gates. For the Q20000 Series,

a 1-bit full adder takes 3 internal cells.

equivalent gates = (number of internal cells / 3) * 16

or:

equivalent gates = [(number of internal cells / number of cells required for measuring

function) * number of gates in function]

AMCC ASIC Product Selection Guide with Equivalent Gates Listed (1996)

AMCC ASIC PRODUCT SELECTION GUIDE

(1990''s)

Part

Number

Technology

EquivalentGates

(Full AdderMethod)

Number

of I/O

Structured

Array Blocks

Q20004 1 Micron Bipolar 671 28 None






Q20P010 1 Micron Bipolar 928 34 1 GHz PLL

Q20P025 1 Micron Bipolar 3120 51 1 GHz PLL

Q20M100 1 Micron Bipolar 13475 195 RAM

I/O cell contributions

None of these methods for estimating equivalent gates take the logic capability of theinterface cells into account. Some vendors do count them in their published equivalent

gate counts and others do not.


11/53

Example - AMCC cell design

AMCC cell design is optimized for MUX, latch and flip/flop implementations. Each cellis designed to support high-speed requirements so that there are no placement restrictions

on the high-speed option macros due to cell limitations. No power is used by a cell in its

base configuration.

For the AMCC BiCMOS arrays, a cell is roughly 3 gates. For the bipolar arrays, a logiccell is a more complex structure and varies with the series.

Cell capabilities

Cells for each array have different capabilities. The cells for different array series, same

technology (bipolar, BiCMOS or CMOS), from the same vendor may also differ widely

in the approach used in their design and in their functional complexity.

Example - AMCC cell capabilities

An internal cell for the Q5000 Series can support a complex D flip/flop, a 3:1 MUX andD flip/flop, a triple latch, two simple (no RESET, single output) D flip/flops, or triple 2:1

MUXs with common select. The Q20000 Series internal cell alone cannot support a D

flip/flop. S- and L-option D flip/flops use two cells while H-option D flip /flops require

three.

The Q20000 Series internal cell is roughly comparable to a half-cell for the Q5000 Series

if size of function alone is considered as the basis for comparison. The logic cell for theQ20000 Series is defined as the smallest partition possible and each internal cell supports

one Turbo macro output. Turbo is a Q20000 feature that provides high drive (18 loads)

with less power and less skew.

Cell types and resources

The vendor data sheet and design guide or design manual should clearly identify cell

types and the number of each on each array in the series. Any restrictions in the use of the

cells, either utilization limits or cell count limits should also be readily available.

Included in these descriptions should be a measure of cell functionality, either in a table

summarizing the array cell capability or through the macro library documentation.

As a part of the cell resources identification, the vendor should be supply a clear

description of the fixed power and ground pads and procedures to added additional powerand ground pads. These added power and ground macros usually reside on an I/O cell and

pad and can affect the number of cells left for circuit signals.

Example - AMCC Cell types

The basic AMCC logic array is composed of two classes of cells: the internal cells, which

is composed of logic (L) and memory (M) cells for bipolar arrays or basic (B) cells for

BiCMOS arrays; and the perimeter cells composed of input, output or bidirectional (I/O)cell. Older AMCC arrays had buffer cells internally and specialized input or output-only


12/53

interface cells. An array may or may not have specialized I/O cells. AMCC cell types are

shown in Table 3-7.

The QM1600S (now the QM1600T) was the first of the AMCC arrays to incorporate

memory on a logic array.

Table 3-7 Cell Types

INTERNAL: Logic, Basic, Buffer, Memory

PERIPHERAL: Input, Output, I/O,Special-

I/O

Refer to the cell resources table for an approximate idea of the array cell capacity forthree series and note the differences. Cell resources for the Q24000 Series are shown in

Table 3-8, for the Q5000 Series in Table 3-9 and for the Q20000 Series in Table 3-10.

Note that no two series are alike!

Table 3-8 AMCC Q24000 Series Arrays - Cell Resources

ArrayName

Internal B Cells I/O Cells Pads

Q24280 6880 300 256

Q24140 3360 226 226

Q24091 2268 160 160

Q24060 1440 132 132

Q24021 540 80 80

Q24008 190 66 44

Usage restrictions: Refer to the Q24000 Design Manual for details.


Array

Name

Internal

L Cells

I/O

Cells

Output

Limit

Memory

Cells

Q5000T 352 160 120 -

Q3500T 242 120 - -

Q1300T 84 76 - -

QM1600T 114 106 -2 (1240bits)


13/53


ArrayName

InternalCells

I/O

Cells(For

Signals)

I/O

Cells(Fixed)

(1)

Signals

- PLLRelated

Signals

- LoopFIlter

(2)

Q20120 3414 198 4 - -

Q20080 2044 162 4 - -

Q20045 1227 128 4 - -

Q20P025 595 76 4 13 8

Q20025 733 100 4 - -

Q20P010 177 54 4 13 8

Q20010 267 66 4 - -

* Two pads are used by the AC Speed Monitor and two by the thermal diode.** Only for the largest arrays, 100_LDCC for the Q20P010 and 132_LDCC for the

Q20P025

Add last four columns to find total I/O cells and pads.

ArrayName

ECLOutputs

Limit

TTLOutputs

Limit

PLLPower/Ground

Power/Ground(1)

Q20120 172 100 - 78

Q20080 130 80 - 52

Q20045 100 64 - 52

Q20P025 45 (2) 45 (2) 8 26

Q20025 80 48 - 36

Q20P010 23 (3) 23 (3) 8 20

Q20010 50 24 - 32

(1) Add last two columns to find total number of fixed power and grounds.(2) 51 for external loop

(3) 34 for external loop

Array architecture

The base arrays for the various series are similar in their design concept in that the core

of most arrays is composed of an array or matrix of logic or basic cells organized in arow-column configuration. Arrays that contain memory place the RAM blocks in the core


14/53

area, with the rest of the core designated for internal logic cells. Phase-Lock loop arrays,

the PLL arrays, have PLL locations that straddle both core and interface areas. Interface(I/O) cells are placed around the perimeter of the array interspersed with power and

ground.

There are different base arrays for different power supply configurations. The base array

for a single +5V supply will be different from that for a mixed-mode +5V/-5.2V dual

supply. A generic die plot for the Q20080 array is shown in Figure 3-1 and one for theBiCMOS Q24091 is shown in Figure 3-2, with the interconnect pattern in Figure 3-3.

Figure 3-1 Q20080 Die Plot

Figure 3-2 Q24140 Die Plot


15/53

Figure 3-3 BiCMOS Macro Interconnect Pattern

Macro configurations

Macros are individually configured by interconnecting the components within a cell with

one layer of metal to form the selected macro function. Macros can occupy a cell, apartial cell (usually 0.5 cell), or require several cells. The internal interconnect for a

simple macro is generally confined to one layer of metal. The particular layer will depend

on the array series.

Cell Interconnect

The process of interconnecting macros is called routing. For channeled architectures,

routing is performed following specific routing tracks. The interconnect is on the first andsecond layers of metal in a two layer metalization array. Horizontal and vertical tracks

are assigned to specific metal layers.

For an array with three layers of metal, the second and third layers will be used for inter-macro routing and the first layer for intra-macro routing. In practice, the hard definition

of which layer of metalization is restricted to which operation can be blurred.

Channelless architecture

Channelless architectures have been developed to avoid some of the limitations imposed

by restricted number of routing tracks.

The Q24000 sea-of-gates and Q20000 sea-of-cells (Channelless) architectures use three

layers of metal. Macros are interconnected on one level and interconnect between macros

occurs on the other two, the specific layers being array and series dependent.


16/53

For the Q20000 Series arrays, the internal macro connects (intraconnect) are on second

and third metal with macro and I/O interconnects on the first layer. Routing on all three

layers is possible and four layers of metal is a future possibility.

Sizing the Design - Selecting the Array

Netlist

The combination of the macro layout patterns (component interconnect) and the macrointerconnect forms the metalization pattern required to implement the circuit on a given

array. This pattern is described in a netlist.

Each workstation produces a netlist in its own format, carrying along whatever in

formation the workstation vendor has decided was necessary. There is no standardworkstation or simulator netlist format although efforts are directed toward that goal (see

EDIF) and some success has been recently attained.

Parametric information that is included in the netlist is array and array-vendor dependent.

A library such as the Q20000 is shipped to customers with aMacro Parameter File,which supplies the parameters for each macro in the library. These parameters are

included in the netlist for each occurrence of each macro used in the design.

Example: The AMCC netlist

To accommodate transfer of designs from any workstation or from any of the sup portednetlisters (Laser 6 and Verilog) to the mainframe-based place and route sys tem, netlist

conversion is performed, where the workstation netlist is translated into a standard

interface format. AMCC refers to this as AGIF -AMCC generic interface format. Adifferent conversion program is required for each workstation or simulator that AMCC

supports.

The standardized netlist is namedcircuit.sdi . This netlist is used as input to the AMCC

MacroMatrix software as listed in Table 3-11.

Table 3-11 AMCC MacroMatrix and Design Support

Software - using circuit.sdi

MacroMatrix AMCCERC rules check

MacroMatrix AMCCPACKAGE (Package Check and Data)MacroMatrix AMCCANN annotation

MacroMatrix AMCCSIMFMT simulation file formatter

MacroMatrix AMCCVRC vector check

MacroMatrix AMCCSUBMIT submission check

AMCCAD place and route system


17/53

Test vector transfer software

Verilog simulator

Interface options - I/O modes

Interface combinations required for the design should be compared to those offered by

the arrays under evaluation. The power supply and the interface combination define the

I/O mode of the array. Not all arrays support all possible I/O modes with all possible

power-supply combinations.

Interface types

Once it is seen that the interface mix can be supported on an array series, the type of TTLand ECL outputs that will be required is used to help size the I/O requirements of the

array.

Example: AMCC interface options

For all AMCC arrays, TTL and ECL translators are included in the I, O, or I/O cells forexternal interfacing to both ECL and TTL. Each I, O, or I/O cell can be configured to be

either TTL, ECL 10KH, ECL 10K, ECL 100K or as a power or ground pad. I/O cells can

usually be used for input macros, output macros or bidirectional macros. Table 3-12

shows the possible I/O combinations allowed on AMCC arrays while Table 3-13 details

the TTL output options and Table 3-14 the ECL output options.

Table 3-12 AMCC Interface Combinations

IF INPUT IS OF TYPE: OUTPUT CAN BE ANY OF:

TTL ECL 10K ECL 100K TTL ECL 10K ECL 100K

X X X X

X X X X

X X X X

X X X X X

X X X X X

Table 3-13 TTL OUTPUT OPTIONS

standard TTL

open-collector

three-state or 3-state also called tri-state

standard TTL output bidirectional

open-collector output bidirectional


18/53

The 3-state outputs and TTL bidirectional macros have an enable pin that is eitherrestricted to being driven by a specific macro type (a 3-state enable driver) or unrestrictedand driveable by any internal-level signal. The restriction depends on the array and on the

mode (100% TTL or Mixed ECL/TTL) of the circuit.

Table 3-14 ECL Output Options

Output Options

ECL 10K, 50 ohm termination




ECL 10K, 50 ohm termination bidirectional

ECL 100K, 50 ohm termination bidirectional

CML outputs (> 600MHz),

ECL''s version of an open-collector

On-chip 50 ohm series termination ECL 10K




Darlington ECL 10K, 50 ohm termination




Darlington ECL Hi-Z 10K

Darlington ECL Hi-Z 100K

Darlington On-chip 50 ohm series termination ECL 10K




From CML forward in the above list are types identified as possible for the Q20000Series. Standard ECL 10K, 100K, CML and Darlington outputs were in the first re lease

of the macro library for the series.


19/53

Power supply options

In addition to the types of interface required, the power supply or power supplies to beused should be compared to the supplies allowed for the array. The supplies, the number

of fixed power and ground pads and their locations should be reviewed for their

applicability to the design in question.

There is often a need to have an array interface with several types of I/O while keepingpower supply requirements in line with what is already provided on the target PCB

(printed circuit board). This can lead to operation of a technology with non-standard

voltages.

Effects on Parametrics

When non-standard voltages are used, such as -4.5V with ECL 10K and -5.2V with ECL

10K, the DC parametrics for the array will be affected. The data sheet for the array series

will call out the parametrics for standard supplies.

The vendor must be consulted for computational procedures to be used

when non-standard power supplies are used.

Example - AMCC Arrays - Power Supply Options

The power-supply and interface type matrix for the AMCC arrays shows a very flex ible

approach to solving interface problems. Many of the AMCC arrays can be used with a

single power supply (+5V) or dual supplies (+5V/

-5.2V or +5V/-4.5V) as shown in Table 3-15.

The Q5000 and Q20000 Series arrays are bipolar arrays. They use an internal ECL core

(0.5V ECL) and can externally interface to either Schottky TTL, ECL 10K or to ECL100K. AMCC arrays allow for the mixed mode operation of ECL/TTL on the same array,

either ECL 10K/TTL or ECL 100K/TTL or all three. Only one type of ECL may be used

for input on a single array. Both ECL types may be used for output on the same array.

Table 3-15 AMCC Power Supply Options

SINGLE POWER SUPPLY DUAL POWER SUPPLY

I/O MODE +5V -5.2V -4.5V +5V/-5.2V +5V/-4.5V

100% TTL x

100% ECL 10K x x x x x

100% ECL 100K x x x x x

ECL10K/TTL x x x

ECL100K/TTL x x x


20/53

100% ECL run with dual power supplies is called"DECL".

Table 3-15, with the exception of "DECL", also applies to the Q24000 Series BiCMOS

arrays. They have a CMOS core and bipolar I/O and they can interface to CMOS.

The concept of mixed ECL-TTL interface on a single array was originated as a result ofcustomer demand. The idea of operating ECL 10K at ECL 100K power supplies and visa

versa was also the result of customer requests.

Example- communicating to the software

AMCC uses dummy macros calledchip macros that allow a user to specify precisely

what array is to be used in what interface mode with what power supplies. (See Figure 3-

4.) The chip macro communicates parameters to the AMCC MacroMatrix softwaremodules that are performing design validation, including population and cell type limit

checks. The array-specific checks use chip macro parameters to set limits for TTL

outputs, Darlington outputs, simultaneously switching outputs, bidirectional macro

counts, and other checks.

The AMCCERC software can spot mismatched interface macros and exceeded macro

type limits and issue appropriate error messages. It can also adjust the DC power module

to use the correct power supply in the power computation.

Figure 3-4 A Chip Macro (1994)

Interface cell functionality

Interface cells are designed to support TTL-translators, ECL-translators and most of the

required buffers for external interfacing to both ECL and TTL. The amount of buffering,the capability of the cell to support high fan-out drivers, single-cell bidirectional macros,

ECL output terminations to 25 or 50 ohms, and elementary logic possible in an interface

cell varies by array series.

For many of the arrays, the input macros also provide simple AND/NAND or OR/NOR

logic or high fan-out driver operations. The output macros for TTL contain OR or NOR

operations and those for ECL may contain these operations plus others as complex as a

latch or a 2:1 MUX. This is in addition to level translation and buffering functions. Theamount of logic contained within an interface macro is series-dependent; it is a function

of the I/O cell complexity and the components available within the cell.


21/53

Variability in I/O design

The various array series and even arrays within a series differ in their approach tointerface. The following gives an idea of the choices that have existed on the arrays from

one vendor. Similar variability and evolution can be traced for other vendors.

The Q700 Series used unbuffered I (input-only) and I/O (input, output or

bidirectional) cells that require a buffer for each input and each output macro. Thebuffer macros were placed on internal cells (L or B), reduc ing the L-cells

available for internal logic functions. There was a D-cell on one array in the series

to provide a pin-restrictive three-state enable driver that could drive more thaneight loads. A bidirectional macro was composed of one interface and two buffer

cells.

The Q1500A array used I (input-only) and O (output-only) cells, with buffering

either in the input or output macro or in a separate macro. The BExx macros werefor ECL output buffering, for example, and were placed on a B cell. TTL input

buffers are part of the input macro that was placed on an I cell. Bidirectionals

were constructed from two macros on two adjacent cells using the same methodsnow used on the Q14000 Series arrays.

The QH1500A array used I and I/O cells, with buffering included in the input,output, and bidirectional macros the first time all buffers were removed from the

internal cell area. The I/O cell could support single -cell bidirectional macros.

The Q3500 and Q5000 Series use I/O cells only, with buffering included in the

input, output and bidirectional macros.

The Q1500, Q3500 and Q5000 Series also provide unbuffered ECL input and the

buffered logic macros to support it. The BIxx series macros are made up of

representative logical functions from the rest of the macro library (gates, EXORnetworks, latches, flip/flops, MUXs and decoders) which also includes the ECL

input buffering function on selected input pins. The BIxx macros are placed oninternal macros (L or B). The selected pins are pin-restricted to be driven by any

unbuffered ECL input macro.

The unbuffered ECL input macro does not suffer any degradation in speed due to

loading delay, the only macro to behave in this manner. It can drive eight loads.

Load capacitance presented to the source driving the unbuffered ECL inputincreases by 1 pF per fan-out.

The Q14000 Series uses I/O cells, with buffering and logic as is used in theQ5000 Series. Single cell bidirectional macros can only be used on the Q9100B or

Q2100B and then only in specific "special-I/O" cell locations. Additional

bidirectional macros must be built from one input and one output macro.

The Q20000 Series uses I/O cells, with buffering but no logic functions. TTL

outputs (output macros and bidirectional macros) are limited to a number that

varies per array. ECL outputs are also limited. The bidirectional macros use two-cells and provide an added ground pad by using the left-over pad.


22/53

Most 25 ohm termination macros require two I/O cells. The Q20000 Series

provides a single-cell 25 ohm termination macro but limits its use to arrays usingtwo power supplies. Darlington macros are limited to arrays with two power

supplies.

The Q20000 Series uses four fixed I/O signals per array. These signals are used

by the on-chip thermal diode (one anode and one cathode) and the on-chip AC

speed monitor (one is power and the other is an output signal). These four padsand cells are not available for use with any other function.

Bidirectional macros

Bidirectional macros can be two-pin, one-pin, one-cell or two-cell macros. If an arrayseries has no bidirectional macros, they may need to be constructed. Watch out for

incompatibility with the workstations - a work-around may be required for proper

simulation of bidirectional macros.

If more bidirectional macros are needed, they are constructed from two macros, one input

and one output, and placed on two adjacent I/O cells. The two macros can be tiedtogether into one package pin, but this requires two test vector sets, one for wafer sort and

one for packaged part testing. They are usually tied together outside of the package to

keep testing simplified, but this requires two package pins.

A third approach not liked by the array vendors is to stitch two macros together in the

interconnect so that only one pad and one pin are used. Anytime that hand-edits or

customization of the interconnect or base is involved, both time and money are required,

and debugging time may need to be increased.

Examples

The Q20000 Series arrays support a bidirectional macro that sits on two I/O cells, unlikethe single-cell approach of the Q5000 Series. In this case, the internal macro routing

eliminates the need for two sets of test vectors or an extra bonded-out pad.

Each bidirectional macro also contains either an IEVCC pad (ECL VCC) or an ITGND

(TTL GROUND) pad. (Refer to "Added power and grounds" for a discussion of pad -

plane interconnections for added power and ground pads.)

Internal Cell Functionality

The logic (bipolar) and basic (BiCMOS) cells are organized to provide logic functionssuch as basic logic gates and buffers, high-fan-out drivers, EXOR and EXNOR net

works, gate networks, multiplexors, decoders, latches, and flip/flops.

These cells can support a 3:1 MUX-D flip/flop combination, triple latch-common clock,

triple 2:1 MUX-common select and dual D F/Fs. As stated before (see "Cell structure"),

the number of cells required for any of these functions will vary by array series.


23/53

The number of cells required to implement a function depends on the component mix

present in the cell and that required by the function. Arrays are designed for a specific set

of applications or targets and base array design is optimized for those applications.

An array cell size may be divisible so that half-cell macros are possible, which also

allows sizes such as 4.5 cells. A cell may be designated as the smallest divisible or

addressable unit (SAU), in which case a one cell macro is the smallest macro allowed.

Multi-Cell Macros

Groups of internal and/or interface cells can also be combined into large multi-cell

macros for higher functionality. The larger multi-cell macros, named MSI macros by

AMCC, interconnect components spread across several cells more efficiently than theschematic interconnection of the equivalent function formed from basic macros. The

result is a denser functionality with the resultant speed improvement.

Design density, measured by the cell utilization per functionality, can be increased by 20-

40% while reducing design partitioning and macro conversion efforts. The large MSI

macros include MSI and LSI functions.

Example MSI macros are 6-bit comparators, 4-bit carry-look ahead adders and their

companion carry-look-ahead generator, 4-bit up and down counters, 4-bit registers, 6-bit

comparators and 8-bit latches. Different array series offer different MSI macros.

The simple and MSI macros available with a specific array series are documented, along

with any use or placement restrictions, in the appropriate Design Guide or DesignManual. Always refer to the latest version of these manuals when performing an

evaluation.

Hard and soft macros

There are two types of MSI or multi-cell macros. One type is hard, where the cell

interconnect is treated as one large macro and no variations in layout are permitted. Theother type is soft, where the cells composing the macro have a preferred, specified-to

layout pattern but which requires the interconnect to be routed as if it were any other

interconnect net.

The MSI macros in the Q5000 Series were originally designed to allow placement in

several different configurations to facilitate the auto-place algorithm (best-fit approach),

while closely maintaining the specified performance for the macro. This is a soft-macro.

A preferred placement is documented.

The problems in improper placement, which invalidates the timing specifications and

therefore, the simulation model, and the problems in net weighting and prioritizing the

internal nets to the router, so that the interconnect delay could be kept minimal, make the

soft MSI macro approach unattractive.

Both the BiCMOS Q14000 and Q20000 Series MSI macros use hard-macros, where an

MSI macro is laid-out as a single multi-cell unit and handled by the placement soft ware


24/53

as an inflexible black box. Hard-macros facilitate automated placement. Future AMCC

arrays will use the hard macro approach. Figure 3-5 shows an MSI-based 16 -bit adder.

Figure 3-5 16-Bit MSI Adder (1994)

Refining Interface Requirements

When the interface types and their power supply requirements are documented and one or

more arrays chosen as candidates for the final selection, the interface requirements mustbe refined. There are several conditions under which additional power or ground pads

will need to be added to an array beyond the fixed power and ground pads provided.

These include:

simultaneously switching outputs,

package restrictions,

high-speed signal isolation and

ECL - TTL isolation.

Simultaneous switching TTL or ECL outputs is a potential source of system noise, which

can be reduced by the addition of TTL VCC - TTL Ground pairs and/or ECL VCC.


25/53

Some arrays require that drivers be placed next to ground. Others require that a ground

exist between simultaneously switching TTL outputs and ECL inputs, or between anyTTL output and an ECL input. Isolation of CMOS inputs from the faster switching TTL

and ECL signals may also be required. When a fixed ground is not available, then one

must be added. The design rules for any array series are called out in theDesign Manual

for the array.

Variable Requirements for Power and Ground

Bipolar arrays require that all fixed power and ground be used or bonded out to the

package. Additional power and grounds are based on simultaneously switching out puts

or isolation requirements.

CMOS arrays have some or all of their fixed power and ground pads under user -

placement control. The vendor provided a list of how many would need to be used

depending on the signals used by the design. This type of flexibility is detrimental tostandard packaging; it is time consuming and expensive.

In spite of the drawbacks, recent BiCMOS designs have returned to this approach,providing the minimal number of power and grounds and allowing other fixed-position

power and grounds to go unbonded (unconnected). The criteria for requiring that thesefixed positions be used or that additional power and grounds be added is based on the

number and types of interface macros used.

When the power busses supporting the internal core are isolated from the bussessupporting the peripheral I, O or I/O cells, noise feedback due to output switching is

minimized. The threshold and reference voltage generators for the logic array internal

cells and I, O and I/O cells should also be independent to insure steady operation.

Adding Extra Power and Ground pads

Adding a power pad or a ground pad to an array can be accomplished by placing a power

or ground macro on the desired pad (array-specific procedure. AMCC arrays use theITPWR (+5V), ITGND (0V) and IEVCC (ECL VCC) macros to add power or ground.

(See Figure 3-6.) For standard refer-ence ECL, IEVCC represents a ground pad. For +5V

REF ECL, IEVCC represents a power pad.

Figure 3-6 Added power and Ground macros (AMCC)

Dual-Function I/O Macros

Each added power and ground macro uses a pad and disables the cell that is associated

with that pad, reducing the number of these cells and pads available for I/O operations.

To offset this waste, many macro libraries include dual-function macros that use the I/O

cell for one function and the pad for added ground.

Silicon efficiency can be achieved with the dual function macros. The macros avail able

are array series-specific and vary widely. If any of these functions applies to the design,

they can reduce silicon requirements while maintaining functionality. (See Figure 3-7.)


26/53

Example macros include:

input function with 3-state enable driver

3-state enable driver with added ground bidirectional input with added ground

Figure 3-7 Example Dual-Function I/O Macro

Example - Simultaneously Switching Outputs

All AMCC arrays, with the exception of the Q20000 Bipolar Series and the BiCMOS

Q24008 array, use the following rules for adding power and ground due tosimultaneously switching outputs (SSO), called an output group.

Allow 8 TTL SSO outputs per quadrant, then add one TTLPWR and one TTLGND

macro for each group of 1-8 after the first eight. This requires two cells, two pads and,

depending on the package, two package pins. Add another pair for the next group of 1-8and another for the next group of 1-8 and so on. All TTL output counts are converted to

"equivalent" 8 mA outputs. (See Table 3-16.)

For packages with internal power and ground planes, place the TTLPWR and TTLGNDmacros so that they are interspersed with the simultaneously switching outputs and can be

bonded to the power or ground package plane.

Table 3-16 Sample Rules for Adding TTL Power and Ground

PER TTL SSO ADD TTLPWR, TTLGND PAIRS:

0-8 do nothing

9-16 add 1 pair

7-24 add 2 pairs

Etc.

Allow 8 ECL SSO outputs per quadrant, then add one ECLVCC macro for each group of

1-8 after the first eight. This requires one cell, one pad and, depending on the package,one package pin. Add another pair for the next group of 1-8 and another pair for the next

group and so on.

For packages with internal power and ground planes, place the ECLVCC macro so that itis interspersed with the simultaneously switching outputs and can be bonded to the power

or ground package plane as required. Note that ECLVCC is a power pad in a +5V


27/53

reference ECL circuit (5V REF ECL) and a ground pad in a standard reference ECL

circuit. (See Table 3-17.)

Table 3-17 Sample Rules for Adding ECL Power OR Ground

PER ECL SSO ADD ECLVCC Q20000 Rules

0-4 do nothing do nothing

4-8 do nothing add 1

9-12 add 1 add 2

13-16 add 1 add 3

17-21 add 2 add 4

21-24 add 2 add 5

Etc. Etc.

The Q20000 Series requires one ECLVCC per additional 1-4 ECL SSO after the first

group of four. All output counts are converted to "equivalent" 50 ohm outputs. The

extremely high speeds of these arrays require design procedures to ensure minimal noise.

Thermal Diodes

As the arrays have become larger and dissipate more power, thermal characterizationbecomes an increasingly important issue. Some means of evaluating array junction

temperature must be developed for each array series.

For some of these series, macros have been developed that allow the designer to add oneor more thermal diodes to the design. The macros are treated as any other macro and are

placed on interface cells.

Newer arrays, such as the Q20000 Series, have thermal diodes built into the base array.

The Q20000 Series arrays have a thermal diode structure embedded in the base and

brought out to dedicated or fixed pads. These pads must be brought out to package pins.

These pads are not accessible to any other macro function.

Example - AMCC thermal diodes (1994)

Thermal diode macros exist for the Q14000 and Q5000 Series libraries and the de signeris required to add one thermal diode macro pair per circuit. Using more than one was

found to be unnecessary as the thermal gradient across the chips was found to be

insignificant. Where there might be doubt, additional thermal diode pairs can be added.

Each pair uses two I/O cells. (See Figure 3-8.) One earlier version of the implementationalso used one internal cell. No differences were found to exist between these two

versions.

Thermal diode macros also exist for the Q20000 Series for those cases where a second

thermal diode measurement is felt to be necessary.


28/53

Figure 3-8 Thermal Diode Pair

The AMCC AC Speed Monitor

AC testing is a problem for both the designer and the vendor and to reduce the problems

associated with it, the Q20000 Series arrays each has a built-in AC speed monitor with

two fixed pads assigned to it. These pads must be brought out to package pins.

Threshold generators - routable generators

The designer is not usually concerned with the threshold generators. In cases where they

are required, they may only need identification and routing connections rather than actual

cell placement.

VBB Reference voltages

There are some instances where VBB reference voltages are desired, where I/Outilization is high and the designer is using single-rail ECL where differential ECL is

required. These reference voltages are supplied with a macro and are placed on an

interface cell. They will connect to external package pins.

Speed and testing interface cell utilization

Maximum speed of operation and testing requirements will have an affect on the final

interface cell count. For very high speeds, differential ECL may be required by the array

vendor, doubling the cell and pad counts of those signals.

Testing may require that parts of the circuit are degated while other parts are being tested.

This will occur when a simultaneously switching group is very large, including thesimultaneous enable-disable of three-state or bidirectional macros. Test-enables may be

required to partition the circuit for testing, and test enables will use cells and pads.

Population or cell type limits and utilization

Where population restrictions exist, circumvention of the limits may include the addition

of interface macros. For example, a single-cell bidirectional macro limit would result in

two-cell bidirectional macros being used for additional bidirectional signals. The single-cell 25 ohm ECL termination, if dual power supplies are not available would result in

two-cell 25-ohm terminations.


29/53

Placement restrictions

High-frequency signals in particular will often require placement in specific cell locationsand require that these macros be isolated with added grounds. Added grounds use pads

and disable the accompanying cell.

Where placement restrictions require the addition of macros or a change in the macros

selected, the effects on cell utilization must be anticipated in the initial estimate.

Final Interface Cell Utilization

The final interface cell count for the circuit in its estimated stage should look at all the

factors that could increase interface cell requirements. The interface cell utilization for a

non-captured circuit should be less than 100% if possible to allow for adjustments andexpansion. If this is not possible, than the rest of the design must be completed using I/O

cell utilization minimization as a priority design objective.

In the ideal situation, an array chosen for a design should be somewhere in the middle ofan array series. This is to provide a smaller option if I/O minimization can reduce the

requirements and to provide a larger option should the interface requirements grow out of

the original selection.

If not, then the interface utilization should be no more than 90% during development,

with no more than 100% interface utilization for the final design.

Interface Cell Utilization (general)

To find interface cell utilization, add the items in the list in Table 3-18.

Table 3-18 Interface Cell Utilization

Interface Cell Utilization

cells for input signals

cells for output signals

cells for bidirectional signals

cells for thermal diodes (I/O)

cells for AC speed monitor (I/O)

cells for reference generators

cells blocked by added power pads

cells blocked by added ground pads

cells dedicated to fixed I/O signals

Divide this sum by the number of interface cells available on the array of choice.


30/53

Interface cell utilization = (number of interface cells used by the circuit) / (number

of interface cells available on the array)

Example - BiCMOS Cell/PAD Utilization

When an array does not have a one-to-one ration of I/O cells and pads, then PAD

utilization may also be required. The Q24008 and Q24280 arrays have 2-cell-1-pad

structures. Certain macros placed on these structures are very efficient, others are not.Depending on the macros used, single-cell or multi-cell, either pads or cells may be

rendered inaccessible. These arrays have a complex algorithm available to allow sizing.

The algorithm requires a check on both cells and pads.

PAD utilization = (number of PADS used by the circuit) / (number of PADS

available on the array)

Fan-out load limits

Internal cell usage will depend on the macros required to implement the desired

functions. Refinements to that estimate come when the fan-out load limits, hook-up andpin restrictions for those macros are evaluated. If an interface macro is driving too many

loads, internal macro buffers will be needed to divide that load or additional interfacemacros will be needed. If internal macros are driving too many loads, the same approach

is used. These buffer trees use cells and current.

Macros will be specified with both fan-in and fan-out load limits. The fan-in numbers

represent the load that the macro presents to the macro driving it. The fan-out limit is thenumber of loads that the macro can safely drive before signal degradation becomes a

predominant factor. A load unit can be considered to be equivalent to one pico-farad.

Check with the array vendor for their definition.

Derated fan-out load limits

Clock paths, distortion-sensitive and high-speed paths should be designed with a derated

fan-out load limit, i.e., with macros operating well below their specified limits. The array

may be specified with a guideline as to the frequency - derating schedule.

Each AMCC array series is different in the value of the breakpoint frequency but each

has the same basic rule. For sensitive and clock paths, derate the fan-out load limit by

20% up to the breakpoint and 40% at or above the breakpoint frequency.

Example - fan-out derating

For the Q20000 Series, all internal macros have the Turbo speed enhancement allowing a

fan-out load limit of 18 loads. The TTL input and bidirectional input macros are the onlyinterface macros that do not have this Turbo enhancement and their fan-out load limit is 9

loads.

Assume that the breakpoint frequency is currently set at 400MHz.


31/53

For an ECL input toggling at 500MHZ, the derated fan-out load limit would be:

(1.0 - 0.4) * 18 = 10 (truncated)

Drivers

Special driver macros may be provided in a library. These "super-drivers" are not derated.They are designed to provide a clean edge even when loaded to their rated limit. These

drivers will use more current and more cells then the non-driver but fewer of them arerequired to drive the same load. The result may be the same cell utilization and the same

power.

Another feature of drivers should be considered. When timing analysis is performed, the

super-drivers and drivers will be seen to have a lower k-factor (drive factor) than the non-driver macros, resulting in lower inter-macro delays for the same load than a non-driver

macro could provide.

Drivers may be interface macros or internal macros.

Hook-up or interconnect restrictions

Hook-up is used here to define the rules on grounding an input pin to a macro. CMOS

and BiCMOS technologies require that all unused macro input pins (non-primary array

inputs) be clipped to VDD or VSS, no exceptions.

Bipolar technologies allow the unused input pins to be tied to global ground. The groundsymbol on the schematic is for human comprehension and to allow checking software to

understand that the designer meant to leave the pin unattached.

For some arrays, a macro input pin connected to global ground on a schematic will mean

that the pin "floats", or is unattached to anything when silicon is built. For others, thesepins are physically attached to a confirmed logical low by connecting to a rail (CMOS) or

by strapping the base to the emitter (bipolar) through conditional geometry.

For the Q20000 Series these pins are base input to transistors and when unused are tied to

the emitter to ensure a logical low. For the Q5000 Series, the pins were allowed to float.

Whether or not the pins are allowed to float, there will be cases where specific macro

pins are restricted, i.e., these pins cannot be attached to global ground but must be driven

low by another macro. This is a hook-up restriction.

When hook-up restrictions exist, some macro must be added to the schematic to drivethese pins low (or high). The number added will depend on the number of loads that must

be driven low or high.

Pin restrictions - interconnect restrictions

Some macros are pin-restricted in that they may not be freely connected to any other

macro but much be driven by or drive a specific class of macro. As an example, TTL

three-state outputs and TTL bidirectional macros in some macro libraries must have their


32/53

enable pins driven by a macro known as a three-state enable driver. No other macro may

drive that enable pin. The three-state enable drivers can only be connected to drive these

specific pins; they may not be used to drive other macros.

In the Q5000 library, three-state enable drivers may only be placed on interface I/O cells,

even when they are driven by internal signals, leaving the pad unused in this case.

When pin-restrictions cause the use of specific macros and these macros have restrictedplacements, the impact on cell utilization must be considered.

Internal cell utilization

When the paths have all been checked for fan-out, pin restrictions, hook-up restrictions,

placement rules, etc., the internal cell utilization can be estimated. As stated in Chapter 2,this is the sum of all the internal cells used divided by the number of internal cells

available.

Internal cell utilization = (number of internal cells used by the circuit) / (number of

internal cells available on the array)

Further changes

Other factors that can change the estimated cell utilization include adjustments made for

power reduction, for speed enhancement, or for cell utilization reduction for either

interface or internal cells.

Exercises

1. Select a semi-custom array series (any).

List:

the processing technology

available power supply configurations

types of TTL input and outputs allowed

types of ECL input and output allowed

how bidirectional macros are handled

2. For the selected series, what cell usage restrictions exist?

a. Any limits on inputs

b. Any limits on outputso TTL

o ECL

c. Any limits on bidirectionals

d. Any rules for simultaneously switching outputs

Are the rules easy to find?

3. For the selected series, how many fixed power and ground pads are on each array in

the series? How are additional power and ground pads added?


33/53

4. For the selected series, what types of cells are available on each array and how many

of each type?

5. How many internal cells would be required by the selected array series macros to

implement an 8-bit barrel shift register (8 2:1 MUXs with 8 4:1 MUXs, 8 D flip/flops)?

6. Given a 16-bit fast adder design using carry-look ahead, 16 DATAA and 16 DATAB

inputs, necessary controls (clock, reset, carry-in), a registered output, 17 outputs (sumplus carry out), size the design for the macro library for the selected array series. Assumea COMMERCIAL environment, single -5.2V power supply, ECL is ECL 10K or ECL

10KH.

Fast adder: four 4-bit fast adders with carry-propagate outputs; one 4-bit carry-look ahead

unit; 17 D flip/flops; 35 ECL inputs; 17 outputs; buffers and gates as required; added

power/ground as required.

7. Given a 32-bit register, 35 ECL inputs (32 data, clock, reset, 3-state enable), dual ECL-

TTL outputs (32 TTL 3-state and 32 ECL, same signals), size the design for the selected

array series. Assume a MILITARY environment, dual-power supplies of +5V and -5.2V,ECL is ECL 10K or ECL 10KH.

Register: 32 D flip/flops, 35 ECL inputs; 64 ECL outputs; buffers and gates as required;

added power and ground as required


34/53

Chapter 3 Appendix: Case Study in Sizing a Design

o Target Array: AMCC Q20080 (1994 Data)

o Solution - Q20000o Selecting a Flip/Flop - First Pass

o Selecting the ECL Output

o Clock Input

o 16:1 MUXo Parity Tree

o Review Status so faro Exercise

o Simultaneously Switching Outputs

o Fanout Loads Select Lines for 16:1 MUX

Reset Loading

Clock

Static Drivero Review of Size - Second Pass

o

Package Sizeo Problemso Alternative Solution

o Power

o Further Thought

o The Schematics

TARGET ARRAY: AMCC's Q20080 {Based on 1994 data}

The following exercise is not intended as a practical circuit for actual construction on anarray, however, this exercise will examine nearly every design rule and restriction for the

example array series. It will be solved here using a Q20080 array as the intended targetsolution but could be solved with any macro library provided one of the supported arraysin that series can accommodate approx. 160 I/O signals and toggle at 500MHz. See

Figure A-1.


35/53

Figure A-1 Sample Classroom Exercise

THE DESIGN

Using the following list of requirements, design a circuit using AMCC macros for the

Q20000 Series and size the design to fit the Q20080 array in that series:

A pipelined structure two flip/flops deep is to be 32 bits wide.

Each data input to the first flip/flop stage is to be driven by a 2:1 MUX, the inputs ofwhich are driven by ECL 10KH inputs.

All flip/flops are required to be reset by way of a master reset signal.

The common clock is to be a differential signal, if possible.

All 32 multiplexors are to have a common select.

The target maximum speed of operation is 500MHz.

(Design Objective.)


36/53

All dataA inputs (32 of them) are to be fed in groups of four into two 16:1

multiplexors. There are four common select lines for the two 16:1 multiplexors

and two outputs, controlled by enables (one per signal).

All input signals, data and controls, are to be fed into a parity tree, a gate tree that will

produce a single output. This structure is to be used for parametric testing.

A six-bit pass-through bus (input to output without logic) is included which uses ECLinputs and outputs.

The flip/flop output stage is connected to non-Darlington ECL 10KH outputs. Both

true and complementary outputs are to be brought out to external pins.

This is a military, standard reference ECL -5.2V single-supply circuit.

Note: Keep your data. This problem or a similar one will be referred to in other chapters.

Exercise

Review the selected design manual, select macros and compute cell utilization. Pick an

array from the series that would fit the design. Perform all required population checking

for that series.

SOLUTION - Q20000

Check for I/O mode and power supply.

This is a 100% ECL circuit and uses no Darlingtons so that a single -5.2V supply is

allowed.

The AMCC chip macro is Q20080ECL10K, which sets the I/O mode at 100% ECL withECL 10K/KH inputs. The power supply parameter is set at STD5 for standard reference -

5.2V supply. The product grade parameter is set at MIL for military. Between them, these

parameters define this circuit as a MIL5 circuit, using the MIL5 library and annotation

data. The chip macro is shown in Figure A-2.

Figure A-2 AMCC Icon for the Chip Macro


37/53

Selecting a flip/flop - first pass

The need for a master reset will reduce the set of available flip/flop macros that could be

used to those with a synchronous or asynchronous reset (or set). The use of a 2:1 MUX -

flip/flop combination will further reduce the choices for the first stage of the circuit.

For the chosen Q20000 macro library, FF46S is a D flip/flop with a 2:1 MUX on the data

input and an asynchronous reset. It is more silicon-efficient to use a combination MUX-

F/F macro than to implement the design with individual multiplexor and flip/flop macros.

The second stage flip/flop needs a reset and at this stage in the design process needs both

Q and QN outputs. FF10S was chosen as the appropriate macro. See Figure A-3.

Figure A-3 MUX and two F/Fs in Two Macros


38/53

Selecting the ECL input

All inputs (reset, selects, output enables and data) except the clock will use the IE93S, asimple buffered input that produces both Y and YN outputs shown in Figure A-4. The

YN output will be used to input to the gate tree to keep loading off the Y path. To reduce

power, the IE94 version with only the Y output could have been chosen. This optionwould use three loads on the Y path, two to the main circuit (register input and 16:1

MUX input) and one to the parametric tree.

Figure A-4 Output Macro with Complementary Outputs

For this circuit, the saving of one load is not significant in that the loads are not in the

critical path. In another instance, the reduction of one load could be the difference

between meeting or failing specification.

There are 64 data inputs, 32 dataA and 32 dataB, plus one select for the input 2:1 MUX,

and four for the 16:1 MUX controls (and four 16:1 MUXs) for a total of 82 IE93S

macros. Each macro uses one I/O cell and one pad. (See Table A-1.)

Table A-1 Required IE93S Inputs

32 data A

32 data B

1 reset control

5 data MUX control select

2 output enables (MUX outputs)6 pass-through inputs

78 IE93S inputs

Clock input

The clock input will use IE34H, a differential high-speed input with a Y and YN output.

For CML-compatible input, use IE31H. The clock will have two loads. It uses two I/O

cells and two pads. The clock is in the critical path.


39/53

Other options that could be considered include the use of the driver version of the

differential input, IE32D. The driver handles 32 loads and has k-factors with less skewthan those of the H-option IE34H. If the IE34H proves to be too slow or the inter-macro

delays too long, the IE32D would be the choice for a speed upgrade. The driver is shown

in Figure A-5.

Figure A-5 Differential Input Macro

ECL outputs - first pass

All outputs in the initial version of the circuit were the OE42S, a cut-off (ECL output

with an enable) macro used with the enable tied low (always on) except for the twocontrolled outputs. (This macro was the only 50 ohm non-Darlington termination in the

initial release of the library.) The (111) version of the library added OE11S, a NOR-input

50 ohm termination, rated for 350MHz. The other option is to have a custom 50 ohm

macro created, not worth the effort for the case study but something that should be

reviewed in a real circuit where power and cell space are at a premium.

The OE42S enable is tied low by way of the GT87D static driver, a macro that supplies

steady HIGH and LOW signals when unused macro pins cannot be "clipped" low or

allowed to float.

There will be 64 data output for the pipeline, six outputs for the pass-through signals, two

MUX outputs and an output for the parametric gate tree for a total of 73 outputs. Each

OE42S uses one I/O cell and one pad.

The fan-out load limit for the GT87D is 50 loads so two will be required to supply theOE42S enable pins in this first version of the design. The basic module is shown in

Figure A-6.

Note that the OE11S is easier to use and uses less power - reasons to consider challenging

the initial solution.


40/53

Figure A-6 Macro Design Restrictions

16:1 MUX

The 16:1 MUX is constructed from five MX21S macros, each a 4:1

MUX with two selects. This is the largest multiplexor in the first release.

Four of these will feed into the fifth to form the 16:1 MUX structure.Since there are two 16:1 MUXs, there will be 10 MX21S macros

required. An 8:1 or 16:1 MUX MSI macro would simplify the design.

The basic design is shown in Figure A-7.


41/53

Figure A-7 Schematic Page for the 16:1 MUX

Copyright @ 2001, 2002 Donnamaie E. White, White EnterprisesFor problems or questions on these pages, contact [email protected]

Parity tree

A parity tree of all inputs (required for parametric VIL, VIH testing) can be formed fromNOR gates using the GT60L or GT60S, an 8-input NOR macro. The L-option is slower

and uses less power. The speed of the gate tree is not important since testing is functional

at 100ns intervals. The first estimate for the tree is to use eleven GT60S macros in a

three-level structure to accommodate the 79 input signals. (The 78 data signals plus the

clock are required.) The parity tree is shown in Figure A-8.


42/53

Figure A-8 Parity Tree

REVIEW STATUS SO FAR

The first sizing estimate provides the cell counts shown in Table A-2.

Table A-2 First Sizing Estimates

# MACROS Macro # I/O Cells Required

78 IE93S 78

73 OE42S 73

1 IE31H 2

TOTAL I/O CELLS: 153


43/53

# MACROS Macro # L Cells Required

10 MX21S 20

11 GT60L 33

32 FF10S 96

32 FF46S 96

TOTAL L CELLS: 245

The number of macros is not the same as the number of cells, even for the I/O macros.

Exercise

Check the cell counts against the current design manual for the Q20000 Series. Check for

new MSI macros or new I/O macros that might be used in place of those selected (such as

OE11S). Consider size, speed and power in making changes. (Changes should be made!)

If you are designing with a different array series, create the same table for the chosenlibrary.

SIMULTANEOUSLY SWITCHING OUTPUTS

Since 64 outputs are switching simultaneously in the worst case (master reset is oneexample), additional IEVCC macros (added ground) will be required according to the

Q20000 Series design rules. A total of 16 IEVCC macros is required for these outputs

and each blocks off one I/O cell and uses one pad.

This is the minimum number of added power and grounds recommended for worst-case

conditions.

Adding two more outputs for the 16:1 MUX Y outputs, six for the pass-through and one

for the gate tree, requires two more IEVCC macros.

If the outputs switch within one macro delay (or within 2 ns, whichever is larger)

of the other switching group additional IEVCC is required.

If they switch well separated in time from the other group, then the added IEVCC

for this group will not be required.

By tagging the switching groups and the added power and ground macros that belong to

the groups with a SWGROUP parameter or property, the AMCC MacroMatrix can checkfor sufficient added power and grounds.

For this design, assume that the groups are not simultaneously switching more than 32signals, allowing a reduction in added ground. Allowing 8 IEVCC for the pipeline

outputs (switching group AAA) and one for the rest of the circuit (switching group BBB),

nine IEVCC macros are required.

Adding these 9 IEVCC macros to the previous counts (153 + 9), the number of I/O cells

used is 162. This is exactly the number of I/O cells available for circuit use on the


44/53

Q20080 array. (This does not count the four fixed I/O signals for the AC Speed Monitor

and the thermal diode that have pre-assigned PADs.) The added ground macro is shown

in Figure A-9.

Figure A-9 Added IEVCC Macro

Note: Using less than the recommended number of added grounds is not a good idea. It

will require engineering approval before design submission and could cause other

problems later. Think about another solution!

FAN-OUT LOADS

The final step toward an estimate of circuit size requires that fan-out loads be examined.

Most macros in the Q2000 library will have a fan-in of one except for H-option macrosthat will have a higher fan-in (and larger cell size). This is not always the case but should

be considered when examining macro options.

Select lines for 16:1 MUX

Select lines to each 16:1 have at most four loads. No buffering is required for the IE93S

macros that can drive 18 loads each.

Select lines to 2:1 MUX structure

The select to the 2:1 MUX structure has 32 loads and will need buffering. One macro can

drive 18 loads, adding a gate buffer tree such as two GT09S macros allows one primary

input to drive 32 loads. (See Figure A-10.)


45/53

Figure A-10 Buffer Tree for the 2:1 MUX (32 Loads)

The other option is to switch the IE93S for an IE23D driver that can drive 32 loadsdirectly. The IE23D driver uses twice as much current as an IE93S macro but would save

the internal cells that the GT09S macros would have used.

Reset loading

RESET requires the same decision process. In this case, the signal goes to 64 flip/flops.

The AR pin for the FF46S is two loads and the AR pin for the FF10S is 1 load for a total

of 96 loads. Either six GT09S macros or three GT55D macros can provide the drive. TheGT55D driver uses twice as much current as a GT09S macro and is twice as large. Since

half as many are required, on comparing cell usage and power these two solutions are

equivalent.

On the schematic, eight GT09S macros were used to simplify the schematic design (eight

pages are replicated). (See Figure A-11.)

Figure A-11 Reset Signal Buffer Tree

RESET STRUCTURE - ONE OPTION


46/53

Reset structures are often treated as clock structures without the need for speed. This

structure is only one level in depth. Current synthesis systems will create the necessary

buffer trees to support the load being driven.

Clock

The clock is handled differently since all clock nets must be derated. There are 64 loads

from the flip/flops, plus 1 load due to the parametric gate tree, for a total of 65 loads. TheIE31H can drive 10 loads with a 40% derating. The GT55D driver, derated, drives 19loads and presents a fan-in load of two to the driving macro. Four GT55D macros would

provide the drive capability with full 40% derating down the path as shown in Figure A-

12.

Figure A-12 Clock Tree

CLOCK STRUCTURE - ONE OPTION

Derating guidelines are part of the array design rules. Macro load limits are listed in the

macro documentation.

Place & Route software today creates the clock tree structure based on the commands in acontrol script. The commands involve suggested buffer or macro to be used and clock

tree depth. In the near future, Floorplanners will incorporate this function. Clock trees

have priority during layout, depending on the design constraints supplied to the Place &

Route tool.

When the clock tree is to be constructed by the Place & Route software, all timinganalysis prior to the routing is done using a modeled clock, approximating what the final

clock tree behavior might be.

Static Driver

The static driver required to drive the always-on output enable inputs can handle 50 loads

but 64 are required in this version of the design. Two GT87D macros can be used. One is

shown in Figure A-13.


47/53

Figure A-13 Static Driver

Static driver is not a term that shows up in macro lists today. Rather, high-drive options

on various macros are used. If no one macro can handle the load to be driven, then a

buffer tree is constructed by the synthesis tool.

Parity tree

A parity tree of all inputs (required for parametric VIL, VIH testing) can be formed from

NOR gates using the GT60L or GT60S, an 8-input NOR macro. The L-option is slower

and uses less power. The speed of the gate tree is not important since testing is functional

at 100ns intervals. The first estimate for the tree is to use eleven GT60S macros in athree-level structure to accommodate the 79 input signals. (The 78 data signals plus the

clock are required.) The parity tree is shown in Figure A-8.

Figure A-8 Parity Tree


48/53

REVIEW OF SIZE - SECOND PASS

The revised estimate (one version of the solution) shows the circuit requirements as they

are now understood.

Table A-3 Second Sizing Estimates

Number of Cells Required

#macros MACRO CELLS TOTAL

79 IE93S 1 79

73 OE42S 1 73

1 IE31H 1 2

9 IEVCC 1 9

TOTAL I/O CELLS REQUIRED 162

10 MX21S 2 20

11 GT60S 3 33

10 GT09S 1 10

4 GT55D 2 8

2 GT87D 2 4

32 FF10S 3 96

32 FF46S 3 96

TOTAL L CELLS REQUIRED 267

Change OE42S to OE11S and delete the 2 GT87Ds.

This fits into the Q20080 array that has 162 I/O cells and 2044 L cells. This is a severely

I/O-bound design (of course!). A design is either core-limited or I/O limited.

Note: When vectors are written for this array, they should be designed so that no morethan 16-32 of the outputs switch at any one time. These are AMCC-specific vector design

rules.


49/53

Table A-4 AMCCERC Population ERC

PACKAGE SIZE

The minimum number of signal pins that should be available on a package for this circuit

is 157 (162 signals plus the 4 fixed signals minus the 9 added grounds). The worst-casenumber of signal pins that could be required on a package for this circuit is 166 (162

signals plus the 4 fixed signals). The truth is in the middle and is placement-dependent.

PROBLEMS

The OE42S is limited to a toggle frequency of 350MHz. If the clock is running at

500MHz, the outputs could be toggling slower. If not, then the OE42S is not acorrect choice if speed is to be maintained. Neither is the OE11S!

Insufficient added grounds are not a minor problem.

The circuit uses nearly 8 Watts - much too high.

ALTERNATIVE SOLUTION

The differential output OE14S could be used in place of two OE42S macros and the

GT87D driver (at least one) could be deleted. This reduces the OE42S macros from 73 to9, and the 7 always-on enables could be driven by a GT08L NOR gate instead of a static

driver macro.

The use of OE14S provides a cleaner solution (less skew) plus it frees internal cells. The

maximum frequency of the OE14S is 1.2GHz. One output pad can be used as the true

signal and the other as the compliment.


50/53

Another advantage is the reduced requirement for added grounds. The 32 differential

outputs count as 32 outputs and not as 64, reducing the requirement for this group to 8added IEVCC, what was provided. The ninth IEVCC applies to the miscellaneous other

outputs. There will be a warning issued by AMCCERC that there might not be sufficient

added grounds for these miscellaneous outputs - the algorithm defined by AMCC

requires that two IEVCC macros be added.

Table A-5 OE42S Solution Table A-65 OE14S SolutionIE93S 78 IE93S 78

OE42S 73 OE42S 9

OE14S 32

IE31H 1 IE31H 1

IEVCC 9 IEVCC 9

MX21S 10 MX21S 10

GT87D 2 GT87D 1

GT60S 11

GT09S 8 GT09S 8

GT55D 4 GT55D 4FF10S 32 FF10S 32

FF46S 32 FF46S 32

POWER

The DC power dissipation for the maximum worst-case MILITARY DC power for the

OE42S version of the circuit was estimated to be over 8 Watts.

The DC power computation for the OE14S version, same conditions, is estimated to be

5.88 Watts. (This number is based on the circuit as shown in the schematics and the

February 1991 library specifications.)

Reducing the GT08S macros to GT08L macros can further reduce power.

FURTHER THOUGHT

For cell usage, timing, power, and added ground requirements, the basic OE14S solution

is the best pro-posed so far.


51/53

Table A-7 OE14S Solution

Table A-8 OE14S Solution

This version used GT87D instead of a GT08L. It uses GT60S macros in the gate tree

instead of GT60L macros. Do the MUX and reset buffer trees need S-macros or could L-option macros be used? (Watch it - the options have different maximum frequency of

operation numbers! This is often overlooked in choosing options.)

The DC power computed by the AMCCERC program is summarized below. Remember -AC power dissipation must be added to this. AC power computations required depend on

the array series.


52/53

Table A-8b Macro Occurrence ReportContinued

Exercise

Add a design objective to reduce power to 5 Watts or as close to it as possible and modifythis circuit using the latest library information. The frequency of operation requirement

remains.

This same exercise was used in the AMCC training classes through several library

releases. This problem, or one close to it, was actually used for over eleven years with

several technology libraries, bipolar, Bi-squared MOS and CMOS. It demonstrates nearly

85% of the array design rules.

Today's designers would create this circuit in Verilog or VHDL and a control script for

the synthesis tool. Constraints can drive area reduction, speed improvements or power

reduction. The script can also set the priority for the different design constraints.

THE SCHEMATICS

Page 1 - Chip Macro and added Ground (IEVCC for ECL VCC);

AAA is switch group tag; GT87D a static driver

Page 2 - Clock tree; RESET tree; 2:1 MUX select tree. Buffer trees go to various pages.

Note the inputs to the parametric gate tree. "40"s are FOD values. (Figure A-10, Figure

A-11, Figure A-12, Figure A-13)

Page 3 - 2:1 MUX selects and enable