Post on 20-Dec-2015
© Digital Integrated Circuits2nd Design Methodologies
Digital Integrated Digital Integrated CircuitsCircuitsA Design PerspectiveA Design Perspective
DesignDesignMethodologiesMethodologies
Jan M. RabaeyAnantha ChandrakasanBorivoje Nikolic
December 10, 2002
© Digital Integrated Circuits2nd Design Methodologies
The Design Productivity ChallengeThe Design Productivity Challenge
Source: sematech97
A growing gap between design complexity and design productivity
58%/Yr. compoundComplexity growth rate
21%/Yr. compoundProductivity growth rate
198
1
10
Log
ic T
ran
sist
ors
pe
r C
hip
(K
)
Pro
du
ctiv
ity (
Tra
ns.
/Sta
ff-M
on
th)
100
1,000
10,000
100,000
1,000,000
10,000,000
1
XX
X XX
X
x
100
1,000
10,000
100,000
1,000,000
10,000,000
100,000,000
10
2.5m
.35m
.10m
198
3
198
5
198
7
198
9
199
1
199
3
199
5
199
7
199
9
200
1
200
3
200
5
200
7
200
9
Transistor/Staff Month
© Digital Integrated Circuits2nd Design Methodologies
A Simple ProcessorA Simple Processor
MEMORY
DATAPATH
CONTROL
INP
UT
/OU
TP
UT
© Digital Integrated Circuits2nd Design Methodologies
A System-on-a-Chip: ExampleA System-on-a-Chip: Example
Courtesy: Philips
© Digital Integrated Circuits2nd Design Methodologies
Impact of Implementation ChoicesImpact of Implementation ChoicesE
nerg
y E
ffic
ienc
y (i
n M
OP
S/m
W)
Flexibility(or application scope)
0.1-1
1-10
10-100
100-1000
None Fullyflexible
Somewhatflexible
Har
dwire
d cu
stom
Con
figur
able
/Par
amet
eriz
able
Dom
ain
-spe
cific
pro
cess
or(e
.g.
DS
P)
Em
bedd
ed m
icro
proc
ess
or
© Digital Integrated Circuits2nd Design Methodologies
Design MethodologyDesign Methodology
• Design process traverses iteratively between three abstractions: behavior, structure, and geometry• More and more automation for each of these steps
© Digital Integrated Circuits2nd Design Methodologies
Implementation ChoicesImplementation Choices
Custom
Standard CellsCompiled Cells Macro Cells
Cell-based
Pre-diffused(Gate Arrays)
Pre-wired(FPGA's)
Array-based
Semicustom
Digital Circuit Implementation Approaches
© Digital Integrated Circuits2nd Design Methodologies
The Custom Approach The Custom Approach
Intel 4004
Courtesy Intel
© Digital Integrated Circuits2nd Design Methodologies
Transition to Automation and Regular StructuresTransition to Automation and Regular Structures
Intel 4004 (‘71)Intel 4004 (‘71)Intel 8080Intel 8080 Intel 8085Intel 8085
Intel 8286Intel 8286 Intel 8486Intel 8486Courtesy Intel
© Digital Integrated Circuits2nd Design Methodologies
Cell-based Design (or standard cells)Cell-based Design (or standard cells)
Routing channel requirements arereduced by presenceof more interconnectlayers
Functionalmodule(RAM,multiplier,…)
Routingchannel
Logic cellFeedthrough cellR
ow
s o
f ce
lls
© Digital Integrated Circuits2nd Design Methodologies
Standard Cell — ExampleStandard Cell — Example
[Brodersen92]
© Digital Integrated Circuits2nd Design Methodologies
Standard Cell – The New GenerationStandard Cell – The New Generation
Cell-structurehidden underinterconnect layers
© Digital Integrated Circuits2nd Design Methodologies
Standard Cell - ExampleStandard Cell - Example
3-input NAND cell(from ST Microelectronics):C = Load capacitanceT = input rise/fall time
© Digital Integrated Circuits2nd Design Methodologies
Automatic Cell GenerationAutomatic Cell Generation
Courtesy Acadabra
Initial transistorgeometries
Placedtransistors
Routedcell
Compactedcell
Finishedcell
© Digital Integrated Circuits2nd Design Methodologies
A Historical Perspective: the PLAA Historical Perspective: the PLA
x0 x1 x2
ANDplane
x0x1
x2
Product terms
ORplane
f0 f1
© Digital Integrated Circuits2nd Design Methodologies
Two-Level LogicTwo-Level Logic
Inverting format (NOR-NOR) more effective
Every logic function can beexpressed in sum-of-productsformat (AND-OR)
minterm
© Digital Integrated Circuits2nd Design Methodologies
PLA Layout – Exploiting RegularityPLA Layout – Exploiting Regularity
f0 f1x0 x0 x1 x1 x2 x2
Pull-up devices Pull-up devices
VDD GNDAnd-Plane Or-Plane
© Digital Integrated Circuits2nd Design Methodologies
Breathing Some New Life in PLAsBreathing Some New Life in PLAsRiver PLAs A cascade of multiple-output PLAs. Adjacent PLAs are connected via river routing.
PRE-CHARGE
PR
E-
CH
AR
GE
PRE-CHARGE
PR
E-C
HA
RG
E
BUFFER
BUFFER
BU
FF
ER
BU
FF
ER
PRE-CHARGE
PR
E-C
HA
RG
E
BUFFER
BU
FF
ER
PRE-CHARGE
PR
E-
CH
AR
GE
BUFFERB
UF
FE
R
• No placement and routing needed. • Output buffers and the input buffers
of the next stage are shared.
Courtesy B. Brayton
© Digital Integrated Circuits2nd Design Methodologies
Experimental ResultsExperimental Results
Layout of C2670
Network of PLAs, 4 layers OTC
River PLA,2 layers no additional routing
Standard cell, 2 layers channel routing
Standard cell,3 layers OTC
0.2
0.6
1
1.4
0 2 4 6 area
dela
y
SC N PLA R PLA
Area: RPLAs (2 layers) 1.23 SCs (3 layers) - 1.00, NPLAs (4 layers) 1.31 DelayRPLAs 1.04SCs 1.00 NPLAs 1.09 Synthesis time: for RPLA , synthesis time equals design time; SCs and NPLAs still need P&R.
Also: RPLAs are regular and predictable
© Digital Integrated Circuits2nd Design Methodologies
MacroModulesMacroModules
25632 (or 8192 bit) SRAMGenerated by hard-macro module generator
© Digital Integrated Circuits2nd Design Methodologies
““Soft” MacroModulesSoft” MacroModules
Synopsys DesignCompiler
© Digital Integrated Circuits2nd Design Methodologies
““Intellectual Property”Intellectual Property”
A Protocol Processor for Wireless
© Digital Integrated Circuits2nd Design Methodologies
Semicustom Design FlowSemicustom Design Flow
HDLHDL
Logic SynthesisLogic Synthesis
FloorplanningFloorplanning
PlacementPlacement
RoutingRouting
Tape-out
Circuit ExtractionCircuit Extraction
Pre-Layout Simulation
Pre-Layout Simulation
Post-Layout Simulation
Post-Layout Simulation
StructuralStructural
PhysicalPhysical
BehavioralBehavioralDesign Capture
Des
ign
Iter
atio
nD
esig
n It
erat
ion
© Digital Integrated Circuits2nd Design Methodologies
The “Design Closure” ProblemThe “Design Closure” Problem
Courtesy Synopsys
Iterative Removal of Timing Violations (white lines)
© Digital Integrated Circuits2nd Design Methodologies
Integrating Synthesis with Integrating Synthesis with Physical DesignPhysical Design
Physical SynthesisPhysical Synthesis
RTL (Timing) Constraints
Place-and-RouteOptimization
Place-and-RouteOptimization
Artwork
Netlist with Place-and-Route Info
MacromodulesFixed netlists
© Digital Integrated Circuits2nd Design Methodologies
Pre-diffused(Gate Arrays)
Pre-wired(FPGA's)
Array-based
Late-Binding ImplementationLate-Binding Implementation
© Digital Integrated Circuits2nd Design Methodologies
Gate Array — Sea-of-gatesGate Array — Sea-of-gates
rows of
cells
routing channel
uncommitted
VDD
GND
polysilicon
metal
possiblecontact
In1 In2 In3 In4
Out
UncommitedCell
CommittedCell(4-input NOR)
© Digital Integrated Circuits2nd Design Methodologies
Sea-of-gate Primitive CellsSea-of-gate Primitive Cells
NMOS
PMOS
Oxide-isolation
PMOS
NMOS
NMOS
Using oxide-isolation Using gate-isolation
© Digital Integrated Circuits2nd Design Methodologies
Example: Base Cell of Gate-Isolated GAExample: Base Cell of Gate-Isolated GA
n-well
contact
21GND2019181716151413121110987654321VDD
m2m1polyp-diffn-diffp-well
contact forisolator
continuousn-diff strip
continuousp-diff strip
From Smith97
© Digital Integrated Circuits2nd Design Methodologies
Example: Flip-Flop in Gate-Isolated GAExample: Flip-Flop in Gate-Isolated GA
CLK
D
Q
GND
VDD
Q
CLR
From Smith97
© Digital Integrated Circuits2nd Design Methodologies
Sea-of-gatesSea-of-gates
Random Logic
MemorySubsystem
LSI Logic LEA300K(0.6 m CMOS)
Courtesy LSI Logic
© Digital Integrated Circuits2nd Design Methodologies
The return of gate arrays?The return of gate arrays?
metal-5 metal-6
Via-programmable cross-point
programmable via
Via programmable gate array(VPGA)
[Pileggi02]
Exploits regularity of interconnect
© Digital Integrated Circuits2nd Design Methodologies
Prewired ArraysPrewired ArraysClassification of prewired arrays (or field-programmable devices): Based on Programming Technique
Fuse-based (program-once) Non-volatile EPROM based RAM based
Programmable Logic Style Array-Based Look-up Table
Programmable Interconnect Style Channel-routing Mesh networks
© Digital Integrated Circuits2nd Design Methodologies
Fuse-Based FPGAFuse-Based FPGA
antifuse polysilicon ONO dielectric
n+ antifuse diffusion
2 l
From Smith97
Open by default, closed by applying current pulse
© Digital Integrated Circuits2nd Design Methodologies
Array-Based Programmable LogicArray-Based Programmable Logic
PLA PROM PAL
I 5 I 4
O0
I 3 I 2 I 1 I 0
O1O2O3
Programmable AND array
ProgrammableOR array I5 I4
O0
I3 I2 I1 I0
O1O2O3
Programmable AND array
Fixed OR array
Indicates programmable connection
Indicates fixed connection
O0
I3 I2 I1 I0
O1O2O3
Fixed AND array
ProgrammableOR array
© Digital Integrated Circuits2nd Design Methodologies
Programming a PROMProgramming a PROM
f0
1 X 2 X 1 X 0
f1NANA
: programmed node
© Digital Integrated Circuits2nd Design Methodologies
More Complex PALMore Complex PAL
From Smith97
programmable AND array (2i 3 jk) k macrocells
j -wide OR array
j
macrocell
productterms
D Q
A
1
j
B
CLK
OUT
C i i inputs
i inputs, j minterms/macrocell, k macrocells
© Digital Integrated Circuits2nd Design Methodologies
2-input mux 2-input mux as programmable logic blockas programmable logic block
FA 0
B
S
1
Configuration
A B S F=
0 0 0 00 X 1 X0 Y 1 Y0 Y X XYX 0 YY 0 XY 1 X X 1 Y1 0 X1 0 Y1 1 1 1
XYXY
XY
© Digital Integrated Circuits2nd Design Methodologies
Logic Cell of Actel Fuse-Based FPGALogic Cell of Actel Fuse-Based FPGA
A
B
SA Y
1
C
D
SB
1
S0S1
1
© Digital Integrated Circuits2nd Design Methodologies
Look-up Table Based Logic CellLook-up Table Based Logic Cell
Out
ln1 ln2
Me
mory In Out
00 00
01 1
10 1
11 0
© Digital Integrated Circuits2nd Design Methodologies
LUT-Based Logic CellLUT-Based Logic Cell
Courtesy Xilinx
D4
C1....C4
xxxxxx
D3
D2
D1
F4
F3
F2
F1
Logicfunction
ofxxx
Logicfunction
ofxxx
Logicfunction
ofxxx
xx
xx
4
xxxxxx
xxxxxxxx
xxx
xxxx xxxx xxxx
HP
Bitscontrol
Bitscontrol
Multiplexer Controlledby Configuration Program
x
xx
x
xx
xxx xx
xxxx
x
xxxxxx
xx
x
xx
xxx
xx
Xilinx 4000 Series
Figure must be updated
© Digital Integrated Circuits2nd Design Methodologies
Array-Based Programmable WiringArray-Based Programmable Wiring
Input/output pinProgrammed interconnection
InterconnectPoint
Horizontaltracks
Vertical tracks
Cell
M
© Digital Integrated Circuits2nd Design Methodologies
Mesh-based Interconnect NetworkMesh-based Interconnect NetworkSwitch Box
Connect Box
InterconnectPoint
Courtesy Dehon and Wawrzyniek
© Digital Integrated Circuits2nd Design Methodologies
Transistor Implementation of MeshTransistor Implementation of Mesh
Courtesy Dehon and Wawrzyniek
© Digital Integrated Circuits2nd Design Methodologies
Hierarchical Mesh NetworkHierarchical Mesh Network
Use overlayed meshto support longer connections
Reduced fanout and reduced resistance
Courtesy Dehon and Wawrzyniek
© Digital Integrated Circuits2nd Design Methodologies
EPLD Block DiagramEPLD Block Diagram
MacrocellPrimary inputs
Courtesy Altera
© Digital Integrated Circuits2nd Design Methodologies
Altera MAXAltera MAX
From Smith97
© Digital Integrated Circuits2nd Design Methodologies
Altera MAX Interconnect ArchitectureAltera MAX Interconnect Architecture
LAB2
PIA
LAB1
LAB6
tPIA
tPIA
row channelcolumn channel
LAB
Courtesy Altera
Array-based(MAX 3000-7000)
Mesh-based(MAX 9000)
© Digital Integrated Circuits2nd Design Methodologies
Field-Programmable Gate ArraysField-Programmable Gate ArraysFuse-basedFuse-based
I/O Buffers
P rogram/Test/Diag nostics
I/O Buffers
I/O B
uffe
rs
I/O B
uffe
rs
Vertical ro utes
Rows o f logic m odule s
Routing channels
Standard-cell likefloorplan
© Digital Integrated Circuits2nd Design Methodologies
Xilinx 4000 Interconnect ArchitectureXilinx 4000 Interconnect Architecture
2
12
8
4
3
2
3
CLB
8 4 8 4
Quad
Single
Double
Long
DirectConnect
DirectConnect
Quad Long GlobalClock
Long Double Single GlobalClock
CarryChain
Long
12 4 4
Courtesy Xilinx
© Digital Integrated Circuits2nd Design Methodologies
RAM-based FPGA RAM-based FPGA
Xilinx XC4000ex
Courtesy Xilinx
© Digital Integrated Circuits2nd Design Methodologies
A Low-Energy FPGA (UC Berkeley)A Low-Energy FPGA (UC Berkeley)
Array Size: 8x8 (2 x 4 LUT)
Power Supply: 1.5V & 0.8V
Configuration: Mapped as RAM
Toggle Frequency: 125MHz
Area: 3mm x 3mm
© Digital Integrated Circuits2nd Design Methodologies
Larger Granularity FPGAsLarger Granularity FPGAs
1-mm 2-metalCMOS tech
1.2 x 1.2 mm2
600k transistors
208-pin PGA
fclock = 50 MHz
Pav = 3.6 W @ 5V
Basic Module: Datapath
PADDI-2 (UC Berkeley)
© Digital Integrated Circuits2nd Design Methodologies
Design at a crossroadDesign at a crossroad
System-on-a-ChipSystem-on-a-Chip
RAM
500 k Gates FPGA+ 1 Gbit DRAMPreprocessing
Multi-
SpectralImager
Csystem+2 GbitDRAMRecog-
nition
Ana
log
64 SIMD ProcessorArray + SRAM
Image Conditioning100 GOPS
Embedded applications where cost, performance, and energy are the real issues!
DSP and control intensive Mixed-mode Combines programmable and
application-specific modules Software plays crucial role
© Digital Integrated Circuits2nd Design Methodologies
Addressing the Design Complexity IssueAddressing the Design Complexity IssueArchitecture ReuseArchitecture Reuse
Reuse comes in generationsGeneration Reuse element Status
1st Standard cells Well established
2nd IP blocks Being introduced
3rd Architecture Emerging
4th IC Early research
Source: Theo Claasen (Philips) – DAC 00
© Digital Integrated Circuits2nd Design Methodologies
Architecture ReUseArchitecture ReUse
Silicon System Platform Flexible architecture for hardware and software Specific (programmable) components Network architecture Software modules Rules and guidelines for design of HW and SW
Has been successful in PC’s Dominance of a few players who specify and control architecture
Application-domain specific (difference in constraints) Speed (compute power) Dissipation Costs Real / non-real time data
© Digital Integrated Circuits2nd Design Methodologies
Platform-Based DesignPlatform-Based Design
A platform is a restriction on the space of possible implementation choices, providing a well-defined abstraction of the underlying technology for the application developer
New platforms will be defined at the architecture-micro-architecture boundary
They will be component-based, and will provide a range of choices from structured-custom to fully programmable implementations
Key to such approaches is the representation of communication in the platform model
““Only the consumer gets freedom of choice;Only the consumer gets freedom of choice;designers need freedomdesigners need freedom fromfrom choice”choice”
(Orfali, et al, 1996, p.522)(Orfali, et al, 1996, p.522)
Source:R.Newton
© Digital Integrated Circuits2nd Design Methodologies
Berkeley Pleiades ProcessorBerkeley Pleiades Processor
• 0.25um 6-level metal CMOS
• 5.2mm x 6.7mm
• 1.2 Million transistors
• 40 MHz at 1V
• 2 extra supplies: 0.4V, 1.5V
• 1.5~2 mW power dissipationInterface
Reconfigurable
Data-path
FPGA
ARM8 Core
© Digital Integrated Circuits2nd Design Methodologies
Heterogeneous Programmable PlatformsHeterogeneous Programmable Platforms
Xilinx Vertex-II Pro
Courtesy Xilinx
High-speed I/O
Embedded PowerPcEmbedded memories
Hardwired multipliers
FPGA Fabric
© Digital Integrated Circuits2nd Design Methodologies
SummarySummary
Digital CMOS Design is kicking and healthy Some major challenges down the road
caused by Deep Sub-micron Super GHz design Power consumption!!!! Reliability – making it workSome new circuit solutions are bound to emerge
Who can afford design in the years to come? Some major design methodology change in the making!
© Digital Integrated Circuits2nd Design Methodologies
Insert F - Design synthesisInsert F - Design synthesis
© Digital Integrated Circuits2nd Design Methodologies
Circuit synthesisCircuit synthesis derivation of the transistors schematics from logic
functions- complementary CMOS
- pass transistor- dynamic
- DCVSL (differential cascode voltage switch logic)
transistor sizing - performance modeling using RC
equivalent circuits - layout generation synthesis not popular due to designers reluctance
© Digital Integrated Circuits2nd Design Methodologies
Logic synthesisLogic synthesis
state transition diagrams, FSM, schematics, Boolean equations, truth tables, and HDL used
synthesis - combinational or sequential - multi level, PLA, or FPGA
logic optimization for - area, speed , power- technology mapping
© Digital Integrated Circuits2nd Design Methodologies
Logic optimizationLogic optimization
Expresso - two level minimization tool (UCB)
state minimization and state encoding MIS - multilevel logic synthesis (UCB)
Example : S = (AB) Ci
Co= AB + ACi + BCi
© Digital Integrated Circuits2nd Design Methodologies
Logic optimizationLogic optimizationMultilevel implementation of adder generated by MIS II cell library from University of Mississippi
© Digital Integrated Circuits2nd Design Methodologies
Architecture synthesisArchitecture synthesis
behavioral or high level synthesis optimizing translation e.g. pipelining Cathedral and HYPER tools HYPER tutorial and synthesis example:
http://infopad.eecs.berkeley.edu/~hyper
© Digital Integrated Circuits2nd Design Methodologies
Architecture synthesis exampleArchitecture synthesis example
© Digital Integrated Circuits2nd Design Methodologies
Architecture synthesisArchitecture synthesis
© Digital Integrated Circuits2nd Design Methodologies
Emerging TechnologiesEmerging Technologies
Complementary Orthogonal Complementary Orthogonal Stacked MOS - Stacked MOS - COSMOSCOSMOS Stack two MOSFETs under a common gate Improve only hole mobility by using strained SiGe channel
– pMOS transconductance equal to nMOS
Reduce parasitics due to wiring and isolating the sub-nets
Conventional CMOS
Complementary Orthogonal
Stacked MOS
Savas Kaya
Technology BaseTechnology Base Strained Si/SiGe layers
Built-in strain traps more carriers and increases mobility– Equal+high electron and hole mobilities (Jung et al.,p.460,EDL’03)
SOI (silicon-on-Insulator) substrates active areas on buried oxide (BOX) layer Reduces unwanted DC leakage and AC parasitics
Mizuno et al., p.988, TED’03
Cheng et al., p.L48, SST’04
COSMOS StructureCOSMOS Structure Single common gate: mid-gap metal or poly-SiGe Ultra-thin channels: 2-6nm to control threshold/leakage
Strained Si1-xGex for holes (x0.3) Strained or relaxed Si for electrons
Substrate: SOI
COSMOS Structure - 3D View ICOSMOS Structure - 3D View I Single gate stack: mid-gap metal or poly-SiGe
Must be engineered for a symmetric threshold
In units of m
COSMOS Structure - 3D View IICOSMOS Structure - 3D View II Conventional self-aligned contacts
Doped S/D contacts: p- (blue) or n- (red) type Inter-dependence between gate dimensions:
W
L
nMOS
L
W
pMOS
© Digital Integrated Circuits2nd Design Methodologies
COSMOS Gate ControlCOSMOS Gate Control A single gate to control both channels
High-mobility strained Si1-xGex (x0.3) buried hole channel
– High Ge% eliminates parallel conduction and improves mobility
– Lowers the threshold voltage VT
Electrons are in a surface channel Requires fine tuning for symmetric operation
0
0.5
1
1.5
2
2.5
0
0.5
1
1.5
2
2.5
3
-1.2 -0.8 -0.4 0 0.4 0.8 1.2
electronsholes
Vgate [V]
nVT
= 1011
[cm-2
]
© Digital Integrated Circuits2nd Design Methodologies
3D Characteristics: 40nm Device 3D Characteristics: 40nm Device
Symmetric operation No QM corrections
– Lower VT Features in sub-threshold
operation– Related to p-i-n parasitic
diode included in 3D
COSMOS Inverter COSMOS Inverter
Top view Peel-off top views
No additional processing Just isolate COSMOS layers and establish proper contacts Significantly shorter output metallization
3D TCAD Verification3D TCAD Verification Inverter operation verified in 3D
40nm COSMOS NOT gate driving CL=1fF
load
ApplicationsApplications Low power static CMOS:
Should outperform conventional devices in terms of speed– Multiple input circuit example: NOR gate
Area tight designs : FPGA, Sensing/testing, power etc. ?