579.Trends.lect1.08
-
Upload
ankit-patodiya -
Category
Documents
-
view
214 -
download
0
Transcript of 579.Trends.lect1.08
-
7/29/2019 579.Trends.lect1.08
1/32
Lecture 1RAS 1
Lecture 1
Design and Technology Trends
R. Saleh
Dept. of ECE
University of British Columbia
-
7/29/2019 579.Trends.lect1.08
2/32
Lecture 1RAS 2
Recently Designed Chips
Itanium chip (Intel), 2B tx, 700mm2 , 8 layer 65nm
CMOS (4 processors) TILE64 Processor, 64-Core SoC with Mesh NoC
Interconnect, 90nm CMOS
153Mb-SRAM (Intel), 45nm, high-k metal-gate CMOS
FPGAs recently fabricated in 45nm
What are the major technology and design issues
that are driving the IC industry?Lets start from the simple rules of MOS scaling
-
7/29/2019 579.Trends.lect1.08
3/32
Lecture 1RAS 3
MOS Transistor Scaling(1974 to present)
Scaling factor s=0.7 per node (0.5x per 2 nodes)
Metal pitch Technology Node
set by 1/2 pitch
(interconnect)
Gate length
(transistor)
Poly width
-
7/29/2019 579.Trends.lect1.08
4/32
Lecture 1RAS 4
Ideal Technology Scaling (constant field)
Quantity Before Scaling After Scaling
Channel Length L L = L * sChannel Width W W = W * s
Gate Oxide thickness tox tox = tox * s
Junction depth xj xj = xj * s
Power Supply Vdd Vdd = Vdd * s
Threshold Voltage Vth Vth = Vth * s
Doping Density, pn+
NAND
NA = NA / sND = ND / s
-
7/29/2019 579.Trends.lect1.08
5/32
Lecture 1RAS 5
Technology Nodes 1999-2019
180nm 130nm 90nm 65nm 45nm 32nm 22nm 16nm
1999 2001 2004 2007 2010 2013 2016 2019
0.7x 0.7x
0.5x
N-1 N N+1
Two year cycle between nodes until 2001, then 3 year cycle begins.
-
7/29/2019 579.Trends.lect1.08
6/32
Lecture 1RAS 6
Forecast Technology Parameters
Year TechnologyNode(nm)
PhysicalGate(nm)
tox(nm)
Dielec-tric K
Vdd(V)
Vth(V)
Na(/cm
3)
Nd(/cm
3)
xj(nm)
2001 130 90 3.0 3.7 1.2 0.34 1.0e16 1.0e19 67.5
2004 90 53 2.4 3.0 1.1 0.32 1.4e16 1.4e19 46.7
2007 65 32 1.7 2.5 0.9 0.29 2.0e16 2.0e19 33.8
2010 45 22 1.5 2.0 0.8 0.29 2.9e16 2.9e19 23.4
2013 32 16 1.4 1.9 0.7 0.25 4.0e16 4.0e19 16.6
2016 22 11 1.3 1.7 0.6 0.22 5.9e16 5.9e19 11.4
-
7/29/2019 579.Trends.lect1.08
7/32
Lecture 1RAS 7
Where are we now?
130nm and 90nm CMOS volume production
Early production of 65nm, Leading-edge use of 45nm
Scaling of gate is leading scaling of wire
Scaling is driven by DIGITAL design needs
-
7/29/2019 579.Trends.lect1.08
8/32
Lecture 1RAS 8
Making Photolithograph Work
Extensive use of OPC and PSM in 90nm and below:
-
7/29/2019 579.Trends.lect1.08
9/32
Lecture 1RAS 9
Deep Submicron Technology Generations
Table 1: Time overlap of semiconductor generations
Each generation spans ~17 yearswe are unlikely to be totally suprised
Univerisity research Industry development Production
95 96 97 98 99 00 01 02 03 04 05 06 07 08 09 10 11 12
350nm
1 2 3 4 5
-2 -1 250nm
1 2 3 4 5
-4 -3 -2 -1 180nm
1 2 3 4 5
-6 -5 -4 -3 -2 -1 130nm
1 2 3 4 5
-9 -8 -7 -6 -5 -4 -3 -2 -1 90nm
1 2 3 4 5
-11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 65nm
1 2 3 4 5
-11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 45nm
1 2
-11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1
-11 -10 -9 -8 -7 -6 -5 -4
6 7 8
6
-
7/29/2019 579.Trends.lect1.08
10/32
Lecture 1RAS 10
MPU Trends - Moores Law
4004400480088008
80808080
80858085 80868086
286286
386386
486486PentiumPentium procproc
P6P6
0.0010.001
0.010.01
0.10.1
11
1010
101000
10001000
10,00010,000
7070 8080 9090 0000 1010
TransistorsTransistors
(MT)(MT)
2X Growth2X Growth
in 2 Years!in 2 Years!
Transistors DoubleTransistors Double
Every Two YearsEvery Two Years
Source: Intel
-
7/29/2019 579.Trends.lect1.08
11/32
Lecture 1RAS 11
More MPU Trends
PentiumPentium Pro procPro proc
PentiumPentium procproc486486
386386
28628680868086
80858085
80808080
80088008
40044004
40403636
32322828
11
1010
100100
7070 8080 9090 0000 1010
Die sizeDie size
(mm)(mm)
~7% growth per year~7% growth per year
~2X growth in 10 years~2X growth in 10 years
~40mm Die in 2010?~40mm Die in 2010?
Source: Intel
-
7/29/2019 579.Trends.lect1.08
12/32
Lecture 1RAS 12
Delay vs Fanout
0
1
2
3
4
5
6
0 2 4 6 8
Fanout
Delay
=0.0
=0.5
=1.0
=2.0
where
CIN Cload
1X 4X 16X
Delay Metric - FO4 Concept
Use FO4 delay
as optimal delay
-
7/29/2019 579.Trends.lect1.08
13/32
Lecture 1RAS 13
FO4 INV Delay Scaling
0
100
200
300
400
500
600
700
0.20.40.60.811.2
pS)
Technology Ldrawn (um)
Fanout
FO4 delay 425ps * Ldrawn
0
100
200
300
400
500
600
700
0.20.40.60.811.2
FO4G
atedelay(pS)
Technology Ldrawn (um)
Fanout =4 inverter delay at TT, 90% Vdd, 125 oC
Ldrawn
For scaling purposes, the alpha-power model is very useful:Idsat = K W Leff
-0.5Tox-0.8 (Vgs -Vth)
1.25
If L,Tox
V all scale (note V scaling will be limited by Vth
scaling),
Current should remains constant per micron of width (approx. 600 to 800uA/m)t = CV/i = st since C, V, i all scale down by s
-
7/29/2019 579.Trends.lect1.08
14/32
Lecture 1RAS 14
10
100
1000
Dec-83 Dec-86 Dec-89 Dec-92 Dec-95 Dec-98
8038680486Pentium
Pentium II
MPU Clock Frequency Trend
Intel: Borkar/Parkhurst
-
7/29/2019 579.Trends.lect1.08
15/32
Lecture 1RAS 15
10
100
1000
Dec-83 Dec-86 Dec-89 Dec-92 Dec-95 Dec-98
8038680486
PentiumPentium II
Expon.
MPU Clock Frequency Trend
Intel: Borkar/Parkhurst
Dec-99 Dec-00 Dec-01 Dec-02
10000Forward projection
may be too optimistic
P4
-
7/29/2019 579.Trends.lect1.08
16/32
Lecture 1RAS 16
10.00
100.00
Dec-83 Dec-86 Dec-89 Dec-92 Dec-95 Dec-98
8038680486PentiumPentium II
MPU Clock Cycle Trend (FO4 Delays)
Intel: Borkar/Parkhurst
-
7/29/2019 579.Trends.lect1.08
17/32
Lecture 1RAS 17
10.00
100.00
Dec-83 Dec-86 Dec-89 Dec-92 Dec-95 Dec-98
8038680486PentiumPentium II
Expon.
MPU Clock Cycle Trend (FO4 Delays)
Intel: Borkar/Parkhurst
Dec-99 Dec-00 Dec-01 Dec-02
Forward projection
does not make sense
Curve actually
flattens at 14-16 FO4
-
7/29/2019 579.Trends.lect1.08
18/32
Lecture 1RAS 18
Power Trend - Ever Increasing
Powerpe
rchip[W]
1980 1985 1990 1995 20000.01
0.1
1
10
100
1000
Year
MPU
x4/3ye
ars
DSP
x1.4/3ye
ars
Processors
published
in ISSCC
-
7/29/2019 579.Trends.lect1.08
19/32
Lecture 1RAS 19
Dynamic vs. Leakage Power
Krishnamurthy, et al., CICC 2002
250nm 180nm 130nm 90nm 65nm
Technology Node
Power(watts)
Dynamic Power
Leakage Power
-
7/29/2019 579.Trends.lect1.08
20/32
Lecture 1RAS 20
Leakage Current Contributions
130nm 90nm 65nm
-
7/29/2019 579.Trends.lect1.08
21/32
Lecture 1RAS 21
MPU Diminishing Returns
Power knob running out Speed == Power 10W/cm2 limit for convection cooling, 50W/cm2 limit for forced-air cooling Large currents, large power surges on wakeup Cf. 125A supply current, 150W total power at 1.2V Vdd for EV8 (Compaq) die size will not continue to increase unless more memory is used to occupy
the additional area additional power dissipation coming from subthreshold leakage
Speed knob running out Historically, 2x clock frequency every process generation
1.4x from device scaling 1.4x from pipelining, hence fewer logic stages (from 40-100 down to around 16 FO4
INV delays) Clocks cannot be generated with period < 6-8 FO4 INV delays Around 14-16 FO4 INV delays is limit for clock period
Unrealistic to continue 2x frequency trend!
-
7/29/2019 579.Trends.lect1.08
22/32
Lecture 1RAS 22
Low-Power Design Techniques
Supply Voltage Scaling
Frequency Scaling
Multiple Supply Voltages (Voltage Islands)
Clock Gating
Power Gating
Multiple Threshold Voltages: LVT, SVT, HVT
Substrate Biasing
Power Shut Off
HW/SW Power Management
-
7/29/2019 579.Trends.lect1.08
23/32
Lecture 1RAS 23
Low-Power Application: PDA
0.18um / 400MHz / 470mW (typical)
CPU
I-cache
32KB
D-cache
32KB
I2C
FICP
USB
MMC
UART AC97
I2S
OST
GPIO
SSP
PWM RTC
DMA controller
LCDCnt.
MEMCnt.
PWR CPG
SDRAM64MB
Flash32MB
LCDPeripheral Area4 48MHz
Data TransferArea100MHz
ProcessorArea
Max 400MHz
MM Application
MP3JPEGSimple Moving Picture
6.5MTrs.
Available Time6-10Hr
USB
MMC
KEY
Sound
-
7/29/2019 579.Trends.lect1.08
24/32
Lecture 1RAS 24
Trends in Low-Power Design Content
Today, such designs contain embedded processing enginessuch as CPU and DSP, and memory blocks such as SRAM and
embedded DRAM As we scale technology and keep power constant how does the
amount of logic vs. memory change?
Consider the following assumptions to develop trends for on-
chip logic/memory percentages Die size is 100mm2
Clock frequency starts at 150MHz increases by about 40% pertechnology node
Average power dissipation in limited to 100mW at 100o
C Initial condition at Year 2001: area percentage 75% logic, 25%
memory
-
7/29/2019 579.Trends.lect1.08
25/32
Lecture 1RAS 25
Logic/Memory Content Trend
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
2001 2004 2007 2010 2013 2016Year
PercentageofArea(%)
Logic Area Contribution (%) LSTP
Total Memory Area (%) LSTP
Die Size = 1cm2
-
7/29/2019 579.Trends.lect1.08
26/32
Lecture 1RAS 26
ASIC Core Composition Breakout
0
10
20
30
40
50
60
1999 2000 2001
PercentgaeofDieArea
(I/OsExclud
ed)
Random Logic
Memory
Analog
Cores
ASIC Logic/Memory Content Trends
Source: Dataquest (2001)
-
7/29/2019 579.Trends.lect1.08
27/32
Lecture 1RAS 27
Design Trend: Productivity Gap
Year Technology Chip Complexity ASIC Frequency
1997 250 nm 50M Tr. 100MHz
1999 180 nm 150M Tr. 200MHz
2002 130 nm 250M Tr. 400MHz
2004 90 nm 500M Tr. 600MHz
-
7/29/2019 579.Trends.lect1.08
28/32
Lecture 1RAS 28
Designing a 50M Transistor IC
Gates Required ~12.5M
Gates/Day (Verified) 1K (including memory) Total Eng. Days 12,500
Total Eng. Years 35
Cost/Eng./Year $200K
Total People Cost $7M Other costs (masks, tools, etc.) $8M
Actual Cost is $10-15M to get actual prototypes after fabrication.
-
7/29/2019 579.Trends.lect1.08
29/32
Lecture 1RAS 29
Productivity Gap
Deep submicron (DSM) technology allows hundreds of millions
of transistors to be integrated on a single chip
Number of transistors that a designer can design per day(~1000 gates/day) is not going up significantly
New design methodologies are needed to address the
integration/productivity issues
System on a chip Design with reusable IP
new design methodology, IP development
new HW/SW design and verification issues
new test issues
-
7/29/2019 579.Trends.lect1.08
30/32
Lecture 1RAS 30
SoC Design Hierarchy
SOC consists of new logic blocks and existing IP
New Logic blocks
Existing IP including memory
Each logic block can be implementedby newly designed portion and a re-useportion based on IPs
Newly designed portion
Re-use portion including memory
-
7/29/2019 579.Trends.lect1.08
31/32
Lecture 1RAS 31
SoC Platform Design Concept
SoC Verification Flow
System-Level PerformanceEvaluationRapid Prototype for
End-Customer Evaluation
SoC Derivative DesignMethodologies
System-level performanceevaluation environment
HW/SW Co-synthesis
SoC IC Design Flows
ApplicationSpace
Methodology / Flows:
Foundation Block
MEM
FPGACPU Processor(s), RTOS(es)
and SW architecture
*IP can be hardware (digitalor analog) or software.
IP can be hard, soft orfirm (HW), source orobject (SW)
Scaleable
bus, test, power, IO,
clock, timing architectures
+ Reference Design
Foundry-SpecificPre-Qualification
Programmable IP
SW IP
Hardware IP
Pre-Qualified/VerifiedFoundation-IP*
-
7/29/2019 579.Trends.lect1.08
32/32
Lecture 1RAS 32
Purpose of this Course
This course addresses SoC/IP design in DSM technologies
It is a very broad subject, one that industry is grappling with on a
daily basis one course cannot address all the issue properly The goal is to present an overview of the various issues from
Systems to Silicon to provide a perspective on what ishappening in technology and design.
We will begin with the Systems Level and work our way down tothe Silicon Level
The projects, presentations, and assignments will provide in-depth analysis of the subjects that are of interest to you