Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant...
Transcript of Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant...
![Page 1: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/1.jpg)
Kaushik RoyElectrical & Computer Engineering
Purdue University
Process-Tolerant Low-Power Design for the Nano-meter
Regime
![Page 2: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/2.jpg)
Exponential Increase in Leakage
Non-Silicon Technology
Silicon Micro- electronics
Silicon Nano- electronics
1 µm 100 nm 10 nm
1970 1980 2000 2010 2020
5 µm
Junctionleakage
GateSource
n+n+
Bulk
Drain
Subthreshold Leakage
Gate Leakage
A. Grove, IEDM 2002
Technology ( µµµµ)
Leak
age
Pow
er (
% o
f Tot
al)
0%
10%
20%
30%
40%
50%
1.5 0.7 0.35 0.18 0.09 0.05
Must stop at 50%
610ON
OFF
I
I====
310ON
OFF
I
I==== 2~6~ 10ON
OFF
I
I
![Page 3: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/3.jpg)
Technology Trend
Buried Oxide (BOX)
Substrate
Fully-depleted body
Gate
VG
VS VD
DrainSource
Vback
Buried Oxide (BOX)
Substrate
Fully-depleted body
Gate
VG
VS VD
DrainSource
Vback
Bulk-CMOS
FD/SOI
Nano devices Carbon nanotubeIII-V devices nano-wiresSpintronics
Single gate device
20032009
2020
DGMOS
FinFET Trigate
Multi-gate devices
Buried Oxide (BOX)
Substrate
Source Floating Body Drain
GateVS
VG
VD
Buried Oxide (BOX)
Substrate
Source Floating Body Drain
GateVS
VG
VD
PD/SOI
Design methods to exploit the advantages of technology innovations
![Page 4: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/4.jpg)
Variation in Process Parameters
Inter and Intra-die Variations
Device parameters are no longer deterministic
Device 1 Device 2
Channel length
130nm
30 %
5X
0.90 .9
1 .01 .0
1 .11 .1
1 .21 .2
1 .31 .3
1 .41 .4
11 22 33 44 55N orm alized Leakage (N o rm a lized Leakage ( IsbIsb ))
Nor
mal
ized
Fre
quen
cyN
orm
aliz
ed F
requ
ency
Delay and Leakage Spread
Source: Intel
10
100
1000
10000
1000 500 250 130 65 32Technology Node (nm)
# do
pant
ato
ms Source: Intel
Random dopant fluctuation
![Page 5: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/5.jpg)
ReliabilityTemporal degradation of performance -- NBTI
Tech. generation
Fai
lure
pro
babi
lity
Time
Defects Life time degradation
![Page 6: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/6.jpg)
Pessimistic Design Hurts Performance
worst-case corner
(150nm CMOS Measurements, 110 °°°°C)
0
50
100
150
200
Normalized I OFF
Num
ber
of d
ies
0 1 2 3 4 5 7
nominal corner
� Substantial variation in leakage across dies� 4X variation between nominal and worst-case leakage� Performance determined at nominal leakage� Robustness determined at worst-case leakage
![Page 7: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/7.jpg)
7
Global and Local Variations
− −− −− −− −= ∆ += ∆ += ∆ += ∆ +t t GLOBAL t LOCALV V Vδ δδ δδ δδ δinter-die
GLOBALσσσσ
−−−−∆∆∆∆ t GLOBALV
intra-die
LOCALσσσσ
−−−−t LOCALVδδδδ
Random Dopant Fluctuation
![Page 8: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/8.jpg)
8
Process Tolerance: Memories
S. Mukhopadhaya, Mahmoodi, RoyVLSI Circuit Symposium 2006, JSSC 2006, TCAD
![Page 9: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/9.jpg)
9
Parametric Failures: Read Failure
Read failure => Flipping of Cell Data while Reading
BL BR
WL
VL=‘1’
VR=‘0’VREAD
+∆∆∆∆
-∆∆∆∆
+∆∆∆∆
-∆∆∆∆NR
PR
NL
PLAXRAXL
VTRIPRDVR=VREAD
VL
WL
Vol
tage
Time ->
WL
VR
VL
Vol
tage
Time ->( )TRIPRDREADRF VVPP >=
![Page 10: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/10.jpg)
10
Parametric Failures in SRAM
BL BR
WL
NR
PR
NL
PL
AXRAXL‘1’ ‘0’
High-Vt Low-Vt
Parametric failures– Read Failures– Write Failures– Access Failures– Hold Failures
Parametric failures can degrade SRAM yieldFaulty chips Working chips
Test & Repair using Redundancy
![Page 11: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/11.jpg)
11
Process Variations in On-chip SRAM
Parametric failures →→→→Yield degradation
σVt ≈ 30mv, using BPTM 45nm technology
Yield ≈ 33%
0000
50505050
100100100100
150150150150
200200200200
250250250250
300300300300
3503503503500 52 105
157
210
262
315
367
419
472
524
577
629
682
734
786
839
890
944
996
1049
Chi
p C
ount
Chi
p C
ount
Fault statisticsFault statisticsFault statisticsFault statistics
Number of faulty cells (Number of faulty cells ( NNFaultyFaulty --CellsCells ))
Simulation of an 64KB Cache
A. Agarwal, et. al, JSSC, 05
![Page 12: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/12.jpg)
12
Inter-die Variation & Cell Failures
inter-die Vt shift ( ∆∆∆∆Vth-GLOBAL )
GLOBALσσσσ
“1” “0”
High–Vt Corners− Access failure ↑↑↑↑− Write failure ↑↑↑↑
“1” “0”
Low–Vt Corners− Read failure ↑↑↑↑− Hold failure ↑↑↑↑
![Page 13: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/13.jpg)
13
Inter-die Variation & Memory Failure
Memory failure probabilities are high when inter-die shift in process is high
LVT HVTNom. Vt
BPTM 70nm Devices
![Page 14: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/14.jpg)
14
Self-Repairing SRAM Array
Reduce the dominant failures at different inter-die corners to increase width of low failure region
Region A Region B Region CRegion C
HVT Corner
Access & Write failures dominate
Reduce AF & WF
Region A LVT Corner
Read & Hold failures dominate
Reduce RF & HF
LVT HVTNom. Vt
![Page 15: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/15.jpg)
15
Post-Silicon Repair: Proposed Approach
inter-die
GLOBALσσσσ
intra-die
L O C A Lσσσσ
intra-die
L O C A Lσσσσ
Apply correction to the global variation to reduce number of failures due to local variations
![Page 16: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/16.jpg)
16
Self-Repairing SRAM Array
Reduce the dominant failures at different inter-die corners to increase width of low failure region
Region A Region B Region C
LVT HVTNom. Vt
ZBB
RBB FBB
![Page 17: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/17.jpg)
17
How to identify the inter-die Vt corner under a large intra-die variation ?
Effect of inter-die variation can be masked by intra-die variation
Monitor circuit parameters, e.g. leakage current
BLBL BRBR
WLWL
VDDVDD
GNDGND
![Page 18: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/18.jpg)
18
Leakage of entire SRAM array is a reliable indicator of the inter-die Vt corner
1
1NY X
ii Y X
Y XN
σ σσ σσ σσ σµ µµ µµ µµ µ====
= => == => == => == => =∑∑∑∑
Array Leakage Monitoring
• Adding a large number of random variables reduces the effect of intra-die variation
![Page 19: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/19.jpg)
19
Self-Repair using Leakage Monitoring
SRAM ARRAY
VOUT
Entire array leakage is monitored to detect inter-die corner and proper body-bias
is selected
BypassSwitch
On-chipLeakageMonitor
CalibrateSignal
VDD
SRAMArray
Comparator
Bodybias Body-Bias
selection
VoutV
REF1 VREF2
FBB ZBB RBB
![Page 20: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/20.jpg)
20
Yield Enhancement using Self-Repair
Self-Repairing SRAM using body-bias can significantly improve design yield
![Page 21: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/21.jpg)
21
Test-Chip of Self-Repairing SRAMVCO
Technology : IBM 0.13 µµµµm
128KB SRAMDual-Vt Triple-well tech. Number of Trans: ~ 7 million Die size: 16mm2VLSI CKT Symp. 2006, ITC 2005
64 KB LVT
Array
16 KB block
VCO
Isol
ated
cel
l
Sensor + Ref. gen. BB gen
Simulation results for 1MB array designed in IBM 0.13 µµµµm
![Page 22: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/22.jpg)
22
Continuous vs Quantized Body Bias
Quantized (3 Level: FBB, ZBB, RBB) body bias scheme is a cost effective solution with good
yield enhancement possibility
![Page 23: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/23.jpg)
Process Tolerance: Register Files
Kim et. al. VLSI Circuit Symposium 2004
![Page 24: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/24.jpg)
Process Compensating Dynamic Circuit Technology
clk
. . .RS0 RS7
D0 D7
RS1
D1
LBL0
LBL1
N0
� Keeper upsizing degrades average performance
Conventional Static Keeper
![Page 25: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/25.jpg)
Process Compensating Dynamic Circuit Technology
3-bit programmable keeper
clk
. . .RS0 RS7
D0 D7
RS1
D1
LBL0
LBL1
N0
b[2:0]
W 2W 4Ws s s
� Opportunistic speedup via keeper downsizing
C. Kim et al. , VLSI Circuits Symp. ‘03
![Page 26: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/26.jpg)
� 5X reduction in robustness failing dies
0
50
100
150
200
250
0.7 0.8 0.9 1.0 1.1 1.2Normalized DC robustness
Num
ber
of d
ies
ConventionalThis work
Noise floor
saved dies
Robustness Squeeze
![Page 27: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/27.jpg)
� 10% opportunistic speedup
0
50
100
150
200
250
300
0.8 0.9 1.0 1.1 1.2Normalized delay
Num
ber
of d
ies
ConventionalThis work
PCD μμμμ = 0.90Conv. μμμμ = 1.00
μμμμ : avg. delay
Delay Squeeze
![Page 28: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/28.jpg)
On-Die Leakage Sensor For Measuring Process Variation
C. Kim et al. , VLSI Circuits Symp. ‘04
83μμμμm
73μμ μμ
m
current reference
comparators
current m
irrors
VBIASgen.
NMOS device
test interface
� High leakage sensing gain – 90nm dual-Vt, Vdd=1.2V, 7 level resolution, 0.66 mW @80Cº
![Page 29: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/29.jpg)
Output codes from leakage sensor
001 010 011 100 101 110 111
Leakage Binning Results
![Page 30: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/30.jpg)
Process detection
Self-Contained Process Compensation
Fab
Assembly
Wafer test
Burn inPackage testCustomer
Leakage measurement
On-die leakage sensor
Program PCD using fuses
![Page 31: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/31.jpg)
Self-Repair: Architecture Level
Agarawal, Roy TVLSI 2005
![Page 32: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/32.jpg)
Fault-Tolerant Cache Architecture
� BIST detects the faulty blocks� Config Storage stores the fault informationIdea is to resize the cache to avoid faulty blocks
during regular operation
Faulty
![Page 33: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/33.jpg)
Mapping Issue
New TAG
INDEX
Off
OffTAG INDEX
Column Address
More than one INDEXare mapped to same
block
Include column address bits into
TAG bits
Resizing is transparent to processor → same memory address
![Page 34: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/34.jpg)
Fault Tolerant Capability
0
50
100
150
200
250
300
350
0 105 210 315 419 524 629 734 839 944 1049
Chi
p C
ount
(N
chip
) Fault statistics
Chips saved by the proposed + redundancy (R=8, r=3)
Chips saved by ECC + redundancy ( R=16)
NFaulty-Cells
More number of saved chipsas compare to ECC ECC fails to save
any chips
� Proposed architecture can handle more number of faulty cells than ECC, as high as 890 faulty cells
� Saves more number of chips than ECC for a givenNFaulty-Cells
![Page 35: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/35.jpg)
CPU Performance Loss
0.0
0.5
1.0
1.5
2.0
2.5
0 105 210 315 419 524 629 734 839
% C
PU
Per
form
ance
Los
s For a 64K cache averaged over SPEC
2000 benchmarks
NFaulty-Cells
� Increase in miss rate due to downsizing of cache
� Average CPU performance loss over all SPEC 2000benchmarks for a cache with 890 faulty cells is ~ 2 %
![Page 36: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/36.jpg)
Logic: Process Tolerance
![Page 37: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/37.jpg)
37
Logic: A New Paradigm for Low-Voltage, Variation Tolerant Circuit Synthesis Using
Critical Path Isolation (CRISTA)
Ghosh, Bhunia, Roy -- ICCAD 2006
![Page 38: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/38.jpg)
38
• Post-Silicon technique for dynamic supply scaling and timing error detection/correction
• Error correction overhead is 1% for a 10% error rate
RAZOR: Dan Ernst et. al., MICRO 2003.
Razor Approach
D
CLK
DelayE
Q
Shadow Latch
Standard Latch
![Page 39: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/39.jpg)
39
Vdd Scaling and Process Tolerance: Conventional Solutions
• Low power:
– Reduce the supply voltage• Error rate increases
– Dual-Vt/dual-VDD assignment• Number of critical paths increases
• Robustness:
– Increase supply voltage• Power dissipation increases
– Upsize the gates• Switching capacitance increases
Low power and robustness: conflicting requirements
Low power and robustness: conflicting requirements
![Page 40: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/40.jpg)
40
CRISTA: Basic Idea
CLK
VDD=1V
VDD<1VVDD<1V
evaluateevaluate based on prediction
non-critical path activationcritical path activation
• Important points:
– Scale down the supply while making delay failures predictable
– Avoid the failures by adaptive clock stretching– Ensure that critical paths are activated rarely
Tc 2Tc
![Page 41: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/41.jpg)
41
path delay
Num
ber
of p
aths
predictable and restricted to a logic section having low activation probability
S2+S3
S1
Design ATc
CLK
S1
S2 S3
Design B
VDD=1V
Design BVDD<1V
Design Considerations for CRISTA
• Few predictable critical paths• Low activation probability of critical paths• Slack between critical and non-critical paths under variations
Design A: conventional design
Design B: proposed design
VDD=1V
![Page 42: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/42.jpg)
42
Case Study: AdderP0 G1 P G1 P2 G2 P3 G3
Co,3Co,2Co,1Co,0Ci,0
P0 G1 P1 G1 P2 G2 P3 G3
Co,3Co,2Co,1Co,0Ci,0 FAFA FAFA FA FA
Crit. path delay=260pslongest non-crit. path delay=165psP = 13uW (1-cycle)
VDD = 1V, TCLK = 260ps
Crit. path delay=330pslongest non-crit. path delay=260psP = 7.4uW (rare 2-cycles, decoder)
VDD = 0.8V, TCLK = 260ps
• Interesting features:
– Single critical path (activated by P0P1P2P3=1 & Ci,0=1)– Low activation probability of critical path
44% power saving by reducing voltage and, operating critical path at 2-cycle and other paths at 1-cycle
44% power saving by reducing voltage and, operating critical path at 2-cycle and other paths at 1-cycle
Can we apply same technique to any random logic?
![Page 43: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/43.jpg)
43
Carry Select Adder
Ci0=
0
Cin= 0
Cout
Co0
Co1
A2
Long latency path (LLP)
Short latency path (SLP1)
Short latency path (SLP2)
A2A3A4A5A2A2A3A4A5
A2A2A3A4A5A2A2A3A4A5
Ci1=
1
M1M2M3M4M5M6M7M8M9M10
A i : i-bit Adder, M k: k-stage MUX
Stage 1 Stage 10 Stage 6
LATENCYPREDICTOR
BLOCK
• ~20% power saving with ~6% area overhead
![Page 44: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/44.jpg)
44
Carry Save Multiplier
HAHA
FA
HA
HA
FAFA
FAFA FA
FA FA
Vector Merging Adder
Critical Path
Longest off-critical path
LATENCYPREDICTOR
BLOCK
• 25% power saving with ~5% area overhead
![Page 45: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/45.jpg)
45
Wallace Tree Multiplier
Vector Merging Adder
Full Adders
Half Adder
Vector Merging Adder
Partial Products
Stage 1
Stage 2
Stage 3
Final Product
Critical pathLongest off-critical path
• 29% power saving with ~4% area overhead
![Page 46: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/46.jpg)
46
Simulation Results
ISO Yield = 92%
20
25
30
35
40
45
12 bits 16 bits 32 bits
% P
ower
sav
ings
5
6
7
8
9
10
11
% A
rea
over
head
ISO Yield = 90%
0
5
10
15
20
25
12 bits 16 bits 32 bits
% P
ower
sav
ings
6
7
8
9
10
%A
rea
over
head
ISO Yield = 96%
26
27
28
29
8 bits 12 bits 16 bits
%P
ower
sav
ings
2.5
3.5
4.5
5.5
6.5
7.5%
Are
a ov
erhe
ad
0
2
4
6
8
10
12
14
6 bits 8 bits 10 bits 12 bits 20 bits
% T
hrou
ghpu
t pen
alty
3.7
3.8
3.9
4
4.1
4.2
4.3
4.4
% A
rea
Ove
rhea
d
% Throughput penalty% Area Overhead
Ripple Carry Adder Carry Save Multiplier
Wallace Tree Multiplier Performance penalty (WTM)
![Page 47: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/47.jpg)
47
1 1 1
1 2
1
'
2
'
1 1
( ,..., ,..., ) ( ,..., 1,..., ) ( ,..., 0,..., )
( ,..., 1,..., ); ( ,..., 0,..., )
n n ni i i
n ni
i i
i
i i
f x x x f x x x f x x x
CF CF
CF f x x x CF f
x x
x
x x
x
x
•
= • = + • == + •
= = = =
CF1(x i=1)
CF2(x i=0)
MU
X
f
inputs
f1
f2x i
Activation probability of cofactors can be reducedHow to choose Control Variable ?
Prob =50%
Prob =50%CF11
CF12
MU
X
x j
f1Prob =25%
Prob =25%
Random Logic: Shannon’s Expansion
![Page 48: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/48.jpg)
48
Original Circuit
CF11
CF21
CF32
CF63
CF53
CF42
MU
X N
etw
ork
Inputs
PO
Inputs
Further Isolation and Slack Creation by Sizing
• Slack creation strategy – Lagrangian Relaxation based sizing (B.C. Paul et. al., DAC 2004) is
used– Non-critical paths are selectively made faster– Critical paths are slightly slowed down
![Page 49: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/49.jpg)
49
0
20
40
60
80
cht sct pcle mux decod cm150a x2 alu2 count
% Im
p. in
pow
er
% imp in power with switching activity = 0.2 100
% imp in power with switching activity = 0.5
MCNC benchmarks, 70nm Process
Simulation Results
• Average power saving = ~50%• Average area overhead = 18%• Avg performance penalty=5.9% (with 4 control variables) for signal
prob=0.5
0
1.0
2.0
3.0
4.0
5.0
6.0
7.0
cht sct pcle mux decod cm150a x2 alu2 count
Are
a (x
103 )
um^2
Original design
Proposed design
MCNC benchmarks, 70nm Process
Power Area
![Page 50: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/50.jpg)
50
Two-Stage Pipeline with Test Logic
Outputs
Com
para
tor
Pre-decoder
Pre-decoder
gclk
SF
Fs
SF
Fs
SF
Fs
Car
ry-L
ook-
ahea
d A
dder
freeze
Stalling Logic
●
LFSR
TM1TM2
CLK
● ●
VDDm
●●
4:1
mux
Outputs
Com
para
tor
SF
Fs
SF
Fs
SF
Fs
Car
ry-L
ook-
ahea
d A
dder
●
TM1
● ●VDDo ●●
2:1
Mux
●●
Low Power Robust Pipeline
Conventional Pipeline
fixed vectors
fixed vectors
●
TM1
TM2
CLK
Clock generator
Test logic
Proposed pipeline
Test logic
Regular pipeline
GDS Layout
Power measurement of proposed pipeline
20% reduction in conventional pipeline
40% extra reduction using CRISTA
~40% power saving with ~13% performance penalty~40% power saving with ~13% performance penalty
![Page 51: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/51.jpg)
51
VDD Scaling, Process Variation, and Quality Trade-off: DCT
Banerjee, Karakonstantis, RoyDesign Automation and Test in Europe (DATE) 2007
![Page 52: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/52.jpg)
52
Basic Idea• All computations are “ not equally important ” for determining outputs
• Identify important and unimportant computations based on output “ sensitivity ”
• Compute important computations with “ higher priority ”
• Delay errors due to variations/ Vdd scaling “ affect only ” non-important computations
• “ Gradual degradation ” in output with voltage scaling and process variations
![Page 53: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/53.jpg)
53
DCT Based Image Compression Process
• DCT is used in current international image/video co ding standards
- JPEG, MPEG, H.261, H.263
QuantizerFDCT
Source image X
Compressed
Image Data
Z = T• X • T '
Z EntropyEncoder
RoundT• Z • T '
Q
JPEG Encoder Block Diagram
V
8××××8 blocks
512××××512 image
1D DCTTranspose
Memory1D DCTX Y ZW
![Page 54: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/54.jpg)
54
High energy components (important outputs 75% energ y)Low energy components (less important outputs)
64
1 2
4
3 5
6 7
9
8
10
11
12
13
14
15
19
20
17
18
2816
27
29
4330
3126 42 44
3225 4541
24
54
4033 5546 53
2321 3934 5247 6156
3522 4838 5751 6260
3736 5049 5958 63
Energy Distribution of a 2D-DCT Output
Can important components be computed with higher priority ?
![Page 55: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/55.jpg)
55(c)(d)
1D- intermediate DCT outputsInput Block
Transpose Memory
(a) (b)
1D-DCT
1D-DCT
Faster
Slo
wer
Com
puta
tion
Fas
ter
Com
puta
tion
Slower
x0
x2
x1
x3
x4
x5
x6
x7
w0
w2
w1
w3
w4
w5
w6
w7
w8 w16 w24 w32 w40 w48 w56
y0
y1
y2
y3
y4
y5
y6
y7
Final DCT outputs
Slo
wer
Com
puta
tion
Fas
ter
Com
puta
tionz0
z1
z2
z3
z4
Computation
Computation
Design Methodology
T.xt
![Page 56: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/56.jpg)
56
<< (3 adders delay)
(2 adders delay)
52( + )x x d•
4w1 6( + )x x d•
70( + )x x d•
3 4( + )x x d•0w
─
+
Path Delays for 1D-DCT outputs
(4 adders delay)
<< ─+
<<
(3 adders delay)
─
70( + )x x e•
3 4( + )x x e•
70( + )x x f•
1 6( + )x x e•
1 6( + )x x f•
3 4( + )x x f•
52( + )x x f•
3 4( + )x x f•
1 6( + )x x f•
52( + )x x e•
2w
6w
─+
+
+
+
─
─
(4 adders delay)
>>
1w
+─<<
>> +─
+─<<
<<+─
70( - )x x a•
3 4( - )x x f•
1 6( - )x x a•
1 6( - )x x f•52( - )x x e•
1 6( - )x x e•
70( - )x x f•
3 4( - )x x a•52( - )x x a•52( - )x x e•
52( - )x x f•
(3 adders delay)
7w
>>
3w+70( - )x x a•
(3 adders delay)
<<>>
3 4( - )x x e•
1 6( - )x x a•3 4( - )x x a•
5w
3 4( - )x x f•
52( - )x x f• <<
<<+
52( - )x x a•
70( - )x x e•
70( - )x x f•3 4( - )x x e•
52( - )x x f• +
+
+─
─
─
─
─
(4 adders delay)
![Page 57: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/57.jpg)
57
w0
w1
w2
w3
w4
w5
w6
w7
Longer Delays
Important Computations
Paths Not Computed
Delay=D1
@ Vdd1
w0
w1
w2
w3
w4
w5
w6
w7
D2 >D1
@Vdd2
D1
@Vdd2
Proposed DCT under Vdd scalingProposed Design with high/low delay paths Scaled Vdd: Longer paths under Vdd scaling
Extreme Scaled Vdd: Shorter paths affected
Paths Not Computed
w0
w1
w2
w3
w4
w5
w6
w7
D4 >D1
@Vdd3
D3 > D1
@Vdd3
Vdd3 < Vdd2 < Vdd1(nominal)D1 @Vdd3
Only DC component
![Page 58: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/58.jpg)
58
0
0.5
1
1.5
2
2.5
3
3.5
4
Pat
h1(w
0)
Pat
h2(w
1)
Pat
h3(w
2)
Pat
h4(w
3)
Pat
h5(w
4)
Pat
h6(w
5)
Pat
h7(w
6)
Pat
h8(w
7)
Computation Paths
Del
ay(n
s)
Conventional DCT Proposed DCT
1D-DCT Path Delay Comparisons
![Page 59: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/59.jpg)
59
Effect of Vdd Scaling
Conventional WTM DCT
1.0 V
0.9 V
0.8 V
FAILS
FAILSFAILS
FAILS
CSHM DCT (2 alphabet)
Proposed DCT
33.2233.2321.97PSNR (dB)
9033710873880490Area (um 2)
3.573.643.2Delay (ns)
2629.825.1Power (mW)
ProposedDCT
DCT with WTM
CSHM DCT(2 alphabets)
1.0V
23.4129PSNR (dB)
11.09(62.8%)17.53(41.2%)Power (mW)
Proposed DCTVdd=0.8VProposed DCT
Vdd=0.9V
Proposed Architecture at Reduced Voltage
Different Architectures at Nominal Voltage
• Graceful degradation of proposed DCT architecture under Vdd scaling ( Vdd can be scaled to 0.75V)
• Conventional architectures fails
![Page 60: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/60.jpg)
Temporal Degradation: NBTI
Kang, Roy, et. al. – TCAD, DAC-07
![Page 61: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/61.jpg)
Temporal Reliability Issues in CMOS Technology
• HCI – Hot Carrier Injection• NBTI – Negative Bias Temperature Instability
� Increase in V T of PMOS with time� The dominant reliability factors in
scaled tech.• TDDB, etc.
![Page 62: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/62.jpg)
NBTI: Negative Bias Temperature Instability
• Interface trap (N IT) generation at the channel interface due to the Si-H bond breaking, when negative gate bias is applied
• With time , VT increases , subthreshold slope (S) increases, mobility degrades,
• Drive current (I DS) reduces and affect the PMOS speed• Overall reduces the lifetime of PMOS
Interface trap generation due to Si-H bond breaking
![Page 63: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/63.jpg)
NBTI: Experimental Data
• PMOS VT degrades as a power of time due to NBTI• Fixed exponent of 1/6 matches the simulation data*
100
101
102
103
104
10-3
10-2
10-1
Stress time [s]
VT
h[V
]ExperimentalSimulation
~t0.17 trend line
* V. Huard and M. Denais, IRPS 2004
![Page 64: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/64.jpg)
Power-law V T degradation Model
NBTI degrades in time of exponent 1/6
Distance into oxide
y
NH
⋅⋅⋅⋅HD t
NH(0)
0
Reaction rate:
Conservation of hydrogen:
(((( ))))012IT H HN N D t= ⋅= ⋅= ⋅= ⋅
(0)0[ ]F
RIT IT H
kk
N N N N⋅ ⋅⋅ ⋅⋅ ⋅⋅ ⋅− =− =− =− =
(((( )))) (((( )))) (((( ))))0
0
,HD t
IT HN t N y t dy⋅⋅⋅⋅
= ⋅= ⋅= ⋅= ⋅∫∫∫∫
(0)0 0[ ] R
ITF IT IT H
dNk N N k N N
dt≈≈≈≈= − −= − −= − −= − −
(((( )))) (((( ))))10 4
2F
IT HR
k NN t D t
k====IT
TOX
q NV
C
⋅ ∆⋅ ∆⋅ ∆⋅ ∆∆ =∆ =∆ =∆ =
![Page 65: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/65.jpg)
Mobility degradation factor
• Mobility degradation due to NBTI is expressed in an additional V T shift, noted as m
• Overall temporal V T shift model is expressed as,
0 0.25
(1 )
oxE
Eox
TOX
mqχ E e t
VC
⋅⋅⋅⋅
∆ = +∆ = +∆ = +∆ = +
![Page 66: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/66.jpg)
Impact of NBTI on circuit performance
![Page 67: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/67.jpg)
Circuit Performance Degradation
• Performance (delay) degradation also follows the po wertrend with same 0.17 exponent
• In CMOS logic, only the rising (L2H) delay’s are affected
100
105
10-2
10-1
100
101
102
time (s)
% c
hang
e
VtInverter DelayROSC delay
109
![Page 68: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/68.jpg)
Circuit Performance degradation cont.
• Delay degradation in ISCAS c432 • Activity factor (switching activity) does not affec t much on the
delay degradation� In reality, activity factor’s are balanced in the n ormal operations
100
102
104
106
108
10-2
10-1
100
101
102
Time (s)
% C
hang
e
Si = 1 (worst case)
Si < 1 (with activity)
Vth
![Page 69: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/69.jpg)
Design method considering the NBTI degradation
![Page 70: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/70.jpg)
NBTI-aware design method
• Over-design is required to guarantee a lifetime stability of the circuit
• LR sizing is used to optimize the circuit� Size the circuit considering the worst-case V T degradation over the
lifetime
Delay Constraint
Required lifetime of the design
Reduced lifetime due to NBTI degradation
time
max
ckt
. del
ay
NBTI-aware over-design
![Page 71: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/71.jpg)
LR Sizing considering NBTI
* C. Chen et. al., TCAD 1999
1. Delay Constraint (DMAX)
2. Required Lifetime (TLife)
Calibrate switching activity’s (Si) in each node
Compute VT shift in each node
considering Si
LR sizing with delay constraint
DMAX
NBTI-aware Design
Lifetime ConstraintA new design constraint
Power-law V T model
Optimal sizing fromLagrangian Relaxation*
Guarantee a lifetime stabilityunder NBTI degradations
![Page 72: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/72.jpg)
Simulation results
7.609.50115.1148c74L85
6.837.90131.9188c74283
9.6310.3577.292c74182
8.689.89194.6372c74181
7.869.00597.33638c3540
8.539.18513.51582c1908
8.069.203681816c499
7.328.90525590c432
Si < 1Si = 1
% delay degrad. (10 yrs)Nominal delay (ps)
No. ofTrans.
Circuit
* All benchmarks are synthesized in BPTM 70nm techn ology
1. Delay degradation in ISCAS85 benchmark circuits after 10 years
![Page 73: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/73.jpg)
Simulation results cont.
5.85.8542.59120c74L85
10.010.066.71125c74283
11.211.331.180c74182
9.09.45111.1180c74181
3.313.441146.5500c3540
6.687.13489.67470c1908
6.717.82581.47340c499
13.614.8196.7385c432
Si < 1Si = 1
% Area overheadNominal area (um)
Nominal delay (ps)
Circuit
* All benchmarks are synthesized in BPTM 70nm techn ology
2. Area overhead in NBTI-aware sizing
![Page 74: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/74.jpg)
Negative Bias Temperature Instability
• PMOS specific Aging Effect• Generation of (+) traps• Reaction-Diffusion (RD) model*• Time exponent ~ 1/6
x
P+P+
GATE
OXIDE
DRAIN
VBODY = 0V
x
H
Si Si Si Si
HH
H
VGS < 0V
VS = 0VVD = 0V
n-sub
xx x HH H
H2
H2
H2
H2
H2
*M. A. Alam, IEDM’03
After NBTI degradation
(((( )))) (((( ))))10 6
2F
IT HR
k NN t D t
k====
ITT
OX
q NV
C⋅ ∆⋅ ∆⋅ ∆⋅ ∆
∆ =∆ =∆ =∆ =
![Page 75: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/75.jpg)
NBTI in Digital Circuits
VL
WL
BLBLB
VR
1
5
o1
o2
3
24
8
7
6
Logic Circuits• fMAX decreases ↓↓↓↓• Timing failure with time
Memory Circuits• Static Noise Margin (SNM) ↓↓↓↓• Read & Write Stability• Parametric Yield ↓↓↓↓
Temporal VTh increase in PMOS affects critical performance factors of digital VLSI circuits
i1
i3
i2
![Page 76: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/76.jpg)
NBTI: Random Logic Circuits
• ISCAS’85 Benchmark Circuits, PTM 65nm • Gate delay: analytical delay model considering NBTI• Circuit delay: NBTI-aware Static Timing Analysis (STA)
• Circuit fMAX � time exponent n ~ 1/6
PTM 65nmISCAS’85
Benchmarks
3 years Lifetime
8% fMAX decrease
n~1/6
100
102
104
106
108
100
Time (s)f M
AX
redu
ctio
n (%
)
c2670c5315c1908c499c3540c74181
26.930.1923.803NOR
26.821.8917.262NOR
14.822.4519.573NAND
17.919.8816.862NAND
21.816.7713.771INV
3 yearst=0Δ (%)
Delay (ps)fanin
LogicCell
Delay Degrad. STD cells
![Page 77: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/77.jpg)
NBTI: 6T SRAM Cell
104 106 108100
101
Time (s)
Deg
rada
tion
in S
NM
(%
)
PTM 60nm, 125 °°°°CCR = 1.33PR = 0.67
1/6 trend line
Static Noise Margin (SNM)
� SNM degrades by more than 10% in 3 years� % SNM Degradation ���� time exponent n ~ 1/6� WM improves with time under NBTI
0.43 0.44 0.45 0.46 0.47 0.48 0.490
0.2
0.4
0.6
0.8
1
Write Margin (V)
CD
F
t=0t=108
t=107
t=106
WM improveswith time
Distribution in Write Margin
![Page 78: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/78.jpg)
Design for Reliability under NBTI
580 600 620 640 660 680 700 720
550
600
650
700
750
800
Delay (ps)
Are
a (u
m)
INITTR-BasedCell-Based
Cell-based
Area SavingFrom TR-based
10psc1908
� Gate Sizing applied to guarantee lifetime functiona lity of design
� 11.7% overhead for Cell-based sizing� 6.13% overhead for TR-based sizing
� 45% improvement in area overhead� Runtime complexity for TR-based sizing is identical to that of Cell-based sizing
� Simulation Setup� Synthesized in PTM 65nm� 1/6 VTh degradation model� 125°°°°C Stress temperature� 50% Signal Probability at PI’s
![Page 79: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/79.jpg)
IDDQ based NBTI Characterization
• Test Circuit Fabricated• 1000 stage INV chain• DC Stress signal @Vin
• IDDQ measurement @GND
1.6 (nm)Tox
1.2 (V)VDD
209I/O Pin
20 (mm2)Die Size
CMOS 130nmTechnology
Inverter Chain
MicrophotographLayout
Vin
VDD
IDDQ Measurement1000 stages
![Page 80: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/80.jpg)
Correlation between IDDQ & fMAX
• DM < 3ms, Temp=125°C, Vstress=1.7V• IDDQ degradation � n~1/6 during
• Clear signature of NBTI
• Correlation between IDDQ and fMAX can be used to predict circuit performance degradation under NBTI
MAX – MIN < 1.2%
10
100
101
% I D
DQ
(Vin
=0.0
) de
grad
atio
n
Time (s)100 101 102 103 104 105
-2
0
2
4
6
8
% I D
DQ
(Vin
=1.7
) de
grad
atio
nIDDQ (Vin=0.0)IDDQ (Vin=1.7)
Vstress = 1.7V @150°°°°C
n ~ 1/6
n > 1/6
S1
S2
Linear relationship
103
107
105
1075°°°°C, 65nm PTMISCAS Benchmarks
100100
101
102
fMAX reduction (%)
I DD
Qre
duct
ion
(%)
c2670c5315c1908c499c3540c74181
1/6( ) ( )
(0) (0)
( :constant)
leak MAXleak freq
leak M
freq leak
AX
I t f tR t R
I f
R K KR
∆ ∆= ∝
= ×
∝ =
% IDDQ decrease% fMAX increase n~1/6
![Page 81: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/81.jpg)
IDDQ based Characterization Technique
Des
ign
phas
eP
ost-
silic
on p
hase
• Circuit-level NBTI Reliability Characterization
• IDDQ test is used• Expensive fMAX testing is avoided
(or minimized)• Accurate circuit level performance
degradation can be predicted
• IC specific burn-in to qualify the target produce
• Efficient way of field monitoring: dynamic local signature of produce usage
• Possible usage in other reliability sources; HCI
Initial Characterization•Compute Rleak, RfMAX
•Compute KfMAX = Rleak / RfMAX
Reliability Extrapolation•For each IDDQ measurement sample, Rleak is computed
•RfMAX = KfMAX X Rleak
•Estimate fMAX degradation
Lifetime Projection•Project IDDQ using KfMAX
NBTI Characterization Report
Temp. Dependency
![Page 82: Process-Tolerant Low-Power Design for the Nano …vlsi/courses/ee695kr/s...Process-Tolerant Low-Power Design for the Nano-meter Regime Exponential Increase in Leakage Non-Silicon Technology](https://reader030.fdocuments.in/reader030/viewer/2022021717/5b34f7df7f8b9a7e4b8ca720/html5/thumbnails/82.jpg)
82
Conclusions
�Process Variation and Process Tolerance is becoming important
�There is a need to optimize designs considering power/performance/yield