EE241B : Advanced Digital Circuits
Transcript of EE241B : Advanced Digital Circuits
inst.eecs.berkeley.edu/~ee241b
Borivoje Nikolić
EE241B : Advanced Digital Circuits
Lecture 20 – Dynamic Voltage Scaling
EECS241B L20 DYNAMIC VOLTAGE SCALING 1
Supreme Court hands Google a victory in a multibillion-dollar case against OracleBy Brian Fung, CNN Business, Mon April 5, 2021
The Supreme Court has handed Google a win in a decade-old case in software development, holding that the technology giant did not commit copyright infringement against Oracle when it copied snippets of programming language to build its Android operating system.
Announcements
• Assignment 3 due this week• Quiz next Tuesday
EECS241B L20 DYNAMIC VOLTAGE SCALING 2
Outline
• Power-performance tradeoffs through supply voltage• Multiple supplies
• Dynamic voltage scaling
3EECS241B L20 DYNAMIC VOLTAGE SCALING
5.F Multiple Supplies
4EECS241B L20 DVS
Power /Energy Optimization Space
Constant Throughput/Latency Variable Throughput/Latency
Energy Design Time Sleep Mode Run Time
Active
Logic designScaled VDD
Trans. sizingMulti-VDD
Clock gatingDFS, DVS
Leakage
Stack effectsTrans sizingScaling VDD
+ Multi-VTh
Sleep T’sMulti-VDD Variable VTh
+ Input control
DVS Variable VTh
EECS241B L20 DVS 5
Multiple Supply Voltages• Block-level supply assignment (“power domains” or “voltage islands”)
• Higher throughput/lower latency functions are implemented in higher VDD
• Slower functions are implemented with lower VDD
• Often called “Voltage islands”• Separate supply grids, level conversion performed at block boundaries
• Multiple supplies inside a block• Non-critical paths moved to lower supply voltage• Level conversion within the block• Physical design challenging
EECS241B L20 DVS 6
Power Domains
EECS241B L20 DVS 7
Practical Examples
• Intel Skylake (ISSCC’16)• Four power planes indicated by colors
EECS241B L20 DVS 8
Practical Examples
• Intel 28-core Skylake-SP (ISSCC’18)
EECS241B L20 DVS 9
Leakage Issue
• Driving from VDDL to VDDH Level converter
EECS241B L20 DVS 10
Multiple Supplies Within A Block• Downsizing, lowering the supply on the critical path will lower the
operating frequency• Downsize (lowering supply) non-critical paths
• Narrows down the path delay distribution• Increases impact of variations
EECS241B L20 DVS 11
Multiple Supplies in a Block
Lower VDD portion is shaded
CVS StructureConventional Design
Critical Path
Level-Shifting F/F
Critical Path
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
M.Takahashi, ISSCC’98. “Clustered voltage scaling”EECS241B L20 DVS 12
Multiple Supplies in a Block
CVS Layout:
Usami’98
EECS241B L20 DVS 13
Level-Converting Flip-Flop
CK
CK
CLK CKCK
D
VL
CK
Q
CK
VH
M1 M2
CK
EECS241B L20 DVS 14
5.F Dynamic Voltage Scaling
15EECS241B L21 DVS2
Power /Energy Optimization Space
Constant Throughput/Latency Variable Throughput/Latency
Energy Design Time Sleep Mode Run Time
Active
Logic designScaled VDD
Trans. sizingMulti-VDD
Clock gatingDFS, DVS
Leakage
Stack effectsTrans sizingScaling VDD
+ Multi-VTh
Sleep T’sMulti-VDD Variable VTh
+ Input control
DVSVariable VTh
EECS241B L21 DVS2 16
Adaptive Supply Voltages
EECS241B L21 DVS2 17
Processors for Portable Devices
1000
100
10
1Perfo
rman
ce (M
IPS)
Processor Energy (Watt*sec)1 100.1
• Eliminate performance ↔ energy trade-off
PDAs
Pocket-PCs
NotebookComputers
DynamicVoltageScaling
BurdISSCC’00
EECS241B L21 DVS2 18
Typical MPEG IDCT Histogram
EECS241B L21 DVS2 19
timeSystem Idle
DesiredThroughput
Maximum Processor Speed
Background andhigh-latency processes
Compute-intensive andlow-latency processes
• Maximize Peak Throughput• Minimize Average Energy/operation
System Optimizations:BurdISSCC’00
Processor Usage Model
EECS241B L21 DVS2 20
Compute ASAP:
Deliv
ered
Thr
ough
put
Clock Frequency Reduction:
Excessthroughput
Always high throughput
Energy/operation remains unchanged…while throughput scaled down with fCLK
fCLKReduced
time
time
Common Design Approaches (Fixed VDD)
EECS241B L21 DVS2 21
0
0.5
1
0 0.5 1
Ener
gy/o
pera
tion
Throughput ( fCLK)
Constant supply voltage
Reduce VDD, slowcircuits down.
~10x EnergyReduction
3.3V
1.1V
BurdISSCC’00
Scale VDD with Clock Frequency
∝EECS241B L21 DVS2 22
InverterRingOscRegFileSRAM
1.0
0.5
0VT 2VT 3VT 4VT
Norm
alize
d m
ax. f
CLK
VDD
Delay tracks within +/- 10%BurdISSCC’00
CMOS Circuits Track Over VDD
EECS241B L21 DVS2 23
time
• Dynamically scale energy/operation with throughput.• Always minimize speed → minimize average energy/operation.• Extend battery life up to 10x with the exact same hardware!
Vary fCLK,VDDDe
liver
edTh
roug
hput
1 2 Dynamically adapt
BurdISSCC’00
Dynamic Voltage Scaling (DVS)
EECS241B L21 DVS2 24
• DVS requires a voltage scheduler (VS).• VS predicts workload to estimate CPU cycles.• Applications supply completion deadlines.
0
20
40
60
80
0 0.2 0.4 0.6 1.0 1.2 1.40.8
Processor Speed (MPEG)
F DESI
RED
(MHz
)
Time (sec)
Operating System Sets Processor Speed
DESIREDCPU cycles F
time=
∆
EECS241B L21 DVS2 25
RST Counter
Latch
Digital Loop Filter
L CDD
VDD
PENAB
NENABΣFERR
FMEAS
f1MHz
0110
100 FDES
+Register
fCLK
Ring Oscillator Processor
IDD
• Feedback loop sets VDD so that FERR → 0.• Ring oscillator delay-matched to CPU critical paths.• Custom loop implementation → Can optimize CDD.
7
Buck converter
Set byO.S.
BurdISSCC’00
Converter Loop Sets VDD, fCLK
EECS241B L21 DVS2 26
• Circuit design constraints. (Functional verification)
• Circuit delay variation. (Timing verification)
• Noise margin reduction. (Power grid, coupling)
• Delay sensitivity. (Local power distribution)
Design verification complexity similar to high-performance processor design @ fixed VDD
Design Over Wide Range of Voltages
EECS241B L21 DVS2 27
• Cannot use NMOS pass gates – fails for VDD < 2VT.
• Functional verification only needed at one VDD value.
InverterRingOscRegFileSRAM
1.0
0.5
0VT 2VT 3VT 4VT
Norm
alize
d m
ax. f
CLK
VDD
BurdISSCC’00
Delay Variation & Circuit Constraints
EECS241B L21 DVS2 28
+40
+20
0
-20Perc
ent D
elay V
ariat
ion
VDDVT 2VT 3VT 4VT
• Timing verification only needed at min. & max. VDD.
Delay relative to ring oscillator
Gate
Interconnect
DiffusionSeries
Four extreme cases ofcritical paths:
All vary monotonically with VDD.
BurdISSCC’00
Relative Delay Variation
EECS241B L21 DVS2 29
Multiple Path Tracking
A. Drake, ISSCC’07EECS241B L21 DVS2 30
Multiple Path Tracking
Cho, ISSCC’16EECS241B L21 DVS2 31
Tracking with SRAM in Critical Path
Mismatch between logic and SRAM
SRAM multiplictive replica
Niki, JSSC’11EECS241B L21 DVS2 32
Alternative: Error Detection
Bull, ISSCC’2010EECS241B L21 DVS2 33
• Static CMOS logic.
• Ring oscillator.
• Dynamic logic (& tri-state busses).
• Sense amp (& memory cell).
Max. allowed |dVDD/dt| → Min. CDD = 100nF (0.6µm)Circuits continue to properly operate as VDD changes
Design for Dynamically Varying VDD
EECS241B L21 DVS2 34
VDD
• Static CMOS robustly operates with varying VDD.
Vin = 0 Vout = VDDrds|PMOS
CL
Vout
max. τ = 4ns
0.6µm CMOS: |dVDD/dt| < 200V/µs
Static CMOS Logic
EECS241B L21 DVS2 35
Ring Oscillator
• Output fCLK instantaneously adapts to new VDD.
60 80 100 120 140 160 180 200 220 240 260
0
1
2
3
4Vo
lts
Time (ns)
fCLK
VDD
Simulated with dVDD/dt = 20V/µs
EECS241B L21 DVS2 36
VDD
Vout
Vin
clk
clk
Volts
Time
VoutVDDFalse logic low: ∆VDD > VTP
Latch-up: ∆VDD > Vbe
Errors
• Cannot gate clock in evaluation state.• Tri-state busses fail similarly → Use hold circuit.
0.6µm CMOS: |dVDD/dt| < 20V/µs
clk = 1
∆VDD
−∆VDD
Dynamic Logic
EECS241B L21 DVS2 37
100
80
60
40
20
00 1 2 3 4 5 6
Dhry
ston
e 2.1
MIPS
Energy (mW/MIPS)
85 MIPS @5.6 mW/MIPS
(3.8V)
6 MIPS @0.54 mW/MIPS
(1.2V)
• Dynamic operation can increase energy efficiency > 10x.
x
Static VDD
Dynamic VDD
BurdISSCC’00
Measured System Performance & Energy
EECS241B L21 DVS2 38
VDD-Hopping
MPEG-4 encoding
Nor
mal
ized
pow
er
0
0.2
0.4
0.6
0.8
1
2 3 8
# of frequency levels1
Transition time
between ƒ levels
= 200µs
Time
n-th slice finished hereNext milestone
#n #n+1
Application slicing and software feedback guarantee real-time operation.
Two hopping levels are sufficient.EECS241B L21 DVS2 39
Dithering Between Supply Levels
Keller et al, ESSCIRC’16
EECS241B L21 DVS2 40
• Done with switched-capacitor DC-DC converters which efficiently work only at discrete levels
Dithering Between Supply Levels
• Dithering fills in between fixed DC-DC modes
EECS241B L21 DVS2 41
Keller et al, ESSCIRC’16
Next Lecture
• Low-power design• Leakage
EECS241B L20 DYNAMIC VOLTAGE SCALING 42