10/6/2015 1Intelligent Systems and Soft Computing Lecture 0 What is Soft Computing.
Energy Efficient Computing in Nanoscale...
Transcript of Energy Efficient Computing in Nanoscale...
![Page 1: Energy Efficient Computing in Nanoscale CMOScdworkshop.eit.lth.se/fileadmin/eit/group/71/2016/2016_Lund_Works… · Near Threshold Voltage (NTV) computing 0 0. 2 0. 4 0. 6 0. 8 1](https://reader034.fdocuments.in/reader034/viewer/2022052013/602a8b1397688603e96e94ba/html5/thumbnails/1.jpg)
Energy Efficient Computing in Nanoscale CMOS
Vivek De
Intel Fellow
Director of Circuit Technology Research
Intel Labs
![Page 2: Energy Efficient Computing in Nanoscale CMOScdworkshop.eit.lth.se/fileadmin/eit/group/71/2016/2016_Lund_Works… · Near Threshold Voltage (NTV) computing 0 0. 2 0. 4 0. 6 0. 8 1](https://reader034.fdocuments.in/reader034/viewer/2022052013/602a8b1397688603e96e94ba/html5/thumbnails/2.jpg)
2
2
Internet of Everything (IoE)
Need end-to-end energy efficiency
![Page 3: Energy Efficient Computing in Nanoscale CMOScdworkshop.eit.lth.se/fileadmin/eit/group/71/2016/2016_Lund_Works… · Near Threshold Voltage (NTV) computing 0 0. 2 0. 4 0. 6 0. 8 1](https://reader034.fdocuments.in/reader034/viewer/2022052013/602a8b1397688603e96e94ba/html5/thumbnails/3.jpg)
3
More, better transistors
More cores
Continued benefitsfrom Moore’s Law
Moore’s Law scaling
45nm
+
2007
105
103
107
10914nm
Trigate
2014
![Page 4: Energy Efficient Computing in Nanoscale CMOScdworkshop.eit.lth.se/fileadmin/eit/group/71/2016/2016_Lund_Works… · Near Threshold Voltage (NTV) computing 0 0. 2 0. 4 0. 6 0. 8 1](https://reader034.fdocuments.in/reader034/viewer/2022052013/602a8b1397688603e96e94ba/html5/thumbnails/4.jpg)
4
4
Dynamic platform control
Deliver best user experience under constraints
Scalable On-die Interconnect FabricScalable On-die Interconnect Fabric
Graphics
Video
SpecialPurposeEngines
IntegratedMemory
Controllers
Off Die interconnect
Cache Cache Cache
Last LevelCache
Last LevelCache
Last LevelCache
Scalable On-die Interconnect FabricScalable On-die Interconnect Fabric
Graphics
Video
SpecialPurposeEngines
IntegratedMemory
Controllers
Off Die interconnect
Cache Cache Cache
Last LevelCache
Last LevelCache
Last LevelCache
Scalable On-die Interconnect FabricScalable On-die Interconnect Fabric
Graphics
Video
SpecialPurposeEngines
Graphics
Video
SpecialPurposeEngines
IntegratedMemory
Controllers
Off Die interconnect
Cache Cache Cache
Last LevelCache
Last LevelCache
Last LevelCache
Cache Cache CacheCache Cache Cache
Last LevelCache
Last LevelCache
Last LevelCache
DynamicV/F control
IndependentV/F control
regions
Workload-basedcore activation
& shutdown
Scenario-basedpower allocation
Maximize
performance
& efficiency
![Page 5: Energy Efficient Computing in Nanoscale CMOScdworkshop.eit.lth.se/fileadmin/eit/group/71/2016/2016_Lund_Works… · Near Threshold Voltage (NTV) computing 0 0. 2 0. 4 0. 6 0. 8 1](https://reader034.fdocuments.in/reader034/viewer/2022052013/602a8b1397688603e96e94ba/html5/thumbnails/5.jpg)
5
5
Near Threshold Voltage (NTV) computing
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
0 0.2 0.4 0.6 0.8 1 1.2
32nm CPU process32nm SOC process
400mV
500mV
3X
5X
Voltage /Freq operating points
No
rma
lize
dE
ne
r gy
/ cy
cle
Normal operating range NTVSub-threshold
![Page 6: Energy Efficient Computing in Nanoscale CMOScdworkshop.eit.lth.se/fileadmin/eit/group/71/2016/2016_Lund_Works… · Near Threshold Voltage (NTV) computing 0 0. 2 0. 4 0. 6 0. 8 1](https://reader034.fdocuments.in/reader034/viewer/2022052013/602a8b1397688603e96e94ba/html5/thumbnails/6.jpg)
6
6
NTV IA processor
Technology 32nm High-K Metal Gate
Interconnect 1 Poly, 9 Metal (Cu)
Transistors 6 Million (Core)
Core Area 2mm2
IA-32
CorePLL
JTAG
I/O Area
I/O Area
I/O
Are
a
5 m
m
5 mm
IA-32 Core
Logic
Scan
RO
M
L1$-I L1$-D
Level Shifters + clk spine
1.1 mm
1.8
mm
![Page 7: Energy Efficient Computing in Nanoscale CMOScdworkshop.eit.lth.se/fileadmin/eit/group/71/2016/2016_Lund_Works… · Near Threshold Voltage (NTV) computing 0 0. 2 0. 4 0. 6 0. 8 1](https://reader034.fdocuments.in/reader034/viewer/2022052013/602a8b1397688603e96e94ba/html5/thumbnails/7.jpg)
7
7
NTV design techniques
m1 m2
m3 m4
m5
m7 m8
wrwl
rdwl
wrb
l#
rdb
l
bitx bit
m9
wrwl#
m10
m6
Modified Register File Cell (L1$)
Robust Flop Topologies
Multi-corner design
optimizations(SCL)
0.5 0.6 0.7 0.8 0.9 1 1.1
Fre
qu
en
cyVoltage (V)
Optimization Corners
Variation-aware design2X min Z, 40% lib cells used
0.4 0.5 0.6 0.7 0.8 0.9 1
Voltage (V)
Delay spread due to random variations
2-i
np
ut
NA
ND
gat
e d
ela
y
4:1 Mux
“1”
“1”
“1”
“0”
“0”
“0”
“1”
“1”
“1”
“0”
“0”
“0”
Narrow muxes No stack height > 2
input
output
vcch vcch
vcch
vccl
vcch
input
output
vcch vcch
vcch
vccl
vcch
Robust level converters
![Page 8: Energy Efficient Computing in Nanoscale CMOScdworkshop.eit.lth.se/fileadmin/eit/group/71/2016/2016_Lund_Works… · Near Threshold Voltage (NTV) computing 0 0. 2 0. 4 0. 6 0. 8 1](https://reader034.fdocuments.in/reader034/viewer/2022052013/602a8b1397688603e96e94ba/html5/thumbnails/8.jpg)
8
8
NTV IA – powered by solar cell!
![Page 9: Energy Efficient Computing in Nanoscale CMOScdworkshop.eit.lth.se/fileadmin/eit/group/71/2016/2016_Lund_Works… · Near Threshold Voltage (NTV) computing 0 0. 2 0. 4 0. 6 0. 8 1](https://reader034.fdocuments.in/reader034/viewer/2022052013/602a8b1397688603e96e94ba/html5/thumbnails/9.jpg)
9
9
Power performance measurements
915MHz
500MHz
100MHz
3MHz
737mW
174mW
17mW2mW
0
100
200
300
400
500
600
700
800
1
10
100
1000
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2
0.55 0.55 0.55 0.55 0.6 0.7 0.8 0.9 1 1.1 1.2
To
tal P
ow
er (m
W)F
req
uen
cy (
MH
z)
32nm CMOS, 25oC
Logic Vcc / Memory Vcc (V)
![Page 10: Energy Efficient Computing in Nanoscale CMOScdworkshop.eit.lth.se/fileadmin/eit/group/71/2016/2016_Lund_Works… · Near Threshold Voltage (NTV) computing 0 0. 2 0. 4 0. 6 0. 8 1](https://reader034.fdocuments.in/reader034/viewer/2022052013/602a8b1397688603e96e94ba/html5/thumbnails/10.jpg)
10
10
Power components
Logic Dynamic Power
Logic Leakage Power
Memory Dynamic Power
Memory Leakage Power
81%
11%
3%
5%
53%
27%
15%
5%
4%
33%
62%
1%
Logic Vcc: 1.2V
Memory Vcc: 1.2V
Vcc-max (Super-Threshold) Vcc-opt (Near-Threshold) Vcc-min (Sub-Threshold)
Logic Vcc: 0.45V
Memory Vcc: 0.55V
Logic Vcc: 0.28V
Memory Vcc: 0.55V
![Page 11: Energy Efficient Computing in Nanoscale CMOScdworkshop.eit.lth.se/fileadmin/eit/group/71/2016/2016_Lund_Works… · Near Threshold Voltage (NTV) computing 0 0. 2 0. 4 0. 6 0. 8 1](https://reader034.fdocuments.in/reader034/viewer/2022052013/602a8b1397688603e96e94ba/html5/thumbnails/11.jpg)
11
11
Minimum energy operation
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2
Total Energy
Leakage Energy
Dynamic Energy
0.55 0.55 0.55 0.55 0.6 0.7 0.8 0.9 1 1.1 1.2
En
erg
y/C
ycle
(nJ
)
32nm CMOS, 25oC
4.7X
Logic Vcc / Memory Vcc (V)
![Page 12: Energy Efficient Computing in Nanoscale CMOScdworkshop.eit.lth.se/fileadmin/eit/group/71/2016/2016_Lund_Works… · Near Threshold Voltage (NTV) computing 0 0. 2 0. 4 0. 6 0. 8 1](https://reader034.fdocuments.in/reader034/viewer/2022052013/602a8b1397688603e96e94ba/html5/thumbnails/12.jpg)
12
12
NTV and variability
100
1000
0 100 200 300 400 500 600 700 800 900 1000
Fast Medium Slow
Leakage Comparison
Slow 1.0X
Medium 2.5X
Fast 7.5X
Frequency (MHz)
En
erg
y/C
ycle
(p
J)
16%30%
22%
28%
18%
32nm CMOS, 25oC
![Page 13: Energy Efficient Computing in Nanoscale CMOScdworkshop.eit.lth.se/fileadmin/eit/group/71/2016/2016_Lund_Works… · Near Threshold Voltage (NTV) computing 0 0. 2 0. 4 0. 6 0. 8 1](https://reader034.fdocuments.in/reader034/viewer/2022052013/602a8b1397688603e96e94ba/html5/thumbnails/13.jpg)
13
13
Voltage-frequency marginsVoltage
Frequency
Voltage
Frequency
V m
arg
inIR drop
Inductive
droops
Load line
variations
V variation T variation
F margin
Nominal T
Worst T
V m
arg
in
F margin
Voltage
Frequency
Voltage
Frequency
Aging Path activity
BOL
EOL
V m
arg
in
F margin
Nominal
Worst
F margin
V m
arg
in
MIS
Signal
coupling
Critical
path
![Page 14: Energy Efficient Computing in Nanoscale CMOScdworkshop.eit.lth.se/fileadmin/eit/group/71/2016/2016_Lund_Works… · Near Threshold Voltage (NTV) computing 0 0. 2 0. 4 0. 6 0. 8 1](https://reader034.fdocuments.in/reader034/viewer/2022052013/602a8b1397688603e96e94ba/html5/thumbnails/14.jpg)
14
14
Dynamic adaptation & reconfiguration
Adapt & reconfigure for best power-performance
Dynam
ic C
ontr
ol U
nit
Processor
Aging
Sensor
Thermal
Sensor
Voltage
Sensor
Current
Sensor
Sensors
Voltage
Control
Frequency
Control
Configuration
Control
Change V
Change F
Reconfigure
![Page 15: Energy Efficient Computing in Nanoscale CMOScdworkshop.eit.lth.se/fileadmin/eit/group/71/2016/2016_Lund_Works… · Near Threshold Voltage (NTV) computing 0 0. 2 0. 4 0. 6 0. 8 1](https://reader034.fdocuments.in/reader034/viewer/2022052013/602a8b1397688603e96e94ba/html5/thumbnails/15.jpg)
15
15
Dynamic V & F adaptation
Clocking
Input
Buffer
Sensors
& Analog
DAB
TCP/IP
Processor
Core
JTAG
No
ise
ge
n
No
ise
ge
n
TCP/IP
processor
PLL0
PLL1
DAB
Control
Thermal
sensor
Div
PMOS
CBG
NMOS
CBG
core clk
gate
Droop
sensor
Time
Tem
p
Time
Vc
c
PLL2
NMOS body bias
PMOS body bias
I/O clk
Noise
injector
CL
OC
KIN
GC
ON
TR
OL
F0
Inp
ut b
uffe
r
Ou
tpu
t po
rt
F1
F2
1st droop
2nd droop 3rd droop
ctrl
PL
L c
om
man
d
VR
M
• Adapt F/V to V/T change reduce V/T margin
• Adapt F/V to aging reduce aging margin
Environment-aware dynamic adaptation
Prototype chip in 90nm
Source: Intel
Source: Intel
![Page 16: Energy Efficient Computing in Nanoscale CMOScdworkshop.eit.lth.se/fileadmin/eit/group/71/2016/2016_Lund_Works… · Near Threshold Voltage (NTV) computing 0 0. 2 0. 4 0. 6 0. 8 1](https://reader034.fdocuments.in/reader034/viewer/2022052013/602a8b1397688603e96e94ba/html5/thumbnails/16.jpg)
16
16
Resilient platforms
Resiliency for performance, efficiency & reliability
Circuit & Design
Microarchitecture
Microcode
Firmware
VM
OS
Programming System
Applications
Low
er
err
or
rate
Less re
covery
overh
ead
Less s
ilic
on o
verh
ead
Resiliency framework
Systemadaptation
Systemrecovery
Errorcorrection
Faultconfinement
Faultdiagnosis
Errordetection
Systemreconfiguration
Resilie
nt
pla
tfo
rm
featu
res
![Page 17: Energy Efficient Computing in Nanoscale CMOScdworkshop.eit.lth.se/fileadmin/eit/group/71/2016/2016_Lund_Works… · Near Threshold Voltage (NTV) computing 0 0. 2 0. 4 0. 6 0. 8 1](https://reader034.fdocuments.in/reader034/viewer/2022052013/602a8b1397688603e96e94ba/html5/thumbnails/17.jpg)
17
17
Resilient & adaptive core
DE
RA
EX
ME
M
X
WB
CORE
CLOCK GENERATOR
errorreplay
PC
refclkPLL
IFDuty
Cycle½FCLK
16KB
I$
16KB
D$
RF
Error Control Unit
Adaptive Clock Control
÷
I/O
DC
ac
he
ICa
ch
e
Co
re
Clo
ck
RFJTAG
I/O
DC
ac
he
ICa
ch
e
Co
re
Clo
ck
RFJTAG
1.45GHz at 1.0VCore FMAX
Technology 45nm CMOS
Die Area 13.64 mm2
Core Area 0.39 mm2
Core Power 135mW at 1.0V
1.45GHz at 1.0VCore FMAX
Technology 45nm CMOS
Die Area 13.64 mm2
Core Area 0.39 mm2
Core Power 135mW at 1.0V
![Page 18: Energy Efficient Computing in Nanoscale CMOScdworkshop.eit.lth.se/fileadmin/eit/group/71/2016/2016_Lund_Works… · Near Threshold Voltage (NTV) computing 0 0. 2 0. 4 0. 6 0. 8 1](https://reader034.fdocuments.in/reader034/viewer/2022052013/602a8b1397688603e96e94ba/html5/thumbnails/18.jpg)
18
18
Performance & efficiency gains
0
5
10
15
20
25
edgedetect linkedlist bubble
Application
Th
rou
gh
pu
t G
ain
(%
)EDS TRC
22%
41%
10% VCC Droop
0.0
0.4
0.8
1.2
1.6
0.0 0.2 0.4 0.6 0.8 1.0 1.2
Throughput (BIPS)
To
tal E
ne
rgy
(m
J)
Conventional
EDS
TRC
22%
41%
10% VCC Droop
0.0
0.4
0.8
1.2
1.6
0.0 0.2 0.4 0.6 0.8 1.0 1.2
Throughput (BIPS)
To
tal E
ne
rgy
(m
J)
Conventional
EDS
TRC
Input Image
Output Image:Resiliency ON
Output Image:Resiliency OFF
![Page 19: Energy Efficient Computing in Nanoscale CMOScdworkshop.eit.lth.se/fileadmin/eit/group/71/2016/2016_Lund_Works… · Near Threshold Voltage (NTV) computing 0 0. 2 0. 4 0. 6 0. 8 1](https://reader034.fdocuments.in/reader034/viewer/2022052013/602a8b1397688603e96e94ba/html5/thumbnails/19.jpg)
19
19
Integrated voltage regulators
![Page 20: Energy Efficient Computing in Nanoscale CMOScdworkshop.eit.lth.se/fileadmin/eit/group/71/2016/2016_Lund_Works… · Near Threshold Voltage (NTV) computing 0 0. 2 0. 4 0. 6 0. 8 1](https://reader034.fdocuments.in/reader034/viewer/2022052013/602a8b1397688603e96e94ba/html5/thumbnails/20.jpg)
20
20
Fully integrated VR
![Page 21: Energy Efficient Computing in Nanoscale CMOScdworkshop.eit.lth.se/fileadmin/eit/group/71/2016/2016_Lund_Works… · Near Threshold Voltage (NTV) computing 0 0. 2 0. 4 0. 6 0. 8 1](https://reader034.fdocuments.in/reader034/viewer/2022052013/602a8b1397688603e96e94ba/html5/thumbnails/21.jpg)
21
Energy efficient interconnects
21
10
100
0 10 20 30 401
Channel Loss @ Symbol rate (dB)
Energ
y E
ff. (p
J/bit)
Green = Intel (research)
Driver PINDiode
Laser
Modulator
Receiver
Fiber
TIA
Optical I/O
![Page 22: Energy Efficient Computing in Nanoscale CMOScdworkshop.eit.lth.se/fileadmin/eit/group/71/2016/2016_Lund_Works… · Near Threshold Voltage (NTV) computing 0 0. 2 0. 4 0. 6 0. 8 1](https://reader034.fdocuments.in/reader034/viewer/2022052013/602a8b1397688603e96e94ba/html5/thumbnails/22.jpg)
22
22
Memory capacity & bandwidth
![Page 23: Energy Efficient Computing in Nanoscale CMOScdworkshop.eit.lth.se/fileadmin/eit/group/71/2016/2016_Lund_Works… · Near Threshold Voltage (NTV) computing 0 0. 2 0. 4 0. 6 0. 8 1](https://reader034.fdocuments.in/reader034/viewer/2022052013/602a8b1397688603e96e94ba/html5/thumbnails/23.jpg)
23
23
Neuromorphic computing
![Page 24: Energy Efficient Computing in Nanoscale CMOScdworkshop.eit.lth.se/fileadmin/eit/group/71/2016/2016_Lund_Works… · Near Threshold Voltage (NTV) computing 0 0. 2 0. 4 0. 6 0. 8 1](https://reader034.fdocuments.in/reader034/viewer/2022052013/602a8b1397688603e96e94ba/html5/thumbnails/24.jpg)
24
24
End-to-end efficiency for IoE