Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew...
Transcript of Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew...
![Page 1: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/1.jpg)
Photos placed in horizontal position
with even amount of white space
between photos and header
Sandia National Laboratories is a multimission laboratory managed and operated by National Technology and
Engineering Solutions of Sandia, LLC, a wholly owned subsidiary of Honeywell International, Inc., for the U.S.
Department of Energy’s National Nuclear Security Administration under contract DE-NA0003525.
Sapan Agarwal, Alexander Hsia, Robin Jacobs-Gedrim, David R. Hughart,
Steven J. Plimpton, Conrad D. James, Matthew J. Marinella
Sandia National Laboratories
Designing an Analog Crossbar based
Neuromorphic Accelerator
1
“For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or
Confidential Information.”
![Page 2: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/2.jpg)
The Von Neumann Bottleneck
CPU MemoryCommunications Bus
Cross chip communications ~ 1 pJ
DRAM Access >10 pJ
Ethernet ~ 1nJCurrent Transistors ~ 10 aJ
40kT Noise Limit ~ 0.2 aJ
Processor Layer Photonic Layer
Optical interconnects 100 fJ to 1 pJ
Communications require
orders of magnitude more
energy!
![Page 3: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/3.jpg)
Use Resistive Memories for Local Computation
• A resistive memory or ReRAM is a
programmable resistor• Apply small voltages allows the conductance
to be read: I = G × V
• Apply large voltages to change the resistance
Ta
Pt
+ +
++++
+ ++ ++
+ ++ +
+ +++
ON
Ta
Pt
+ +
++++
TaOx
+
OFF
V = I×R
I = G×V
multiplication
Addition: I=I1+I2
I1
I2
Current
VoltageVREAD VSET
VRESETSET
RESETWrite
Read Window
![Page 4: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/4.jpg)
Directly Process in the Memory Itself
4
Analog is efficiently and naturally
able to combine computation and
data access
Effectively, large-scale processing in
memory with a multiplier and adder
at each real-valued memory location
w11
w21
w31
w41
w12
w22
w32
w42
w13
w23
w33
w43
w14
w24
w34
w44
V1=x1 +-
+-
+-
+-
V2=x2
V3=x3
V4=x4
I1=x1*w11 + … + x4*w41
![Page 5: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/5.jpg)
Crossbars Can Perform Parallel Reads and Writes
w11
w21
w31
w41
w12
w22
w32
w42
w13
w23
w33
w43
w14
w24
w34
w44
V1=x1+-
+-
+-
+-
V2=x2
V3=x3
V4=x4x4=1
V t
x3=0.66V t
x2=0.33V t
x1=0V t
y1=0.25
V V V V
tt t t
y2=0.5 y3=0.75 y4=1
w11
w21
w31
w41
w12
w22
w32
w42
w13
w23
w33
w43
w14
w24
w34
w44
N r
ow
s
M columns
Energy to charge the crossbar is CV2
E ∝ C ∝ number of RRAMs ∝ N×M
E ~ O(N×M)
![Page 6: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/6.jpg)
SRAM Arrays Require Charging Columns Multiple Times
WL[0]
WL[1]
WL[2]
BL[0] BL[1] BL[2]
SRAMs must be read one row at a time, charging M columns
Each column wire length is O(N).
Energy = N Rows × M Columns × O(N) wire length
Energy ~ O(N2×M)
O(N) times worse than a crossbar!
N r
ow
s
M columns
![Page 7: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/7.jpg)
Want To Accelerate Many Different Neural Algorithms
BackpropagationSparse
Coding
Liquid State
Machine
Input
Nodes
Output
NodesReservoir
![Page 8: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/8.jpg)
Crossbars Can Perform Parallel Reads and Writes
w11
w21
w31
w41
w12
w22
w32
w42
w13
w23
w33
w43
w14
w24
w34
w44
V1=x1+-
+-
+-
+-
V2=x2
V3=x3
V4=x4x4=1
V t
x3=0.66V t
x2=0.33V t
x1=0V t
y1=0.25
V V V V
tt t t
y2=0.5 y3=0.75 y4=1
w11
w21
w31
w41
w12
w22
w32
w42
w13
w23
w33
w43
w14
w24
w34
w44
N r
ow
s
M columns
Energy to charge the crossbar is CV2
E ∝ C ∝ number of RRAMs ∝ N×M
E ~ O(N×M)
![Page 9: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/9.jpg)
General Purpose Neural Architecture
Neuromorphic core:
• Evaluate vector matrix multiplies along
rows or columns
• Train based on input vectors
Digital Core:
• Process neural core inputs/outputs
• For NxN crossbar, the crossbar accelerates
O(N2) operations leaving only O(N) operations
for the digital core
Run any neural algorithm on the
same hardware
R RR BusBus
R RR Bus Bus
R RR Bus Bus
Neural
Core(s)
Digital
Core
Neural
Core(s)
Digital
Core
Neural
Core(s)
Digital
Core
Neural
Core(s)
Digital
Core
Router
D/A
D/A
D/A D/A
A/D A/D
+ - + -
A/D
A/D
+ -
+ -
positive
weights
negative
weights
![Page 10: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/10.jpg)
yi
Neural Core
jzje
y
1
1
i
ijij wyz
yj
O(N2)
Operations
O(N)
Operations
in
out
i
j
k
Can Run Neural Networks on this
Architecture
w11
w21
w31
w12
w22
w32
w31
w32
w33
Digital Core
![Page 11: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/11.jpg)
yi
Digital Core
kjj zdz
dy )(
i
ijij wyz
O(N2) Read
Operations
O(N)
Operations
i
j
k
kerror
k
k
jkk w
j k
yiiy
O(N2) Write
Operations
Back Propagation
w11
w21
w31
w12
w22
w32
w31
w32
w33
w11
w21
w31
w12
w22
w32
w31
w32
w33
w11
w21
w31
w12
w22
w32
w31
w32
w33
jz
![Page 12: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/12.jpg)
Design & Model Detailed Architecture
12
Vector Matrix Multiply Matrix Vector Multiply Outer product Update
Neuron Circuitry Current Conveyor
Based Integrator
Comparator
![Page 13: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/13.jpg)
Row & Column Driver Circuitry
13
Row Driver Logic
Column Driver Logic
Voltage level shifter (drive
high V transistor with low V)
Array driver pass transistors
![Page 14: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/14.jpg)
Compare Architectures
14
1024 x1024 = 1M array operations, sum over 1 training cycle, 3 operations:
• Vector Matrix Multiply
• Matrix Vector Multiply
• Outer Product Update
Latency
35 – 800X over SRAM
Energy
430 – 6,900X over SRAM
Area
11 – 20X over SRAM
Used a commercial 14/16 nm PDK ***Requires 100 MΩ on state devices
![Page 15: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/15.jpg)
Neural Core Energy Analysis
15
Analog
ReRAM
Digital
ReRAM
SRAM
12,010 nJ 10,150 nJ 8,970 nJ
7,520 nJ 5,580 nJ 4,340 nJ
28 nJ 2.7 nJ 1.3 nJ
8 bits In/out
8 bit weights
4 bits In/out
8 bit weights
2 bits In/out
8 bit weights
![Page 16: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/16.jpg)
16
Algorithms
Sandia Cross-Sim: Translates device measurements and crossbar circuits to algorithm-level performance
Architecture
Circuits
Devices
Materials
Target Algorithms• Deep Learning• Sparse Coding• Liquid State
Machines
x2
x2
x2
x2
w1,1
w2,1
w3,1
w4,1
w1,2
w2,2
w3,2
w4,2
w2,x
w2,x
w3,x
w4,x
...
...
...
...
Drift-diffusion model of ReRAM band diagram & transport (REOS, Charon)
DFT of model of oxide physics, bands
+++
+-
--
--
-
++
VTE
+
++
+
+
--
+
++
++
+
+
--
--
-
PtTaOxTa
Modified McPAT/CACTI: Model performance and energy requirements
In situ TEM of filament switching: Use DFT model to interpret EELS signature
Sandia’s Xyce Circuit Sim: Simulate crossbar circuits based on our devices
Memristor fabrication and measurements in MESAFab
Multiscale Model of a Neural Training Accelerator
![Page 17: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/17.jpg)
Numeric
Crossbar
Simulator
Xyce
Crossbar
Circuit Model
Learning
Algorithm
Neural Core
Simulator
Simple Python API:
# Do a matrix vector multiplication
result = neural_core.run_xbar_mvm(vector)
+ - + - + -
333231
232221
131211
www
www
www
Detailed but
slow
Fast but
approximate
Measured
Devices
TaOx –MNIST
Periodic Carry
Single Device
Ideal Numeric
https://cross-sim.sandia.gov
Algorithmic
Performance
Physical
Hardware
Crossbar
![Page 18: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/18.jpg)
18
Simple API to model crossbars# ************** set parameters defining the crossbar
params.algorithm_params.weights.sim_type = “XYCE” # Use a XYCE based sim
params.algorithm_params.weights.maximum = 10 # clipping limits
params.algorithm_params.weights.minimum = -10 # clipping limits
params.xyce_parameters.xbar.device.TAHA_A1 = 4e-4 # Xyce Parameters
…
…
# ************** API for running neural operations
# All crossbar details are transparent to the user
# Create a neural_core object that models a crossbar
neural_core = MakeCore(params=params)
neural_core.set_matrix(weights) # set the initial weights
result = neural_core.run_xbar_vmm(vector) # Do a vector matrix multiply
result = neural_core.run_xbar_mvm(vector) # Do the transpose, a matrix vector mult.
neural_core.update_matrix(vector1,vector2) # Do an outer product update
https://cross-sim.sandia.gov
![Page 19: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/19.jpg)
Go from Measurement to AccuracyMeasured Pulsing ΔG Scatterplot
Cumulative
Probability of ΔG
D/A
D/A
D/A D/A
A/D A/D
+ - + -
A/D
A/D
+ -
+ -
positive
weights
negative
weights
R
R
R
R
R
R
Bus R
R
R
Bus
Neural
Core(s)
Digital
Core
Neural
Core(s)
Digital
Core
Neural
Core(s)
Digital
Core
Neural
Core(s)
Digital
Core
Router
Bus Bus
Bus Bus
TiN
Ta– 50 nm
TaOx – 10 nm
TiN
Fabricate
Device
![Page 20: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/20.jpg)
If we need more bits per synapse, use multiple memristors
• Three 10 level ReRAMs could represent 1-1000!
• Adding to the weight requires reading every
ReRAM to account for any carries and serially
programming each ReRAM: VERY EXPENSIVE
Neuron
×100 ×10 ×1 • Use >10 levels to represent a base 10 system
• Ignore carry and program the crossbar in parallel.
• Periodically (once every few hundred cycles) read
the ReRAM and perform the carry
10 levels
represent the
weight
Extra levels
store the
carry
conductance
Multi-ReRAM Synapse: Periodic Carry
![Page 21: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/21.jpg)
Read and reset every 100 pulses
Do 300,000 small (0.02% of weight range) updates
• net of 1500 positive training pulses
Noise Sigma = 1.4% for single device
• (from )
• Write noise applied during updates and carries
Periodic Carry Compensates for Write Noise
Periodic
Carry
Single
Device
rangerangenoise GGG 1.0
Learn from a 0.5% Signal
1/5
-1/5
1/25 1/125
-1/25 -1/125
Carry
-1
1
![Page 22: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/22.jpg)
Pulse Number
Weig
ht
Positive
Pulses
Alternating
Pulses
1/5
-1/5
1/25 1/125
-1/25 -1/125
Carry
-1
1
Periodic Carry Mitigates Write Nonlinearity
Write NonlinearityAlternating Pulses Cause Weight Decay
Use center linear range of weights
• Train with 1% signal
• Ideal result is 0.6
Single
Device
Periodic
Carry
![Page 23: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/23.jpg)
TaOx Results
A/D and D/A is modeled, Serial operations modeled• When resetting weight, need to adjust pulse size based on current state to compensate for nonlinearity
• When reading a single weight, need to adjust readout range to be smaller (change capacitor on the integrator)
Carry once every
1000 updates
1/4
-1/4
Carry
-1
1 TaOx – File Types
Periodic Carry
Ideal Numeric
Single Device
TaOx –MNIST
Periodic Carry
Single Device
Ideal Numeric
![Page 24: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/24.jpg)
Li-Ion Synaptic Transistor for Analog Computation (LISTA)
E. J. Fuller, et al, "Li-Ion Synaptic Transistor for Low Power Analog Computing," Advanced
Materials, vol. 29, no. 4, p. 1604310, 2017.
Off by ~ 1%
from Ideal
![Page 25: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/25.jpg)
Summary
Fundamental O(N) energy scaling advantage
Use CrossSim to co-design materials to algorithms Use periodic carry to overcome noise devices
Need high resistance 10-100 MΩ Devices
Need low write nonlinearities
25
Latency
35 – 800X over SRAM
Energy
430 – 6,900X over SRAM
Area
11 – 20X over SRAM
https://cross-sim.sandia.gov
![Page 26: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/26.jpg)
Extra Slides
26
![Page 27: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/27.jpg)
Overcoming the Power Limit
CPU MemoryCommunications Bus
Integrate Processing and Memory
Richard Goering, “Three Die Stack -- A Big Step “Up” for 3D-ICs with TSVs” Cadence blog
w11
w21
w31
w41
w12
w22
w32
w42
w13
w23
w33
w43
w14
w24
w34
w44
V1=x1+-
+-
+-
+-
V2=x2
V3=x3
V4=x4
![Page 28: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/28.jpg)
VGI oo
VGI oo
VGI oo
Measure N resistors and determine the total output current with some signal to noise ratio (SNR)*
What is the minimum energy?
fNGVEnergy O
12
Power in each resistor ×number of resistors
Determined by noise and SNR
*we are assuming we need some fixed precision on the output, and don’t need full floating point accuracy
fGTkN
I
ob
4
Noise Thermal 2
2
2
2
I
NISNR o
24 SNRTkEnergy b
If we double the number of resistors, we can double the speed to get the same energy and SNR.
This is because the noise scales as sqrt(N) while the signal scales as N
NGVSNRTk
f o
b
2
2 14
1
The Noise Limited Energy to Read a Crossbar Column is Independent of Crossbar Size
![Page 29: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/29.jpg)
Experimental Device Non-idealities
Device: Write Variability, Write Nonlinearity, Asymmetry, Read Noise
Circuit: A/D, D/A noise, parasitics
29
G∝
W
Pulse Number (Vwrite=1V, tpulse=1µs)
= Ideal
Variability and Nonlinearity
= Variability Range
= Nonlinear
I∝
G∝
W
Time (Vread=100mV)
I0I0-∆I
I0+∆I
Read Noise
![Page 30: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/30.jpg)
Combined Effects of Nonidealities
0
90
98Linear Asymmetric, ν = 1
Asymmetric, ν = 5 Symmetric, ν = 5
AccuracyMNIST
Read Noise (σRN) Read Noise (σRN)
Writ
e Noi
se (
σW
N)
Writ
e Noi
se (
σW
N)
0
90
98Linear Asymmetric, ν = 1
Asymmetric, ν = 5 Symmetric, ν = 5
AccuracyFile Types
Read Noise (σRN) Read Noise (σRN)
Writ
e Noi
se (
σW
N)
Writ
e Noi
se (
σW
N)
![Page 31: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/31.jpg)
What are the Neural ReRAM Device Requirements?
Small
Images
Large
Images
File
Types
Read Noise σ (% Range) 3% 5% 9%
Write Noise σ (% Range) 0.3% 0.4% 0.4%
Asymmetric Nonlinearity (ν) 0.1 0.1 0.1
Symmetric Nonlinearity (ν) >20 5 5
Maximum Current 160 nA 13 nA 40 nA
wmin
wmax
We
igh
t (C
on
du
cta
nce
)
Normalized Pulse Number
Positive Pulses Negative Pulses
0 0.5 1 0.5 0
Symmetric Nonlinearity
wmin
wmax
We
igh
t (C
on
du
cta
nce
)
Normalized Pulse Number
Positive Pulses Negative Pulses
0 0.5 1 0.5 0
Asymmetric Nonlinearity
![Page 32: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/32.jpg)
Full System Simulation
Range Bits
Row Input -1 to 1 8
Col Output -6 to 6 8
Col Input -1 to 1 8
Row Output -4 to 4 8
Row Update -0.01 to 0.01 7
Col Update -1 to 1 5
A/D & D/A Have
Minimal Impact
D/A
D/A
D/A D/A
A/D A/D
- + - +
A/D
A/D
-+
-+
positive
weights
negative
weights
Data set#Training/Test
Examples
Network Size
File Types 4,501 / 900 256×512×9
MNIST 60,000 /10,000 784×300×10
MNIST
Conductance
Wmax
-Wmax
0
0
Gmax
Weight
Gmax/2
![Page 33: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/33.jpg)
TaOx Results
A/D and D/A is modeled, serial operations modeled
• When resetting weight, need to adjust pulse size based on current state to
compensate for nonlinearity
• When reading a single weight, need to adjust readout range to be smaller (change
capacitor on the integrator)
Carry once every 1000 updates
for the LSB, and every 2 updates
on others
1/4
-1/4
Carry
-1
1
TaOx – File Types
Periodic Carry
Ideal Numeric
Single Device
TaOx –MNIST
Periodic Carry
Single Device
Ideal Numeric
Update Count (x 10,000)
Dig
it 1
Wei
ght
Dig
it 0
Wei
ght
Weights During Training
![Page 34: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/34.jpg)
LISTA ResultsWeight
Configuration
(base 7)
49
-49
-98
98
0
×7
343
-343
-686
686
0
×49
Carry
• Carry once every 1000 updates
• Use a single device per weight and
subtract a reference current
7
-7
-14
14
0
×1
LISTA - MNIST99
98
97
9695
Periodic
CarryIdeal
Numeric
Ideal
w/ A/D
Single
Device
![Page 35: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/35.jpg)
Neural Core Latency Analysis
35
Analog ReRAM
Min write time of 8 ns vs
1 ns incremental write1.28 µs
Digital ReRAM
All bit precisions
4 bit in/out 2 bit in/out
SRAM
All bit precisions
SRAM transpose
read expensive
0.08 µs 0.054 µs
1335 µs 44 µs
8 bit in/out
x1 x0.06 x0.04
x1040 x35
OPU = Outer Product Update
![Page 36: Designing an Analog Crossbar based · 2017. 12. 7. · Steven J. Plimpton, Conrad D. James, Matthew J. Marinella Sandia National Laboratories Designing an Analog Crossbar based Neuromorphic](https://reader035.fdocuments.in/reader035/viewer/2022071008/5fc58bee6555a1313225ee61/html5/thumbnails/36.jpg)
Neural Core Area Analysis
36
Analog ReRAM
Digital ReRAM
SRAM
x1
x1.8
x11.1
8 bits In/out
8 bit weights
SRAM Array
MAC
Array
Drivers
Array
836k µm2
137k µm2
75k µm2
4 bits In/out
8 bit weights
2 bits In/out
8 bit weights
For the ReRAM, high voltage transistors require 8X area, improving this could give ~2X area savings
ReRAM Array
on logic
“For Internal E3S Use Only. These Slides May Contain Prepublication Data and/or Confidential Information.”