Performance Comparison of Hybrid CNN-SVM and CNN-XGBoost ...
Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal...
Transcript of Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal...
![Page 1: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/1.jpg)
Mixed-Signal Techniques for Embedded
Machine Learning Systems
Boris Murmann
June 16, 2019
![Page 2: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/2.jpg)
Applications
Speed of response
Bandwidth utilized
Privacy
Power consumed
CloudEdge
Conversational
Interfaces
…natural?
…seamless?
…real-time?
2
![Page 3: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/3.jpg)
Task Complexity, Memory and Classification Energy
Face Detection
SVM
Digit RecognitionImage Classification
10 classes
Binary CNN
Image Classification
1000 classes
MobileNet v1
Self-Driving
ResNet-50 HD
pJ
nJ
µJ
mJ
J
3
![Page 4: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/4.jpg)
Task Complexity, Memory and Classification Energy
pJ
nJ
µJ
mJ
J
My main interest
4
![Page 5: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/5.jpg)
Edge Inference System
5
A/D
A/D
Sensors
Wake-up
Detector
Pre-
Processin
g
Deep
Neural
Network
DRA
M
SRA
M
Buffer
PE
Array
Register
File
Compute
MAC
![Page 6: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/6.jpg)
Opportunities for Analog/Mixed-Signal Design
6
A/D
A/D
Sensors
Wake-up
Detector
Pre-
Processin
g
Deep
Neural
Network
DRA
M
SRA
M
Buffer
PE
Array
Register
File
Compute
MAC
Analog
Feature
Extraction
Analog
MAC
In-Memory
Computing
![Page 7: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/7.jpg)
Outline
7
Data-Compressive Imager for Object Detection
› Omid-Zohoor & Young, TCSVT 2018 & ISSCC 2019
Mixed-Signal ConvNet
› Bankman, ISSCC 2018 & JSSC 2019
RRAM-based ConvNet with In-Memory Compute
› Ongoing work
![Page 8: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/8.jpg)
Wake-Up Detector with Hand-Crafted Features
Feature
Extractor
Simple
Classifier
Sensor
Signal LabelA/D
Training
Algorithm
Labeled
Training Data
High-dimensional
data
Low-dimensional
representation
Data Deluge
8
![Page 9: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/9.jpg)
Analog Feature Extractor
Low-rate and/or low–resolution ADC
Low data rate digital I/O
Reduced memory requirements
Feature
Extractor
Simple
Classification
Sensor
SignalLabelA/D
Training
Algorithm
Labeled
Training Data
Low-dimensional
representation
9
![Page 10: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/10.jpg)
Histogram of Oriented Gradients
10
![Page 11: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/11.jpg)
Analog Gradient Computation
H1 H2𝐺𝐻 =
V1 V2𝐺𝑉 =
horizontal
vertical
V2
H
1
PH
2
V1
𝐺𝐻 = 400𝑚𝑉 − 100𝑚𝑉 = 𝟑𝟎𝟎𝒎𝑽
𝐺𝐻 =1
4400𝑚𝑉 −
1
4100𝑚𝑉 = 𝟕𝟓𝒎𝑽
Dark patch
Bright patch
11
![Page 12: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/12.jpg)
High Dynamic Range Images
12
Gradient magnitude varies
significantly across image
Would require high-
resolution ADCs (≥ 9b) to
digitize computed gradients
› But, we want to produce
as little data as possible
![Page 13: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/13.jpg)
Ratio-Based (“Log”) Gradients
V2
H
1
PH
2
V1
H1
H2
𝐺𝐻 =V1
V2
𝐺𝑉 =
H1
H2
𝐺𝐻 = =𝛼 ×
𝛼 ×
H1
H2
Illumination Invariant
13
![Page 14: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/14.jpg)
Ratio Quantization
V2
H
1
PH
2
V1
H1
H2
𝐺𝐻 =
V1
V2
𝐺𝑉 =
𝑉𝑝𝑖𝑥1𝑉𝑝𝑖𝑥2
1
22
Negative
edge
Positive
edge
No
edge
*Empirically determined
thresholds
𝑉𝑝𝑖𝑥1𝑉𝑝𝑖𝑥2
1
22
1.5b
2.75b
4.21.31
1.3
1
4.2
H1
H2
V1
V2
𝐺𝑉 = 𝑄𝐺𝐻 = 𝑄
14
![Page 15: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/15.jpg)
HOG Feature Compression with 1.5b Gradients
Quant
Fewer angle bins
Quantizing histogram
magnitudes
25× less data in HOG
features compared to 8-bit
image
15
![Page 16: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/16.jpg)
Log vs. Linear GradientsLess Illu
min
atio
n
16
![Page 17: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/17.jpg)
Log vs. Linear GradientsLess Illu
min
atio
n
17
![Page 18: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/18.jpg)
Log vs. Linear GradientsLess Illu
min
atio
n
18
![Page 19: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/19.jpg)
Prototype Chip
19
![Page 20: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/20.jpg)
Row Buffers with Pixel Binning (Image Pyramid)
20
![Page 21: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/21.jpg)
Ratio-to-Digital Converter (RDC)
21
![Page 22: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/22.jpg)
Data-Driven Spec Derivation
22
𝐻1 + 𝝐𝟏𝐻2 + 𝝐𝟐
![Page 23: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/23.jpg)
Chip Summary
• 0.13 µm CIS 1P4M
• 5µm 4T pixels
• QVGA 320(V) x 240(H)
• 229 mW @ 30 FPS
Supply Voltages
Pixel: 2.5V
Analog: 1.5V, 2.5V
Digital: 0.9V
23
![Page 24: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/24.jpg)
Sample Images
24
![Page 25: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/25.jpg)
25
Results using Deformable Parts Model detection & custom database (PascalRAW)
![Page 26: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/26.jpg)
Comparison to State of the Art
26
![Page 27: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/27.jpg)
Information Preservation
27
![Page 28: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/28.jpg)
Use Log Gradients as ConvNet Input?
28
Ongoing work; comparable performance using ResNet-10 (PascalRaw dataset)
![Page 29: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/29.jpg)
Digital ConvNet Fabric
Can We Play Mixed-Signal Tricks in a ConvNet?
29
![Page 30: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/30.jpg)
BinaryNet
Courbariaux et al., NIPS 2016
Weights and activations constrained to +1
and -1, multiplication becomes XNOR
Minimizes D/A and A/D overhead
Nice option for small/medium-size
problems and mixed-signal exploration
30
![Page 31: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/31.jpg)
Bankman et al., ISSCC 2018
Mixed-Signal Binary CNN Processor
1. Binary CNN with “CMOS-inspired” topology, engineered for minimal circuit-level path loading
2. Hardware architecture amortizes memory access across many computations, with all memory on chip (328 KB)
3. Energy-efficient switched-capacitor neuron for wide vector summation, replacing digital adder tree
31
CIFAR-10
![Page 32: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/32.jpg)
Zhao et al., FPGA 2017
Original BinaryNet Topology
88.54% accuracy on CIFAR-10
1.67 MB weight memory (68% FC layers)
27.9 mJ/classification with FPGA
32
![Page 33: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/33.jpg)
Mixed-Signal BinaryNet Topology
Sacrificed accuracy for regularity and energy efficiency
86.05% accuracy on CIFAR-10
328 KB weight memory
3.8 mJ per classification
33
![Page 34: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/34.jpg)
Neuron
34
![Page 35: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/35.jpg)
Naïve Sequential Computation
35
![Page 36: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/36.jpg)
Weight-Stationary
256
x2 (north/south)
x4 (neuron mux)
36
![Page 37: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/37.jpg)
Weight-Stationary and Data-ParallelParallel
broadcast
37
![Page 38: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/38.jpg)
Complete Architecture
38
![Page 39: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/39.jpg)
𝑧 = sgn
𝑖=0
1023
𝑤𝑖𝑥𝑖 + 𝑏 𝑧
𝑤 𝑏Filter Weights
𝑥Image Patch Filter Bias
9b sign-magnitude
𝑏 = −1 𝑠
𝑖=0
7
2𝑖𝑚𝑖
Output
2×2×256 2×2×256
Neuron Function
39
![Page 40: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/40.jpg)
𝑣diff𝑉𝐷𝐷
=𝐶𝑢𝐶𝑡𝑜𝑡
𝑖=0
1023
𝑤𝑖𝑥𝑖 + 𝑏
𝑏 = −1 𝑠
𝑖=0
7
2𝑖𝑚𝑖
Switched-Capacitor Implementation
Bias & offset cal.Weights x inputs
Batch normalization
folded into weight
signs and bias
40
![Page 41: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/41.jpg)
Behavioral Simulations
41
Significant margin in noise, offset, and mismatch (VFS = 460 mV)
𝐶𝑢 = 1 fF𝑣𝑜𝑠 = 4.6 mV RMS𝑣𝑛 = 460 µV RMS
![Page 42: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/42.jpg)
“Memory-Cell-Like” Processing Element
1 fF metal-oxide-metal fringe capacitorStandard-cell-based
42 transistors
24107 F2
42
![Page 43: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/43.jpg)
TSMC 28nm HPL 1P8M
6 mm2 area
328 KB SRAM
10 MHz clock
Supply Voltages
VDD – Digital Logic, 0.6V – 1.0V
VMEM – SRAM, 0.53V – 1.0V
VNEU – Neuron Array, 0.6V
VCOMP – Comparators, 0.8V
Die Photo
43
![Page 44: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/44.jpg)
μ = 86.05%
σ = 0.40%
Measured Classification Accuracy
10 chips, 180 runs each through
10,000 CIFAR-10 test images
VDD = 0.8V, VMEM = 0.8V
3.8 μJ/classification
237 FPS, 899 μW
0.43 μJ in 1.8V I/O
Mean accuracy μ = 86.05% same
as perfect digital model
44
![Page 45: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/45.jpg)
Comparison to Synthesized Digital
BinarEye(Moons et al., CICC 2018)
Synthesized Digital Mixed-Signal
45
![Page 46: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/46.jpg)
Digital vs. Mixed-Signal Binary CNN Processor
Synthesized
Digital
Moons et al.
CICC 2018
Hand-Designed
Digital
(Projected)
Mixed-Signal
Bankman et al.
ISSCC 2018
Energy @
86.05%
CIFAR-10
46
![Page 47: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/47.jpg)
CIFAR-10 Energy vs. Accuracy Neuromorphic› [1] TrueNorth, Esser PNAS
2016
GPU› [2] Zhao FPGA 2017
FPGA› [2] Zhao FPGA 2017
› [3] Umuroglu FPGA 2017
MCU› [4] CMSIS-NN, Lai arXiv 2018
Memory-like, mixed-signal› [5] Bankman ISSCC 2018
BinarEye, digital› [6] Moons CICC 2018
In-memory, mixed-signal› [7] Jia arXiv 2018
*energy excludes off-chip DRAM
47
![Page 48: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/48.jpg)
Limitations of Mixed-Signal BinaryNet
48
Poor programmability
Relatively limited accuracy (even on CIFAR-10) due to 1b arithmetic
Energy advantage over customized digital is not revolutionary
› Same SRAM, essentially same dataflow
Need a more “analog” memory system to unleash larger gains
› In-memory computing
![Page 49: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/49.jpg)
BinaryNet Synapse versus Resistive RAM
• TBD
• 25 F2
• Multi-bit (?)
49
• 0.93 fJ per 1b-MAC in 28 nm
• 24107 F2
• Single-bit
![Page 50: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/50.jpg)
Tsai, 2018
Matrix-Vector Multiplication with Resistive Cells
50
Typically use two cells to
achieve pos/neg weights
(other schemes possible)
![Page 51: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/51.jpg)
256 columns
…
10
24
row
s
25 F2 cell @ F = 90 nm
Side length s = 0.45 mm
2 x 1024 x 0.45 um x 256 x 0.45 um = 0.106 mm2
per layer
0.84 mm2 for the complete 8-layer network
RRAM Density – BinaryNet Example
51
(2x)
![Page 52: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/52.jpg)
Ongoing Research
52
What is the best architecture?
How many levels can be stored in each cell?
What is the most efficient readout?
Can we cope with nonidealities using training techniques?
![Page 53: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/53.jpg)
53
(Content deleted for online distribution)
![Page 54: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/54.jpg)
54
(Content deleted for online distribution)
![Page 55: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/55.jpg)
55
(Content deleted for online distribution)
![Page 56: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/56.jpg)
56
(Content deleted for online distribution)
![Page 57: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/57.jpg)
VGG-7 Experiment (4.8 Million Parameters)
57
8192
10
3×3×128
3×3×128
3×3×256 3×3×512
3×3×2563×3×3
2-bit weights, 2-bit
activations
Accuracy on CIFAR-10
2b quantization only: 93%
2b quantization + RRAM/ADC model: 92%
Work in progress!
![Page 58: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/58.jpg)
Optimal energy/MAC = 0.38 fJ
Energy Model for Column in Conv6 Layer
58
![Page 59: Mixed-Signal Techniques for Embedded Machine Learning Systems · 2020. 12. 5. · Mixed-Signal Binary CNN Processor 1. Binary CNN with “CMOS-inspired” topology, engineered for](https://reader036.fdocuments.in/reader036/viewer/2022081411/60b044cf9d9c2b2d8b723533/html5/thumbnails/59.jpg)
Summary
59
Analog feature extraction is attractive for wake-up detectors
Adding analog compute in ConvNets can be beneficial when it
simultaneously lets us reduce data movement
› In-memory analog compute looks most promising
› Can consider SRAM or emerging memories (e.g. RRAM)
Expect significant progress as more application drivers for “machine
learning at the edge” emerge