Advances in Energy Efficient Neuromorphic Computing: Ready for ... · • IBM’s TrueNorth...

28
1 E. Corti and A.M. Ionescu, @ Semicon 2019 Advances in Energy Efficient Neuromorphic Computing: Ready for Artificial Intelligence at the Edge? Elisabetta Corti 1,2 & Adrian M. Ionescu 2 1 IBM Research Zurich, Switzerland 2 EPFL, Lausanne, Switzerland

Transcript of Advances in Energy Efficient Neuromorphic Computing: Ready for ... · • IBM’s TrueNorth...

Page 1: Advances in Energy Efficient Neuromorphic Computing: Ready for ... · • IBM’s TrueNorth (DARPA’s SyNAPSE project) 65 mW real-time neurosynaptic processor, 4096 neurosynaptic

1

E. C

ort

i a

nd

A.M

. Io

nescu

, @

Sem

icon

20

19

Advances in Energy Efficient

Neuromorphic Computing:

Ready for Artificial

Intelligence at the Edge?

Elisabetta Corti1,2 & Adrian M. Ionescu2

1IBM Research Zurich, Switzerland2EPFL, Lausanne, Switzerland

Page 2: Advances in Energy Efficient Neuromorphic Computing: Ready for ... · • IBM’s TrueNorth (DARPA’s SyNAPSE project) 65 mW real-time neurosynaptic processor, 4096 neurosynaptic

Outline

• Digitalization Era: from smartphone to cloud and edge computing

• Neuromorphic computing

❑ What is and is not

❑ Digital and analog implementation

❑ Disruptive computing: pattern recognition with coupled oscillators and stochastic neuromorphic computing

• Conclusion

E. C

ort

i a

nd

A.M

. Io

nescu

, @

Sem

icon

20

19

2

Page 3: Advances in Energy Efficient Neuromorphic Computing: Ready for ... · • IBM’s TrueNorth (DARPA’s SyNAPSE project) 65 mW real-time neurosynaptic processor, 4096 neurosynaptic

E. C

ort

i a

nd

A.M

. Io

nescu

, @

Sem

icon

20

19

3

Nanoelectronics: ~10 nm 3D transistors

VirusToday: 10 nm:

❑ 100 millions

transistors/mm2Negatively stained Influenza Virus,

usually spherical or ovoid in shape,

80 to 150 nm.

© Intel

Next: 7 nm

Page 4: Advances in Energy Efficient Neuromorphic Computing: Ready for ... · • IBM’s TrueNorth (DARPA’s SyNAPSE project) 65 mW real-time neurosynaptic processor, 4096 neurosynaptic

4

2007: First wireless computer with sensors

2011 Cloud technologies Future: Edge computing of the cloud

technologies

2019: 80% of US citizens own a smartphone

2018: 30% revenue growth for cloud

computing services. 183 billion $ revenue

Source: IDC

Page 5: Advances in Energy Efficient Neuromorphic Computing: Ready for ... · • IBM’s TrueNorth (DARPA’s SyNAPSE project) 65 mW real-time neurosynaptic processor, 4096 neurosynaptic

5

2007: First wireless computer with sensors

2011 Cloud technologies Future: Edge computing of the cloud

technologies

Why is edge computing needed?

• Application for which latency is a problem

• When security is concerned

• When dimensions and power consumption are limited

Page 6: Advances in Energy Efficient Neuromorphic Computing: Ready for ... · • IBM’s TrueNorth (DARPA’s SyNAPSE project) 65 mW real-time neurosynaptic processor, 4096 neurosynaptic

Future: billions of energy efficient IoT edge & fogdevices…

E. C

ort

i a

nd

A.M

. Io

nescu

, @

Sem

icon

20

19

7

CLOUD: Data Centers

FOG: Nodes

EDGE: Devices

Thousands

Millions

Billions

Source: IDC

Page 7: Advances in Energy Efficient Neuromorphic Computing: Ready for ... · • IBM’s TrueNorth (DARPA’s SyNAPSE project) 65 mW real-time neurosynaptic processor, 4096 neurosynaptic

E. C

ort

i a

nd

A.M

. Io

nescu

, @

Sem

icon

20

19

8

What is required to compute AI?

Deep Learning is energy expensive!

Hardware accelerators are neededSource: J. Weiss, IBM.

Page 8: Advances in Energy Efficient Neuromorphic Computing: Ready for ... · • IBM’s TrueNorth (DARPA’s SyNAPSE project) 65 mW real-time neurosynaptic processor, 4096 neurosynaptic

E. C

ort

i a

nd

A.M

. Io

nescu

, @

Sem

icon

20

19

9

2025: AI-related semiconductors will account for

20% percent of all demand, $67 billion in revenue.

5X

Page 9: Advances in Energy Efficient Neuromorphic Computing: Ready for ... · • IBM’s TrueNorth (DARPA’s SyNAPSE project) 65 mW real-time neurosynaptic processor, 4096 neurosynaptic

How do we solve this problem? How do we make computing more efficient?

E. C

ort

i a

nd

A.M

. Io

nescu

, @

Sem

icon

20

19

10

Page 10: Advances in Energy Efficient Neuromorphic Computing: Ready for ... · • IBM’s TrueNorth (DARPA’s SyNAPSE project) 65 mW real-time neurosynaptic processor, 4096 neurosynaptic

E. C

ort

i a

nd

A.M

. Io

nescu

, @

Sem

icon

20

19

11

Neuromorphic computing: what is and what is not?!

=

A neuromorphic computer is not a brain but a

brain-like energy efficient system to do machine

learning & AI.

CPU MemoryBus

Von Neumann Architecture

In standard Von Neumann architecture:

Separation CPU/Memory Slower computation

High power consumption (GPU)

Dedicated architecture:

CPU/Memory in the same place Faster computation

Reduced power consumption

Reconfigurable, fault-tolerant …

Page 11: Advances in Energy Efficient Neuromorphic Computing: Ready for ... · • IBM’s TrueNorth (DARPA’s SyNAPSE project) 65 mW real-time neurosynaptic processor, 4096 neurosynaptic

E. C

ort

i a

nd

A.M

. Io

nescu

, @

Sem

icon

20

19

12

1. Standard CMOS based solutions, but bringing memory

near the computation

2. Analog computing promises 100x improvements –

example of multiply-accumulation operations with

memristors

3. Other solution: disruptive architectures

Technologies for Neuromorphic computing

We require NEW ARCHITECTURES

combined with NEW TECHNOLOGIES

Page 12: Advances in Energy Efficient Neuromorphic Computing: Ready for ... · • IBM’s TrueNorth (DARPA’s SyNAPSE project) 65 mW real-time neurosynaptic processor, 4096 neurosynaptic

E. C

ort

i a

nd

A.M

. Io

nescu

, @

Sem

icon

20

19

13

1. Tremendous recent progress inCMOS neuromorphic computers

Standard CMOS technology but architecture reorganization

• IBM’s TrueNorth (DARPA’s SyNAPSE project) 65 mW real-time neurosynaptic processor, 4096 neurosynaptic cores tiled in 2-D array, 1 million digital neurons and 256

million synapses, with computational energy efficiency = 400 GSOPS/Watt.

• Intel’s Liohi (September 2017)130000 neurons, 130 million synapses

• Spinnaker 2

64 KB SRAM- 18 cores

16,000 neurons with eight million plastic synapsis per chip

Potential future applications: cognitive

prosthetics, BMI, wearables, smart in situ

imaging facilities.

C. Liu et al, Frt. Neurosc., 2018.

Page 13: Advances in Energy Efficient Neuromorphic Computing: Ready for ... · • IBM’s TrueNorth (DARPA’s SyNAPSE project) 65 mW real-time neurosynaptic processor, 4096 neurosynaptic

▪ Network with inputs𝑁𝑖

▪ Each multiplied by weight 𝑤𝑖𝑗

▪ Sum all products 𝑀𝑗 = σ𝑖𝑁𝑖 × 𝑤𝑖𝑗

▪ Apply filter function (activation, threshold, …)

2. Why analog computing

Sum all products 𝑀𝑗 = σ𝑖𝑁𝑖 × 𝑤𝑖𝑗

Multiply- accumulating instruction

Page 14: Advances in Energy Efficient Neuromorphic Computing: Ready for ... · • IBM’s TrueNorth (DARPA’s SyNAPSE project) 65 mW real-time neurosynaptic processor, 4096 neurosynaptic

2. Why analog computing

Sum all products 𝑀𝑗 = σ𝑖𝑁𝑖 × 𝑤𝑖𝑗

Multiply- accumulating instruction

Computational efficiency of

various technologies

Dig

ital d

esig

n sp

ace

Brain

J. Hasler, B. Marr, Frt. Neurosc., 2013.

Page 15: Advances in Energy Efficient Neuromorphic Computing: Ready for ... · • IBM’s TrueNorth (DARPA’s SyNAPSE project) 65 mW real-time neurosynaptic processor, 4096 neurosynaptic

2. Why analog computingComputational efficiency of

various technologies

Dig

ital d

esig

n sp

ace

Brain

Multiplication and summation in analog circuit

Mult.: 𝐼𝑖 = 𝑉𝑖 × 𝐺𝑖𝑗 (Ohm’s law)

Sum: 𝐼𝑗 = σ𝑖 𝐼𝑖 (Kirchhoff’s law)

V1 R11

V2 R21

I = I1+I2

J. Hasler, B. Marr, Frt. Neurosc., 2013.

Page 16: Advances in Energy Efficient Neuromorphic Computing: Ready for ... · • IBM’s TrueNorth (DARPA’s SyNAPSE project) 65 mW real-time neurosynaptic processor, 4096 neurosynaptic

17

Device Candidates for Cross-Bar Arrays

Jonas Weiss, IBM Zurich Research Lab

Synaptic Devices & Materials – Memristors

Top Electrode

Crystalline

Amorph.

Bott. ElectrodeH

ea

te(R

)

Top Electrode

Bott. Electrode

Top Electrode

Bott. Electrode

Ferromagnet

Ferromagnet

Tunnel Barr.

PCMFilaments & Oxides

Interfaces

FE Tunnel Jncts.

Domain modulation

Gate

+

Electrolyte/

ion source

++++

+

Channel SS

Cations

Wread

write

e-

Oxide(“very

insulating”)

Ion-

Intercallation

ReRAM(Resistive Ram)

+ Inline oxides

+ Small cell size

- Variability

- Asymmetry

PCM(Phase Change Memory)

+ Commercial

SCM

- Abrupt reset

- Conductance drift

MRAM(Magnetic RAM)

+ High endurance

+ Low power

- Small on/off ratio

- Limited # of states

ECRAM(Electro-Chemical RAM)

+ Low power

+ symmetry

- scalability

- no CMOS comp

E. C

ort

i a

nd

A.M

. Io

nescu

, @

Sem

icon

20

19

Source : V. Bragaglia, IBM

FeFET-based memristors

memristive functions with Fe-

FET with doped high-k are

transferable to advanced

scaling.

Halid Mulaosmanovic et al,

Applied Mat. & Interfaces, 2017.

Additional ref.: I. Boybat, Nat. Comm. 2018

Page 17: Advances in Energy Efficient Neuromorphic Computing: Ready for ... · • IBM’s TrueNorth (DARPA’s SyNAPSE project) 65 mW real-time neurosynaptic processor, 4096 neurosynaptic

E. C

ort

i a

nd

A.M

. Io

nescu

, @

Sem

icon

20

19

18

Disruptive architecture (I): computing with

time in coupled oscillators systems

• Idea: timing rather than amplitude information is used for computation.

• Coupled oscillators lock in frequency and the phase relation can be adjusted by the

coupling resistance. Applications in pattern recognition.

With VO2:

Single oscillator footprint: 200x200nm2

Power consumption -> decreases with

scaling of the device (now around 20μW)

Page 18: Advances in Energy Efficient Neuromorphic Computing: Ready for ... · • IBM’s TrueNorth (DARPA’s SyNAPSE project) 65 mW real-time neurosynaptic processor, 4096 neurosynaptic

E. C

ort

i a

nd

A.M

. Io

nescu

, @

Sem

icon

20

19

19

Disruptive architecture (II): computing with

time in coupled oscillators systems

A.Parihar et al. Nanophotonics, 2017.

Image filter with VO2 capacitively

coupled oscillators Corti et al. ICRC 2018.

Associative memory computation with VO2 coupled

oscillators.

Resistively coupled for compatibility with RRAM or

PCM technology.

0 10 20 30 40 500

20

40

60

80

100

Pro

babili

ty to

recogniz

e im

age / %

Allowed grey deviation / %

20 images

per grey scale tested

0.3

0.4

0.5

0.6

0.7

0.8

Vo

ut /

VD

D

Time (ms)

Osc. 1

Osc. 2

Osc. 3SE

TT

LIN

G T

IME

Input 1 Output 1

Initial delay

0 0.50.7

VO

UT/V

DD

Page 19: Advances in Energy Efficient Neuromorphic Computing: Ready for ... · • IBM’s TrueNorth (DARPA’s SyNAPSE project) 65 mW real-time neurosynaptic processor, 4096 neurosynaptic

Spiking neuromorphic computation: towards probabilistic MIT neurons for SSM

E. C

ort

i a

nd

A.M

. Io

nescu

, @

Sem

icon

20

19

20

W.Maas, Proc. IEEE, 2015.

Benchmarking challenge for

artificial neuron technology:

• Energy efficiency:

> 1013 spikes/Joule (< 0.1pJ/spike)

• Footprint: < 100um2

• RT to 100°CHuman brain:

• 1.8 x 1014 spikes/Joule

• Neuron < 3um2

• ~36-37°C (deep brain

temperature is less than 1°C

higher than body temperature)

Source: A.M: Ionescu, EPFL.

Page 20: Advances in Energy Efficient Neuromorphic Computing: Ready for ... · • IBM’s TrueNorth (DARPA’s SyNAPSE project) 65 mW real-time neurosynaptic processor, 4096 neurosynaptic

E. C

ort

i a

nd

A.M

. Io

nescu

, @

Sem

icon

20

19

21

VO2 probabilistic neurons

Page 21: Advances in Energy Efficient Neuromorphic Computing: Ready for ... · • IBM’s TrueNorth (DARPA’s SyNAPSE project) 65 mW real-time neurosynaptic processor, 4096 neurosynaptic

VO2 probabilistic neurons

E. C

ort

i a

nd

A.M

. Io

nescu

, @

Sem

icon

20

19

22

Stochastic VO2 neuron

S. Datta group @ VLSI 2017:

• probabilistic hardware for stochastic IMT

neuron due to nucleation (filament formation)

IMT/MIT process.

• implementation of SSM for pattern recognition

Page 22: Advances in Energy Efficient Neuromorphic Computing: Ready for ... · • IBM’s TrueNorth (DARPA’s SyNAPSE project) 65 mW real-time neurosynaptic processor, 4096 neurosynaptic

Recent advances in Ge-doped VO2 material@ EPFL

E. C

ort

i a

nd

A.M

. Io

nescu

, @

Sem

icon

20

19

23Grains ~ 50 – 200nm

Sputtered Ge-VO2

• Phase-change transition temperature near 90°C, enabling future implementations of advanced neuromorphic hardware.

Further grain engineering & control needed

Page 23: Advances in Energy Efficient Neuromorphic Computing: Ready for ... · • IBM’s TrueNorth (DARPA’s SyNAPSE project) 65 mW real-time neurosynaptic processor, 4096 neurosynaptic

.

E. C

ort

i a

nd

A.M

. Io

nescu

, @

Sem

icon

20

19

24

CLOUD

FOG

EDGECM

OS

& B

eyon

dThe Future of Energy Efficient Computing will be Hybrid: CMOS + Neuromorphic + Quantum

Source: A.M: Ionescu @ IEDM 2017

Page 24: Advances in Energy Efficient Neuromorphic Computing: Ready for ... · • IBM’s TrueNorth (DARPA’s SyNAPSE project) 65 mW real-time neurosynaptic processor, 4096 neurosynaptic

Conclusions

• Energy efficiency technologies form the next driver in the zettabyte era.

• Paradigm change of distributed computation from Cloud to Edge.

• Future neuromorphic hardware needed: CMOS + new emerging

material, device and architecture concepts.

• The future of computing will probably be hybrid, with CMOS,

neuromorphic and quantum computing serving from edge-to-cloud.

E. C

ort

i a

nd

A.M

. Io

nescu

, @

Sem

icon

20

19

25E. Corti: [email protected]

Page 25: Advances in Energy Efficient Neuromorphic Computing: Ready for ... · • IBM’s TrueNorth (DARPA’s SyNAPSE project) 65 mW real-time neurosynaptic processor, 4096 neurosynaptic

Thank you!Questions?

E. C

ort

i a

nd

A.M

. Io

nescu

, @

Sem

icon

20

19

26E. Corti: [email protected]

Page 26: Advances in Energy Efficient Neuromorphic Computing: Ready for ... · • IBM’s TrueNorth (DARPA’s SyNAPSE project) 65 mW real-time neurosynaptic processor, 4096 neurosynaptic

27

Efficiency is also a lot about (not) Moving Data

Registers

On-Chip Cache

Main Memory (DRAM)

Storage Class Memory

Hard disk Drive

Storage, Backup Tape

<1

<10

100

105-106

107-108

1010-1012

pJ

pJ

nJ

nJ

mJ

mJ-J

kB

MB

GB

GB

TB

PB

Energy

Performance

For better:

- Power Efficiency

- Performance

We need to:

- Not move data

- Bring «Stuff»

closer together

1024

256

64

64

16

4

Values are trends

Compute Energy

(Horowitz, ISSCC 2014)

ALU

Jonas Weiss, IBM Zurich Research Lab

Source: J. Weiss IBM

Page 27: Advances in Energy Efficient Neuromorphic Computing: Ready for ... · • IBM’s TrueNorth (DARPA’s SyNAPSE project) 65 mW real-time neurosynaptic processor, 4096 neurosynaptic

28

Anatomy of “Heavy” Workloads – The Actual Problem

Scientific Workloads

(electro) Chemistry

Drug Discovery

Weather/Climate

PDEs

AI & Machine Learning

Graph Analytics

Clustering Algorithms Image Classification

Time-Series Predictions

Backpropagation

Jonas Weiss, IBM Zurich Research Lab

Matrix-Vector Multiplication

Is Common to all Workloads!

and computationally Expensive!

Page 28: Advances in Energy Efficient Neuromorphic Computing: Ready for ... · • IBM’s TrueNorth (DARPA’s SyNAPSE project) 65 mW real-time neurosynaptic processor, 4096 neurosynaptic

29

RRAM Development in IBM Zurich

Substrate

TiN

HfO2

Ti

TiN

Baseline Stack

Substrate

TiN

HfO2

Metal-Oxides

TiN

Tuned StackC

on

du

ctan

ce

Pot. Depr.

Goal (symmetry & linearity)

Replace scavenging

layer with

intercalation metal-

oxides

set

[V]

[I]

Optimized Device Characteristics

reset

New Results to be

presented at MEMRISYS

2019 Dresden

Current Status

60 x 60 µm2 .1 x .1 µm2

In-house:

- Material growth

- Device fabrication

- Characterization

Device DC-Characteristics

Jonas Weiss, IBM Zurich Research Lab