Computing beyond a Million Processors - bio-inspired massively-parallel architectures

44
ACACES 12 July 2009 1 Computing beyond a Million Processors - bio-inspired massively-parallel architectures Steve Furber The University of Manchester [email protected] SBF is supported by a Royal Society-Wolfson Research Merit Award Andrew Brown The University of Southampton [email protected]

description

Computing beyond a Million Processors - bio-inspired massively-parallel architectures. Andrew Brown The University of Southampton [email protected]. Steve Furber The University of Manchester [email protected]. SBF is supported by a Royal Society- Wolfson Research Merit Award. - PowerPoint PPT Presentation

Transcript of Computing beyond a Million Processors - bio-inspired massively-parallel architectures

Page 1: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 1

Computing beyond a Million Processors

- bio-inspired massively-parallel architectures

Steve FurberThe University of [email protected]

SBF is supported by a Royal Society-Wolfson Research Merit Award

Andrew BrownThe University of Southampton

[email protected]

Page 2: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 2

Outline

• Computer Architecture Perspective• Building Brains• Living with Failure• Design Principles• The SpiNNaker system• Concurrency• Conclusions

Page 3: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 3

Multi-core CPUs• High-end uniprocessors

– diminishing returns from complexity– wire vs transistor delays

• Multi-core processors– cut-and-paste– simple way to deliver more MIPS

• Moore’s Law– more transistors– more cores

… but what about the software?

Page 4: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 4

Multi-core CPUS

• General-purpose parallelization– an unsolved problem– the ‘Holy Grail’ of computer science for half a century?– but imperative in the many-core world

• Once solved– few complex cores, or many simple cores?– simple cores win hands-down on power-efficiency!

Page 5: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 5

Back to the future

• Imagine…– a limitless supply of (free) processors– load-balancing is irrelevant– all that matters is:

• the energy used to perform a computation• formulating the problem to avoid synchronisation• abandoning determinism

• How might such systems work?

Page 6: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 6

Bio-inspiration

• How can massively parallel computing resources accelerate our understanding of brain function?

• How can our growing understanding of brain function point the way to more efficient parallel, fault-tolerant computation?

Page 7: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 7

Outline

• Computer Architecture Perspective• Building Brains• Living with Failure• Design Principles• The SpiNNaker system• Concurrency• Conclusions

Page 8: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 8

Building brains

• Brains demonstrate– massive parallelism (1011 neurons)– massive connectivity (1015 synapses)– excellent power-efficiency

• much better than today’s microchips– low-performance components (~ 100 Hz)– low-speed communication (~ metres/sec)– adaptivity – tolerant of component failure– autonomous learning

Page 9: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 9

(www.ship.edu/ ~cgboeree/theneuron.html)

Neurons• Multiple inputs

(dendrites)• Single output (axon)

– digital “spike”– fires at 10s to 100s

of Hz– output connects to

many targets• Synapse at

input/output connection

0 20 40 60 80 100 120 140 160 180 200-80

-60

-40

-20

0

20

40

Page 10: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 10

Neurons

• A flexible biological control component– very simple

animals have a handful

– bees: 850,000– humans: 1011

(photo courtesy of the Brain Mind Institute, EPFL)

Page 11: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 11

• Regular high-level structure– e.g. 6-level cortical

microachitecture• low-level vision, to• language, etc.

• Random low-level structure– adapts over time (f

acul

ty.w

ashi

ngto

n.ed

u/

rhev

ner/

Mis

cella

ny.h

tml)

Neurons

Page 12: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 12

Neural Computation• To compute we need:

– Processing– Communication– Storage

• Processing:abstract model– linear sum of

weighted inputs• ignores non-linear

processes in dendrites– non-linear output function– learn by adjusting synaptic weights

w1

x1

w2

x2

w3

x3

w4

x4

yf

Page 13: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 13

0&

/

)(

AsetfireAif

IAA

xwI

ttx

A

A

iii

kiki

• Leaky integrate-and-fire model– inputs are a series of spikes– total input is a weighted

sum of the spikes– neuron activation is the

input with a “leaky” decay– when activation exceeds

threshold, output fires– habituation, refractory

period, …?

Processing

Page 14: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 14

• Izhikevich model– two variables, one fast, one slow:

– neuron fires whenv > 30; then:

– a, b, c & d select behaviour

)(140504.0 2

ubvauIuvvv

duucv

( www.izhikevich.com )

Processing

0 20 40 60 80 100 120 140 160 180 200-80

-60

-40

-20

0

20

40

0 20 40 60 80 100 120 140 160 180 200-14

-12

-10

-8

-6

-4

-2

0

2

v

u

Page 15: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 15

Communication

• Spikes– biological neurons communicate principally

via ‘spike’ events– asynchronous– information is only:

• which neuron fires, and• when it fires

0 20 40 60 80 100 120 140 160 180 200-80

-60

-40

-20

0

20

40

Page 16: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 16

Storage• Synaptic weights

– stable over long periods of time• with diverse decay properties?

– adaptive, with diverse rules• Hebbian, anti-Hebbian, LTP, LTD, ...

• Axon ‘delay lines’• Neuron dynamics

– multiple time constants• Dynamic network states

Page 17: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 17

Outline

• Building Brains• Computer Architecture Perspective• Living with Failure• Design Principles• The SpiNNaker system• Concurrency• Conclusions

Page 18: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 18

The Good News...Transistors per Intel chip

0.001

0.01

0.1

1

10

100

1970 1975 1980 1985 1990 1995 2000

Year

Mill

ions

of t

rans

isto

rs p

er c

hip

8008

8080

8086

286386

486Pentium

4004

Pentium II

Pentium IIIPentium 4

Page 19: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 19

...and the Bad News

• Device variability

&

• Component failure

0.0 0.2 0.4 0.6 0.8 1.00.0

0.2

0.4

0.6

0.8

1.0

Vou

t2(V

)

Vout1

(V)

Page 20: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 20

Atomic Scale devices

The simulationParadigm now

A 4.2 nm MOSFETIn production 2023

A 22 nm MOSFETIn production 2008

Page 21: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 21

A view from Intel

• The Good News:– we will have 100 billion transistor ICs

• The Bad News:– billions will fail in manufacture

• unusable due to parameter variations– billions more will fail over the first year of operation

• intermittent and permanent faults(Shekhar Borkar, Intel Fellow)

Page 22: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 22

A view from Intel

• Conclusions:– one-time production test will be out– burn-in to catch infant mortality will be impractical– test hardware will be an integral part of the design– dynamically self-test, detect errors, reconfigure,

adapt, ...(Shekhar Borkar, Intel Fellow)

Page 23: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 23

Outline

• Building Brains• Computer Architecture Perspective• Living with Failure• Design Principles• The SpiNNaker system• Concurrency• Conclusions

Page 24: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 24

Design principles

• Virtualised topology– physical and logical connectivity are

decoupled• Bounded asynchrony

– time models itself• Energy frugality

– processors are free– the real cost of computation is energy

Page 25: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 25

Outline

• Building Brains• Computer Architecture Perspective• Living with Failure• Design Principles• The SpiNNaker system• Concurrency• Conclusions

Page 26: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 26

SpiNNaker project

• Multi-core CPU node– 20 ARM968 processors– to model large-scale systems of spiking neurons

• Scalable up to systems with 10,000s of nodes– over a million processors– >108 MIPS total

• Power ~ 25mw/neuron

Page 27: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 27

SpiNNaker project

Page 28: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 28

• Fault-tolerant architecture for large-scale neural modelling

• A billion neurons in real time

• A step-function increase in the scale of neural computation

• Cost- and energy-efficient

SpiNNaker project

Page 29: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 29

SpiNNaker system

Page 30: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 30

CMP node

Page 31: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 31

ARM968 subsystem

Page 32: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 32

GALS organization

• clocked IP blocks• self-timed

interconnect• self-timed inter-

chip links

Page 33: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 33

Outline

• Building Brains• Computer Architecture Perspective• Living with Failure• Design Principles• The SpiNNaker system• Concurrency• Conclusions

Page 34: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 34

Circuit-level concurrency

• Delay-insensitive comms– 3-of-6 RTZ on chip– 2-of-7 NRZ off chip

• Deadlock resistance– Tx & Rx circuits have high deadlock

immunity– Tx & Rx can be reset independently

• each injects a token at reset• true transition detector filters surplus

token

din

(2 phase)

dout

(4 phase)

¬reset ¬ack

Tx Rxdata

ack

Page 35: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 35

System-level concurrency

• Breaking symmetry– any processor can be Monitor Processor

• local ‘election’ on each chip, after self-test– all nodes are identical at start-up

• addresses are computed relative to node with host connection (0,0)

– system initialised using flood-fill• nearest-neighbour packet type• boot time (almost) independent of system scale

Page 36: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 36

Application-level concurrency

• Event-driven real-time software– spike packet arrived

• initiate DMA– DMA of synaptic

data completed• process inputs• insert axonal delay

– 1ms Timer interrupt

update_Neurons();

update_Stimulus();

sleeping

event

goto_Sleep();

Priority 1 Priority 2Priority 3

Timer MillisecondInterrupt

DMA CompletionInterrupt

Packet ReceivedInterrupt

fetch_Synaptic_

Data();

Page 37: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 37

Application-level concurrency

• Cross-system delay << 1ms– hardware

routing– ‘emergency’

routing• failed links• Congestion

– if all else fails• drop packet

Page 38: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 38

• Firing rate population codes– N neurons– diverse tuning– collective coding of

a physical parameter– accuracy– robust to neuron

failure(Neural Engineering,

Eliasmith & Anderson 2003)

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 10

50

100

150

200

250

300

350

400

450

500

firin

g ra

te

parameter

N

Biological concurrency

Page 39: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 39

Unordered N-of-M

Ordered N-of-M

M-bit binary

Number of codes CNM M!/(M-N)! 2M

e.g. M=100, N=20 1021 1039 1030

e.g. M=1000, N=200 10216 10591 10301

• Single spike/neuron codes– choose N to fire from a population of M– order of firing may or may not matter

Biological concurrency

Page 40: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 40

Outline

• Building Brains• Computer Architecture Perspective• Living with Failure• Design Principles• The SpiNNaker system• Concurrency• Conclusions

Page 41: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 41

Software progress• ARM SoC Designer

SystemC model• 4 chip x 2 CPU top-

level Verilog model • Running:

• boot code• Izhikevich model• PDP2 codes

• Basic configuration flow

• …it all works!

Page 42: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 42

Where might this lead?

• Robots– iCub EU project– open humanoid

robot platform– mechanics, but no

brain!

Page 43: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 43

Conclusions

• Many-core processing is coming– soon we will have far more processors than we can program

• When (if?) we crack parallelism…– more small processors are better than fewer large

processors– synchronization, coherent global memory, determinism, are

all impediments• Biology suggests a way forward!

– but we need new theories of biological concurrency

Page 44: Computing beyond a Million Processors - bio-inspired massively-parallel architectures

ACACES 12 July 2009 44

UoM SpiNNaker team