HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon...

37
HETEROGENEOUS IMPLEMENTATION OF NEURAL NETWORK ALGORITHMS Dmitri Yudanov (AMD) Leon Reznik (RIT)

description

Presentation HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik at the AMD Developer Summit (APU13) November 11-13, 2013.

Transcript of HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon...

Page 1: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

HETEROGENEOUS  IMPLEMENTATION  OF  NEURAL  NETWORK  ALGORITHMS  

Dmitri  Yudanov  (AMD)  Leon  Reznik  (RIT)  

Page 2: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

|      Heterogeneous  implementa?on  of  Neural  network  algorithms  |      NOVEMBER  2013      |      CONFIDENTIAL  2  

AGENDA  

Neural  Networks:  Origin,  Features,  Applica?ons  

Spiking  Neural  Networks  (SNN):  Simula?on  Principles  

SNN:  Heterogeneous  Implementa?on  

Page 3: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

Neural  Networks:  Origin  Features  Applica?ons  

Page 4: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

|      Heterogeneous  implementa?on  of  Neural  network  algorithms  |      NOVEMBER  2013      |      CONFIDENTIAL  4  

NEURAL  NETWORKS:  ORIGIN,  FEATURES,  APPLICATIONS  

!  From  Biological  to  Ar?ficial  Neural  Networks  (ANN)  

!  ANN  Applica?ons    ‒ Applica?on  categories  ‒ Examples  

!  Why  ANN?  

!  Why  Spiking  Neural  Network  (SNN)?  

OUTLINE  

Page 5: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

|      Heterogeneous  implementa?on  of  Neural  network  algorithms  |      NOVEMBER  2013      |      CONFIDENTIAL  5  

FROM  BIOLOGICAL  TO  ARTIFICIAL  NEURAL  NETWORK  (ANN)  

!  ANN  is  simplifica?on  of  biological  neural  network  

!  ANN  consists  of  simple  elements  (neurons)  analogous  to  the  biological  neurons  in  the  brain.  

!  The  neurons  are  connected  by  weighted  links  and  form  a  network.  

!  The  links  pass  signals  (numbers)  from  one  neuron  to  another.  Neurons  operate  on  the  weighted  signals  and  retransmit  the  results  

!  The  network  can  learn  by  adjus?ng  the  weights  (the  behavior  is  encoded  in  weights).  

NEURAL  NETWORKS:  ORIGIN,  FEATURES,  APPLICATIONS  

Page 6: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

|      Heterogeneous  implementa?on  of  Neural  network  algorithms  |      NOVEMBER  2013      |      CONFIDENTIAL  6  

ANN  APPLICATION  CATEGORIES  NEURAL  NETWORKS:  ORIGIN,  FEATURES,  APPLICATIONS  

!  Based  on  patent  and  applica?on  search  (US  Patent  and  Trademark  Office,  EU  Patent  Office,  Google  Patent  Search.  Conducted  in  2012  by  students  of  Machine  Learning  class  (Dr.  Leon  Reznik,  RIT)  

0%  2%  4%  6%  8%  

10%  12%  14%  16%  18%  

Page 7: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

|      Heterogeneous  implementa?on  of  Neural  network  algorithms  |      NOVEMBER  2013      |      CONFIDENTIAL  7  

WHY  ANN?  EXAMPLES  NEURAL  NETWORKS:  ORIGIN,  FEATURES,  APPLICATIONS  

!  Recogni@on  ‒ Character  (e.g.  mail),  speech,  image  (e.g.  image  clustering),  odor  (e.g.  locust  antennal  lobe),  face  and  emo?on  

!  Gaming  ‒ AI  features  in  games  

!  Robo@cs  ‒ Vision,  spa?al  naviga?on  and  planning  (e.g.  mental  maps  with  place  cells),  posi?oning,  decision  making  

!  Control  ‒ Missile  guidance  ‒ An?-­‐lock  brakes  (Ford)  ‒  Self-­‐driving  cars,  UAVs  

!  Crime  preven@on  and  security  ‒ Bomb  sniffer  (JFK  airport)  ‒ Credit  card  fraud  detec?on  (Visa)  

!  Biomedical  ‒ Neuroscience:  Brain  modeling  and  simula?on  

‒ US  BRAIN  Ini?a?ve  (expected  300  EB/day)  ‒  EU  Human  brain  project  

‒ Neurology:  (e.g.  disease  modeling  and  forecas?ng,  ModelDB)  

‒ Cardiology:  (e.g.  adap?ve  biventricular  pacemaker)  ‒ Prosthesis:  BCI  neuromosphic  chips  

!  Financial  analysis  ‒ Mortgage  risk  evalua?on  (AVCO,  Irvine)  ‒ Currency  trading  (Ci?bank)  

!   Difficul@es  ‒ Need  to  compute  fast  but  problem  size  is  large  ‒ How  to  get  the  right  ANN  circuit  for  an  applica?on?  

Page 8: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

|      Heterogeneous  implementa?on  of  Neural  network  algorithms  |      NOVEMBER  2013      |      CONFIDENTIAL  8  

WHY  ANN?  NEURAL  NETWORKS:  ORIGIN,  FEATURES,  APPLICATIONS  

!  Novel  algorithms.  ‒ Conven?onal  algorithms  performance  is  not  sa?sfactory  in  numerous  problems  with  dynamic  changes  (e.g.  face  recogni?on  may  fail  if  the  view  angle  is  different  or  the  person  is  smiling).  

!  Learning,  adaptability.  ‒ Con?nuously  learn  from  the  available  data  and  adapt  to  new  condi?ons.  

!  Reliability.  ‒ Performance  tends  to  degrade  gracefully  under  par?al  damage.  Parts  of  networks  can  learn  to  perform  func?on  of  damaged  parts.  In  contrast,  most  programs  and  engineered  systems  are  brijle:  if  you  remove  some  arbitrary  parts,  very  likely  the  whole  system  ceases  to  func?on.    

!  Low  power.  Neuromorphic  engineering  ‒  Switching  speed  of  biological  neurons  is  less  than  1KHz  (CPU  3GHz)  

‒  Switching  energy  of  biological  neurons  ~  1.0E-­‐17  Joules/op  (CPU  1.0E-­‐5  joules/op)  

‒ Conduc?on  speed  of  biological  neural  network  ~  100  m/s  

!  Parallel.  ‒ Brain  performs  massively  parallel  computa?ons  very  efficiently.  Data  and  processing  have  global  impact.  For  example,  complex  visual  percep?on  occurs  within  less  than  100  ms,  that  is,  10  processing  steps.  

!  AI.  Consciousness.  Intelligence.  Self-­‐awareness.  

Page 9: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

|      Heterogeneous  implementa?on  of  Neural  network  algorithms  |      NOVEMBER  2013      |      CONFIDENTIAL  9  

WHY  SNN?  NEURAL  NETWORK  CATEGORIES  

!  Which  level  of  abstrac?on  to  choose?  

!  Which  one  is  the  right  for  the  target  applica?on?  

!  Point-­‐to-­‐point  connected  spiking  neural  network  (SNN):  ?me  (spikes),  polychroniza?on  (memory  capacity),  unsupervised  learning  (synap?c  plas?city)  

NEURAL  NETWORKS:  ORIGIN,  FEATURES,  APPLICATIONS  

Complexity

Biological  

Le

arn

ing

Ab

ilit

y

Time Dynamics

Hopfield  Recurrent  

Rosenblaj  ADALINE  

MLP  RBF   LVQ  Neural  Gas  

ASNN  SOM  

SNN  

Page 10: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

Spiking  Neural  Networks  Simula?on  Principles  

Page 11: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

|      Heterogeneous  implementa?on  of  Neural  network  algorithms  |      NOVEMBER  2013      |      CONFIDENTIAL  11  

OUTLINE  

!  SNN  Models  

!  Synap?c  plas?city  !  Simula?on  Types  

‒ Time-­‐driven  (synchronous)  simula?on  ‒ Event-­‐driven  (asynchronous)  simula?on  ‒ Timed  event-­‐driven  (hybrid)  simula?on  

!  Numerical  Integra?on  Methods  ‒ Euler  ‒ Parker-­‐Sochacki  

!  Summary  

 

SPIKING  NEURAL  NETWORKS  (SNN):  SIMULATION  PRINCIPLES  

Page 12: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

|      Heterogeneous  implementa?on  of  Neural  network  algorithms  |      NOVEMBER  2013      |      CONFIDENTIAL  12  

HETEROGENEOUS  IMPLEMENTATION:  SIMULATORS  AND  ABSTRACTION  LEVEL  SNN:  HETEROGENEOUS  IMPLEMENTATION  

!  Popula?on  model  ‒ Nengo  

!  Point-­‐neuron  network  models  ‒ NEST  ‒ PCSIM  ‒ Brian  

!  Compartmental  neuron  and  membrane  models  ‒ NEURON  ‒ GENESIS  

!  Reac?on-­‐diffusion  model  of  biochemical  signaling  pathways  ‒  STEPS  

Page 13: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

|      Heterogeneous  implementa?on  of  Neural  network  algorithms  |      NOVEMBER  2013      |      CONFIDENTIAL  13  

SNN  MODELS:  TRADEOFFS  SNN  SIMULATION  PRINCIPLES  

HH

IF

!  Integrate-­‐and-­‐Fire  (IF):  simple,  but  has  poor  spiking  response  

!  Hodgkin-­‐Huxley  (HH):  has  reach  response,  but  complex  

!  Izhikevich  (IZ):  simple,  has  reach  response,  but  phenomenological  

IZ

Page 14: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

|      Heterogeneous  implementa?on  of  Neural  network  algorithms  |      NOVEMBER  2013      |      CONFIDENTIAL  14  

SYNAPTIC  PLASTICITY  SNN  SIMULATION  PRINCIPLES  

!  Long-­‐term  plas?city  (minutes  –  hours)  ‒  LTP  (signaling  pathways,  post-­‐  and  pre-­‐  synap?c  ac?vity  correla?on)  

‒  LTD  (strong  or  persistent  weak  s?mula?on,  inac?vity,  drugs)  

!  STDP    !  Synapse:  how  it  works  

‒  Spikes  !  vesicles  !  fusing  !  transmijer  crossing  the  clep  !  binding  

‒  Synap?c  strength  !  PSP  strength  

!  Synap?c  strength:  ‒  Transmijer  release  volume  ‒  Connec?ons:  number,  size  ‒  Channels,  receptors:  density,  type,  conductance  

!  Short-­‐term  plas?city  (milliseconds  –  minutes)  ‒  Facilita?on  (spiking  rate  !  presynap?c   𝐶𝑎↑2+   !  fusing  rate)  ‒  Fa?gue  (transmijer  release  vs.  recycle  rate  !  deple?on  of  vesicles)  

 

Page 15: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

|      Heterogeneous  implementa?on  of  Neural  network  algorithms  |      NOVEMBER  2013      |      CONFIDENTIAL  15  

!  Events  aligned  to  ?me  grid  ‒ Can  update  all  neurons  at  the  same  ?me  

‒ Good  for  parallel  implementa?on  

!  Time  quan?za?on  error  ‒ Delayed  or  missing  events  ‒ Can  be  controlled  by  size  of  dt:  the  smaller  the  size  the  smaller  the  error,  but  the  more  computa?on  per  unit  ?me  

TIME-­‐DRIVEN  (SYNCHRONOUS)  SIMULATION  SNN  SIMULATION  PRINCIPLES  

Page 16: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

|      Heterogeneous  implementa?on  of  Neural  network  algorithms  |      NOVEMBER  2013      |      CONFIDENTIAL  16  

!  Events  are  unique  in  ?me:  ‒ A  single  event  can  change  the  state  of  the  whole  system  

‒ Have  to  update  neurons  sequen?ally  in  the  order  of  events  

‒ Minimum  transmission  latency  is  unknown  

‒ Assumes  analy?cal  solu?on  for  the  model  equa?ons  

‒ …  or  ?med  event-­‐driven  update  

!  Time  quan?za?on  error  ‒ No  error  caused  by  simula?on  type  ‒ Bejer  event  accuracy  ‒ Good  for  STDP  

EVENT-­‐DRIVEN  (ASYNCHRONOUS)  SIMULATION  SNN  SIMULATION  PRINCIPLES  

Page 17: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

|      Heterogeneous  implementa?on  of  Neural  network  algorithms  |      NOVEMBER  2013      |      CONFIDENTIAL  17  

!  Events  are  unique  in  ?me:  ‒  A  single  event  can  change  the  state  of  the  whole  system,  but  not  within  the  minimum  transmission  delay  

‒  Time  grid:  dt  is  equal  to  the  minimum  delay  ‒  Update  all  neurons  at  the  same  ?me  every  dt  increment  

‒  Also  between  dt  increments  update  every  neuron  in  the  order  of  events  it  receives  within  the  increment.  

‒  Good  for  parallel  implementa?on,  but  there  is  computa?on  divergence  across  neurons.  

!  Time  quan?za?on  error  ‒  No  error  caused  by  simula?on  type  ‒  Bejer  event  accuracy  ‒  Good  for  STDP  

TIMED  EVENT-­‐DRIVEN  (HYBRID)  SIMULATION  SNN  SIMULATION  PRINCIPLES  

Page 18: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

|      Heterogeneous  implementa?on  of  Neural  network  algorithms  |      NOVEMBER  2013      |      CONFIDENTIAL  18  

NUMERICAL  INTEGRATION  METHODS  SNN  SIMULATION  PRINCIPLES  

!  Mo@va@on.  Need  to  solve  ini?al  value  problem  (IVP)  

!  Euler.  Compute  next  y  based  on  tangent  to  current  y.  

!  Modified  Euler.  Predict  with  Euler,  correct  with  average  slope.  

!  Runge-­‐KuXa  (4th  Order).  Evaluate  and  average.  

!  Bulirsch–Stoer  ‒  Uses  Modified  midpoint  method  with  evalua?on  and  error  tolerance  check  using  extrapola?on  with  ra?onal  func?ons.  Provides  adap?ve  order.  Generally  more  suited  for  smooth  func?ons.  

!  Parker-­‐Sochacki  ‒  Uses  expression  of  IVP  in  terms  of  power  series.  Provides  adap?ve  order.  

Page 19: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

|      Heterogeneous  implementa?on  of  Neural  network  algorithms  |      NOVEMBER  2013      |      CONFIDENTIAL  19  

NUMERICAL  INTEGRATION  METHODS:  EULER  SNN  SIMULATION  PRINCIPLES  

𝑦↑′ (𝑡)=𝑓(𝑡,𝑦(𝑡))   𝑦(𝑡↓0 )= 𝑦↓0  𝑦↓𝑛+1   =   𝑦↓𝑛 +ℎ𝑓( 𝑡↓𝑛 , 𝑦↓𝑛 )  

Page 20: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

|      Heterogeneous  implementa?on  of  Neural  network  algorithms  |      NOVEMBER  2013      |      CONFIDENTIAL  20  

!  As  a  result:  

NUMERICAL  INTEGRATION  METHODS:  PARCKER-­‐SOCHACKI  SNN  SIMULATION  PRINCIPLES  

!  Assume  that  solu?on  func?on  can  be  represented  with  power  series.  

!  Therefore,  its  deriva?ve  based  on  Maclaurin  series  proper?es  is  

!  A  typical  IVP  

Page 21: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

|      Heterogeneous  implementa?on  of  Neural  network  algorithms  |      NOVEMBER  2013      |      CONFIDENTIAL  21  

NUMERICAL  INTEGRATION  METHODS:  PARCKER-­‐SOCHACKI  SNN  SIMULATION  PRINCIPLES  

!  If                is  linear:    

!  Ship  it  to  eliminate  constant  term:  

!  With  finite  order  N:  

!  Parallelism:  ‒  Loop-­‐level  parallelism  ‒ Parallel  reduc?on  

!  As  a  result,  the  equa?on  becomes:   !  Benefit:  adap?ve  order  and  error  tolerance  control  ‒  Local  Lipschitz  constant  determines  the  number  of  itera?ons  for  achieving  certain  error  tolerance:  

Page 22: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

|      Heterogeneous  implementa?on  of  Neural  network  algorithms  |      NOVEMBER  2013      |      CONFIDENTIAL  22  

SUMMARY  SNN  SIMULATION  PRINCIPLES  

!  Neuron/Synapse  Model  

!  Simula?on  Type  

!  Integra?on  Method  

!  Applica?on  !  Requirements  

!  Result  

Page 23: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

Spiking  Neural  Networks  Heterogeneous  Implementa?on  

Page 24: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

|      Heterogeneous  implementa?on  of  Neural  network  algorithms  |      NOVEMBER  2013      |      CONFIDENTIAL  24  

OUTLINE  

!  Simula?on  Flow  ‒  Synchronous  ‒ Hybrid  ‒ Combined  

!  Implementa?on  of  Hybrid  Simula?on  Type  ‒  Simula?on  Flow  ‒  Simula?on  Phases  

‒ Update  ‒  Expand  ‒  Sort  

‒ Results  !  Heterogeneous  Implementa?on  of  Synchronous  Simula?on  Type  

‒ NEST  Simulator  ‒  Sopware  Architecture  

SNN:  HETEROGENEOUS  IMPLEMENTATION  

Page 25: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

|      Heterogeneous  implementa?on  of  Neural  network  algorithms  |      NOVEMBER  2013      |      CONFIDENTIAL  25  

SYNCHRONOUS  SIMULATION  FLOW  

!  Simula?on  step  (dt)  has  two  phases:  

‒ Update:  ‒ Compute  new  state  for  all  neurons.  ‒ Detect  spiked  neurons  and  process  them  separately  to  update  spike  history  (divergence  reduc?on).  

‒ Propaga?on:  ‒ Expand  spikes  to  arriving  events.  

SNN:  HETEROGENEOUS  IMPLEMENTATION  

Page 26: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

|      Heterogeneous  implementa?on  of  Neural  network  algorithms  |      NOVEMBER  2013      |      CONFIDENTIAL  26  

HYBRID  SIMULATION  FLOW  

!  Simula?on  step  (dt)  has  two  phases:  

‒ Update:  ‒  Compute  new  state  for  all  neurons  at  the  ?mes  of  arriving  spikes  (event-­‐driven).  

‒ Detect  spiked  neurons  and  process  them  separately  to  compute  spike  ?me  and  update  spike  history  (divergence  reduc?on).  

‒ Propaga?on:  ‒  Expand  spikes  to  arriving  events.  ‒  Sort  the  events  that  are  due  for  delivery  in  the  current  ?me  step  by  arrival  ?me  for  each  neuron.  

‒  Create  a  pointer  array  that  maps  neurons  to  their  sorted  events.  

SNN:  HETEROGENEOUS  IMPLEMENTATION  

Page 27: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

|      Heterogeneous  implementa?on  of  Neural  network  algorithms  |      NOVEMBER  2013      |      CONFIDENTIAL  27  

COMBINED  SIMULATION  FLOW  

!  Exchange  spikes  between  compute  nodes  (MPI)  ‒  Spike  is  (?me  stamp,  source  neuron  ID)  

!  Store  spikes  in  the  spike  ring  buffer  ‒ How  many  ring  segments?  int(max  delay  /  min  delay)  ‒ The  ring  ‘rotates’  every  step  by  one  segment  

!  Expand  spikes  ‒  Spike  segments  are  matched  with  relevant  delay  segments  (synap?c  connec?vity  matrix)  

‒ Arrival  ?me  is  computed  ‒  Synap?c  events  due  filtered  

!  Sort  synap?c  events  by  arrival  ?me  for  each  target  neuron  (event-­‐driven  only)  

!  Update  neurons  !  Update  synapses  !  Gather  new  spikes  

SNN:  HETEROGENEOUS  IMPLEMENTATION  

Page 28: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

|      Heterogeneous  implementa?on  of  Neural  network  algorithms  |      NOVEMBER  2013      |      CONFIDENTIAL  28  

IMPLEMENTATION  OF  HYBRID  SIMULATION:  UPDATE  PHASE  

!  Wave-­‐fronts  (WFs)    work  on  their  segments  of  neurons  represented  by  parameters  and  state  stored  in  global  memory  (GM)  

!  A  work-­‐item  (WI)  takes  a  neuron  and  updates  its  state  at  every  arriving  event  

!  The  state  is  stored  back  to  GM  

!  Spike  data  is  accumulated  in  local  data  store  (LDS)  and  flushed  to  GM  periodically.  

!  Spiked  neurons  are  processed  in  a  separate  kernel  (divergence  reduc?on)  ‒  Spike  ?me  is  computed  with  Newton  Raphson  method  (NR)  

‒  Spiked  neurons  are  updated  for  the  rest  of  arriving  events.  

SNN:  HETEROGENEOUS  IMPLEMENTATION  

Page 29: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

|      Heterogeneous  implementa?on  of  Neural  network  algorithms  |      NOVEMBER  2013      |      CONFIDENTIAL  29  

IMPLEMENTATION  OF  HYBRID  SIMULATION:  EXPAND  PHASE  

!  Load  source  spike  packets  from  GM  and  stored  them  in  con?guous  array  in  LDS.  

!  Load  synap?c  pointer  to  LDS.  ‒ Each  neuron  is  connected  to  100s  or  even  1000s  of  other  neurons.  Synap?c  pointer  describes  where  to  get  synap?c  data  for  target  neurons  for  known  spike  source  neuron.  

!  Main  loop  ‒ A  WF  picks  a  source  spike  (?me  stamp,  source  neuron  ID)  and  the  pointer  

‒ A  WI  loads  synap?c  data  for  a  target  neuron,  computes  arrival  ?me  and  stores  synap?c  event  in  the  ring  buffer  in  GM.  

!  Alone  the  way  the  sort  histogram  (required  in  radix  sort)  is  loaded  and  stored  in  LDS.  It  is  updated  reflec?ng  the  newly  created  synap?c  events.  

SNN:  HETEROGENEOUS  IMPLEMENTATION  

Page 30: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

|      Heterogeneous  implementa?on  of  Neural  network  algorithms  |      NOVEMBER  2013      |      CONFIDENTIAL  30  

IMPLEMENTATION  OF  HYBRID  SIMULATION:  SORT  PHASE  

!  We  need  to  order  synap?c  events  by  arrival  ?me  and  by  target  ID  

!  Radix  sort:  select  next  radix  from  LSD  to  MSD  and  group  numbers  based  on  radix  value  from  smallest  to  largest  ‒ Group  numbers  based  on  current  radix  and  compute  histogram  (count  of  numbers  with  the  same  radix  value)  

‒  Scan  histogram:  compute  prefix  sum  (global  offset  for  the  next  grouping).  

!  8  passes  for  32-­‐bit  addressing  and  4-­‐bit  radix.  

SNN:  HETEROGENEOUS  IMPLEMENTATION  

Radix  sort  example:  1  bit  radix.  LSD  sort.

Page 31: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

|      Heterogeneous  implementa?on  of  Neural  network  algorithms  |      NOVEMBER  2013      |      CONFIDENTIAL  31  

IMPLEMENTATION  OF  HYBRID  SIMULATION:  PERFORMANCE  SNN:  HETEROGENEOUS  IMPLEMENTATION  

Network  Size  (neurons)  

Average  Synapses  per  Neuron  

Average  Events  per  Step  

Average  Spikes  per  Step  

Total  Synapse  Count  

(millions)  

“Tahi@”  GPU  Time  per  Step,  (ms)  

2,100,000   90   230,000   2,522   190   13.5  

131,000   1,458   370,000   257   191   5.7  

16,000   11,677   300,000   25   191   3.2  

!  Size-­‐connec?on  scalability  in  mul?-­‐precision  networks  with  per-­‐WF  precision  alloca?on  

!  1000  itera?ons,  250  us  step  !  Randomly-­‐connected  SNN  with  only  AMPA  synapses  

!  Speedups  up  to  100  depending  on  configura?on  and  compared  devices  

Page 32: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

|      Heterogeneous  implementa?on  of  Neural  network  algorithms  |      NOVEMBER  2013      |      CONFIDENTIAL  32  

HETEROGENEOUS  IMPLEMENTATION:  SIMULATOR  ARCHITECTURE  SNN:  HETEROGENEOUS  IMPLEMENTATION  

!  Interface:  Python  –  SLI  –  Network  class  (C++)  !  Object-­‐oriented:  Nodes  –  Connec?ons  –  Events  !  Network:  administrates  node  connec?ons  !  Scheduler:  orchestrates  simula?on  

‒  Node  management:  update,  prepare,  finalize  ‒  Execu?on  type  selec?on:  serial,  p-­‐threads,  OpenMP  ‒  Step  scheduling  ‒  Event  transmission  via  Communicator  

!  Communicator  ‒  Inter-­‐process  communica?on  ‒ MPI  

!  Features  ‒  Primarily  used  as  a  vehicle  for  neuroscience  research  ‒  Generic,  suitable  for  SNN  applica?ons  ‒  Both  ?me-­‐  and  event-­‐driven  simula?on  types  ‒  Flexible  node  dynamics,  a  variety  of  built-­‐in  models  ‒  Communica?on  infrastructure  to  deliver  both  discrete  and  con?nuous  events  at  the  same  ?me.  

‒  Emphasis  on  correctness,  performance  and  scalability  

Page 33: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

|      Heterogeneous  implementa?on  of  Neural  network  algorithms  |      NOVEMBER  2013      |      CONFIDENTIAL  33  

HETEROGENEOUS  IMPLEMENTATION:  SOFTWARE  ARCHITECTURE  SNN:  HETEROGENEOUS  IMPLEMENTATION  

!  Simplified  UML  diagram  for  heterogeneous  part  of  implementa?on  

!  Neuron  model  templates  (single  and  double  precision)  with  OpenCL™  update  phase  

!  Object-­‐oriented  design  with  shared  vector  members  (data  redundancy  reduc?on)  

!  STL-­‐like  containers  with  OpenCL™  memory  /  buffer  types  underneath  

!  On-­‐a-­‐fly  CPU-­‐GPU  execu?on  steering:  adaptability  

!  Data  structure  size  stability:  sta?s?cal  monitoring,  steering,  error  repor?ng  

 

Page 34: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

|      Heterogeneous  implementa?on  of  Neural  network  algorithms  |      NOVEMBER  2013      |      CONFIDENTIAL  34  

CONCLUSION  HETEROGENEOUS  IMPLEMENTATION  OF  NEURAL  NETWORK  ALGORITHMS  

!  Thank  You!  

Page 35: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

|      Heterogeneous  implementa?on  of  Neural  network  algorithms  |      NOVEMBER  2013      |      CONFIDENTIAL  35  

LITERATURE  HETEROGENEOUS  IMPLEMENTATION  OF  NEURAL  NETWORK  ALGORITHMS  

!  R.  Breje,  et  al.,  "Simula?on  of  networks  of  spiking  neurons:  A  review  of  tools  and  strategies,"  Journal  of  Computa0onal  Neurscience,  vol.  23,  no.  3,  pp.  349-­‐398,  2007.  

!  B  Gaster,  D  R  Kaeli,  L  Howes,  and  P  Mistry,  Heterogeneous  Compu?ng  with  OpenCL  ™  :  Morgan  Kaufmann  Pub,  2011.  

!  T  Harada  and  L  Howes.  (2011,  Dec.)  “Introduc?on  to  GPU  Radix  Sort.”  Heterogeneous  Compute.  [Online].    

!  E.  M.  Izhikevich,  "Simple  model  of  spiking  neurons,"  Neural  Networks,  IEEE  Transac?ons  on,  vol.  14,  pp.  1569-­‐-­‐1572,  2003.  

!  R  Stewart  and  W  Bair,  "Spiking  neural  network  simula?on:  numerical  integra?on  with  the  Parker-­‐Sochacki  method,"  Journal  of  Computa?onal  Neuroscience,  vol.  27,  no.  1,  pp.  115-­‐133,  August  2009.  

!  D  Yudanov,  L  Reznik,  "Scalable  mul?-­‐precision  simula?on  of  spiking  neural  networks  on  GPU  with  OpenCL  ™."  Neural  Networks  (IJCNN),  The  2012  Interna?onal  Joint  Conference  on.  IEEE,  2012.  

Page 36: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

|      Heterogeneous  implementa?on  of  Neural  network  algorithms  |      NOVEMBER  2013      |      CONFIDENTIAL  36  

THANKS  HETEROGENEOUS  IMPLEMENTATION  OF  NEURAL  NETWORK  ALGORITHMS  

!  Wayne  Burleson  

!  Mayank  Daga  

!  Markus  Diesmann  

!  Joseph  Dinh  

!  Tan  Ho  

!  Aus?n  Hung  

!  Jeremy  Johnson  

!  John  Keaty  

!  Bingley  Li  

!  Gewal?g  Marc-­‐Oliver  

!  Saul  Mar?nez  

!  Haibin  Niu  

!  Kyle  Pour  

!  Jason  Shantz  

!  Jason  Tang  

!  Yury  Zaytsev  

Page 37: HC-4016, Heterogeneous Implementation of Neural Network Algorithms, by Dmitri Yudanov and Leon Reznik

|      Heterogeneous  implementa?on  of  Neural  network  algorithms  |      NOVEMBER  2013      |      CONFIDENTIAL  37  

DISCLAIMER  &  ATTRIBUTION  

The  informa?on  presented  in  this  document  is  for  informa?onal  purposes  only  and  may  contain  technical  inaccuracies,  omissions  and  typographical  errors.    

The  informa?on  contained  herein  is  subject  to  change  and  may  be  rendered  inaccurate  for  many  reasons,  including  but  not  limited  to  product  and  roadmap  changes,  component  and  motherboard  version  changes,  new  model  and/or  product  releases,  product  differences  between  differing  manufacturers,  sopware  changes,  BIOS  flashes,  firmware  upgrades,  or  the  like.  AMD  assumes  no  obliga?on  to  update  or  otherwise  correct  or  revise  this  informa?on.  However,  AMD  reserves  the  right  to  revise  this  informa?on  and  to  make  changes  from  ?me  to  ?me  to  the  content  hereof  without  obliga?on  of  AMD  to  no?fy  any  person  of  such  revisions  or  changes.    

AMD  MAKES  NO  REPRESENTATIONS  OR  WARRANTIES  WITH  RESPECT  TO  THE  CONTENTS  HEREOF  AND  ASSUMES  NO  RESPONSIBILITY  FOR  ANY  INACCURACIES,  ERRORS  OR  OMISSIONS  THAT  MAY  APPEAR  IN  THIS  INFORMATION.    

AMD  SPECIFICALLY  DISCLAIMS  ANY  IMPLIED  WARRANTIES  OF  MERCHANTABILITY  OR  FITNESS  FOR  ANY  PARTICULAR  PURPOSE.  IN  NO  EVENT  WILL  AMD  BE  LIABLE  TO  ANY  PERSON  FOR  ANY  DIRECT,  INDIRECT,  SPECIAL  OR  OTHER  CONSEQUENTIAL  DAMAGES  ARISING  FROM  THE  USE  OF  ANY  INFORMATION  CONTAINED  HEREIN,  EVEN  IF  AMD  IS  EXPRESSLY  ADVISED  OF  THE  POSSIBILITY  OF  SUCH  DAMAGES.  

 

ATTRIBUTION  

©  2013  Advanced  Micro  Devices,  Inc.  All  rights  reserved.  AMD,  the  AMD  Arrow  logo  and  combina?ons  thereof  are  trademarks  of  Advanced  Micro  Devices,  Inc.  in  the  United  States  and/or  other  jurisdic?ons.    OpenCL™  is  a  trademark  of  Apple  Inc.  Other  names  are  for  informa?onal  purposes  only  and  may  be  trademarks  of  their  respec?ve  owners.