Learning, Inference, and Replay of Hidden State [0.17] Sequences … · 2018. 1. 30. · wrec kk0 /...

Learning, Inference, and Replay of Hidden StateSequences in Recurrent Spiking Neural Networks

Dane Corneil, Emre Neftci, Giacomo Indiveri & Michael Pfeiffer

Institute of Neuroinformatics, ETH Zurich & University of Zurich; Laboratory of Computational Neuroscience, EPFL

NCL

Abstract

Learning to recognize, predict, and generate spatio–temporal patternsand sequences of spikes is a key feature of nervous systems, andessential for solving basic tasks like localization and navigation. Howthis can be done by a spiking network, however, remains an openquestion. Here we present a STDP-based framework extending aprevious model [1], that can simultaneously learn to abstract hiddenstates from sensory inputs and learn transition probabilities [2] betweenthese states in recurrent connection weights.

Network Structure

WTAz1 zK...

STDP-basedlearning rule

sensoryevidence

spiking neuronsW

lateralinhibition

xmx1 . . .

recurrentexcitation

y1 . . . yn

ff

Wrec

Zt-1

Xt-1

Zt

Xt

Zt+1

Xt+1

zt-1 zt zt+1

xt-1 xt xt+1

Each output neuron zk encodes a hidden cause over the input, where

p(zk fires at time t) ∝ exp(uk(t)− I(t))

uk(t) =N∑i=1

wffki · yi(t) +

K∑k′=1

wreckk′ · zk′(t)

wffki = log p(yi = 1|zk = 1,w)

wreckk′ = log p(zk = 1|zk′ = 1,w).

Trained on input generated by a Hidden Markov Model, the activity ofthe Winner-Take-All (WTA) network evolves as a single–sample(unitary) particle filter [3].

Recurrent Learning Rule

Depression on every presynaptic spike; weight– and time– dependentpotentiation if postsynaptic neuron fires within a time window.

∆wreckk′ ∝

{−1 after presynaptic spike, ande−wkk′·zk′(t) after pre–post pair

E[∆wreckk′] = 0⇔ wrec

kk′ = log p(zk = 1|zk′ = 1,w)

Learning a Hidden Markov Model

A four–state HMM was presented to the network, with states definedby differing Poisson firing statistics over 225 input neurons.

Inp

ut

Neu

rons

A

0 2 4Time (s)

Sta

te N

euro

ns

B

C

D

0.018(0.019) 0.023

(0.025)

0.035(0.035)

0.035(0.034)

0.018(0.018)

0.047(0.047)

0.953(0.951)

0.947(0.947)

0.965(0.963)

0.959(0.956)

Optimal

A

BC

D

(Learned)

State–dependent Inference

A network model with additional transition neurons for encoding andlearning Finite State Machines (FSMs) was trained on observationsfrom a four–state maze traversal task. After training, the networkresolved the current state given only transition symbols.

1 2

34

Left

Right

Down

Up

Left

Right

Down

Up

4

1 2

3

Right1 to 2

Left2 to 1

Right4 to 3

Left3 to 4

Up4 to 1

Up3 to 2

Down1 to 4

Down2 to 3

Symmetric Asymmetric

Training Condition

0.30

0.35

0.40

0.45

0.50

0.55

Pro

bab

ility

of

Counte

r-C

lock

wis

e T

urn

Trained Probabilities

Learned Probabilities

Inference Results

0 5 10Time [s]

Right (4 to 3)Left (2 to 1)

Down (2 to 3)Up (4 to 1)



Room 4Room 3Room 2Room 1

Sta

te N

euro

ns

Input

Neuro

ns

StationaryRight

LeftDown

Up

0 5 10

0 5 10Time [s]





Room 4Room 3Room 2Room 1

Input

Neuro

ns

Stationary

0 5 10

Moving

Sta

te N

euro

ns

Temporal Replay

The network was trained on input defined by a spatio–temporal patternwith two stochastically branching trajectories.I Input neurons indicated the current color (black or white) of the pixelsI Network learned both the spatial attributes (feedforward weights) and the temporal progression

(recurrent weights) of the input pattern

67%

33%31.7%

35.0%

65.0%

68.3%

After training, the network produced samples of states corresponding toa stochastic “replay” of observed trajectories.

0.0 0.5 1.0 1.5 2.0 2.5

Time (s)

Conclusions

We have presented a recurrent spiking neural network architecture,which can be trained to perform dynamical Bayesian inference of hiddenstates. The same network can be used to generate random sampletrajectories through the state space by spontaneous activity in the WTA(using the recurrent connections as a generative Bayesian model) or canbe used as a probabilistic FSM in which transitions are activelytriggered by movement signals.

References

[1] B. Nessler, M. Pfeiffer, L. Busing, and W. Maass. Bayesian computation emerges in generic cortical

microcircuits through spike-timing-dependent plasticity. PLoS Comp. Biol., 9(4):e1003037, 2013.

[2] Nessler B. Kappel, D. and W. Maass. STDP installs in winner-take-all circuits an online approximation to

Hidden Markov Models learning. PLoS Computational Biology, 2014, in press.

[3] C. Fox and T. Prescott. Hippocampal formation as unitary coherent particle filter. BMC Neuroscience,

10(Suppl 1):P275, 2009.

[4] Senn W. Brea, J. and J-P Pfister. Sequence learning with hidden units in spiking neural networks. In

Advances in Neural Information Processing Systems 24, pages 1422–1430, 2011.

[5] P. Baldi and Y. Chauvin. Smooth on-line learning algorithms for Hidden Markov Models. Neural

Computation, 6(2):307–318, 1994.

[email protected]

Learning, Inference, and Replay of Hidden State [0.17] Sequences … · 2018. 1. 30. · wrec kk0 /...

Documents

Transcript of Learning, Inference, and Replay of Hidden State [0.17] Sequences … · 2018. 1. 30. · wrec kk0 /...