Learning, Inference, and Replay of Hidden State [0.17] Sequences … · 2018. 1. 30. · wrec kk0 /...

1
Learning, Inference, and Replay of Hidden State Sequences in Recurrent Spiking Neural Networks Dane Corneil, Emre Neftci, Giacomo Indiveri & Michael Pfeiffer Institute of Neuroinformatics, ETH Zurich & University of Zurich; Laboratory of Computational Neuroscience, EPFL N C L Abstract Learning to recognize, predict, and generate spatio–temporal patterns and sequences of spikes is a key feature of nervous systems, and essential for solving basic tasks like localization and navigation. How this can be done by a spiking network, however, remains an open question. Here we present a STDP-based framework extending a previous model [1], that can simultaneously learn to abstract hidden states from sensory inputs and learn transition probabilities [2] between these states in recurrent connection weights. Network Structure WTA z 1 z K . . . STDP-based learning rule sensory evidence spiking neurons W lateral inhibition x m x 1 . . . recurrent excitation y 1 . . . y n ff W rec Z t-1 X t-1 Z t X t Z t+1 X t+1 z t-1 z t z t+1 x t-1 x t x t+1 Each output neuron z k encodes a hidden cause over the input, where p(z k fires at time t) exp(u k (t) - I (t)) u k (t)= N X i=1 w ki · ˜ y i (t)+ K X k 0 =1 w rec kk 0 · ˜ z k 0 (t) w ki = log py i =1|z k =1, w) w rec kk 0 = log p(z k =1| ˜ z k 0 =1, w). Trained on input generated by a Hidden Markov Model, the activity of the Winner-Take-All (WTA) network evolves as a single–sample (unitary) particle filter [3]. Recurrent Learning Rule Depression on every presynaptic spike; weight– and time– dependent potentiation if postsynaptic neuron fires within a time window. Δw rec kk 0 -1 after presynaptic spike, and e -w kk 0 · ˜ z k 0 (t) after pre–post pair E w rec kk 0 ]=0 w rec kk 0 = log p(z k =1| ˜ z k 0 =1, w) Learning a Hidden Markov Model A four–state HMM was presented to the network, with states defined by differing Poisson firing statistics over 225 input neurons. Input Neurons A 0 2 4 Time (s) State Neurons B C D 0.018 (0.019) 0.023 (0.025) 0.035 (0.035) 0.035 (0.034) 0.018 (0.018) 0.047 (0.047) 0.953 (0.951) 0.947 (0.947) 0.965 (0.963) 0.959 (0.956) Optimal A B C D (Learned) State–dependent Inference A network model with additional transition neurons for encoding and learning Finite State Machines (FSMs) was trained on observations from a four–state maze traversal task. After training, the network resolved the current state given only transition symbols. 1 2 3 4 Left Right Down Up Left Right Down Up 4 1 2 3 Right 1 to 2 Left 2 to 1 Right 4 to 3 Left 3 to 4 Up 4 to 1 Up 3 to 2 Down 1 to 4 Down 2 to 3 Symmetric Asymmetric Training Condition 0.30 0.35 0.40 0.45 0.50 0.55 Probability of Counter-Clockwise Turn Trained Probabilities Learned Probabilities Inference Results 0 5 10 Time [s] Right (4 to 3) Left (2 to 1) Down (2 to 3) Up (4 to 1) Right (1 to 2) Left (3 to 4) Down (1 to 4) Up (3 to 2) Room 4 Room 3 Room 2 Room 1 State Neurons Input Neurons Stationary Right Left Down Up 0 5 10 0 5 10 Time [s] Right (4 to 3) Left (2 to 1) Down (2 to 3) Up (4 to 1) Right (1 to 2) Left (3 to 4) Down (1 to 4) Up (3 to 2) Room 4 Room 3 Room 2 Room 1 Input Neurons Stationary 0 5 10 Moving State Neurons Temporal Replay The network was trained on input defined by a spatio–temporal pattern with two stochastically branching trajectories. I Input neurons indicated the current color (black or white) of the pixels I Network learned both the spatial attributes (feedforward weights) and the temporal progression (recurrent weights) of the input pattern 67% 33% 31.7% 35.0% 65.0% 68.3% After training, the network produced samples of states corresponding to a stochastic “replay” of observed trajectories. 0.0 0.5 1.0 1.5 2.0 2.5 Time (s) Conclusions We have presented a recurrent spiking neural network architecture, which can be trained to perform dynamical Bayesian inference of hidden states. The same network can be used to generate random sample trajectories through the state space by spontaneous activity in the WTA (using the recurrent connections as a generative Bayesian model) or can be used as a probabilistic FSM in which transitions are actively triggered by movement signals. References [1] B. Nessler, M. Pfeiffer, L. B¨ using, and W. Maass. Bayesian computation emerges in generic cortical microcircuits through spike-timing-dependent plasticity. PLoS Comp. Biol., 9(4):e1003037, 2013. [2] Nessler B. Kappel, D. and W. Maass. STDP installs in winner-take-all circuits an online approximation to Hidden Markov Models learning. PLoS Computational Biology, 2014, in press. [3] C. Fox and T. Prescott. Hippocampal formation as unitary coherent particle filter. BMC Neuroscience, 10(Suppl 1):P275, 2009. [4] Senn W. Brea, J. and J-P Pfister. Sequence learning with hidden units in spiking neural networks. In Advances in Neural Information Processing Systems 24, pages 1422–1430, 2011. [5] P. Baldi and Y. Chauvin. Smooth on-line learning algorithms for Hidden Markov Models. Neural Computation, 6(2):307–318, 1994. [email protected]

Transcript of Learning, Inference, and Replay of Hidden State [0.17] Sequences … · 2018. 1. 30. · wrec kk0 /...

Page 1: Learning, Inference, and Replay of Hidden State [0.17] Sequences … · 2018. 1. 30. · wrec kk0 / ˆ 1 after presynaptic spike, and e w kk0z~ k0(t) after pre{post pair E[ wrec kk0]

Learning, Inference, and Replay of Hidden StateSequences in Recurrent Spiking Neural Networks

Dane Corneil, Emre Neftci, Giacomo Indiveri & Michael Pfeiffer

Institute of Neuroinformatics, ETH Zurich & University of Zurich; Laboratory of Computational Neuroscience, EPFL

NCL

Abstract

Learning to recognize, predict, and generate spatio–temporal patternsand sequences of spikes is a key feature of nervous systems, andessential for solving basic tasks like localization and navigation. Howthis can be done by a spiking network, however, remains an openquestion. Here we present a STDP-based framework extending aprevious model [1], that can simultaneously learn to abstract hiddenstates from sensory inputs and learn transition probabilities [2] betweenthese states in recurrent connection weights.

Network Structure

WTAz1 zK...

STDP-basedlearning rule

sensoryevidence

spiking neuronsW

lateralinhibition

xmx1 . . .

recurrentexcitation

y1 . . . yn

ff

Wrec

Zt-1

Xt-1

Zt

Xt

Zt+1

Xt+1

zt-1 zt zt+1

xt-1 xt xt+1

Each output neuron zk encodes a hidden cause over the input, where

p(zk fires at time t) ∝ exp(uk(t)− I(t))

uk(t) =N∑i=1

wffki · yi(t) +

K∑k′=1

wreckk′ · zk′(t)

wffki = log p(yi = 1|zk = 1,w)

wreckk′ = log p(zk = 1|zk′ = 1,w).

Trained on input generated by a Hidden Markov Model, the activity ofthe Winner-Take-All (WTA) network evolves as a single–sample(unitary) particle filter [3].

Recurrent Learning Rule

Depression on every presynaptic spike; weight– and time– dependentpotentiation if postsynaptic neuron fires within a time window.

∆wreckk′ ∝

{−1 after presynaptic spike, ande−wkk′·zk′(t) after pre–post pair

E[∆wreckk′] = 0⇔ wrec

kk′ = log p(zk = 1|zk′ = 1,w)

Learning a Hidden Markov Model

A four–state HMM was presented to the network, with states definedby differing Poisson firing statistics over 225 input neurons.

Inp

ut

Neu

rons

A

0 2 4Time (s)

Sta

te N

euro

ns

B

C

D

0.018(0.019) 0.023

(0.025)

0.035(0.035)

0.035(0.034)

0.018(0.018)

0.047(0.047)

0.953(0.951)

0.947(0.947)

0.965(0.963)

0.959(0.956)

Optimal

A

BC

D

(Learned)

State–dependent Inference

A network model with additional transition neurons for encoding andlearning Finite State Machines (FSMs) was trained on observationsfrom a four–state maze traversal task. After training, the networkresolved the current state given only transition symbols.

1 2

34

Left

Right

Down

Up

Left

Right

Down

Up

4

1 2

3

Right1 to 2

Left2 to 1

Right4 to 3

Left3 to 4

Up4 to 1

Up3 to 2

Down1 to 4

Down2 to 3

Symmetric Asymmetric

Training Condition

0.30

0.35

0.40

0.45

0.50

0.55

Pro

bab

ility

of

Counte

r-C

lock

wis

e T

urn

Trained Probabilities

Learned Probabilities

Inference Results

0 5 10Time [s]

Right (4 to 3)Left (2 to 1)

Down (2 to 3)Up (4 to 1)

Right (1 to 2)Left (3 to 4)

Down (1 to 4)Up (3 to 2)

Room 4Room 3Room 2Room 1

Sta

te N

euro

ns

Input

Neuro

ns

StationaryRight

LeftDown

Up

0 5 10

0 5 10Time [s]

Right (4 to 3)Left (2 to 1)

Down (2 to 3)Up (4 to 1)

Right (1 to 2)Left (3 to 4)

Down (1 to 4)Up (3 to 2)

Room 4Room 3Room 2Room 1

Input

Neuro

ns

Stationary

0 5 10

Moving

Sta

te N

euro

ns

Temporal Replay

The network was trained on input defined by a spatio–temporal patternwith two stochastically branching trajectories.I Input neurons indicated the current color (black or white) of the pixelsI Network learned both the spatial attributes (feedforward weights) and the temporal progression

(recurrent weights) of the input pattern

67%

33%31.7%

35.0%

65.0%

68.3%

After training, the network produced samples of states corresponding toa stochastic “replay” of observed trajectories.

0.0 0.5 1.0 1.5 2.0 2.5

Time (s)

Conclusions

We have presented a recurrent spiking neural network architecture,which can be trained to perform dynamical Bayesian inference of hiddenstates. The same network can be used to generate random sampletrajectories through the state space by spontaneous activity in the WTA(using the recurrent connections as a generative Bayesian model) or canbe used as a probabilistic FSM in which transitions are activelytriggered by movement signals.

References

[1] B. Nessler, M. Pfeiffer, L. Busing, and W. Maass. Bayesian computation emerges in generic cortical

microcircuits through spike-timing-dependent plasticity. PLoS Comp. Biol., 9(4):e1003037, 2013.

[2] Nessler B. Kappel, D. and W. Maass. STDP installs in winner-take-all circuits an online approximation to

Hidden Markov Models learning. PLoS Computational Biology, 2014, in press.

[3] C. Fox and T. Prescott. Hippocampal formation as unitary coherent particle filter. BMC Neuroscience,

10(Suppl 1):P275, 2009.

[4] Senn W. Brea, J. and J-P Pfister. Sequence learning with hidden units in spiking neural networks. In

Advances in Neural Information Processing Systems 24, pages 1422–1430, 2011.

[5] P. Baldi and Y. Chauvin. Smooth on-line learning algorithms for Hidden Markov Models. Neural

Computation, 6(2):307–318, 1994.

[email protected]