Computational Neuromodulation

45
Computational Neuromodulation Peter Dayan Gatsby Computational Neuroscience Unit University College London Nathaniel Daw Sham Kakade Read Montague John O’Doherty Wolfram Schultz Ben Seymour Terry Sejnowski Angela Yu

description

Computational Neuromodulation. Peter Dayan Gatsby Computational Neuroscience Unit University College London. Nathaniel Daw Sham Kakade Read Montague John O’Doherty Wolfram Schultz Ben Seymour Terry Sejnowski Angela Yu. 5. Diseases of the Will Contemplators - PowerPoint PPT Presentation

Transcript of Computational Neuromodulation

Page 1: Computational Neuromodulation

Computational Neuromodulation

Peter Dayan Gatsby Computational Neuroscience Unit

University College London

Nathaniel Daw Sham Kakade Read Montague

John O’Doherty Wolfram Schultz Ben Seymour

Terry Sejnowski Angela Yu

Page 2: Computational Neuromodulation

2

5. Diseases of the Will

• Contemplators• Bibliophiles and Polyglots • Megalomaniacs• Instrument addicts• Misfits

• Theorists

Page 3: Computational Neuromodulation

3

There are highly cultivated, wonderfully endowed minds whose wills suffer from a particular form of lethargy. Its undeniable symptoms include a facility for exposition, a creative and restless imagination, an aversion to the laboratory, and an indomitable dislike for concrete science and seemingly unimportant data… When faced with a difficult problem, they feel an irresistible urge to formulate a theory rather than question nature.

As might be expected, disappointments plague the theorist…

Theorists

Page 4: Computational Neuromodulation

4

Computation and the Brain• statistical computations

– representation from density estimation (Terry)– combining uncertain information over space,

time, modalities for sensory/memory inference– learning as a hierarchical Bayesian problem– learning as a filtering problem

• control theoretic computations– optimising rewards, punishments– homeostasis/allostasis

Page 5: Computational Neuromodulation

5

Conditioning

• Ethology

• Psychology– classical/operant conditioning

• Computation– dynamic programming– Kalman filtering

• Algorithm– TD/delta rules

• Neurobiology

neuromodulators;

amygdala; OFC; nucleus accumbens; dorsal striatum

prediction: of important events

control: in the light of those predictions

policy evaluation

policy improvement

Page 6: Computational Neuromodulation

6

Dopamine

no prediction prediction, reward prediction, no reward

RR

LSchultz et al R L R

• drug addiction, self-stimulation

• effect of antagonists

• effect on vigour

• link to action

• `scalar’ signal

Page 7: Computational Neuromodulation

7

Prediction, but What Sort?

• Sutton: predict sum future reward

s tV(t)= r(s)

s t+1=r(t)+ r(s) =r(t)+V(t+1)

(t)=r(t)+V(t+1)- V(t)TD error

Page 8: Computational Neuromodulation

8

Rewards rather than Punishments

no prediction prediction, reward prediction, no reward

(t)=r(t)+V(t+1)- V(t)TD error

V(t)

R

RL

dopamine cells in VTA/SNc Schultz et al

Page 9: Computational Neuromodulation

9

Prediction, but What Sort?

• Sutton:

• Watkins: policy evaluation

predict sum future reward

s tV(t)= r(s)

s t+1=r(t)+ r(s) =r(t)+V(t+1)

(t)=r(t)+V(t+1)- V(t)TD error

~ ( )

V( ) ( , ) ( )V( )xyy a x

x E r x a P a y

Page 10: Computational Neuromodulation

10

Policy Improvement

• Sutton: define (x;M) do R-M on:

uses the same TD error

• Watkins: value iteration with

~ ( ; )

( , ) ( )V( ) ( )xyy a x M

E r x a P a y V x

(t)

( , )Q x a

* *( , ) ( , ) ( )max ( , )xy by

Q x a r x a P a Q y b

Q b(t)=r(t)+max Q(t+1,b) - Q(t,a)

Page 11: Computational Neuromodulation

11

Active Issues

• exploration/exploitation• model-based (PFC)/cached (striatal) methods• motivational influences• vigour• hierarchical control (PFC)• hyperbolic discounting, Pavlovian misbehavior

and ‘the will’• representational learning• appetitive/aversive opponency• links with behavioural economics

Page 12: Computational Neuromodulation

12

Computation and the Brain• statistical computations

– representation from density estimation (Terry)– combining uncertain information over space,

time, modalities for sensory/memory inference– learning as a hierarchical Bayesian problem– learning as a filtering problem

• control theoretic computations– optimising rewards, punishments– homeostasis/allostasis– exploration/exploitation trade-offs

Page 13: Computational Neuromodulation

13

Uncertainty

Computational functions of uncertainty:

weaken top-down influence over sensory processing

promote learning about the relevant representations

expected uncertainty from known variability or ignorance

We focus on two different kinds of uncertainties:

unexpected uncertainty due to gross mismatch between prediction and observation

ACh

NE

Page 14: Computational Neuromodulation

14

Norepinephrine

• vigilance

• reversals

• modulates plasticity? exploration?

• scalar

Page 15: Computational Neuromodulation

15

Aston-Jones: Target Detectiondetect and react to a rare target amongst common distractors

• elevated tonic activity for reversal• activated by rare target (and reverses)• not reward/stimulus related? more response related?

Page 16: Computational Neuromodulation

16

Vigilance Task

• variable time in start• η controls confusability

• one single run• cumulative is clearer

• exact inference• effect of 80% prior

Page 17: Computational Neuromodulation

18

Phasic NE

• onset response from timing uncertainty (SET)

• growth as P(target)/0.2 rises

• act when P(target)=0.95

• stop if P(target)=0.01

• arbitrarily set NE=0 after 5 timesteps(small prob of reflexive action)

Page 18: Computational Neuromodulation

19

Four Types of Trial

19%

1.5%

1%

77%

fall is rather arbitrary

Page 19: Computational Neuromodulation

20

Response Locking

slightly flatters the model – since no furtherresponse variability

Page 20: Computational Neuromodulation

21

Interrupts/Resets (SB)

LC

PFC/ACC

Page 21: Computational Neuromodulation

22

Active Issues

• approximate inference strategy• interaction with expected

uncertainty (ACh)• other representations of

uncertainty• finer gradations of ignorance

Page 22: Computational Neuromodulation

23

Computation and the Brain• statistical computations

– representation from density estimation (Terry)– combining uncertain information over space,

time, modalities for sensory/memory inference– learning as a hierarchical Bayesian problem– learning as a filtering problem

• control theoretic computations– optimising rewards, punishments– homeostasis/allostasis– exploration/exploitation trade-offs

Page 23: Computational Neuromodulation

24

• general: excitability, signal/noise ratios

• specific: prediction errors, uncertainty signals

Computational Neuromodulation

Page 24: Computational Neuromodulation

25

Learning and Inference

• Learning: predict; control

∆ weight (learning rate) x (error) x (stimulus)

– dopaminephasic prediction error for future reward

– serotoninphasic prediction error for future punishment

– acetylcholineexpected uncertainty boosts learning

– norepinephrineunexpected uncertainty boosts learning

Page 25: Computational Neuromodulation

26

Learning and Inference

z

x

ACh

expecteduncertainty

top-downprocessing

bottom-upprocessing

sensory inputs

cortical processing

context

NE

unexpecteduncertainty

prediction, learning, ...

y

Page 26: Computational Neuromodulation

27

HighPain

LowPain

0.8 1.0

0.8 1.0

0.2

0.2

Temporal Difference Prediction Error

predict sum future pain:

s tV(t)= r(s)

s t+1=r(t)+ r(s) =r(t)+V(t+1)

(t)=r(t)+V(t+1)- V(t)TD error

∆ weight (learning rate) x (error) x (stimulus)

Page 27: Computational Neuromodulation

28

HighPain

LowPain

0.8 1.0

0.8 1.0

0.2

0.2

Prediction error

(t)=r(t)+V(t+1)- V(t)TD error

Temporal Difference Prediction Error

Value

Page 28: Computational Neuromodulation

29

TD model

?

A – B – HIGH C – D – LOW C – B – HIGH A – B – HIGH A – D – LOW C – D – LOW A – B – HIGH A – B – HIGH C – D – LOW C – B – HIGH

Brain responsesPrediction error

experimental sequence…..

MR scanner

Ben Seymour; John O’Doherty

Temporal Difference Prediction Error

Page 29: Computational Neuromodulation

30

TD prediction error:

ventral striatum

Z=-4 R

Page 30: Computational Neuromodulation

31

Temporal Difference Values

right anterior insula dorsal raphe?

Page 31: Computational Neuromodulation

32

Rewards rather than Punishments

no prediction prediction, reward prediction, no reward

(t)=r(t)+V(t+1)- V(t)TD error

V(t)

R

RL

dopamine cells in VTA/SNc Schultz et al

Page 32: Computational Neuromodulation

33

TD Prediction Errors

• computation: dynamic programming and optimal control

• algorithm: ongoing error in predictions of the future

• implementation:– dopamine: phasic prediction error for reward;

tonic punishment– serotonin: phasic prediction error for punishment;

tonic reward

• evident in VTA; striatum; raphe?

• next: action; motivation; addiction; misbehavior

Page 33: Computational Neuromodulation

35

Task Difficulty

• set η=0.65 rather than 0.675• information accumulates over a longer period• hits more affected than cr’s• timing not quite right

Page 34: Computational Neuromodulation

36

Intra-trial Uncertainty

• phasic NE as unexpected state change within a model

• relative to prior probability; against default

• interrupts (resets) ongoing processing

• tie to ADHD?

• close to alerting (AJ) – but not necessarily tied to behavioral output (onset rise)

• close to behavioural switching (PR) – but not DA

• farther from optimal inference (EB)

• phasic ACh: aspects of known variability within a state?

Page 35: Computational Neuromodulation

37

Where Next

• dopamine– tonic release and vigour– appetitive misbehaviour and hyperbolic

discounting– actions and habits– psychosis

• serotonin– aversive misbehaviour and psychiatry

• norepinephrine– stress, depression and beyond

Page 36: Computational Neuromodulation

38

ACh & NE have distinct behavioral effects:

• ACh boosts learning to stimuli with uncertain consequences

• NE boosts learning upon encountering global changes in the environment

(e.g. Bear & Singer, 1986; Kilgard & Merzenich, 1998)

ACh & NE have similar physiological effects

• suppress recurrent & feedback processing

• enhance thalamocortical transmission

• boost experience-dependent plasticity(e.g. Gil et al, 1997)

(e.g. Kimura et al, 1995; Kobayashi et al, 2000)

Experimental Data

(e.g. Bucci, Holland, & Gallagher, 1998)

(e.g. Devauges & Sara, 1990)

Page 37: Computational Neuromodulation

39

Model Schematics

z

x

ACh

expecteduncertainty

top-downprocessing

bottom-upprocessing

sensory inputs

cortical processing

context

NE

unexpecteduncertainty

prediction, learning, ...

y

Page 38: Computational Neuromodulation

40

Attentionattentional selection for (statistically) optimal processing,above and beyond the traditional view of resource constraint

sensoryinput

Example 1: Posner’s Task

stimulus

location

cue

sensoryinput

cue

highvalidity

lowvalidity

stimulus

location

(Phillips, McAlonan, Robb, & Brown, 2000)

cue

target

response

0.2-0.5s

0.1s

0.1s

0.15s

generalize to the case that cue identity changes with no notice

Page 39: Computational Neuromodulation

41

Formal Framework

cues: vestibular, visual, ... 4c3c2c1c

Starget: stimulus location, exit direction...

variability in quality of relevant cuevariability in identity of relevant cue

AChNE

Sensory Informationavoid representing

full uncertainty

t1 t1

it ttt DP )|(*

1

1)|(*

hDijP ttt

Page 40: Computational Neuromodulation

42

Simulation Results: Posner’s Task

increase ACh

valid

ity e

ffect

% normal level

100 120 140

decrease ACh

% normal level

100 80 60

VE (1- )(NE 1-ACh)

3c2c1c

S

vary cue validity vary ACh

fix relevant cue low NE

nicotine

valid

ity e

ffect

concentration concentration

scopolamine

(Phillips, McAlonan, Robb, & Brown, 2000)

Page 41: Computational Neuromodulation

43

Maze Task

example 2: attentional shift

reward

cue 1

cue 2

reward

cue 1

cue 2

relevant irrelevant

irrelevant relevant

(Devauges & Sara, 1990)

no issue of validity

Page 42: Computational Neuromodulation

44

Simulation Results: Maze Navigation

3c2c1c

S

fix cue validity no explicit manipulation of ACh

change relevant cue NE

% R

ats

reach

ing c

rite

rion

No. days after shift from spatial to visual task

% R

ats

reach

ing c

rite

rion

No. days after shift from spatial to visual task

experimental data model data

(Devauges & Sara, 1990)

Page 43: Computational Neuromodulation

45

Simulation Results: Full Modeltrue & estimated relevant stimuli

neuromodulation in action

trials

validity effect (VE)

Page 44: Computational Neuromodulation

46

Simulated Psychopharmacology

50% NE

50% ACh/NE

AChcompensation

NE cannearly catchup

Page 45: Computational Neuromodulation

47

Summary

• single framework for understanding ACh, NE and some aspects of attention

• ACh/NE as expected/unexpected uncertainty signals

• experimental psychopharmacological data replicated by model simulations

• implications from complex interactions between ACh & NE

• predictions at the cellular, systems, and behavioral levels

• activity vs weight vs neuromodulatory vs population representations of uncertainty