Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston,...

33
Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston, Wellcome Centre for Neuroimaging, UCL Abstract Value-learning and perceptual learning have been an important focus over the past decade, attracting the concerted attention of experimental psychologists, neurobiologists and the machine learning community. Despite some formal connections; e.g., the role of prediction error in optimizing some function of sensory states, both fields have developed their own rhetoric and postulates. In work, we show that perception is, literally, an integral part of value learning; in the sense that it is necessary to integrate out dependencies on the inferred causes of sensory information. This enables the value of sensory trajectories to be optimized through action. Furthermore, we show that acting to optimize value and perception are two aspects of exactly the same principle; namely the minimization of a quantity (free-energy) that bounds the probability of sensations, given a particular agent or phenotype. This principle can be derived, in a straightforward way, from the very existence of biological agents, by considering the probabilistic behavior of an ensemble of agents belonging to the same class. Put simply, we sample the world to maximize the evidence for our existence

Transcript of Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston,...

Page 1: Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston, Wellcome Centre for Neuroimaging, UCL Abstract Value-learning.

Zangwill Club Seminar - Lent Term

The Bayesian brain, surprise and free-energyKarl Friston, Wellcome Centre for Neuroimaging, UCL

Abstract

Value-learning and perceptual learning have been an important focus over the past decade, attracting the concerted attention of experimental psychologists, neurobiologists and the machine learning community. Despite some formal connections; e.g., the role of prediction error in optimizing some function of sensory states, both fields have developed their own rhetoric and postulates. In work, we show that perception is, literally, an integral part of value learning; in the sense that it is necessary to integrate out dependencies on the inferred causes of sensory information. This enables the value of sensory trajectories to be optimized through action. Furthermore, we show that acting to optimize value and perception are two aspects of exactly the same principle; namely the minimization of a quantity (free-energy) that bounds the probability of sensations, given a particular agent or phenotype. This principle can be derived, in a straightforward way, from the very existence of biological agents, by considering the probabilistic behavior of an ensemble of agents belonging to the same class. Put simply, we sample the world to maximize the evidence for our existence

Page 2: Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston, Wellcome Centre for Neuroimaging, UCL Abstract Value-learning.

“Objects are always imagined as being present in the field of vision as would have to be there in order to produce the same impression on the nervous mechanism” - Hermann Ludwig Ferdinand von Helmholtz

Thomas Bayes

Geoffrey Hinton

Richard Feynman

From the Helmholtz machine to the Bayesian brain and self-organization

Hermann Haken

Richard Gregory

Page 3: Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston, Wellcome Centre for Neuroimaging, UCL Abstract Value-learning.

Overview

Ensemble dynamics Entropy and equilibriaFree-energy and surprise

The free-energy principle Action and perceptionHierarchies and generative models

Perception Birdsong and categorizationSimulated lesions

Action Active inferenceGoal directed reaching

Policies Control and attractorsThe mountain-car problem

Page 4: Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston, Wellcome Centre for Neuroimaging, UCL Abstract Value-learning.

tem

pera

ture

What is the difference between a snowflake and a bird?

Phase-boundary

…a bird can act (to avoid surprises)

Page 5: Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston, Wellcome Centre for Neuroimaging, UCL Abstract Value-learning.

What is the difference between snowfall and a flock of birds?

Ensemble dynamics, clumping and swarming

…birds (biological agents) stay in the same place

They resist the second law of thermodynamics, which says that their entropy should increase

Page 6: Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston, Wellcome Centre for Neuroimaging, UCL Abstract Value-learning.

This means biological agents must self-organize to minimise surprise. In other words, to ensure they occupy a limited number of states (attracting set; cf homeostasis)..

But what is the entropy?

A

( )s g

…entropy is just average surprise

Low surprise (we are usually here) High surprise (I am never here)

0

( ) ( | ) ln ( | )

ln ( | )

H L

L

T

dt t p m p m d

p s m

Page 7: Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston, Wellcome Centre for Neuroimaging, UCL Abstract Value-learning.

But there is a small problem… agents cannot measure their surprise

But they can measure their free-energy, which is always bigger than surprise

This means agents should minimize their free-energy. So what is free-energy?

?

( ) ( )F Lt t

( )s g

Page 9: Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston, Wellcome Centre for Neuroimaging, UCL Abstract Value-learning.

Action to minimise a bound on surprise Perception to optimise the bound

Action

( )( ) ss g

argmin ( , )a

a s F

External states in the world

Internal states of the agent (m)

Sensations

argmin ( , )s

F( )( , )a f

More formally,

( ( ) || ( )) ln ( ( ) | , )

argmax

q

a

D q p p s a m

Complexity Accuracy

a Accuracy

F ( | ) ( ( | ) || ( | ))

argmin

F L s m D q p s

Surprise Divergence

Divergence

Page 10: Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston, Wellcome Centre for Neuroimaging, UCL Abstract Value-learning.

Free-energy is a function of sensations and a proposal density over hidden causes

and can be evaluated, given a generative model comprising a likelihood and prior:

So what models might the brain use?

( , ) lnq q

s Energy Entropy q F G

( , ) ln ( , | ) ln ( | , ) ln ( | )s p s m p s m p m G

Action

( )( ) ss g

argmin ( , )a

a s F

External states in the world

Internal states of the agent (m)

Sensations

argmin ( , )s

F( )( , )a f

More formally,

Page 11: Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston, Wellcome Centre for Neuroimaging, UCL Abstract Value-learning.

Backward(modulatory)

Forward(driving)

lateral

)1(~x )1(

s

)2((2)

(1)

)2(~x

)2(~v

)1(~v

( 1) ( ) ( , )

( ) ( ) ( , )D

i i v i

i i x i

v g

x f

{ ( ), ( ), , }x t v t

Hierarchal models in the brain

Page 12: Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston, Wellcome Centre for Neuroimaging, UCL Abstract Value-learning.

1( ) ( ) ( ) ( ) ( 1) ( 1)

1

( ) ( ) ( ) ( ) ( , )

( ) ( 1) ( 1) ( 1) ( , )

, , | | , ,

, ( ) ( | ) ( | , )

| , ( , )

| , ( , )

D

D N

N

ni i i i i i

i

i i i i x i

i i i i v i

p s x v m p s x v p x v

p x v p x p x v p v x v

p x x v f

p v x v g

Structural priors

Dynamical priors

Likelihood and empirical priors

(1) (1) ( ,1)

(1) (1) (1) ( ,1)

( 1) ( ) ( ) ( , )

( ) ( ) ( ) ( , )

( , )

( , )

( , )

( , )

v

x

i i i v i

i i i x i

s g x v

x f x v

v g x v

x f x v

(1)

(1)( )

( )

( )

( )

( ) D

v

m

m

v

x

s g

v

g

v

v g

x f

Hierarchal form

Gibb’s energy - a simple function of prediction error

Prediction errors{ ( ), ( ), , }x t v t

1 12 2

ln , , |

ln

GT

p s x v m

Page 13: Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston, Wellcome Centre for Neuroimaging, UCL Abstract Value-learning.

( , )x v ( )

( )Synaptic gain

Synaptic activity Synaptic efficacy

Activity-dependent plasticity

Functional specialization

Attentional gain

Enabling of plasticity

( ) ( )( )

G

Perception and inference Learning and memory

The proposal density and its sufficient statistics

( ) ( )( )

G

( ) ( )( )

( ) ( )( )

GD

GD

x xx

v vv

( | ) ( , ( ))q NLaplace approximation:

Attention and salience

Page 14: Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston, Wellcome Centre for Neuroimaging, UCL Abstract Value-learning.

How can we minimize prediction error (free-energy)?

Change sensory input

sensations – predictions

Prediction error

Change predictions

Action Perception

…prediction errors drive action and perception to suppress themselves

Page 15: Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston, Wellcome Centre for Neuroimaging, UCL Abstract Value-learning.

Adjust hypotheses

sensory input

Backward connections return predictions

…by hierarchical message passing in the brain

prediction

Forward connections convey feedback

So how do prediction errors change predictions?

Prediction errors

Predictions

Page 16: Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston, Wellcome Centre for Neuroimaging, UCL Abstract Value-learning.

Backward predictions

Forward prediction error

Synaptic activity and message-passing

( , ) ( , ) ( ) ( ) ( , 1)

( , ) ( , ) ( ) ( )

D

D

v i v i i T i v iv

x i x i i T ix

( ) ( )12 ( ( ( )))T

i itr R ( )

i

Ti

Synaptic plasticity

( , ) ( , ) ( , ) ( , ) ( , 1) ( )

( , ) ( , ) ( , ) ( , ) ( , ) ( )

( )

( )

v i v i v i v i v i i

x i x i x i x i x i i

g

f

D

( , )s i

( , )x i

( , )v i

( , 1)v i

( )s t

( , )v i( , 1)x i

( , 1)x i

( , 1)v i

( , 2)v i

Synaptic gain

David Mumford

More formally,

cf Hebb's Law cf Rescorla-Wagnercf Predictive coding

Page 17: Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston, Wellcome Centre for Neuroimaging, UCL Abstract Value-learning.

predictions

Reflexes to action

aaction

( )s a

dorsal root

ventral horn

sensory error

What about action?

Action can only suppress (sensory) prediction error. This means action fulfils our (sensory) predictions

Taa ( ,1) ( ,1) ( ( ) ( ))v v s a g

Page 18: Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston, Wellcome Centre for Neuroimaging, UCL Abstract Value-learning.

Summary

Biological agents resist the second law of thermodynamics

They must minimize their average surprise (entropy)

They minimize surprise by suppressing prediction error (free-energy)

Prediction error can be reduced by changing predictions (perception)

Prediction error can be reduced by changing sensations (action)

Perception entails recurrent message passing in the brain to optimise predictions

Action makes predictions come true (and minimises surprise)

Page 19: Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston, Wellcome Centre for Neuroimaging, UCL Abstract Value-learning.

Overview

Ensemble dynamics Entropy and equilibriaFree-energy and surprise

The free-energy principle Action and perceptionHierarchies and generative models

Perception Birdsong and categorizationSimulated lesions

Action Active inferenceGoal directed reaching

Policies Control and attractorsThe mountain-car problem

Page 20: Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston, Wellcome Centre for Neuroimaging, UCL Abstract Value-learning.

Making bird songs with Lorenz attractors

SyrinxVocal centre

time (sec)

Freq

uenc

y

Sonogram

0.5 1 1.5causal states

hidden states

1

2

vv

v

(1) (1)2 1

(1) (1) (1) (1) (1) (1)1 1 3 1 2

(1) (1) (1) (1)1 2 2 3

18 18

2

2

x x

f v x x x x

x x v x

Page 21: Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston, Wellcome Centre for Neuroimaging, UCL Abstract Value-learning.

( )x

( )x

( )v( )s t

( )v

10 20 30 40 50 60-5

0

5

10

15

20prediction and error

10 20 30 40 50 60-5

0

5

10

15

20hidden states

Backward predictions

Forward prediction error

10 20 30 40 50 60-10

-5

0

5

10

15

20

causal states

Perception and message passing

stimulus

0.2 0.4 0.6 0.82000

2500

3000

3500

4000

4500

5000

time (seconds)

Page 22: Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston, Wellcome Centre for Neuroimaging, UCL Abstract Value-learning.

Perceptual categorization

Freq

uenc

y (H

z) Song a

time (seconds)

Song b Song c

( )1v

( )2v

Page 23: Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston, Wellcome Centre for Neuroimaging, UCL Abstract Value-learning.

Hierarchical (itinerant) birdsong: sequences of sequences

SyrinxNeuronal hierarchy

Time (sec)

Freq

uenc

y (K

Hz)

sonogram

0.5 1 1.5

(1)1(1)2

v

v

(2) (2)2 1

(2) (2) (2) (2) (2)1 3 1 2

(2) (2) (2)81 2 33

18 18

32 2

2

x x

f x x x x

x x x

(1) (1)2 1

(1) (1) (1) (1) (1) (1)1 1 3 1 2

(1) (1) (1) (1)1 2 2 3

(1)1(1) 2

(1)23

18 18

2

2

x x

f v x x x x

x x v x

sxg

sx

(2) (1)(2) 2 1

(2) (1)3 2

x vg

x v

Page 24: Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston, Wellcome Centre for Neuroimaging, UCL Abstract Value-learning.

Freq

uenc

y (H

z)

percept

Freq

uenc

y (H

z)no top-down messages

time (seconds)

Freq

uenc

y (H

z)

no lateral messages

0.5 1 1.5

-40

-20

0

20

40

60

LFP

(micr

o-vo

lts)

LFP

-60

-40

-20

0

20

40

60

LFP

(micr

o-vo

lts)

LFP

0 500 1000 1500 2000-60

-40

-20

0

20

40

60

peristimulus time (ms)

LFP

(micr

o-vo

lts)

LFP

Simulated lesions and false inference

no structural priors

no dynamical priors

Page 25: Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston, Wellcome Centre for Neuroimaging, UCL Abstract Value-learning.

a

Vs

J

1

2

xs

x

( ,1)v

1J

1x

2x2J

(0,0)

1 2 3( , , )V v v v

Descending predictions

visual input

proprioceptive input

Action, predictions and priors

Taa

( ,1) ( ,1) ( ( ) ( ))v v s a g

( ,1)v

( ,2)v( ,1)x

( ,1)x

( ,1)v

Page 26: Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston, Wellcome Centre for Neuroimaging, UCL Abstract Value-learning.

( ,2)x

Itinerant behavior and action-observation

0 0.2 0.4 0.6 0.8 1 1.2 1.4

0.4

0.6

0.8

1

1.2

1.4

action

position (x)po

sitio

n (y

)0 0.2 0.4 0.6 0.8 1 1.2 1.4

observation

position (x)

Taa

Descending predictions

hidden attractor states(Lotka-Volterra)

( ,1)x

Page 27: Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston, Wellcome Centre for Neuroimaging, UCL Abstract Value-learning.

Overview

Perception Birdsong and categorizationSimulated lesions

Action Active inferenceGoal directed reaching

Policies Control and attractorsThe mountain-car problem

Does minimising surprise preclude searching (itinerant or wandering) behaviour?

No – not if you expect circumstances to change

Page 28: Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston, Wellcome Centre for Neuroimaging, UCL Abstract Value-learning.

18( ) x

xx

a xx

f

True motion

-2 -1 0 1 20

0.1

0.2

0.3

0.4

0.5

0.6

0.7

position

( )x

heig

ht

The mountain car problem

position happiness

The cost-function

x

xxf

cxx

Policy (predicted motion)

( , )c x h

( )h( )x

The environment

Adriaan Fokker Max Planck

“I expect to move faster when cost is positive”

Page 29: Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston, Wellcome Centre for Neuroimaging, UCL Abstract Value-learning.

With cost (i.e., exploratory

dynamics)

Exploring & exploiting the environment

Page 30: Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston, Wellcome Centre for Neuroimaging, UCL Abstract Value-learning.

Using just the free-energy principle and itinerant priors on motion, we have solved a benchmark problem in optimal control theory (without any learning).

Policies and prior expectations

Page 31: Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston, Wellcome Centre for Neuroimaging, UCL Abstract Value-learning.

Thank you

And thanks to collaborators:

Jean DaunizeauHarriet Feldman

Lee HarrisonStefan KiebelJames Kilner

Jérémie MattoutKlaas Stephan

And colleagues:

Peter DayanJörn DiedrichsenPaul Verschure

Florentin Wörgötter

And many others

Page 32: Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston, Wellcome Centre for Neuroimaging, UCL Abstract Value-learning.

The selection of adaptive predictions

Darwinian evolution of virtual block creatures. A population of several hundred creatures is created within a supercomputer, and each creature is tested for their ability to perform a given task, such the ability to swim in a simulated water environment. The successful survive, and their virtual genes are copied, combined, and mutated to make offspring. The new creatures are again tested, and some may be improvements on their parents. As this cycle of variation and selection continues, creatures with more and more successful behaviours can emerge.

…we inherit them

Page 33: Zangwill Club Seminar - Lent Term The Bayesian brain, surprise and free-energy Karl Friston, Wellcome Centre for Neuroimaging, UCL Abstract Value-learning.

310 s

010 s

310 s

610 s

1510 s

Perception and Action: The optimisation of neuronal and neuromuscular activity to suppress prediction errors (or free-energy) based on generative models of sensory data.

Learning and attention: The optimisation of synaptic gain and efficacy over seconds to hours, to encode the precisions of prediction errors and causal structure in the sensorium. This entails suppression of free-energy over time.

Neurodevelopment: Model optimisation through activity-dependent pruning and maintenance of neuronal connections that are specified epigenetically

Evolution: Optimisation of the average free-energy (free-fitness) over time and individuals of a given class (e.g., conspecifics) by selective pressure on the epigenetic specification of their generative models.

Time-scale Free-energy minimisation leading to…