Model-based classification Kay Henning Brodersen · 2020. 5. 17. · evidence for sparse coding vs....

25
Model-based classification through generative embedding Kay Henning Brodersen Machine Learning and Pattern Recognition Group Department of Computer Science, ETH Zurich Computational Neuroeconomics Group Department of Economics, University of Zurich http://people.inf.ethz.ch/bkay/

Transcript of Model-based classification Kay Henning Brodersen · 2020. 5. 17. · evidence for sparse coding vs....

Page 1: Model-based classification Kay Henning Brodersen · 2020. 5. 17. · evidence for sparse coding vs. dense coding ... The class-specific accuracies ... Creation of a low-dimensional,

Model-based classification through generative embedding

Kay Henning Brodersen

Machine Learning and Pattern Recognition Group Department of Computer Science, ETH Zurich

Computational Neuroeconomics Group Department of Economics, University of Zurich

http://people.inf.ethz.ch/bkay/

Page 2: Model-based classification Kay Henning Brodersen · 2020. 5. 17. · evidence for sparse coding vs. dense coding ... The class-specific accuracies ... Creation of a low-dimensional,

2

Computational science and psychiatry

Spectrum diseases

diverse genetic basis

strong gene-environment interactions

variability in treatment response and outcome

Consequences

multiple pathophysiological mechanisms

even when symptoms are similar, causes may differ across patients

need to infer pathophysiological mechanisms in individual patients!

Page 3: Model-based classification Kay Henning Brodersen · 2020. 5. 17. · evidence for sparse coding vs. dense coding ... The class-specific accuracies ... Creation of a low-dimensional,

4

Model-based inference on individual pathophysiology

model-based

diagnostic tests

application to brain activity data from individual patients

model of neuronal (patho)physiology

spectrum disease

type 2

type 3

type 1

diagnostic classification

Page 4: Model-based classification Kay Henning Brodersen · 2020. 5. 17. · evidence for sparse coding vs. dense coding ... The class-specific accuracies ... Creation of a low-dimensional,

5

Conventional vs. model-based classification

-10

0

10

-0.5

0

0.5

-0.1

0

0.1

0.2

0.3

0.4

-0.4

-0.2

0 -0.5

0

0.5-0.4

-0.35

-0.3

-0.25

-0.2

-0.15

-10

0

10

-0.5

0

0.5

-0.1

0

0.1

0.2

0.3

0.4

-0.4

-0.2

0 -0.5

0

0.5-0.4

-0.35

-0.3

-0.25

-0.2

-0.15

ge

ne

rativ

e

em

be

ddin

g

L.H

G →

L.H

G

Vo

xe

l (6

4,-

24

,4)

mm

L.MGB → L.MGB Voxel (-42,-26,10) mm

Voxel (-56,-20,10) mm R.HG → L.HG

controls patients

Conventional classification Model-based classification

Page 5: Model-based classification Kay Henning Brodersen · 2020. 5. 17. · evidence for sparse coding vs. dense coding ... The class-specific accuracies ... Creation of a low-dimensional,

6

Colleagues & collaborators

Joachim M Buhmann ETH Zurich

Klaas Enno Stephan University of Zurich · University College London

Kate Lomakina University of Zurich · ETH Zurich

Alexander Leff University College London

Cheng Soon Ong ETH Zurich

Thomas Schofield University College London

Page 6: Model-based classification Kay Henning Brodersen · 2020. 5. 17. · evidence for sparse coding vs. dense coding ... The class-specific accuracies ... Creation of a low-dimensional,

7

Model-based classification

Can we exploit the rich discriminative information encoded in individual patterns of connection strengths?

Data representations in classification analyses

Structure-based classification

• mild traumatic brain injury • Alzheimer’s disease • autistic spectrum disorder • frontotemporal

dementia • mild cognitive

impairment • schizophrenia • aphasia

Activation-based classification

• depression • schizophrenia • mild cognitive

impairment

Page 7: Model-based classification Kay Henning Brodersen · 2020. 5. 17. · evidence for sparse coding vs. dense coding ... The class-specific accuracies ... Creation of a low-dimensional,

8

Prediction versus inference

The goal of prediction is to find a highly accurate encoding or decoding function.

The goal of inference is to decide between competing hypotheses about mechanisms or representations in the brain.

predicting a cognitive state using a

brain-machine interface

predicting a subject-specific

diagnostic status

comparing a model that links distributed neuronal

activity to a cognitive state with a model that does not

weighing the evidence for sparse

coding vs. dense coding

powerful discriminative algorithms for classification

mechanistically interpretable generative models of brain function

Page 8: Model-based classification Kay Henning Brodersen · 2020. 5. 17. · evidence for sparse coding vs. dense coding ... The class-specific accuracies ... Creation of a low-dimensional,

9

Model-based classification through generative embedding

Brodersen, Haiss, Ong, Jung, Tittgemeyer, Buhmann, Weber, Stephan (2011) NeuroImage Brodersen, Schofield, Leff, Ong, Lomakina, Buhmann, Stephan (2011) PLoS Comp Biol

step 2 — kernel construction

step 1 — model inversion

measurements from an individual

subject

subject-specific inverted generative model

subject representation in the generative score space

A → B

A → C

B → B

B → C

A

C B

step 3 — classification

separating hyperplane to discriminate between groups

A

C B

jointly discriminative connection strengths

step 4 — interpretation

-2 0 2 4 6 8-1

0

1

2

3

4

5

Voxel 1

Voxe

l 2

-0.4 -0.35 -0.3 -0.25 -0.2 -0.15-0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

(1) L.MGB -> L.MGB

(14)

R.H

G -

> L

.HG

Page 9: Model-based classification Kay Henning Brodersen · 2020. 5. 17. · evidence for sparse coding vs. dense coding ... The class-specific accuracies ... Creation of a low-dimensional,

10

activity 𝑧1(𝑡)

Choosing a generative model: DCM for fMRI

intrinsic connectivity

direct inputs

modulation of connectivity

neural state equation

CuzBuAz j

j )( )(

u

zC

z

z

uB

z

zA

j

j

)(

haemodynamic forward model

𝑥 = 𝑔(𝑧, 𝜃ℎ)

observed BOLD signal

neuronal states

t

driving input 𝑢1(𝑡)

modulatory input 𝑢2(𝑡)

t

activity 𝑧2(𝑡)

activity 𝑧3(𝑡)

signal 𝑥1(𝑡)

signal 𝑥2(𝑡)

signal 𝑥3(𝑡)

Jansen & Rit (1995) Biological Cybernetics Friston, Harrison & Penny (2003) NeuroImage

Stephan & Friston (2007), Handbook of Brain Connectivity

Page 10: Model-based classification Kay Henning Brodersen · 2020. 5. 17. · evidence for sparse coding vs. dense coding ... The class-specific accuracies ... Creation of a low-dimensional,

12

Choosing a generative model: DCM for LFP/EEG

4

4

3

3

1

1

2

2

1

2

4914

41

2))(( xxuaxsHx

xx

eeee

1

2

4914

41

2))(( xxuaxsHx

xx

eeee

Excitatory spiny cells in granular layers

Exogenous input u

4

4

3

3

1

1

2

2

Intrinsic

connections5

5

Excitatory spiny cells in granular layers

Excitatory pyramidal cells in agranular layers

Inhibitory cells in agranular layers

),( uxfx

11812

10

2

1112511

1110

7

2

8938

87

2)(

2)()(

xxx

xxxSHx

xx

xxxSAAHx

xx

iiii

ee

LB

ee

1

2

4914

41

2))()(( xxCuxSAAHx

xx

ee

LF

ee

Synaptic ‘alpha’ kernelSynaptic ‘alpha’ kernel

Sigmoid functionSigmoid function

659

3

2

61246

63

2

2

51295

52

2)(

2))()()((

xxx

xxxSHx

x

xxxSxSAAHx

xx

iiii

ee

LB

ee

Extrinsic

Connections:

Forward

Backward

Lateral

Moran et al. 2009 NeuroImage

Page 11: Model-based classification Kay Henning Brodersen · 2020. 5. 17. · evidence for sparse coding vs. dense coding ... The class-specific accuracies ... Creation of a low-dimensional,

13

Training and testing a model-based classifier

n

i

n

j

n

i ijijiji xxkcc1 1 1

),(2

1)(max L

n

i iicts1

0..

niCi ,...,10

Training a kernel-based discriminant classifier:

n

i niin bxxkxf1

*

1

*

1 ),()(

))(sgn(:ˆ11 nn xfc

Using the model to make predictions:

Linear SVM

In the case of generative embedding:

𝑘 𝑥𝑖 , 𝑥𝑗 = 𝑥𝑖𝑇𝑥𝑗

Page 12: Model-based classification Kay Henning Brodersen · 2020. 5. 17. · evidence for sparse coding vs. dense coding ... The class-specific accuracies ... Creation of a low-dimensional,

14

1 ROI definition and n model inversions

unbiased estimate

Repeat n times: 1 ROI definition and n model inversions

unbiased estimate

1 ROI definition and n model inversions

slightly optimistic estimate: voxel selection for training set and test set based on test data

Repeat n times: 1 ROI definition and 1 model inversion

slightly optimistic estimate: voxel selection for training set based on test data and test labels

Repeat n times: 1 ROI definition and n model inversions

unbiased estimate

1 ROI definition and n model inversions

highly optimistic estimate: voxel selection for training set and test set based on test data and test labels

Specifying and inverting the model – how?

F

D

E C

B

A

Definition of ROIs

Are regions of interest defined anatomically or functionally?

anatomically functionally

Functional contrasts

Are the functional contrasts defined across all subjects or between groups?

across subjects

between groups

Page 13: Model-based classification Kay Henning Brodersen · 2020. 5. 17. · evidence for sparse coding vs. dense coding ... The class-specific accuracies ... Creation of a low-dimensional,

15

Model We model the likelihood functions for 𝑘+ positive and 𝑘− negative correct predictions as:

𝑝 𝑘+ 𝜋+, 𝑛+ = Bin 𝑘+ 𝜋+, 𝑛+ 𝑝 𝑘− 𝜋−, 𝑛− = Bin(𝑘−|𝜋−, 𝑛−)

The class-specific accuracies 𝜋+ and 𝜋− can be modelled as latent random variables with conjugate Beta priors:

𝑝 𝜋+ 𝛼+, 𝛽+ = Beta 𝜋+ 𝛼+, 𝛽+

𝑝 𝜋− 𝛼−, 𝛽− = Beta 𝜋− 𝛼−, 𝛽−

This prior is uninformative when using the hyperparameters 𝛼+ = 𝛽+ = 𝛼− = 𝛽− = 1. The balanced

accuracy is given by 𝜙 ≔1

2𝜋+ + 𝜋− .

Full Bayesian approach to performance evaluation

Beta(𝜋+|1,1)

Beta(𝜋+|7,3)

Brodersen, Chumbley, Mathys, Daunizeau, Ong, Buhmann & Stephan (in preparation)

𝑘+ 𝑘−

𝜋+ 𝜋−

𝛼+, 𝛽+ 𝛼−, 𝛽−

𝜋 + 𝜋 −

~

𝜙 =𝜋 + + 𝜋 −

2

Inference Inverting the model yields the posterior balanced classification accuracy,

𝑝 𝜙 𝑘+, 𝑘−, 𝑛+, 𝑛−, 𝛼+, 𝛽+, 𝛼−, 𝛽−

= Beta 2 𝜙 − 𝑧 𝛼𝑛+, 𝛽𝑛

+ Beta 2𝑧 𝛼𝑛−, 𝛽𝑛

− 𝑑𝑧1

0.

Page 14: Model-based classification Kay Henning Brodersen · 2020. 5. 17. · evidence for sparse coding vs. dense coding ... The class-specific accuracies ... Creation of a low-dimensional,

16

Summary of the analysis

pre-processing

estimation of group contrasts based on all subjects except subject j selection of voxels for regions of interest

unsupservised DCM inversion for each subject

training the SVM on all subjects except subject j testing the SVM on subject j

performance evaluation

1 2 3

repeat for each subject

A

C B

Page 15: Model-based classification Kay Henning Brodersen · 2020. 5. 17. · evidence for sparse coding vs. dense coding ... The class-specific accuracies ... Creation of a low-dimensional,

17

Example: diagnosis of moderate aphasia

Page 16: Model-based classification Kay Henning Brodersen · 2020. 5. 17. · evidence for sparse coding vs. dense coding ... The class-specific accuracies ... Creation of a low-dimensional,

18

Regions of interest

x = –56 mm y = –20 mm z = 8 mm

L R

Page 17: Model-based classification Kay Henning Brodersen · 2020. 5. 17. · evidence for sparse coding vs. dense coding ... The class-specific accuracies ... Creation of a low-dimensional,

19

Neuronal model

Schofield, Penny, Stephan, Crinion, Thompson, Price & Leff (under review) Brodersen, Schofield, Leff, Ong, Lomakina, Buhmann & Stephan (2011) PLoS Comp Biol

L.MGB

L.PT

L.HG (A1)

R.MGB

R.PT

R.HG (A1)

Page 18: Model-based classification Kay Henning Brodersen · 2020. 5. 17. · evidence for sparse coding vs. dense coding ... The class-specific accuracies ... Creation of a low-dimensional,

20

Univariate analysis

range(d1$x, d2$x) range(d1$x, d2$x) range(d1$x, d2$x) range(d1$x, d2$x) range(d1$x, d2$x) range(d1$x, d2$x) range(d1$x, d2$x) range(d1$x, d2$x)

range(d1$x, d2$x) range(d1$x, d2$x) range(d1$x, d2$x) range(d1$x, d2$x) range(d1$x, d2$x) range(d1$x, d2$x) range(d1$x, d2$x) range(d1$x, d2$x)

L.MGB → L.MGB L.MGB → L.HG L.MGB → L.PT L.HG → L.HG *** L.HG → L.PT *** L.HG → R.HG L.PT → L.MGB L.PT → L.HG

L.PT → L.PT L.PT → R.PT R.MGB → R.MGB R.MGB → R.HG R.MGB → R.PT *** R.HG → L.HG *** R.HG → R.HG R.HG → R.PT

R.PT → L.PT R.PT → R.MGB R.PT → R.HG R.PT → R.PT input to L.MGB input to R.MGB patients controls

Page 19: Model-based classification Kay Henning Brodersen · 2020. 5. 17. · evidence for sparse coding vs. dense coding ... The class-specific accuracies ... Creation of a low-dimensional,

21

Connectional fingerprints

patients controls

Page 20: Model-based classification Kay Henning Brodersen · 2020. 5. 17. · evidence for sparse coding vs. dense coding ... The class-specific accuracies ... Creation of a low-dimensional,

22

16218917734729133230781 2433893600.4

0.5

0.6

0.7

0.8

0.9

1

bala

nced

accu

racy

0 0.5 10

0.2

0.4

0.6

0.8

1

FPR (1 - specificity)T

PR

(sen

sitiv

ity)

0 0.5 10

0.2

0.4

0.6

0.8

1

TPR (recall)

PP

V (

pre

cis

ion

)

Classification performance

16218917734729133230781 2433893600.4

0.5

0.6

0.7

0.8

0.9

1

bala

nced

accu

racy

0 0.5 10

0.2

0.4

0.6

0.8

1

FPR (1 - specificity)T

PR

(sen

sitiv

ity)

0 0.5 10

0.2

0.4

0.6

0.8

1

TPR (recall)

PP

V (

pre

cis

ion

)

anatomical contrast searchlight PCA means correlations eigenvariates correlations eigenvariates z-correlations gen.embed., original model gen.embed., feedforward gen.embed., left hemisphere gen.embed., right hemisphere

activation- based

correlation- based

model- based

a c s p m e z o f l r

Page 21: Model-based classification Kay Henning Brodersen · 2020. 5. 17. · evidence for sparse coding vs. dense coding ... The class-specific accuracies ... Creation of a low-dimensional,

23

Biologically implausible models perform poorly

A B

L.MGB

L.PT

L.HG (A1)

R.MGB

R.PT

R.HG (A1)

C

L.MGB

L.PT

L.HG (A1)

R.MGB

R.PT

R.HG (A1)

auditory stimuli auditory stimuli auditory stimuli

Page 22: Model-based classification Kay Henning Brodersen · 2020. 5. 17. · evidence for sparse coding vs. dense coding ... The class-specific accuracies ... Creation of a low-dimensional,

24

Discriminative features in model space

L.MGB

L.PT

L.HG (A1)

R.MGB

R.PT

R.HG (A1)

Page 23: Model-based classification Kay Henning Brodersen · 2020. 5. 17. · evidence for sparse coding vs. dense coding ... The class-specific accuracies ... Creation of a low-dimensional,

25

Discriminative features in model space

L.MGB

L.PT

L.HG (A1)

R.MGB

R.PT

R.HG (A1)

Page 24: Model-based classification Kay Henning Brodersen · 2020. 5. 17. · evidence for sparse coding vs. dense coding ... The class-specific accuracies ... Creation of a low-dimensional,

26

Illustration of the generative score space

-10

0

10

-0.5

0

0.5

-0.1

0

0.1

0.2

0.3

0.4

-0.4

-0.2

0 -0.5

0

0.5-0.4

-0.35

-0.3

-0.25

-0.2

-0.15

-10

0

10

-0.5

0

0.5

-0.1

0

0.1

0.2

0.3

0.4

-0.4

-0.2

0 -0.5

0

0.5-0.4

-0.35

-0.3

-0.25

-0.2

-0.15

ge

ne

rativ

e

em

be

ddin

g

L.H

G →

L.H

G

Vo

xe

l (6

4,-

24

,4)

mm

L.MGB → L.MGB Voxel (-42,-26,10) mm

Voxel (-56,-20,10) mm R.HG → L.HG

controls patients

Voxel-based input space Generative score space

Page 25: Model-based classification Kay Henning Brodersen · 2020. 5. 17. · evidence for sparse coding vs. dense coding ... The class-specific accuracies ... Creation of a low-dimensional,

27

❶ Strong classification performance

Generative embedding exploits the rich discriminative information encoded in ‘hidden’ quantities, such as coupling parameters. It may therefore outperform conventional schemes.

❷ Creation of a low-dimensional, interpretable feature space

The approach replaces high-dimensional fMRI data by a low-dimensional subject-specific fingerprint, where each dimension has a specific biological interpretation.

❸ Domains of application

Generative embedding can be used both for trial-by-trial decoding (EEG, MEG, or LFP data) and for subject-by-subject classification analyses (fMRI data).

Summary