Download - Hierarchical Models and Variance Components

Hierarchical Models and Variance Components

Hierarchical Models and Variance Components

Will Penny

Wellcome Department of Imaging Neuroscience, University College London, UK

SPM Course, London, May 2003

Outline Random Effects Analysis

Summary statistic approach (t-tests @ 2nd level) General Framework

Multiple variance components and Hierarchical models Multiple variance components

F-tests and conjunctions @2nd levelModelling fMRI serial correlation @1st level

Hierarchical models for Bayesian InferenceSPMs versus PPMs

1st Level 2nd Level

^

1^

^

2^

^

11^

^

12^

Data Design Matrix Contrast Images

)ˆ(ˆ

ˆ

craV

ct

Random Effects Analysis:Summary-Statistic Approach

SPM(t)

One-samplet-test @2nd level

Validity of approach

Gold Standard approach is EM – see later –estimates population mean effect as MEANEM

the variance of this estimate as VAREM

For N subjects, n scans per subject and equal within-subject variancewe have

VAREM = Var-between/N + Var-within/Nn

In this case, the SS approach gives the same results, on average:

Avg[MEANEM

Avg[Var()] =VAREM

In other cases, with N~12, and typical ratios of between-subject to within-subject variance found in fMRI, the SS approach will give very similar results to EM.

^^

Example: Multi-session study of auditory processing

SS results EM results

Friston et al. (2003) Mixed effects and fMRI studies, Submitted.

Two populations

Contrast images

Estimatedpopulation means

Two-samplet-test @2nd level

y = X + N 1 N L L 1 N 1

2 Basic AssumptionsIdentityIndependence

The General Linear Model

IC

N

N

Error covariance

We assume ‘sphericity’

y = X + N 1 N L L 1 N 1

Multiple variance components

N

N

Error covariance

QC kk

k

Errors can now have different variances and there can be correlations

We allow for ‘nonsphericity’

Errors are independent but not identical

Errors are not independent and not identical

Error Covariance

Non-Sphericity

)()()()1(

)2()2()2()1(

)1()1()1(

nnnn X

X

Xy

General Framework

)()()( i

k

i

kk

i QC

Hierarchical Models Multiple variance componentsat each level

With hierarchical models we can define priors and make Bayesian inferences.

If we know the variance components we can compute the distributions over the parameters at each level.

)()()()1(

)2()2()2()1(

)1()1()1(

nnnn X

X

Xy

E-Step

yCXC

XCXC

T

yy

T

y

1

11

M-Stepy

Xyr

for i and j {

}{

}{}{

11

11111

CQCQtrJ

XCQCXCtrrCQCrCQtrg

ijij

i

T

yi

T

ii

}

kkQCC

gJ

1

Friston, K. et al. (2002), Neuroimage

EM algorithm

)()()( i

k

i

kk

i QC

Estimation

)()()()1(

)2()2()2()1(

)1()1()1(

nnnn X

X

Xy

)()()1()()1()1()2()1()1( nnnn XXXXXy

Hierarchical model

Single-level model

ParametricEmpiricalBayes (PEB)

RestrictedMaximimumLikelihood(ReML)

Algorithm Equivalence

EM=PEB=ReML

Errors are independent but not identical

Errors are not independent and not identical

Error Covariance

Non-Sphericity

Error can be Independent but Non-Identical when…

1) One parameter but from different groups

e.g. patients and control groups

2) One parameter but design matrices differ across subjects

e.g. subsequent memory effect

Non-Sphericity

Error can be Non-Independent and Non-Identical when…

1) Several parameters per subject e.g. Repeated Measurement design

2) Conjunction over several parameters e.g. Common brain activity for different cognitive processes

3) Complete characterization of the hemodynamic response e.g. F-test combining HRF, temporal derivative and dispersion regressors

Non-Sphericity

jump touch koob

Stimuli: Auditory Presentation (SOA = 4 secs) of

(i) words and (ii) words spoken backwards

Subjects: (i) 12 control subjects(ii) 11 blind subjects

Scanning: fMRI, 250 scans per subject, block design

Example I

“click”

Q. What are the regions that activate for real words relative to reverse words in both blind and control groups?

U. Noppeney et al.

2nd Level

Controls Blinds

Independent but Non-Identical Error

1st Level

Conjunctionbetween the

2 groups

Controls and Blinds

jump touch

motion actionvisualsound

Stimuli: Auditory Presentation (SOA = 4 secs) of words

Subjects: (i) 12 control subjects


Example 2

“click”“jump” “click” “pink” “turn”

Q. What regions are affected by the semantic content of the words ?

U. Noppeney et al.

Non-Independent and Non-Identical Error

1st Leve visual sound hand motion

2nd Level

?=

?=

?=

F-test


Stimuli: (i) Sentences presented visually

(ii) False fonts (symbols)

Example III

Some of the sentences are syntactically primed

U. Noppeney et al.

Q. Which brain regions of the “sentence reading system” are affected by Priming?

1st Level Sentence > Symbols No-Priming>Priming

Orthogonal contrasts

2nd Level

Non-Independent and Non-Identical Error

Conjunction of 2 contrasts

LeftAnteriorTemporal

Modelling serial correlation in fMRI time series

Model errors for each subjectas AR(1) + white noise.

Example IV

Bayes Rule

Example 2:Univariate model

Likelihood and Prior

)2()2()1(

)1()1(

y

),()(

),()|()2()2()1(

)1()1()1(

Np

Nyp

)2(

)1(

)2(

)1(

)1(

)1(

)1(

)2()1()1(

)1()1()1( ),()|(

PPm

P

PmNyp

Posterior

)2( m )1( )1(

Relative Precision Weighting

Example 2:Univariate model

Likelihood and Prior

)2()2()1(

)1()1(

y

),()(

),()|()2()2()1(

)1()1()1(

Np

Nyp

)2(

)1(

)2(

)1(

)1(

)1(

)1(

)2()1()1(

)1()1()1( ),()|(

PPm

P

PmNyp

Posterior

Similar expressions exist for posterior distributionsin multivariate models

AIM: Make inferences based onposterior distribution

But how do we compute thevariance components or ‘hyperparameters’ ?

)()()()1(

)2()2()2()1(

)1()1()1(

nnnn X

X

Xy

E-Step

yCXC

XCXC

T

yy

T

y

1

11

M-Stepy

Xyr

for i and j {

}{

}{}{

11

11111

CQCQtrJ

XCQCXCtrrCQCrCQtrg

ijij

i

T

yi

T

ii

}

kkQCC

gJ

1

Friston, K. et al. (2002), Neuroimage

EM algorithm

)()()( i

k

i

kk

i QC

Estimation

Estimating mean and variance

N

nny

N 1

1

N

nny

N 1

2)(

11

N

nny

N 1

1

Maximum Likelihood (ML), maximises p(Y|,)

Expectation-Maximisation (EM),maximises

N

nny

N 1

2)(

1

11

dpYpYp )(),|()|(

for ‘vague’ prior on

Estimating mean and variance

N

nny

N 1

2)(

11

N

nny

N 1

For a prior on with prior mean 0 and prior precision Expectation-Maximisation (EM) gives

where

10

N

N

Larger more shrinkage

Estimating mean and variance atmultiple voxels

N

nini

ii

yN 1

2

, )(11

10

i

ii N

N

N

nni

ii y

N 1,

For a prior on over voxels with prior mean 0 and prior precision Expectation-Maximisation (EM) gives at voxel i=1..V, scan n=1..N

where

V

ii

ii 1

211

Prior precision can be estimated from data. If mean activation overall voxels is 0 then these EM estimates are more accurate than ML

The Interface

WLSParameters,REMLHyperparameters

PEBParametersandHyperparameters

No Priors Shrinkagepriors

Bayesian Inference

1st level = within-voxel

2nd level = between-voxels

Likelihood

Shrinkage Prior

)2()2()2()1(

)1()1()1(

X

Xy

In the absence of evidenceto the contrary parameterswill shrink to zero

LikelihoodLikelihood PriorPriorPosteriorPosterior

SPMsSPMs

PPMsPPMs

u

)(yft

)0|( tp)|( yp

Bayesian Inference: Posterior Probability Maps

)()|()|( pypyp

SPMs and PPMsS

PM

mip

[0, 0

, 0]

<

< <

PPM2.06

rest [2.06]

SPMresults:C:\home\spm\analysis_PET

Height threshold P = 0.95

Extent threshold k = 0 voxels

Design matrix1 4 7 10 13 16 19 22

147

1013161922252831343740434649525560

contrast(s)

4

SP

Mm

ip[0

, 0, 0

]

<

< <

SPM{T39.0

}

rest

SPMresults:C:\home\spm\analysis_PET

Height threshold T = 5.50

Extent threshold k = 0 voxels

Design matrix1 4 7 10 13 16 19 22

147

1013161922252831343740434649525560

contrast(s)

3

PPMs: Show activationsof a given size

SPMs: show voxels with non-zero activations

PPMs

Advantages Disadvantages

One can infer a causeDID NOT elicit a response

SPMs conflate effect-size and effect-variability

P-values don’t change withsearch volume

For reasonable thresholds have intrinsically high specificity

Use of shrinkagepriors over voxelsis computationally demanding

Utility of Bayesian approach is yetto be established

Summary Random Effects Analysis

Summary statistic approach (t-tests @ 2nd level) Multiple variance components



General FrameworkMultiple variance components and Hierarchical models