Effect Estimation & Testing

38
1 Effect Estimation & Testing Thomas Nichols, Ph.D. Assistant Professor Department of Biostatistics http://www.sph.umich.edu/~nichols fMRI Course OHBM 2004

description

Effect Estimation & Testing. Thomas Nichols, Ph.D. Assistant Professor Department of Biostatistics http://www.sph.umich.edu/~nichols fMRI Course OHBM 2004. Outline. Data Modeling General Linear Model GLM Issues Statistical Inference Statistic Images & Hypothesis Testing - PowerPoint PPT Presentation

Transcript of Effect Estimation & Testing

Page 1: Effect Estimation & Testing

1

Effect Estimation& Testing

Thomas Nichols, Ph.D.Assistant Professor

Department of Biostatistics

http://www.sph.umich.edu/~nichols

fMRI CourseOHBM 2004

Page 2: Effect Estimation & Testing

2

Outline

• Data Modeling– General Linear Model – GLM Issues

• Statistical Inference– Statistic Images & Hypothesis Testing– Multiple Testing Problem

Page 3: Effect Estimation & Testing

3

Basic fMRI Example

• Data at one voxel– Rest vs.

passive word listening

• Is there an effect?

Page 4: Effect Estimation & Testing

4

A Linear Model

IntensityT

ime = 1 2+ + er

ror

x1 x2

• “Linear” in parameters 1 & 2

Page 5: Effect Estimation & Testing

5

Linear model, in image form…

= + +1 2

Y 11x 22x

Page 6: Effect Estimation & Testing

6

… in image matrix form…

= +

2

1

Y X

Page 7: Effect Estimation & Testing

7

… in matrix form.

XY

=

+YY X

N

1

N N

1 1p

p

N: Number of scans, p: Number of regressors

GeneralLinear Model

Really general–Correlation

–ANOVA

–ANCOVA

Page 8: Effect Estimation & Testing

8

Linear Model Issues

• Signal Predictors– Block– Event-related

• Nuisance Predictors– Drift – Motion parameters

• Autocorrelation

• Random effects

Page 9: Effect Estimation & Testing

9

Temporal AutocorrelationIn Brief

Advantage Disadvantage Software

Indep. Simple Inflated significance

All

Precoloring Avoids autocorr. est.

Statistically inefficient

SPM99

Whitening Statistically optimal

Requires precise autocorr. est.

FSL, SPM2

Page 10: Effect Estimation & Testing

10

Random Effects Models

• GLM has only one source of randomness

– Residual error

• But people are another source of error– Everyone activates somewhat differently…

XY

Page 11: Effect Estimation & Testing

11

Subj. 1

Subj. 2

Subj. 3

Subj. 4

Subj. 5

Subj. 6

0

Fixed vs.RandomEffects

• Fixed Effects– Intra-subject

variation suggests all these subjects different from zero

• Random Effects– Intersubject

variation suggests population not very different from zero

Distribution of each subject’s estimated effect

Distribution of population effect

Page 12: Effect Estimation & Testing

12

Random Effects for fMRI• Summary Statistic Approach

– Easy• Create contrast images for each subject• Analyze contrast images with one-sample t

– Limited• Only allows one scan per subject• Assumes balanced designs and homogeneous meas. error.

• Full Mixed Effects Analysis– Harder

• Requires iterative fitting• REML to estimate inter- and intra subject variance

– SPM2 & FSL3 implement this differently

– Very flexible

Page 13: Effect Estimation & Testing

13

Random Effects for fMRIRandom vs. Fixed

• Fixed isn’t “wrong”, just usually isn’t of interest• If it is sufficient to say

“I can see this effect in this cohort”then fixed effects are OK

• If need to say“If I were to sample a new cohort from the population I would get the same result”

then random effects are needed

Page 14: Effect Estimation & Testing

14

Building Statistic Images

• Contrast– A linear combination

of parameters– Truth: c’ Estimate:

T =

contrast ofestimated

parameters

varianceestimate

T =

ss22c’(X’X)c’(X’X)-1-1cc

c’ = 1 0 0 0 0 0 0 0

...ˆˆˆˆˆ54321

’c

’c

Page 15: Effect Estimation & Testing

16

Hypothesis Testing

• Assume Null Hypothesis of no signal

• Given that there is nosignal, how likely is our measured T?

• P-value measures this– Probability of obtaining T

as large or larger

level– Acceptable false positive rate

P-val

T

Page 16: Effect Estimation & Testing

17

Hypothesis Testing in fMRI

• Massively Univariate Modeling– Fit model at each voxel– Create statistic images of effect

• Which of 100,000 voxels are significant? =0.05 5,000 false positives!

t > 0.5 t > 1.5 t > 2.5 t > 3.5 t > 4.5 t > 5.5 t > 6.5

Page 17: Effect Estimation & Testing

18

MCP Solutions:Measuring False Positives

• Familywise Error Rate (FWER)– Familywise Error

• Existence of one or more false positives

– FWER is probability of familywise error

• False Discovery Rate (FDR)– R voxels declared active, V falsely so

• Observed false discovery rate: V/R

– FDR = E(V/R)

Page 18: Effect Estimation & Testing

19

FWER MCP Solutions

• Bonferroni

• Maximum Distribution Methods– Random Field Theory– Permutation

Page 19: Effect Estimation & Testing

20

FWER MCP Solutions

• Bonferroni

• Maximum Distribution Methods– Random Field Theory– Permutation

Page 20: Effect Estimation & Testing

21

FWER MCP Solutions: Controlling FWER w/ Max

• FWER & distribution of maximum

FWER= P(FWE)= P(One or more voxels u |

Ho)= P(Max voxel u | Ho)

• 100(1-)%ile of max distn controls FWERFWER = P(Max voxel u | Ho)

u

Page 21: Effect Estimation & Testing

22

FWER MCP Solutions:Random Field Theory

• Euler Characteristic u

– Topological Measure• #blobs - #holes

– At high thresholds,just counts blobs

– FWER = P(Max voxel u | Ho)= P(One or more blobs | Ho) P(u 1 | Ho) E(u | Ho)

Random Field

Suprathreshold Sets

Threshold

Page 22: Effect Estimation & Testing

23

Random Field Intuition

• Corrected P-value for voxel value t Pc = P(max T > t)

E(t) () ||1/2 t2 exp(-t2/2)

• Statistic value t increases– Pc decreases (of course!)

• Search volume increases– Pc increases (more severe MCP)

• Smoothness increases (||1/2 smaller)– Pc decreases (less severe MCP)

Page 23: Effect Estimation & Testing

24

Random Field TheoryStrengths & Weaknesses

• Closed form results for E(u)– Z, t, F, Chi-Squared Continuous RFs

• Results depend only on volume & smoothness

• Smoothness assumed known• Sufficient smoothness required

– Results are for continuous random fields

• Multivariate normality• Several layers of approximations

Lattice ImageData

Continuous Random Field

Page 24: Effect Estimation & Testing

25

FWER MCP Solutions

• Bonferroni

• Maximum Distribution Methods– Random Field Theory– Permutation

Page 25: Effect Estimation & Testing

26

Nonparametric Permutation Test

• Parametric methods– Assume distribution of

statistic under nullhypothesis

• Nonparametric methods– Use data to find

distribution of statisticunder null hypothesis

– Any statistic!

5%

Parametric Null Distribution

5%

Nonparametric Null Distribution

Page 26: Effect Estimation & Testing

27

Controlling FWER: Permutation Test

• Parametric methods– Assume distribution of

max statistic under nullhypothesis

• Nonparametric methods– Use data to find

distribution of max statisticunder null hypothesis

– Any max statistic!

5%

Parametric Null Max Distribution

5%

Nonparametric Null Max Distribution

Page 27: Effect Estimation & Testing

28

Permutation TestStrengths

• Requires only assumption of exchangeability– Under Ho, distribution unperturbed by permutation

• Subjects are exchangeable– Under Ho, each subject’s A/B labels can be flipped

• fMRI scans not exchangeable under Ho– Due to temporal autocorrelation– Need to de-correlate, then permute

(Brammer, Bullmore et al, 1997)

Page 28: Effect Estimation & Testing

29

Permutation TestLimitations

• Computational Intensity– Analysis repeated for each relabeling– Not so bad on modern hardware

• No analysis discussed below took more than 3 hours

• Implementation Generality– Each experimental design type needs unique

code to generate permutations• Not so bad for population inference with t-tests

Page 29: Effect Estimation & Testing

30

Measuring False Positives

• Familywise Error Rate (FWER)– Familywise Error

• Existence of one or more false positives

– FWER is probability of familywise error

• False Discovery Rate (FDR)– R voxels declared active, V falsely so

• Observed false discovery rate: V/R

– FDR = E(V/R)

Page 30: Effect Estimation & Testing

31

False Discovery RateIllustration:

Signal

Signal+Noise

Noise

Page 31: Effect Estimation & Testing

32

FWE

6.7% 10.4% 14.9% 9.3% 16.2% 13.8% 14.0% 10.5% 12.2% 8.7%

Control of Familywise Error Rate at 10%

11.3% 11.3% 12.5% 10.8% 11.5% 10.0% 10.7% 11.2% 10.2% 9.5%

Control of Per Comparison Rate at 10%

Percentage of Null Pixels that are False Positives

Control of False Discovery Rate at 10%

Occurrence of Familywise Error

Percentage of Activated Pixels that are False Positives

Page 32: Effect Estimation & Testing

33

Controlling FDR:Benjamini & Hochberg

• Select desired limit q on E(FDR)• Order p-values, p(1) p(2) ... p(V)

• Let r be largest i such that

• Reject all hypotheses corresponding to p(1), ... , p(r).p(i) i/V q p(i)

i/V

i/V qp-

valu

e

0 1

01

Page 33: Effect Estimation & Testing

34

Example – Working Memory

• fMRI Study of Working Memory – 12 subjects, block design Marshuetz et al (2000)

– Item Recognition• Active:View five letters, 2s pause,

view probe letter, respond

• Baseline: View XXXXX, 2s pause,view Y or N, respond

• Second Level RFX– Difference image, A-B constructed

for each subject

– One sample t test

...

D

yes

...

UBKDA

Active

...

N

no

...

XXXXX

Baseline

Skip

Page 34: Effect Estimation & Testing

35

Example – Working MemoryRFT Result

• Threshold– S = 110,776– 2 2 2 voxels

5.1 5.8 6.9 mmFWHM

– u = 9.870

• Result– 5 voxels above

the threshold

-log 1

0 p

-va

lue

Page 35: Effect Estimation & Testing

36

Example – Working MemoryNon-Parametric Result

• Threshold– u = 7.67

• Result– 58 voxels above

the threshold

-log 1

0 p

-va

lue

Permutation Distribution Maximum t

Page 36: Effect Estimation & Testing

37

Example – Working MemoryFDR Result

• FDR Threshold– u = 3.83

• Result– 3,073 voxels above

threshold

Page 37: Effect Estimation & Testing

38

Conclusions

• Must account for multiple comparisons• FWER

– Random Field Theory• Simple to apply, but heavy on assumptions

– Nonparametric• Exact, but requires more computation

• FDR– More lenient measure of false positives – more powerful

– Sociological calibration still underway (5%? 1%? 0.1%?)

Page 38: Effect Estimation & Testing

39

Thanks

• Slide help– Stefan Keibel, Rik Henson, JB Poline, Andrew

Holmes