Multiple testing

25
Multiple testing Justin Chumbley Laboratory for Social and Neural Systems Research University of Zurich With many thanks for slides & images to: FIL Methods group

description

Multiple testing. Justin Chumbley Laboratory for Social and Neural Systems Research University of Zurich. With many thanks for slides & images to: FIL Methods group. Detect an effect of unknown extent & location. Design matrix. Statistical parametric map (SPM). Image time-series. Kernel. - PowerPoint PPT Presentation

Transcript of Multiple testing

Page 1: Multiple testing

Multiple testing

Justin ChumbleyLaboratory for Social and Neural Systems ResearchUniversity of Zurich

With many thanks for slides & images to:FIL Methods group

Page 2: Multiple testing

Detect an effect of unknown extent & location

Realignment Smoothing

Normalisation

General linear model

Statistical parametric map (SPM)Image time-series

Parameter estimates

Design matrix

Template

Kernel

• Voluminous• Dependent

p <0.05

Statisticalinference

Page 3: Multiple testing

t =

contrast ofestimated

parameters

varianceestimate

t

Error at a single voxel

Page 4: Multiple testing

H0 , H1: zero/non-zero activation

t =

contrast ofestimated

parameters

varianceestimate

t

Error at a single voxel

Page 5: Multiple testing

Decision:H0 , H1: zero/non-zero activation

t =

contrast ofestimated

parameters

varianceestimate

t

h

Error at a single voxel

Page 6: Multiple testing

Decision:H0 , H1: zero/non-zero activation

t =

contrast ofestimated

parameters

varianceestimate

t

h

h

Error at a single voxel

Page 7: Multiple testing

Decision:H0 , H1: zero/non-zero activation

t =

contrast ofestimated

parameters

varianceestimate

t

h

h

Error at a single voxel

Page 8: Multiple testing

Decision:H0 , H1: zero/non-zero activation

t =

contrast ofestimated

parameters

varianceestimate

t

h

Decision rule (threshold) h, determines related error rates ,

Convention: Penalize complexityChoose h to give acceptable under H0

hh

h h

h

Error at a single voxel

Page 9: Multiple testing

Types of error Reality

H1

H0

H0 H1

True negative (TN)

True positive (TP)False positive (FP)

False negative (FN)

specificity: 1- = TN / (TN + FP)= proportion of actual negatives which are correctly identified

sensitivity (power): 1- = TP / (TP + FN)= proportion of actual positives which are correctly identified

h

h

hh

Decision

Page 10: Multiple testing

Multiple tests

t =

contrast ofestimated

parameters

varianceestimate

t

h

ht

h

h

h

t

h

h

What is the problem?

Page 11: Multiple testing

Multiple tests

t =

contrast ofestimated

parameters

varianceestimate

t

h

ht

h

h

h

t

h

h

( 1 )

( )

hp or more FP FWERFPE FDR

All positives

Penalize each independentopportunity for error.

Page 12: Multiple testing

Multiple tests

t =

contrast ofestimated

parameters

varianceestimate

t

h

ht

h

h

h

t

h

h hh

FWERN

h h

Bonferonni

FWER N

Convention: Choose h to limit assuming family-wise H0

hFWER

Page 13: Multiple testing

Issues1. Voxels or regions2. Bonferroni too harsh (insensitive)

• Unnecessary penalty for sampling resolution (#voxels/volume)

• Unnecessary penalty for independence

Page 14: Multiple testing

• intrinsic smoothness– MRI signals are aquired in k-space (Fourier space); after projection on anatomical

space, signals have continuous support– diffusion of vasodilatory molecules has extended spatial support

• extrinsic smoothness– resampling during preprocessing– matched filter theorem

deliberate additional smoothing to increase SNR– Robustness to between-subject anatomical differences

Page 15: Multiple testing

1. Apply high threshold: identify improbably high peaks

2. Apply lower threshold: identify improbably broad peaks

3. Total number of regions?

Acknowledge/estimate dependence Detect effects in smooth landscape, not voxels

Page 16: Multiple testing

Null distribution?1. Simulate null experiments 2. Model null experiments

Page 17: Multiple testing

Use continuous random field theory• image discretised continuous random field

Discretisation(“lattice

approximation”)

Smoothness quantified: resolution elements (‘resels’)• similar, but not identical to # independent observations• computed from spatial derivatives of the residuals

Page 18: Multiple testing

Euler characteristic (h)

– threshold an image at high h

- h # blobs

FWER E [h] = p (blob)

Page 19: Multiple testing

• General form for expected Euler characteristic• 2, F, & t fields

E[h(W)] = Sd Rd (W) rd (h)

Small volumes: Anatomical atlas, ‘functional localisers’, orthogonal contrasts, volume around previously reported coordinates…

Unified Formula

Rd (W): d-dimensional Minkowskifunctional of W

– function of dimension,space W and smoothness:

R0(W) = (W) Euler characteristic ofWR1(W) = resel diameterR2(W) = resel surface areaR3(W) = resel volume

rd (W): d-dimensional EC density of Z(x)– function of dimension and threshold,

specific for RF type:E.g. Gaussian RF:

r0(h) = 1- (h)

r1(h) = (4 ln2)1/2 exp(-h2/2) / (2p)

r2(h) = (4 ln2) exp(-h2/2) / (2p)3/2

r3(h) = (4 ln2)3/2 (h2 -1) exp(-h2/2) / (2p)2

r4(h) = (4 ln2)2 (h3 -3h) exp(-h2/2) / (2p)5/2

W

Page 20: Multiple testing

Euler characteristic (EC) for 2D images

)5.0exp()2)(2log4(ECE 22/3 hhR pR = number of reselsh = threshold

Set h such that E[EC] = 0.05

Example: For 100 resels, E [EC] = 0.049 for a Z threshold of 3.8. That is, the probability of getting one or more blobs where Z is greater than 3.8, is 0.049.

Expected EC values for an image of 100 resels

Page 21: Multiple testing

Spatial extent: similar

Page 22: Multiple testing

Voxel, cluster and set level tests

e

u h

Page 23: Multiple testing
Page 24: Multiple testing

ROI Voxel Field‘volume’ resolution*

volume independence

FWE FDR

*voxels/volume

Height Extent

ROI Voxel Field

Height Extent

There is a multiple testing problem (‘voxel’ or ‘blob’ perspective).More corrections needed as ..

Detect an effect of unknown extent & location

Volume , Independence

Page 25: Multiple testing

Further reading• Friston KJ, Frith CD, Liddle PF, Frackowiak RS. Comparing

functional (PET) images: the assessment of significant change. J Cereb Blood Flow Metab. 1991 Jul;11(4):690-9.

• Genovese CR, Lazar NA, Nichols T. Thresholding of statistical maps in functional neuroimaging using the false discovery rate. Neuroimage. 2002 Apr;15(4):870-8.

• Worsley KJ Marrett S Neelin P Vandal AC Friston KJ Evans AC. A unified statistical approach for determining significant signals in images of cerebral activation. Human Brain Mapping 1996;4:58-73.