Learning and comparing multi-subject models of brain functional connecitivity
-
Upload
gaelvaroquaux -
Category
Technology
-
view
1.342 -
download
0
description
Transcript of Learning and comparing multi-subject models of brain functional connecitivity
Learning and comparing multi-subject modelsof brain functional connectivity
Gael Varoquaux INSERM/Unicog – INRIA/Parietal – Neurospin
Intrinsic brain structures in on-going activity?(cognitive and systems neuroscience research)
Diagnostic markers in resting-state?(medical applications)
Need population-level modelsStatistical (generative) models+ explicit subject variability
In order toAccumulate data in a groupCompare subjects
G Varoquaux 2
Outline
1 Spatial modes of ongoing activity
2 Graphical models of brain connectivity
3 Detecting differences in connectivity
G Varoquaux 3
1 Spatial modes of ongoingactivity
G Varoquaux 4
1 Spatial modes of ongoingactivity
G Varoquaux 4
1 Decomposing in spatial modes: a modelti
me
voxels
tim
e
voxels
tim
e voxels
Y +E · S=
25
N
Decomposing time series into:covarying spatial maps, Suncorrelated residuals, N
ICA: minimize mutual information across S
G Varoquaux 5
1 ICA on multiple subjects: group ICA
[Calhoun HBM 2001]
Estimate common spatial maps S:ti
me
voxels
tim
e
voxels
tim
e voxels
Y +E · S= N111
tim
e
tim
e
tim
e
Y +E · S= Nsss
··· ··· ···
Concatenate images, minimize norm of residualsCorresponds to fixed-effects modeling:
i.i.d. residuals Ns
G Varoquaux 6
1 ICA on multiple subjects: group ICA
[Calhoun HBM 2001]
Estimate common spatial maps S:ti
me
voxels
tim
e
voxels
tim
e voxels
Y +E · S= N111
tim
e
tim
e
tim
e
Y +E · S= Nsss
··· ··· ···
Concatenate images, minimize norm of residualsCorresponds to fixed-effects modeling:
i.i.d. residuals Ns
G Varoquaux 6
1 ICA: Noise modelObservation noise: minimize group residuals (PCA):
tim
e
voxels
tim
e
voxels
tim
e voxels
Y +W· B= Oconcat
Learn interesting maps (ICA):
sourc
es voxels
B M · S=voxels
sourc
es
G Varoquaux 7
1 CanICA: random effects model
[Varoquaux NeuroImage 2010]
Subj
ect
Gro
upObservation noise: minimize subject residuals (PCA):
tim
e
voxels
tim
e
voxels
tim
e voxels
Y +W · P= Os s s s
Select signal similar across subjects (CCA):
P1
...
PsR+=
sourc
es
Λ ·· Bvoxels
subje
cts
voxels
Learn interesting maps (ICA):
sourc
es voxels
B M · S=voxels
sourc
es
G Varoquaux 8
1 CanICA: experimental validation
[Varoquaux NeuroImage 2010]
Reproducibility across controls groupsno CCA CanICA MELODIC.36 (.02) .72 (.05) .51 (.04)
Qualitative observation: less ’noise’ components
G Varoquaux 9
1 Noise in the ICA maps
[Varoquaux ISBI 2010]
How to describe noise versus signal?
⇓ ⇓
Blobs standing outBackground noise
G Varoquaux 10
1 Noise in the ICA maps
[Varoquaux ISBI 2010]
How to describe noise versus signal?
Jointdistribution:
Blobs standing out = long-tailed distributionBackground noise = isotropic central mode
G Varoquaux 10
1 Noise in the ICA maps
[Varoquaux ISBI 2010]
How to describe noise versus signal?
⇓ ⇓
Jointdistribution:
Thresholding
G Varoquaux 10
1 ICA as a sparse decomposition
[Varoquaux ISBI 2010]
⇒
sourc
es voxels
B M · S=voxels
sourc
es
Qvoxels
+( (Interesting sources S are sparseQ: Gaussian noise
Thresholding ICA = sparse recovery
Experimental validation: on sub-sampled signal:more robust than other approaches
G Varoquaux 11
1 The group-level ICA mapsVisual system
0-74 9
map 0, reproducibility: 0.54
V1
3-91 -3
map 1, reproducibility: 0.52
V1-V2
40-80 4
map 3, reproducibility: 0.47
extrastriate
-30-78 24
map 25, reproducibility: 0.34
superior parietal[Varoquaux NeuroImage 2010]G Varoquaux 12
1 The group-level ICA mapsMotor system
-1-25 62
map 4, reproducibility: 0.47
part ofmotor
-42-21 54
map 21, reproducibility: 0.36
part ofmotor
-54-8 29
map 32, reproducibility: 0.30
part ofmotor
[Varoquaux NeuroImage 2010]G Varoquaux 12
1 The group-level ICA mapsFrontal structures
-3043 28
map 18, reproducibility: 0.37
frontal 010 54
map 23, reproducibility: 0.35
dorsalmedial wall
021 24
map 29, reproducibility: 0.31
pre-frontal
-3421 -8
map 39, reproducibility: 0.26
part ofprefronto-insular -4215 -3
map 37, reproducibility: 0.28
part ofprefronto-insular
[Varoquaux NeuroImage 2010]G Varoquaux 12
1 The group-level ICA maps
[Varoquaux NeuroImage 2010]
ICA extracts a brain parcellationHowever
No overall control of residualsDoes not select for what we interpret
G Varoquaux 12
1 Multi-subject dictionary learning
[Varoquaux Inf Proc Med Imag 2011]
25 xSubject
mapsGroup
mapsTime series
Subject level spatial patterns:Ys = UsVs T + Es , Es ∼ N (0, σI)
Group level spatial patterns:Vs = V + Fs , Fs ∼ N (0, ζI)
Sparsity and spatial-smoothness prior:V ∼ exp (−ξ Ω(V)), Ω(v) = ‖v‖1+
12vT Lv
G Varoquaux 13
1 Multi-subject dictionary learning
[Varoquaux Inf Proc Med Imag 2011]
Estimation: maximum a posterioriargminUs ,Vs ,V
∑sujets
(‖Ys −UsVsT‖2
Fro + µ‖Vs − V‖2Fro
)+ λΩ(V)
Data fit Subjectvariability
Penalization: sparseand smooth maps
Alternate optimization on Us , Vs , V:Update Us: standard dictionary learning procedure
[Mairal2010]
Update Vs: ridge regression on (Vs − V)T
Update V: proximal operator for λΩ:argmin
v
S∑s=1
12‖v
s − v‖22 + γ Ω(v) = prox
γ/S Ωv, V = mean
sVs
G Varoquaux 14
1 Multi-subject dictionary learning
[Varoquaux Inf Proc Med Imag 2011]
Estimation: maximum a posterioriargminUs ,Vs ,V
∑sujets
(‖Ys −UsVsT‖2
Fro + µ‖Vs − V‖2Fro
)+ λΩ(V)
Data fit Subjectvariability
Penalization: sparseand smooth maps
Parameter selectionµ: comparing variance (PCA spectrum) at subjectand group levelλ: cross-validation
G Varoquaux 14
1 Multi-subject dictionary learning
[Varoquaux Inf Proc Med Imag 2011]
Individual maps + Atlas of functional regions
G Varoquaux 15
1 Multi-subject dictionary learning
[Varoquaux Inf Proc Med Imag 2011]
Multi-subject dictionary learning ICA
G Varoquaux 16
1 Multi-subject dictionary learning
[Varoquaux Inf Proc Med Imag 2011]
Multi-subject dictionary learning ICA
G Varoquaux 16
1 Multi-subject dictionary learning
[Varoquaux Inf Proc Med Imag 2011]
Default mode Base ganglia
G Varoquaux 16
Spatial modes: from fluctuations to a parcellationti
me
voxels
tim
e
voxels
tim
e voxels
Y +E · S= N
G Varoquaux 17
Associated time series:tim
e
voxels
time
voxels
time voxels
Y +E · S= N
G Varoquaux 17
2 Graphical models of brainconnectivity
Modeling the correlations betweenregions
G Varoquaux 18
2 Graphical model for correlationSpecify the probability of observing fMRI data
Multivariate normal P(X) ∝√|Σ−1|e−1
2XT Σ−1X
Parametrized by inverse covariance matrix K = Σ−1
Observations:Covariance matrix
0
1
2
3
4
Direct connections:Inverse covariance
0
1
2
3
4
[Smith 2011, Varoquaux NIPS 2010]G Varoquaux 19
2 Penalized sparse inverse covariance estimationMaximum a posteriori: fit models with a prior
K = argmaxK0
L(Σ|K) + f (K)
Standard sparse inverse-covariance estimation:Prior: many pairs of regions are not connected
Lasso-like problem:`1 penalization f (K) =
∑i 6=j|Ki ,j |
G Varoquaux 20
2 Penalized sparse inverse covariance estimation
[Varoquaux NIPS 2010]
Maximum a posteriori: fit models with a priorK = argmax
K0L(Σ|K) + f (K)
Our contribution: Population prior:
A. Gramfort
same independence structure across subjects⇒ Estimate together all Ks from Σs
Group-lasso (mixed norms):
`21 penalization f(Ks
)= λ
∑i 6=j
√∑s
(Ksi ,j)
2
Convex optimization problem
G Varoquaux 20
2 Population-sparse graph perform better
[Varoquaux NIPS 2010]
Σ−1 Sparseinverse
Populationprior
Likelihood of new data (nested cross-validation)Subject data, Σ−1 -57.1
Subject data, sparse inverse 43.0Group average data, Σ−1 40.6
Group average data, sparse inverse 41.8Population prior 45.6
G Varoquaux 21
2 Brain graphs
[Varoquaux NIPS 2010]
Rawcorrelations
Populationprior
G Varoquaux 22
2 Graphs of brain function?Cognitive function arises from the interplay ofspecialized brain regions:The functional segregation of local areas [...]contrasts sharply with their global integration duringperception and behavior [Tononi 1994]
A proposed measure of functional segregationGraph modularity =
divide in communities tomaximize intra-class connectionsversus extra-class
G Varoquaux 23
2 Graph cuts to isolate functional communitiesFind communities to maximize modularity:
Q =k∑
c=1
A(Vc ,Vc)
A(V ,V )−A(V ,Vc)
A(V ,V )
2A(Va,Vb) is the sum of edges going from Va to Vb
Rewrite as an eigenvalue problem [White 2005]
A ·1100
1 1 0 0
⇒ Spectral clustering = spectral embedding + k-means
Similar to normalized graph cuts
G Varoquaux 24
2 Brain graphs and communities
Rawcorrelations
Populationprior
G Varoquaux 25
2 Brain integration between communities
[Varoquaux NIPS 2010]
Proposed measure for functional integration:mutual information (Tononi)
Integration: Ic1 =12 log det(Kc1)
Mutual information: Mc1,c2 = Ic1∪c2 − Ic1 − Is2
G Varoquaux 26
2 Brain integration between communities
[Varoquaux NIPS 2010]
Proposed measure for functional integration:mutual information (Tononi)
With population prior:
Posterior inferiortemporal 2
Posterior inferiortemporal 1
Lateral visualareas
Medial visual areasOccipital pole visual areasDefault mode network
Fronto-parietalnetworks
Fronto-lateralnetwork
Pars opercularis
Dorsal motor
Ventral motorAuditory Basal ganglia
Left PutamenCingulo-insularnetwork
Right ThalamusRawcorrelations:
G Varoquaux 26
Map functional connections of individualsin a population
G Varoquaux 27
After a stroke, functional connections distant fromthe lesion are modified
?
?
Outcome prognosisin ongoing activity?
G Varoquaux 27
3 Detecting differences inconnectivity
G Varoquaux 28
3 Failure of univariate approach on correlationsSubject variability spread across correlation matrices
0 5 10 15 20 25
0
5
10
15
20
25 Control0 5 10 15 20 25
0
5
10
15
20
25 Control0 5 10 15 20 25
0
5
10
15
20
25 Control0 5 10 15 20 25
0
5
10
15
20
25Large lesion
Cannot apply univariate statistics
Σ1 Σ2 dΣ = Σ2 −Σ1dΣ = Σ2 −Σ1 is not definite positive⇒ Describes impossible observations (negative variance)
G Varoquaux 29
3 Failure of univariate approach on correlationsSubject variability spread across correlation matrices
0 5 10 15 20 25
0
5
10
15
20
25 Control0 5 10 15 20 25
0
5
10
15
20
25 Control0 5 10 15 20 25
0
5
10
15
20
25 Control0 5 10 15 20 25
0
5
10
15
20
25Large lesion
Cannot apply univariate statisticsin contradiction with Gaussian models:
parameters not independent
Σ does not live in a vector space
G Varoquaux 29
3 Simulation on a toy problemSimulate two processes with different inverse covariance
K1: K1 −K2: Σ1: Σ1 −Σ2:
Add jitter in observed covariance... sampleMSE(K1 −K2): MSE(Σ1 −Σ2):
Non-local effects and non homogeneous noiseG Varoquaux 30
3 Theoretical settings: comparison of estimates
θ¹
θ²( )θ¹I -1
( )θ²I -1
Observations in 2 populations: X1 and X2
Goal: comparing estimates: θ(X1) and θ(X1)
Asymptotic normality: θ(X1) ∼ N(θ1, I(θ1)−1
)
G Varoquaux 31
3 Theoretical settings: comparison of estimates
Manifold
[Rao 1945] Fisher information I defines a metric onthe manifold of models.
We use it to choose a global parametrization forcomparisons
G Varoquaux 31
3 Covariance manifold – Sym+n
Metric tensor (Fisher information) [Lenglet 2006]〈dΣ1,dΣ2〉Σ = 1
2trace(Σ−1 dΣ1 Σ−1 dΣ2)
Nice properties of the Sym+n manifold (Lie group):
metric can be fully integrated, gives rise to globalmapping to a vector space (Logarithmic map).∥∥∥Σ1,Σ2
∥∥∥2Σ1
=∥∥∥log
(Σ1− 1
2 Σ2Σ1− 1
2)∥∥∥2,
Locally:∥∥∥Σ1,Σ2
∥∥∥Σ1∝∣∣∣trace(Σ1
− 12 Σ2Σ1
− 12 )− p
∣∣∣= ‖dΣ‖Fro
where dΣ = Σ−1/21 Σ2 Σ−1/2
1
G Varoquaux 32
3 Reparametrization for uniform error geometryLogarithmic mapping:
Σ1 ∈ Sym+n Σ2 ∈ Sym+
n →−−−→Σ1Σ2 ∈ R 1
2 p (p−1)
ControlsPatient
Controls
Patient
G Varoquaux 33
3 Reparametrization for uniform error geometryLogarithmic mapping:
Σ1 ∈ Sym+n Σ2 ∈ Sym+
n →−−−→Σ1Σ2 ∈ R 1
2 p (p−1)
d(Σ1,Σ2) = ‖−−−→Σ1Σ2‖2
Controls
Patient
dΣ
Manifold
Tangent
G Varoquaux 33
3 Statistics...
Do intrinsic statistics on the parameterization:Mean (Frechet mean)PDFParameter-level hypothesis testing
G Varoquaux 34
3 Random effects on the covariance manifoldPopulation-level covariance distribution
Generalized isotropic normal distribution:
p(Σ) = k(σ) exp− 1
2σ2‖Σ?Σ‖2
Σ?
(1)
Population mean:Σ? = argmin
Σ
∑i‖ΣΣi‖2
Σ (2)
Efficient gradient descent algorithm
Principled computation of:group mean Σ? and spread σ
likelihood of new dataG Varoquaux 35
3 Random effects on the covariance manifoldPopulation-level covariance distribution
Generalized isotropic normal distribution:
p(Σ) = k(σ) exp− 1
2σ2‖Σ?Σ‖2
Σ?
(1)
Edge-level statisticsUnder null hypothesis: subject ∈ group model (1)
−→dΣ ∼ N (0, σI) : Independant coefficients
⇒ Univariate statistics on dΣi ,j
[Varoquaux MICCAI 2010]
G Varoquaux 35
3 Discriminating strokes patients from controls20 controls – 10 stroke patients, all different
A. Kleinschmidt F. Baronnet
G Varoquaux 36
3 Discriminating strokes patients from controlsLeave one out likelihood
controls patients
Log-lik
elih
ood
Rn×n
controls patients
Log-lik
elih
ood
Tangentspace
Probabilistic model on manifold discriminatespatients better
G Varoquaux 37
3 ResidualsCorrelation matrices: Σ -1.0 0.0 1.0
0 5 10 15 20 25
0
5
10
15
20
25
0 5 10 15 20 25
0
5
10
15
20
25
0 5 10 15 20 25
0
5
10
15
20
25
0 5 10 15 20 25
0
5
10
15
20
25
Residuals: dΣ -1.0 0.0 1.0
0 5 10 15 20 25
0
5
10
15
20
25
Control 0 5 10 15 20 25
0
5
10
15
20
25
Control 0 5 10 15 20 25
0
5
10
15
20
25
Control 0 5 10 15 20 25
0
5
10
15
20
25
Large lesionG Varoquaux 38
3 Number of edge-level differences detected
1 2 3 4 5 6 7 8 9 10Patient number
012345678910
Num
ber
ofdete
ctio
ns Detections in tangent space
Detections in Rn×n
p-value: 5·10−2
Bonferroni-correctedG Varoquaux 39
3 Post-stroke covariance modifications
p-value: 5·10−2
Bonferroni-correctedG Varoquaux 40
3 Post-stroke covariance modifications
p-value: 5·10−2
Bonferroni-correctedG Varoquaux 40
ThanksB. Thirion, J.B. Poline, A. Kleinschmidt
Resting state analysis S. SadaghianiDictionary learning F. Bach, R. JenattonSparse inverse covariance A. GramfortStrokes F. BaronnetMatrix-variate MFX P. Fillard
Software: in Pythonscikit-learn: machine learningF. Pedegrosa, O. Grisel, M. Blondel . . .
Mayavi: 3D plottingP. Ramachandran
G Varoquaux 41
Multi-subject functional connectivity mapping
A consistent full-brain modelProbabilistic generative modelWith explicit inter-subject variabilitySuitable for inference
Y +E · S=
25
N
Population-level data analysisFunctional atlasesLarge-scale graphical modelsInter-subject discrimination
G Varoquaux 42
Bibliography[Varoquaux NeuroImage 2010] G. Varoquaux, S. Sadaghiani, P. Pinel, A.Kleinschmidt, J.B. Poline, B. Thirion A group model for stable multi-subject ICAon fMRI datasets, NeuroImage 51 p. 288 (2010)http://hal.inria.fr/hal-00489507/en
[Varoquaux MICCAI 2010] G. Varoquaux, F. Baronnet, A. Kleinschmidt, P.Fillard and B. Thirion, Detection of brain functional-connectivity difference inpost-stroke patients using group-level covariance modeling, MICCAI (2010)http://hal.inria.fr/inria-00512417/en
[Varoquaux NIPS 2010] G. Varoquaux, A. Gramfort, J.B. Poline and B. Thirion,Brain covariance selection: better individual functional connectivity models usingpopulation prior, NIPS (2010)http://hal.inria.fr/inria-00512451/en
[Varoquaux IPMI 2011] G. Varoquaux, A. Gramfort, F. Pedregosa, V. Michel,and B. Thirion, Multi-subject dictionary learning to segment an atlas of brainspontaneous activity, Information Processing in Medical Imaging p. 562 (2011)http://hal.inria.fr/inria-00588898/en
[Ramachandran 2011] P. Ramachandran, G. Varoquaux Mayavi: 3d visualizationof scientific data, Computing in Science & Engineering 13 p. 40 (2011)http://hal.inria.fr/inria-00528985/en
G Varoquaux 43