Estimation of causal direction in the presence of latent...
Transcript of Estimation of causal direction in the presence of latent...
![Page 1: Estimation of causal direction in the presence of latent ...people.tuebingen.mpg.de/causal-learning/slides/CaMaL_Shimizu.pdf · • Estimation of causal direction in the presence](https://reader033.fdocuments.in/reader033/viewer/2022042920/5f64c1530075480c2d320014/html5/thumbnails/1.jpg)
Shohei Shimizu
Osaka University, Japan
with
Kenneth Bollen
University of North Carolina, Chapel Hill, USA
1
Estimation of causal direction
in the presence of latent confounders
and linear non-Gaussian SEMs
Causal Modeling and Machine Learning
Beijing, China, June 2014
![Page 2: Estimation of causal direction in the presence of latent ...people.tuebingen.mpg.de/causal-learning/slides/CaMaL_Shimizu.pdf · • Estimation of causal direction in the presence](https://reader033.fdocuments.in/reader033/viewer/2022042920/5f64c1530075480c2d320014/html5/thumbnails/2.jpg)
Abstract
• Estimation of causal direction of two
observed variables in the presence of
latent confounders
• A key challenge in causal discovery
• Propose a non-Gaussian method
• Not require to specify the number of latent
confounders
• Experiments on artificial and sociology data
2
![Page 3: Estimation of causal direction in the presence of latent ...people.tuebingen.mpg.de/causal-learning/slides/CaMaL_Shimizu.pdf · • Estimation of causal direction in the presence](https://reader033.fdocuments.in/reader033/viewer/2022042920/5f64c1530075480c2d320014/html5/thumbnails/3.jpg)
Background
![Page 4: Estimation of causal direction in the presence of latent ...people.tuebingen.mpg.de/causal-learning/slides/CaMaL_Shimizu.pdf · • Estimation of causal direction in the presence](https://reader033.fdocuments.in/reader033/viewer/2022042920/5f64c1530075480c2d320014/html5/thumbnails/4.jpg)
4
Motivation
• Causality is a main interest in many empirical sciences
• Many recent methods for estimating causal directions (with no temporal information)– Linear non-Gaussian model (Dodge & Rousson 2001; Shimizu et al., 2006)
– Nonlinear model (Hoyer et al., 2009; Zhang & Hyvarinen, 2009; Peters et al. 2011)
• Another important challenge: Latent confounders
Sleep
problems
Depression
mood
Sleep
problems
Depression
mood ?
or
Epidemiology (Rosenstrom et al., 2012)
Which is dominant?
![Page 5: Estimation of causal direction in the presence of latent ...people.tuebingen.mpg.de/causal-learning/slides/CaMaL_Shimizu.pdf · • Estimation of causal direction in the presence](https://reader033.fdocuments.in/reader033/viewer/2022042920/5f64c1530075480c2d320014/html5/thumbnails/5.jpg)
Structural equation modeling
(SEM) (Bollen, 1989; Pearl, 2000, 2009)
• A framework for describing causal relations
• An example (of linear cases):
– The value of 𝑥2 is determined by the values of 𝑥1 and
error/exogenous variable 𝑒2 through the linear function
• Generally speaking, if the value of 𝑥1 is changed
and that of 𝑥2 also changes, then 𝑥1 causes 𝑥2
5
= 𝒃𝟐𝟏𝒙𝟏 + 𝒆𝟐x1x2
𝒙𝟐 ∶= 𝒇(𝒙𝟏, 𝒆𝟐)e2
![Page 6: Estimation of causal direction in the presence of latent ...people.tuebingen.mpg.de/causal-learning/slides/CaMaL_Shimizu.pdf · • Estimation of causal direction in the presence](https://reader033.fdocuments.in/reader033/viewer/2022042920/5f64c1530075480c2d320014/html5/thumbnails/6.jpg)
1. Estimation of causal direction when temporal information is not available
2. Coping with latent confounders
6
Major challenges
x1 x2?
x1 x2
or
x1 x2 ?x1 x2 or
f1 f1
![Page 7: Estimation of causal direction in the presence of latent ...people.tuebingen.mpg.de/causal-learning/slides/CaMaL_Shimizu.pdf · • Estimation of causal direction in the presence](https://reader033.fdocuments.in/reader033/viewer/2022042920/5f64c1530075480c2d320014/html5/thumbnails/7.jpg)
• Acyclic SEMs with different directions distinguishable (Dodge & Rousson, 2001; Shimizu et al., 2006)
• Fundamental assumptions:
– e1 and e2 are non-Gaussian
– Independence btw. e1 and e2 (No latent confounders)
where and are error/exogenous variables
Non-Gaussian approach: LiNGAM
(Linear Non-Gaussian Acyclic Model) (Shimizu et al., 2006)
7
or
21212
11
exbx
ex
22
12121
ex
exbx
Model 1: Model 2:
x1 x2
e1 e2
1e 2e
x1 x2
e1 e2
![Page 8: Estimation of causal direction in the presence of latent ...people.tuebingen.mpg.de/causal-learning/slides/CaMaL_Shimizu.pdf · • Estimation of causal direction in the presence](https://reader033.fdocuments.in/reader033/viewer/2022042920/5f64c1530075480c2d320014/html5/thumbnails/8.jpg)
88Different directions give
different data distributionsGaussian Non-Gaussian
(uniform)
Model 1:
Model 2:
x1
x2
x1
x2
e1
e2
x1
x2
e1
e2
x1
x2
x1
x2
x1
x2
212
11
8.0 exx
ex
22
121 8.0
ex
exx
1varvar 21 xx
,021 eEeE
0.8
0.8
![Page 9: Estimation of causal direction in the presence of latent ...people.tuebingen.mpg.de/causal-learning/slides/CaMaL_Shimizu.pdf · • Estimation of causal direction in the presence](https://reader033.fdocuments.in/reader033/viewer/2022042920/5f64c1530075480c2d320014/html5/thumbnails/9.jpg)
• Extension to incorporate non-Gaussian latent
confounders
i
ij
jij
Q
q
qiqii exbfx 1
LiNGAM with latent confounders (Hoyer, Shimizu & Kerminen, 2008)
9
where, WLG, are independent: ),,1( Qqfq
qf
x1 x2 2e1e
1f 2f
2121
1
222
1
1
111
exbfx
efx
Q
q
Q
q
![Page 10: Estimation of causal direction in the presence of latent ...people.tuebingen.mpg.de/causal-learning/slides/CaMaL_Shimizu.pdf · • Estimation of causal direction in the presence](https://reader033.fdocuments.in/reader033/viewer/2022042920/5f64c1530075480c2d320014/html5/thumbnails/10.jpg)
Previous estimation approaches• Explicitly model latent confounders and
compare two models with opposite directions of
causation
– Maximum likelihood principle (Hoyer et al., 2008 )
– Bayesian model selection (Henao & Winther, 2011)
– Laplace / finite mixture of Gaussians for p( )
• Require to specify the number of latent
confounders, which is difficult in general
10
x1 x2
f1
x1 x2
orfQ f1 fQ
… …
2e1e2e1e
ie
![Page 11: Estimation of causal direction in the presence of latent ...people.tuebingen.mpg.de/causal-learning/slides/CaMaL_Shimizu.pdf · • Estimation of causal direction in the presence](https://reader033.fdocuments.in/reader033/viewer/2022042920/5f64c1530075480c2d320014/html5/thumbnails/11.jpg)
Our proposal
Reference:
Shimizu and Bollen (2014)
Journal of Machine Learning Research
In press
![Page 12: Estimation of causal direction in the presence of latent ...people.tuebingen.mpg.de/causal-learning/slides/CaMaL_Shimizu.pdf · • Estimation of causal direction in the presence](https://reader033.fdocuments.in/reader033/viewer/2022042920/5f64c1530075480c2d320014/html5/thumbnails/12.jpg)
)(
2
m
Observations are generated from the LiNGAM
model with possibly different intercepts )(
22
m
)1(
1x)1(
2x
)(
2
mx)1(
1x
)(
2
)(
121
1
)(
22
)(
2
mmQ
q
m
m exbfx
Key idea (1/2)
• Another look at the LiNGAM with latent confounders:
12
x1 x2
f1 fQ…
2e1e
m-th obs.:
)1(
2e)1(
1e
)(
2
me)(
1
me
……
21b
)(
22
m
)1(
22 21b
21b
![Page 13: Estimation of causal direction in the presence of latent ...people.tuebingen.mpg.de/causal-learning/slides/CaMaL_Shimizu.pdf · • Estimation of causal direction in the presence](https://reader033.fdocuments.in/reader033/viewer/2022042920/5f64c1530075480c2d320014/html5/thumbnails/13.jpg)
Key idea (2/2)
• Include the sums of latent confounders as
the observation-specific intercepts:
• Not explicitly model latent confounders
• Neither necessary to specify the number
of latent confounders Q nor estimate the
coefficients
13
)(
2
m
)(
2
)(
121
1
)(
22
)(
2
mmQ
q
m
m exbfx
m-th obs.:
q2
Obs.-specific
intercept
![Page 14: Estimation of causal direction in the presence of latent ...people.tuebingen.mpg.de/causal-learning/slides/CaMaL_Shimizu.pdf · • Estimation of causal direction in the presence](https://reader033.fdocuments.in/reader033/viewer/2022042920/5f64c1530075480c2d320014/html5/thumbnails/14.jpg)
• Compare these two LiNGAM models with opposite
directions:
• Many additional parameters
• Prior for the observation-specific intercepts
• Other para. low-informative: Gaussian with large sd.
• Bayesian model selection (marginal likelihoods)
)()(
121
)(
22
)(
2
)(
1
)(
11
)(
1
m
i
mmm
mmm
exbx
ex
Our approach14
),,1;2,1()( nmim
i )(m
i
Model 3 (x1 x2)
)(
2
)(
22
)(
2
)(
1
)(
212
)(
11
)(
1
mmm
mmmm
ex
exbx
Model 4 (x1 x2)
![Page 15: Estimation of causal direction in the presence of latent ...people.tuebingen.mpg.de/causal-learning/slides/CaMaL_Shimizu.pdf · • Estimation of causal direction in the presence](https://reader033.fdocuments.in/reader033/viewer/2022042920/5f64c1530075480c2d320014/html5/thumbnails/15.jpg)
v
Prior for the observation-specific
intercepts
• Motivation: Central limit theorem
– Sums of independent variables tend to be more Gaussian
• Approximate the density by a bell-shaped curve dist.
• Select the hyper-parameter values that maximize the
marginal likelihood: Empirical Bayes
–
– DOF fixed to be 6 in the experiments below
• Small means similar intercepts
15
Q
q
m
mQ
q
m
m ff1
)(
2
)(
2
1
)(
1
)(
1 ,
~)(
2
)(
1
m
m
t-distribution with sd ,
correlation , and DOF1221,
v
)},(sd0.1,),(sd2.0,0{ lll xx }9.0,,1.0,0{12
l
![Page 16: Estimation of causal direction in the presence of latent ...people.tuebingen.mpg.de/causal-learning/slides/CaMaL_Shimizu.pdf · • Estimation of causal direction in the presence](https://reader033.fdocuments.in/reader033/viewer/2022042920/5f64c1530075480c2d320014/html5/thumbnails/16.jpg)
Experiments on artificial data
![Page 17: Estimation of causal direction in the presence of latent ...people.tuebingen.mpg.de/causal-learning/slides/CaMaL_Shimizu.pdf · • Estimation of causal direction in the presence](https://reader033.fdocuments.in/reader033/viewer/2022042920/5f64c1530075480c2d320014/html5/thumbnails/17.jpg)
8072
5847
34 39
0
20
40
60
80
100
N. latent confounders = 6N. latent confounders = 1
Experimental results (100 obs.)• Data generated from LiNGAM with latent confounders
• Various non-Gaussian distributions
– Laplace, Uniform, asymmetric dist. etc.
• Our method uses Laplace for p( )
17
x1 x2
f1 fQ…
Numbers of successful discoveries (100 rep.)
86
55 58 55 51 54
0
20
40
60
80
100
Our
mthd
Henao:
1, 4, 10 conf.
Hoyer:
1, 4 conf.
Our
mthdHenao:
1, 4, 10 conf.
Hoyer:
1, 4 conf.
ie
![Page 18: Estimation of causal direction in the presence of latent ...people.tuebingen.mpg.de/causal-learning/slides/CaMaL_Shimizu.pdf · • Estimation of causal direction in the presence](https://reader033.fdocuments.in/reader033/viewer/2022042920/5f64c1530075480c2d320014/html5/thumbnails/18.jpg)
Experiment on sociology data
![Page 19: Estimation of causal direction in the presence of latent ...people.tuebingen.mpg.de/causal-learning/slides/CaMaL_Shimizu.pdf · • Estimation of causal direction in the presence](https://reader033.fdocuments.in/reader033/viewer/2022042920/5f64c1530075480c2d320014/html5/thumbnails/19.jpg)
Sociology data
• Source: General Social Survey (n=1380)
– Non-farm background, ages 35-44, white,
male, in the labor force, no missing data for
any of the covariates, 1972-2006
19
Status attainment model(Duncan et al., 1972)
x2: Son’s Income
![Page 20: Estimation of causal direction in the presence of latent ...people.tuebingen.mpg.de/causal-learning/slides/CaMaL_Shimizu.pdf · • Estimation of causal direction in the presence](https://reader033.fdocuments.in/reader033/viewer/2022042920/5f64c1530075480c2d320014/html5/thumbnails/20.jpg)
Evaluation of our method
using the sociology data
Known (temporal)
orderings of 15 pairs
20
Son’s
Education
Father’s
Education
Son’s
Income
Father’s
Education
Son’s
Income
Son’s
Occupation
……
![Page 21: Estimation of causal direction in the presence of latent ...people.tuebingen.mpg.de/causal-learning/slides/CaMaL_Shimizu.pdf · • Estimation of causal direction in the presence](https://reader033.fdocuments.in/reader033/viewer/2022042920/5f64c1530075480c2d320014/html5/thumbnails/21.jpg)
Conclusions
![Page 22: Estimation of causal direction in the presence of latent ...people.tuebingen.mpg.de/causal-learning/slides/CaMaL_Shimizu.pdf · • Estimation of causal direction in the presence](https://reader033.fdocuments.in/reader033/viewer/2022042920/5f64c1530075480c2d320014/html5/thumbnails/22.jpg)
Conclusions
• Estimation of causal direction in the presence of
latent confounders is a major challenge in
causal discovery
• Our proposal: Fit linear non-Gaussian SEM with
possibly different intercepts to data
• Future works
– Test other informative priors for observation-specific
intercepts
– Implement a wider variety of error/prior distributions
(e.g., learn DOF of t dist.)
– Develop extensions using nonlinear/cyclic models (Hoyer et al., 2009; Zhang & Hyvarinen, 2009; Lacerda et al., 2008)
instead of LiNGAM
22