Challenges posed by Structural Equation Models Thomas Richardson Department of Statistics University...
-
Upload
kristina-beatrice-scott -
Category
Documents
-
view
214 -
download
0
Transcript of Challenges posed by Structural Equation Models Thomas Richardson Department of Statistics University...
Challenges posed byStructural Equation Models
Challenges posed byStructural Equation Models
Thomas RichardsonThomas Richardson
Department of StatisticsDepartment of Statistics
University of WashingtonUniversity of Washington
Joint work with Mathias Drton, UC Berkeley Peter Spirtes, CMU
OverviewOverview
Challenges for Likelihood InferenceChallenges for Likelihood Inference
Problems in Model Selection and Problems in Model Selection and InterpretationInterpretation
Partial Solution Partial Solution sub-class of path diagrams: sub-class of path diagrams: ancestral graphsancestral graphs
Problems for Likelihood InferenceProblems for Likelihood Inference
Likelihood may be multimodalLikelihood may be multimodal e.g. the bi-variate Gaussian Seemingly e.g. the bi-variate Gaussian Seemingly
Unrelated Regression (SUR) model:Unrelated Regression (SUR) model:
X1
X2
Y1
Y2
may have up to 3 local maxima.
Consistent starting value does not guarantee iterativeprocedures will find the MLE.
Problems for Likelihood InferenceProblems for Likelihood Inference
Discrete latent variable models are not Discrete latent variable models are not curved exponential familiescurved exponential families
C
X1 X2 X3 X4
binary observedvariables
ternary latentclass variable
15 parameters in saturated model14 model parametersBUT model has 2d.f.(Goodman)
Usual asymptotics may not apply
Problems for Likelihood InferenceProblems for Likelihood Inference Likelihood may be highly multimodal in the Likelihood may be highly multimodal in the
asymptotic limitasymptotic limit AfterAfter accounting for label switching/aliasing accounting for label switching/aliasing
C
X1 X2 X3 X4
Why report one mode ?
d.f. may vary as a function of modelparameters
Problems for Model SelectionProblems for Model Selection
SEM models with latent variables are SEM models with latent variables are notnot curved exponential familiescurved exponential families Standard Standard 22 asymptotics do asymptotics do notnot necessarily apply necessarily apply
e.g. for LRTse.g. for LRTs Model selection criteria such as BIC are not Model selection criteria such as BIC are not
asymptotically consistentasymptotically consistent The effective The effective degrees of freedom degrees of freedom may vary may vary
depending on the values of the model parametersdepending on the values of the model parameters
Problems for Model SelectionProblems for Model Selection
Many models may be equivalent:Many models may be equivalent:
X1
X2
Y1
Y2
X1
X2
Y1
Y2
X1
X2
Y1
Y2
X1
X2
Y1
Y2
Problems for Model SelectionProblems for Model Selection
X1
Xp
Y1
Yq
X1
Xp
Y1
Yq
Models with different numbers of latents may be Models with different numbers of latents may be equivalent: equivalent: e.g. unrestricted error covariance within blockse.g. unrestricted error covariance within blocks
Problems for Model SelectionProblems for Model Selection Models with different numbers of latents may be Models with different numbers of latents may be
equivalent: equivalent: e.g. unrestricted error covariance within blockse.g. unrestricted error covariance within blocks
X1
Xp
Y1
Yq
X1
Xp
Y1
Yq
Wegelin & Richardson (2001)
Two scenariosTwo scenarios A single SEM model is proposed and fitted. A single SEM model is proposed and fitted.
The results are reported. The results are reported.
Two scenariosTwo scenarios A single SEM model is proposed and fitted. A single SEM model is proposed and fitted.
The results are reported. The results are reported. The researcher fits a sequence of models, The researcher fits a sequence of models,
making modifications to an original making modifications to an original specification.specification. Model equivalence implies:Model equivalence implies:
Final model depends on initial model chosenFinal model depends on initial model chosen Sequence of changes is often Sequence of changes is often ad hocad hoc Equivalent models may lead to very different Equivalent models may lead to very different
substantive conclusionssubstantive conclusions
Often many equivalence classes of models give Often many equivalence classes of models give reasonable fit. reasonable fit. Why report just one?Why report just one?
Partial SolutionPartial Solution Embed each latent variable model in a ‘larger’ model without latent Embed each latent variable model in a ‘larger’ model without latent
variables characterized by conditional independence restrictions.variables characterized by conditional independence restrictions. We ignore non-independence constraints and inequality constraints.We ignore non-independence constraints and inequality constraints.
Latent variablemodelModel imposing
only independenceconstraintson observed variables
Sets of distributions
a bt
c d
ToyExample:
a c b d
a d a d c
a d b
a c d
b d a
G
a t d t
b c t +others
The Generating graphThe Generating graph
Begin with a graph, and associated set of independencesBegin with a graph, and associated set of independences
a bt
c d
a c b d
a d a d c
a d b
a c d
b d a
G
a t d t
b c t +others
hidden:
‘Unobserved’ independenciesin red
MarginalizationMarginalization Suppose now that some variables are unobservedSuppose now that some variables are unobserved Find the independence relations involving only the Find the independence relations involving only the
observed variablesobserved variables
ToyExample:
a bt
c d
a c b d
a d a d c
a d b
a c d
b d a
G
a t d t
b c t +others
hidden:
‘Unobserved’ independenciesin red
MarginalizationMarginalization Suppose now that some variables are unobservedSuppose now that some variables are unobserved Find the independence relations involving only the Find the independence relations involving only the
observed variablesobserved variables
ToyExample:
a bt
c d a b c d
a c b d
a d a d c
a d b
a c d
b d a
G G*
‘Graphical Marginalization’‘Graphical Marginalization’ Now construct a graph that represents the conditional Now construct a graph that represents the conditional
independence relations among the observed variables. independence relations among the observed variables. Bi-directed edges are required.Bi-directed edges are required.
represents
ToyExample:
all and only the distributions in whichthese independencies hold
Equivalence re-visitedEquivalence re-visited
Restrict model class to path diagrams including Restrict model class to path diagrams including only observed variables only observed variables characterized by characterized by conditional independenceconditional independence AncestralAncestral Graph Graph MarkovMarkov modelsmodels
For such models we can:For such models we can: Determine the entire class of equivalent modelsDetermine the entire class of equivalent models Identify which features they have in commonIdentify which features they have in common
Models are curved exponential: usual asymptotics Models are curved exponential: usual asymptotics dodo apply apply
T
A B C D
A C
B D
A D
A D C
A D B
A C D
B D A
A B C D
AncestralGraph
V
A B C D
T
A B C D
U
A C
B D
A D
A D C
A D B
A C D
B D A
A B C D
A B C D
Equivalentancestral graphs
V
A B C D
T
A B C D
U
Q
A B C D
P R
A C
B D
A D
A D C
A D B
A C D
B D A
A B C D
A B C D
A B C D
Markov Equiv. Class of Graphswith Latent Variables
Equivalentancestral graphs
V
A B C D
T
A B C D
U
+ infinitely many others
Q
A B C D
P R
A C
B D
A D
A D C
A D B
A C D
B D A
A B C D
A B C D
A B C D
A B C D
N
A B C D
M R
L
Markov Equiv. Class of Graphswith Latent Variables
Equivalence Classes
Equivalentancestral graphs
A B C D
V
A B C D
T
A B C D
U
+ infinitely many others
Q
A B C D
P R
A C
B D
A D
A D C
A D B
A C D
B D A
A B C D
A B C D
A B C D
A B C D
N
A B C D
M R
L
Markov Equiv. Class of Graphswith Latent Variables
Equivalence class of Ancestral Graphs
Partial Ancestral Graph
A B C D
Partial Ancestral Graph
V
A B C D
T
A B C D
U
+ infinitely many others
Q
A B C D
P R
A C
B D
A D
A D C
A D B
A C D
B D A
A B C D
A B C D
A B C D
A B C D
Equivalence class of Ancestral Graphs
N
A B C D
M R
L
Markov Equiv. Class of Graphswith Latent Variables
Measurement modelsMeasurement models
IfIf we have pure measurement models with we have pure measurement models with several indicators per latent:several indicators per latent: May apply similar search methods among the latent May apply similar search methods among the latent
variables variables (Spirtes et al. 2001; Silva et al.2003) (Spirtes et al. 2001; Silva et al.2003)
Other Related WorkOther Related Work Iterative ML estimation methods existIterative ML estimation methods exist
Guaranteed convergenceGuaranteed convergence Multimodality is still possibleMultimodality is still possible
Implemented in R packageImplemented in R package ggm ggm (Drton & Marchetti, 2003)(Drton & Marchetti, 2003)
Current work: Current work: Extension to discrete dataExtension to discrete data
Parameterization and ML fitting for binary bi-directed graphs Parameterization and ML fitting for binary bi-directed graphs already existalready exist
Implementing search procedures in RImplementing search procedures in R
ReferencesReferences Richardson, T., Spirtes, P. (2002) Ancestral graph Markov models, Richardson, T., Spirtes, P. (2002) Ancestral graph Markov models, Ann. StatAnn. Stat., .,
30: 962-103030: 962-1030 Richardson, T. (2003) Markov properties for acyclic directed mixed graphs. Richardson, T. (2003) Markov properties for acyclic directed mixed graphs.
Scand. J. StatistScand. J. Statist. 30(1), pp. 145-157. 30(1), pp. 145-157 Drton, M., Richardson T. (2003) A new algorithm for maximum likelihood Drton, M., Richardson T. (2003) A new algorithm for maximum likelihood
estimation in Gaussian graphical models for marginal independence. estimation in Gaussian graphical models for marginal independence. UAI 03UAI 03, , 184-191184-191
Drton, M., Richardson T. (2003) Iterative conditional fitting in Gaussian Drton, M., Richardson T. (2003) Iterative conditional fitting in Gaussian ancestral graph models. ancestral graph models. UAI 04 UAI 04 130-137130-137..
Drton, M., Richardson T. (2004) Multimodality of the likelihood in the bivariate Drton, M., Richardson T. (2004) Multimodality of the likelihood in the bivariate seemingly unrelated regressions model. seemingly unrelated regressions model. Biometrika, Biometrika, 91(2), 383-92.91(2), 383-92.
Marchetti, G., Drton, M. (2003) Marchetti, G., Drton, M. (2003) ggm packageggm package. Available from . Available from http://cran.r-project.orghttp://cran.r-project.org