Challenges posed by Structural Equation Models Thomas Richardson Department of Statistics University...

Challenges posed byStructural Equation Models

Challenges posed byStructural Equation Models

Thomas RichardsonThomas Richardson

Department of StatisticsDepartment of Statistics

University of WashingtonUniversity of Washington

Joint work with Mathias Drton, UC Berkeley Peter Spirtes, CMU

OverviewOverview

Challenges for Likelihood InferenceChallenges for Likelihood Inference

Problems in Model Selection and Problems in Model Selection and InterpretationInterpretation

Partial Solution Partial Solution sub-class of path diagrams: sub-class of path diagrams: ancestral graphsancestral graphs

Problems for Likelihood InferenceProblems for Likelihood Inference

Likelihood may be multimodalLikelihood may be multimodal e.g. the bi-variate Gaussian Seemingly e.g. the bi-variate Gaussian Seemingly

Unrelated Regression (SUR) model:Unrelated Regression (SUR) model:

X1

X2

Y1

Y2

may have up to 3 local maxima.

Consistent starting value does not guarantee iterativeprocedures will find the MLE.

Problems for Likelihood InferenceProblems for Likelihood Inference

Discrete latent variable models are not Discrete latent variable models are not curved exponential familiescurved exponential families

C

X1 X2 X3 X4

binary observedvariables

ternary latentclass variable

15 parameters in saturated model14 model parametersBUT model has 2d.f.(Goodman)

Usual asymptotics may not apply

Problems for Likelihood InferenceProblems for Likelihood Inference Likelihood may be highly multimodal in the Likelihood may be highly multimodal in the

asymptotic limitasymptotic limit AfterAfter accounting for label switching/aliasing accounting for label switching/aliasing

C

X1 X2 X3 X4

Why report one mode ?

d.f. may vary as a function of modelparameters

Problems for Model SelectionProblems for Model Selection

SEM models with latent variables are SEM models with latent variables are notnot curved exponential familiescurved exponential families Standard Standard 22 asymptotics do asymptotics do notnot necessarily apply necessarily apply

e.g. for LRTse.g. for LRTs Model selection criteria such as BIC are not Model selection criteria such as BIC are not

asymptotically consistentasymptotically consistent The effective The effective degrees of freedom degrees of freedom may vary may vary

depending on the values of the model parametersdepending on the values of the model parameters


Many models may be equivalent:Many models may be equivalent:

X1

X2

Y1

Y2

X1

X2

Y1

Y2

X1

X2

Y1

Y2

X1

X2

Y1

Y2


X1

Xp

Y1

Yq

X1

Xp

Y1

Yq

Models with different numbers of latents may be Models with different numbers of latents may be equivalent: equivalent: e.g. unrestricted error covariance within blockse.g. unrestricted error covariance within blocks

Problems for Model SelectionProblems for Model Selection Models with different numbers of latents may be Models with different numbers of latents may be

equivalent: equivalent: e.g. unrestricted error covariance within blockse.g. unrestricted error covariance within blocks

X1

Xp

Y1

Yq

X1

Xp

Y1

Yq

Wegelin & Richardson (2001)

Two scenariosTwo scenarios A single SEM model is proposed and fitted. A single SEM model is proposed and fitted.

The results are reported. The results are reported.

Two scenariosTwo scenarios A single SEM model is proposed and fitted. A single SEM model is proposed and fitted.

The results are reported. The results are reported. The researcher fits a sequence of models, The researcher fits a sequence of models,

making modifications to an original making modifications to an original specification.specification. Model equivalence implies:Model equivalence implies:

Final model depends on initial model chosenFinal model depends on initial model chosen Sequence of changes is often Sequence of changes is often ad hocad hoc Equivalent models may lead to very different Equivalent models may lead to very different

substantive conclusionssubstantive conclusions

Often many equivalence classes of models give Often many equivalence classes of models give reasonable fit. reasonable fit. Why report just one?Why report just one?

Partial SolutionPartial Solution Embed each latent variable model in a ‘larger’ model without latent Embed each latent variable model in a ‘larger’ model without latent

variables characterized by conditional independence restrictions.variables characterized by conditional independence restrictions. We ignore non-independence constraints and inequality constraints.We ignore non-independence constraints and inequality constraints.

Latent variablemodelModel imposing

only independenceconstraintson observed variables

Sets of distributions

a bt

c d

ToyExample:

a c b d

a d a d c

a d b

a c d

b d a

G

a t d t

b c t +others

The Generating graphThe Generating graph

Begin with a graph, and associated set of independencesBegin with a graph, and associated set of independences

a bt

c d

a c b d

a d a d c

a d b

a c d

b d a

G

a t d t

b c t +others

hidden:

‘Unobserved’ independenciesin red

MarginalizationMarginalization Suppose now that some variables are unobservedSuppose now that some variables are unobserved Find the independence relations involving only the Find the independence relations involving only the

observed variablesobserved variables

ToyExample:

a bt

c d a b c d

a c b d

a d a d c

a d b

a c d

b d a

G G*

‘Graphical Marginalization’‘Graphical Marginalization’ Now construct a graph that represents the conditional Now construct a graph that represents the conditional

independence relations among the observed variables. independence relations among the observed variables. Bi-directed edges are required.Bi-directed edges are required.

represents

ToyExample:

all and only the distributions in whichthese independencies hold

Equivalence re-visitedEquivalence re-visited

Restrict model class to path diagrams including Restrict model class to path diagrams including only observed variables only observed variables characterized by characterized by conditional independenceconditional independence AncestralAncestral Graph Graph MarkovMarkov modelsmodels

For such models we can:For such models we can: Determine the entire class of equivalent modelsDetermine the entire class of equivalent models Identify which features they have in commonIdentify which features they have in common

Models are curved exponential: usual asymptotics Models are curved exponential: usual asymptotics dodo apply apply

T

A B C D

A C

B D

A D

A D C

A D B

A C D

B D A

A B C D

AncestralGraph

V

A B C D

T

A B C D

U

A C

B D

A D

A D C

A D B

A C D

B D A

A B C D

A B C D

Equivalentancestral graphs

V

A B C D

T

A B C D

U

Q

A B C D

P R

A C

B D

A D

A D C

A D B

A C D

B D A

A B C D

A B C D

A B C D

Markov Equiv. Class of Graphswith Latent Variables


V

A B C D

T

A B C D

U

+ infinitely many others

Q

A B C D

P R

A C

B D

A D

A D C

A D B

A C D

B D A

A B C D

A B C D

A B C D

A B C D

N

A B C D

M R

L


Equivalence Classes


A B C D

V

A B C D

T

A B C D

U


Q

A B C D

P R

A C

B D

A D

A D C

A D B

A C D

B D A

A B C D

A B C D

A B C D

A B C D

N

A B C D

M R

L


Equivalence class of Ancestral Graphs

Partial Ancestral Graph

A B C D

Partial Ancestral Graph

V

A B C D

T

A B C D

U


Q

A B C D

P R

A C

B D

A D

A D C

A D B

A C D

B D A

A B C D

A B C D

A B C D

A B C D

Equivalence class of Ancestral Graphs

N

A B C D

M R

L


Measurement modelsMeasurement models

IfIf we have pure measurement models with we have pure measurement models with several indicators per latent:several indicators per latent: May apply similar search methods among the latent May apply similar search methods among the latent

variables variables (Spirtes et al. 2001; Silva et al.2003) (Spirtes et al. 2001; Silva et al.2003)

Other Related WorkOther Related Work Iterative ML estimation methods existIterative ML estimation methods exist

Guaranteed convergenceGuaranteed convergence Multimodality is still possibleMultimodality is still possible

Implemented in R packageImplemented in R package ggm ggm (Drton & Marchetti, 2003)(Drton & Marchetti, 2003)

Current work: Current work: Extension to discrete dataExtension to discrete data

Parameterization and ML fitting for binary bi-directed graphs Parameterization and ML fitting for binary bi-directed graphs already existalready exist

Implementing search procedures in RImplementing search procedures in R

ReferencesReferences Richardson, T., Spirtes, P. (2002) Ancestral graph Markov models, Richardson, T., Spirtes, P. (2002) Ancestral graph Markov models, Ann. StatAnn. Stat., .,

30: 962-103030: 962-1030 Richardson, T. (2003) Markov properties for acyclic directed mixed graphs. Richardson, T. (2003) Markov properties for acyclic directed mixed graphs.

Scand. J. StatistScand. J. Statist. 30(1), pp. 145-157. 30(1), pp. 145-157 Drton, M., Richardson T. (2003) A new algorithm for maximum likelihood Drton, M., Richardson T. (2003) A new algorithm for maximum likelihood

estimation in Gaussian graphical models for marginal independence. estimation in Gaussian graphical models for marginal independence. UAI 03UAI 03, , 184-191184-191

Drton, M., Richardson T. (2003) Iterative conditional fitting in Gaussian Drton, M., Richardson T. (2003) Iterative conditional fitting in Gaussian ancestral graph models. ancestral graph models. UAI 04 UAI 04 130-137130-137..

Drton, M., Richardson T. (2004) Multimodality of the likelihood in the bivariate Drton, M., Richardson T. (2004) Multimodality of the likelihood in the bivariate seemingly unrelated regressions model. seemingly unrelated regressions model. Biometrika, Biometrika, 91(2), 383-92.91(2), 383-92.

Marchetti, G., Drton, M. (2003) Marchetti, G., Drton, M. (2003) ggm packageggm package. Available from . Available from http://cran.r-project.orghttp://cran.r-project.org

Challenges posed by Structural Equation Models Thomas Richardson Department of Statistics University...

Documents

Transcript of Challenges posed by Structural Equation Models Thomas Richardson Department of Statistics University...