10 SEM is Based on the Analysis of Covariances! Why?Analysis of correlations represents loss of...
-
Upload
blaise-christopher-curtis -
Category
Documents
-
view
215 -
download
0
Transcript of 10 SEM is Based on the Analysis of Covariances! Why?Analysis of correlations represents loss of...
10
SEM is Based on the Analysis of Covariances!Why? Analysis of correlations represents loss of information.
0
20
40
60
80
100
0 10 20 30
x
y
0
20
40
60
80
100
0 10 20 30
x
y
A B
r = 0.86 r = 0.50
illustration with regressions having same slope and intercept
Analysis of covariances allows for estimation of both standardized and unstandardized parameters.
11
I. SEM Essentials:1. SEM is a form of graphical modeling, and therefore, a system in which relationships
can be represented in either graphical or equational form.
x1 y1
111graphicalform
y1 = γ11x1 + ζ1equationalform
2. An equation is said to be structural if there exists sufficient evidence from all available sources to support the interpretation that x1 has a causal effect on y1.
12
y2y1
x1 y3
ζ1 ζ2
ζ3
Complex Hypothesis
e.g.
y1 = γ11x1 + ζ1
y2 = β 21y1 + γ 21x1 + ζ 2
y3 = β 32y2 + γ31x1 + ζ 3
CorrespondingEquations
3. Structural equation modeling can be defined as the use of two or more structural equations to represent complex hypotheses.
14
a. manipulations of x can repeatably be demonstrated to be followed by responses in y, and/or
b. we can assume that the values of x that we have can serve as indicators for the values of x that existed when effects on y were being generated, and/or
c. if it can be assumed that a manipulation of x would result in a subsequent change in the values of y
Relevant References:
Pearl (2000) Causality. Cambridge University Press. Shipley (2000) Cause and Correlation in Biology. Cambridge
4. Some practical criteria for supporting an assumption of causal relationships in structural equations:
15
The Methodological Side of SEM
0
20
40
60
80
100
softwarehyp testingstat modelingfactor analysisregression
16
6. SEM is a framework for building and evaluating multivariate hypotheses about multiple processes. It is not dependent on a particular estimation method.
7. When it comes to statistical methodology, it is important to distinguish between the priorities of the methodology versus those of the scientific enterprise. Regarding the diagram below, in SEM we use statistics for the purposes of the scientific enterprise.
Statistics and other Methodological
Tools, Procedures, and Principles.
The Scientific Enterprise
17
The Relationship of SEM to the Scientific Enterprise
modified from Starfield and Bleloch (1991)
Understanding of Processes
univariatedescriptive statistics
exploration,methodology and
theory development
realisticpredictive models
simplisticmodels
multivariatedescriptive statistics
detailed processmodels
univariate datamodeling
Data
structuralequationmodeling
18
8. SEM seeks to progress knowledge through cumulative learning. Current work is striving to increase the capacity for model memory and model generality.
exploratory/model-building
applications
structural equation modeling
confirmatory/hypothesis-testing
applicationsone aim ofSEM
19
11. An interest in systems under multivariate control motivates us to explicitly consider the relative importances of multiple processes and how they interact. We seek to consider simultaneously the main factors that determine how system responses behave.
12. SEM is one of the few applications of statistical inference where the results of estimation are frequently “you have the wrong model!”. This feedback comes from the unique feature that in SEM we compare patterns in the data to those implied by the model. This is an extremely important form of learning about systems.
20
13. Illustrations of fixed-structure protocol models:
Univariate Models
x1
x2
x3
x4
x5
y1
Multivariate Models
x1
x2
x3
x4
x5
F
y1
y2
y3
y4
y5
Do these model structures match the causal forces that influencedthe data? If not, what can they tell you about the processes operating?
21
14. Structural equation modeling and its associated scientific goals represent an ambitious undertaking. We should be both humbled by the limits of our successes and inspired by the learning that takes place during the journey.
22
x1
y1
y2
1
2
Some Terminology
exogenousvariable
endogenousvariables
21
11 21
path coefficients
direct effect of x1 on y2
indirect effect of x1 on y2
is 11 times 21
23
First Rule of Path Coefficients: the path coefficients forunanalyzed relationships (curved arrows) between exogenous variables are simply the correlations (standardized form) or covariances (unstandardized form).
x1
x2
y1.40
x1 x2 y1
-----------------------------x1 1.0
x2 0.40 1.0
y1 0.50 0.60 1.0
24
x1 y1 y2
11 = .50 21 = .60
(gamma) used to represent effect of exogenous on endogenous. (beta) used to represent effect of endogenous on endogenous.
Second Rule of Path Coefficients: when variables areconnected by a single causal path, the pathcoefficient is simply the standardized or unstandardized regression coefficient (note that a standardized regression coefficient = a simple correlation.)
x1 y1 y2
-------------------------------------------------x1 1.0y1 0.50 1.0y2 0.30 0.60 1.0
25
Third Rule of Path Coefficients: strength of a compound path is the product of the coefficients along the path.
x1 y1 y2
.50 .60
Thus, in this example the effect of x1 on y2 = 0.5 x 0.6 = 0.30
Since the strength of the indirect path from x1 to y2 equals the
correlation between x1 and y2, we say x1 and y2 are
conditionally independent.
26
What does it mean when two separated variables are not conditionally independent?
x1 y1 y2
-------------------------------------------------x1 1.0
y1 0.55 1.0
y2 0.50 0.60 1.0
x1 y1 y2
r = .55 r = .60
0.55 x 0.60 = 0.33, which is not equal to 0.50
27
The inequality implies that the true model is
x1
y1
y2
Fourth Rule of Path Coefficients: when variables areconnected by more than one causal pathway, the path coefficients are "partial" regression coefficients.
additional process
Which pairs of variables are connected by two causal paths?
answer: x1 and y2 (obvious one), but also y1 and y2, which are connected by the joint influence of x1 on both of them.
28
And for another case:
x1
x2
y1
A case of shared causal influence: the unanalyzed relationbetween x1 and x2 represents the effects of an unspecified
joint causal process. Therefore, x1 and y1 connected by two
causal paths. x2 and y1 likewise.
29
x1
y1
y2
.40
.31
.48
How to Interpret Partial Path Coefficients: - The Concept of Statistical Control
The effect of y1 on y2 is controlled for the joint effects of x1.
I have an article on this subject that is brief and to the point.Grace, J.B. and K.A. Bollen 2005. Interpreting the results from multiple
regression and structural equation models. Bull. Ecological Soc. Amer. 86:283-295.
30
Interpretation of Partial CoefficientsAnalogy to an electronic equalizer
from Sourceforge.net
With all other variables in model held to their means, how much does a response variable change when a predictor is varied?
31
x1
y1
y2
Fifth Rule of Path Coefficients: paths from error variables are correlations or covariances.
R2 = 0.16
.92
R2 = 0.44
.73
2
1
.31
.40 .48
21iy
R
equation for pathfrom error variable
.56
alternative is toshow values for zetas,which = 1-R2
.84
32
x1
y1
y2
R2 = 0.16
R2 = 0.25
2
1
.50
.40x1 y1 y2
-------------------------------x1 1.0
y1 0.40 1.0
y2 0.50 0.60 1.0
Now, imagine y1 and y2
are joint responses
Sixth Rule of Path Coefficients: unanalyzed residual correlations between endogenous variables are partial correlations or covariances.
34
Seventh Rule of Path Coefficients: total effect one variable has on another equals the sum of its direct and indirect effects.
y1x2
x1 y2
ζ1
ζ2.80
.15
.64
-.11
.27
x1 x2 y1
-------------------------------y1 0.64 -0.11 ---
y2 0.32 -0.03 0.27
Total Effects:
Eighth Rule of Path Coefficients:sum of all pathways between two variables (causal and noncausal) equals the correlation/covariance.
note: correlation betweenx1 and y1 = 0.55, which
equals 0.64 - 0.80*0.11
35
Path Tracing Rules
• No loops• No going forward & then backward• A maximum of one curved arrow per path
37
Suppression Effect - when presence of another variable causes path coefficient to strongly differ from bivariate correlation.
x1 x2 y1 y2 -----------------------------------------------x1 1.0x2 0.80 1.0y1 0.55 0.40 1.0y2 0.30 0.23 0.35 1.0
y1x2
x1 y2
ζ1
ζ2.80
.15
.64
-.11
.27
path coefficient for x2 to y1 very different from correlation,
(results from overwhelming influence from x1.)
42
Σ = { σ11
σ12 σ22
σ13 σ23 σ33 }Model-Implied CorrelationsObserved Correlations*
{1.0.24 1.0.01 .70 1.0}S =
* typically the unstandardized correlations, or covariances
2. Estimation (cont.) – analysis of covariance structure
The most commonly used method of estimation over the past 3 decades has been through the analysis of covariance structure (think – analysis of patterns of correlations among variables).
compare
43
x1
y1
y2
Hypothesized Model
Σ = { σ11
σ12 σ22
σ13 σ23 σ33 }Implied Covariance Matrix
Observed Covariance Matrix
{1.3.24 .41.01 9.7 12.3}S =
compareModel FitEvaluations
+
ParameterEstimates
estimation
(e.g., maximum likelihood)
3. Evaluation
44
1. The Multiequational Framework
(a) the observed variable model
We can model the interdependences among a set of predictors and responses using an extension of the general linear model that accommodates the dependences of response variables on other response variables.
y = p x 1 vector of responses
α = p x 1 vector of intercepts
Β = p x p coefficient matrix of ys on ys
Γ = p x q coefficient matrix of ys on xs
x = q x 1 vector of exogenous predictors
ζ = p x 1 vector of errors for the elements of y
Φ = cov (x) = q x q matrix of covariances among xs
Ψ = cov (ζ) = q x q matrix of covariances among errors
y = α + Βy + Γx + ζ
47
The LISREL Equations
Jöreskög 1973
(b) the latent variable model
η = α + Β η + Γξ + ζ
x = Λxξ + δy = Λyη + ε
where: η is a vector of latent responses,ξ is a vector of latent predictors,Β and Γ are matrices of coefficients,ζ is a vector of errors for η, andα is a vector of intercepts for η
(c) the measurement model
where: Λx is a vector of loadings that link observed x variables to latent predictors,Λy is a vector of loadings that link observed y variables to latent responses, andδ and ε are vectors are errors
Notation
• Covariance Matrixes of Interest:– Φ Exogenous Constructs– Ψ Structural Error Terms– Θδ Measurement Error X
– Θε Measurement Error Y
Trust in Individuals
people arehelpful
(x1)
people canbe trusted
(x2)
people areFair (x3)
1
ξ1
δ1 δ2 δ3
11111 x
21212 x
313 x
Confirmatory Factor Analysis
δξΛx x
λ11 λ21
11
)(00
)(0
)(
3
2
1
VAR
VAR
VAR
G89.2247 Lecture 5 56
Form of ( )S q• The form of ( )S q can be worked out with
expectation operators
111
1
)()()(
)(
BIBIBI
BI
YYYX
XYXX
G89.2247 Lecture 5 57
Fitting Functions
• ML minimizes
• ULS minimizes
• GLS minimizes
• ADF minimizes a weighted least squares criterion, but with a weight different than GLS
)(lnln)( 1 qpStrSF
SStrF
1121 SSSStrF
G89.2247 Lecture 5 61
Estimation Example using Excel
0 0.3205 0.4485 0 1.0452 0.5004 0.566 -0.22 0.3205 10.5059 0 0 0.3975 0.5004 1.1538 -0.22 0.67 0.5059 2
0.4485 3S X1 X2 Y1 Y2 0.3975 4
X1 1.0452 0.5004 0.6356 0.5204 ML= 0.0000 LR= 6E-07 1.0452 5X2 0.5004 1.1538 0.4433 0.6829 ULS= 0.0000 1.1538 6Y1 0.6356 0.4433 1.1075 0.7256 GLS= 0.0000 LR= 6E-07 0.5004 7Y2 0.5204 0.6829 0.7256 1.3033 0.566 8
N= 250 0.6703 91.0452 0.5004 0.6356 0.5205 -0.224 100.5004 1.1538 0.4433 0.68290.6356 0.4433 1.1075 0.72560.5205 0.6829 0.7256 1.3033 Y1= 0.3205 Y2+ 0.448 X1+ 0 X2+E1
S- Y2= 0.5059 Y1+ 0 X2+ 0.4 X2+E20.000 0.000 0.000 0.0000.000 0.000 0.000 0.0000.000 0.000 0.000 0.0000.000 0.000 0.000 0.000
G89.2247 Lecture 5 62
Measures of Fit: Chi Square
• If model is not saturated, and• If residuals of Y have multivariate normal
distribution– ML*(N-1) and GLS*(N-1) have large sample chi
squared distributions– Degrees of freedom given by difference in number
of parameters in model compared to saturated model
63
Illustration of the use of Χ2
X2 = 3.64 with 1 df and 100 samplesP = 0.056
X2 = 7.27 with 1 df and 200 samplesP = 0.007
x
y1
y2
1.00.4 1.00.35 0.5 1.0
rxy2 expected to be 0.2 (0.40 x 0.50)
X2 = 1.82 with 1 df and 50 samplesP = 0.18
correlation matrix
issue: should there be a path from x to y2?
0.40 0.50
Essentially, our ability to detect significant differences from our base model, depends as usual on sample size.
G89.2247 Lecture 5 64
Chi Square Test Issues
• Appeal of Chi Square Test– Makes model fit appear confirmatory– Can reject an ill fitting model– Can compare nested models– Can calculate power
• Problems with Chi Square Test– Global test will reject a good model if data are not
multivariate normal– Usual issues of significance testing
65
Residuals: Most fit indices represent average of residuals between observed and predicted covariances. Therefore, individual residuals should be inspected.
Correlation Matrix to be Analyzed y1 y2 x -------- -------- --------y1 1.00y2 0.50 1.00 x 0.40 0.35 1.00
Fitted Correlation Matrix y1 y2 x -------- -------- --------y1 1.00y2 0.50 1.00 x 0.40 0.20 1.00
residual = 0.15
Diagnosing Causes of Lack of Fit (misspecification)
Modification Indices: Predicted effects of model modification on model chi-square.
Six Steps to Modeling
• Specification• Implied Covariance Matrix• Identification• Estimation• Model Fit• Respecification
Specification
• Theorize your model– What observed variables?
• How many observed variables?
– What latent variables?• How many latent variables?
– Relationship between latent variables?– Relationship between latent variables and
observed variables?– Correlated errors of measurement?
Identification
• Are there unique values for parameters?
• Property of model, not data
• 10 = x + y
x = y
2, 8 -1, 11 4, 6
Identification
• Underidentified• Just identified• Overidentified
Identification
• Rules for Identification– By type of model
• Classic econometric– e.g., recursive rule
• Confirmatory factor analysis– e.g., three indicator rule
• General Model– e.g., two-step rule
Identification
• Identified? Yes (just) by 3-indicator rule.
Trust in Individuals
people arehelpful
(x1)
people canbe trusted
(x2)
people areFair (x3)
1
ξ1
δ1 δ2 δ3
λ11 λ21
A Perspective on “Fit”