Vanderbilt Census Information Center Vanderbilt Peabody Library
Proactive Monte Carlo Analysis in Structural Equation Modeling James H. Steiger Vanderbilt...
-
Upload
cameron-greenhalgh -
Category
Documents
-
view
214 -
download
1
Transcript of Proactive Monte Carlo Analysis in Structural Equation Modeling James H. Steiger Vanderbilt...
Proactive Monte Carlo Analysis in Structural Equation Modeling
James H. Steiger
Vanderbilt University
Some Unhappy Scenarios
A Confirmatory Factor Analysis– You fit a 3 factor model to 9 variables with N=150– You obtain a Heywood Case
Comparing Two Correlation Matrices– You wish to test whether two population matrices
are equivalent, using ML estimation– You obtain an unexpected rejection
Some Unhappy Scenarios
Fitting a Trait-State Model– You fit the Kenny-Zautra TSE model to 4 waves of panel
data with N=200. You obtain a variance estimate of zero.
Writing a Program Manual– You include an example analysis in your widely distributed
computer manual– The analysis remains in your manuals for more than a
decade– The analysis is fundamentally flawed, and gives incorrect
results
Some Common Elements
Models of covariance or correlation structure Potential problems could have been
identified before data were ever gathered, using “proactive Monte Carlo analysis”
Confirmatory Factor Analysis
Variable Factor 1 Factor 2 Factor 3
VIS_PERC X
CUBES X
LOZENGES X
PAR_COMP X
SEN_COMP X
WRD_MNG X
ADDITION X
CNT_DOT X
ST_CURVE X
Confirmatory Factor Analysis
Variable Factor 1 Factor 2 Factor 3 Unique Var.
VIS_PERC 0.46 0.79
CUBES 0.65 0.58
LOZENGES 0.25 0.94
PAR_COMP 1.00 0.00
SEN_COMP 0.41 0.84
WRD_MNG 0.22 0.95
ADDITION 0.38 0.85
CNT_DOT 1.00 0.00
ST_CURVE 0.30 0.91
Confirmatory Factor Analysis
Variable Factor 1 Factor 2 Factor 3 Unique Var.
VIS_PERC 0.60 0.64
CUBES 0.60 0.64
LOZENGES 0.60 0.64
PAR_COMP 0.60 0.64
SEN_COMP 0.60 0.64
WRD_MNG 0.60 0.64
ADDITION 0.60 0.64
CNT_DOT 0.60 0.64
ST_CURVE 0.60 0.64
Proactive Monte Carlo Analysis
Take the model you anticipate fitting Insert reasonable parameter values Generate a population covariance or correlation
matrix and fit this matrix, to assess identification problems
Examine Monte Carlo performance over a range of sample sizes that you are considering
Assess convergence problems, frequency of improper estimates, Type I Error, accuracy of fit indices
Preliminary investigations may take only a few hours
Confirmatory Factor Analysis
(Speed)-1{.3}->[VIS_PERC] (Speed)-2{.4}->[CUBES] (Speed)-3{.5}->[LOZENGES]
(Verbal)-4{.6}->[PAR_COMP] (Verbal)-5{.3}->[SEN_COMP] (Verbal)-6{.4}->[WRD_MNG]
(Visual)-7{.5}->[ADDITION] (Visual)-8{.6}->[CNT_DOT] (Visual)-9{.3}->[ST_CURVE]
Confirmatory Factor Analysis
Confirmatory Factor Analysis
Confirmatory Factor Analysis
Confirmatory Factor Analysis
(Speed)-1{.53}->[VIS_PERC] (Speed)-2{.54}->[CUBES] (Speed)-3{.55}->[LOZENGES]
(Verbal)-4{.6}->[PAR_COMP] (Verbal)-5{.3}->[SEN_COMP] (Verbal)-6{.4}->[WRD_MNG]
(Visual)-7{.5}->[ADDITION] (Visual)-8{.6}->[CNT_DOT] (Visual)-9{.3}->[ST_CURVE]
Confirmatory Factor Analysis
Confirmatory Factor Analysis
Confirmatory Factor Analysis
Confirmatory Factor Analysis
Variable Factor 1 Factor 2 Factor 3 Unique Var.
VIS_PERC 0.60 0.64
CUBES 0.60 0.64
LOZENGES 0.60 0.64
PAR_COMP 0.60 0.64
SEN_COMP 0.60 0.64
WRD_MNG 0.60 0.64
ADDITION 0.60 0.64
CNT_DOT 0.60 0.64
ST_CURVE 0.60 0.64
Proactive Monte Carlo Analysis
Proactive Monte Carlo Analysis
Proactive Monte Carlo Analysis
Proactive Monte Carlo Analysis
Percentage of Heywood Cases
N Loading .4 Loading .6 Loading .8
75 80% 30% 0%
100 78% 11% 0%
150 62% 3% 0%
300 21% 0% 0%
500 01% 0% 0%
Standard Errors
Standard Errors
Standard Errors
Distribution of Estimates
Es tim ates for Param eter 1, N=72
-0.03740.0490
0.13550.2219
0.30840.3948
0.48130.5677
0.65420.7406
0.82710.9135
1.0000
PAR _1
0
20
40
60
80
100
120
140
No of obs
Standard Errors (N =300)
Standard Errors (N = 300)
Distribution of Estimates
Es tim ates for Param eter 1 (N=300)
0.37540.4154
0.45530.4952
0.53510.5751
0.61500.6549
0.69490.7348
0.77470.8146
0.8546
PAR _1
0
20
40
60
80
100
120
140
No of obs
Correlational Pattern Hypotheses
“Pattern Hypothesis”– A statistical hypothesis that specifies that
parameters or groups of parameters are equal to each other, and/or to specified numerical values
Advantages of Pattern Hypotheses– Only about equality, so they are invariant under
nonlinear monotonic transformations (e.g., Fisher Transform).
Correlational Pattern Hypotheses
Caution! You cannot use the Fisher transform to construct confidence intervals for differences of correlations– For an example of this error, see Glass and
Stanley (1970, p. 311-312).
Comparing Two Correlation Matrices in Two Independent Samples
Jennrich (1970)– Method of Maximum Likelihood (ML)– Method of Generalized Least Squares (GLS)– Example
Two 11x11 matrices Sample sizes of 40 and 89
Comparing Two Correlation Matrices in Two Independent Samples
ML Approach
Minimizes ML discrepancy function Can be programmed with standard SEM
software packages that have multi-sample capability
1 1 2 2; D RD D RD
Comparing Two Correlation Matrices in Two Independent Samples
Generalized Least Squares Approach Minimizes GLS discrepancy function SEM programs will iterate the solution Freeware (Steiger, 2005, in press) will
perform direct analytic solution
Monte Carlo Results – Chi-Square Statistic
Mean S.D.
Observed 75.8 13.2
Expected 66 11.5
Monte Carlo Results – Distribution of p-Values
Comparing Two Correlation Matrices (ML)N 1 = 40, N 2 = 89
-0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1
p Value
0
50
100
150
200
250
300
350
No
of o
bs
Monte Carlo Results – Distribution of Chi-Square Statistics
Observed vs. Expected Frequencies
Observed Expected
-20
0
20
40
60
80
100
120
140
160
180
200
220
Monte Carlo Results (ML) – Empirical vs. Nominal Type I Error Rate
Nominal .010 .050
Empirical .076 .208
Monte Carlo Results (ML)Empirical vs. Nominal Type I Error RateN = 250 per Group
Nominal .010 .050
Empirical .011 .068
Monte Carlo Results – Chi-Square Statistic, N = 250 per Group
Mean S.D.
Observed 67.7 11.6
Expected 66 11.5
Kenny-Zautra TSE Model
T
Y1
1
Y2
2
Y3
3
YJ
J
O2O1 O3 OJ
…
2 3 J
J
TSE model
Likelihood of Improper Values in the TSE Model
Constraint Interaction
Steiger, J.H. (2002). When constraints interact: A caution about reference variables, identification constraints, and scale dependencies in structural equation modeling. Psychological Methods, 7, 210-227.
Constraint Interaction
3
Respondent'sParental Aspiration
X1
Respondent'sIntelligence
Respondent's Socioeconomic Status
Best Friend'sSocioeconomic Status
Best Friend'sIntelligence
Best Friend'sParental Aspiration
Respondent'sOccupational Aspiration
Respondent'sEducational Aspiration
Best Friend'sEducational Aspiration
Best Friend'sOccupational Aspiration
Respondent'sAmbition
Best Friend'sAmbition
1
1
1
X2
2
3
1,1
1,3
1,4
2,3
2,4
2,5
2,6
2,1 1,2 = 2,1
1
2
1,1 = 1
2,1
3,2
4,2 = 1
X3
X4
X5
X6
Y1
Y2
Y3
Y4
1,2
Constraint Interaction
X1
1
X2
2
1
1 2
Y1 Y2 Y3 Y4
1 2 3 4
21
1,1 = 1 2,1
1,1 2,2
2,1
1,1 2,1
1,1 = 12,1 3,2 = 1
4,2
1,1 2,2 3,3 4,4
1,1 2,2
1,1
Constraint Interaction
X1
1
X2
2
1
1 2
Y1 Y2 Y3 Y4
1 2 3 4
21
1,1 2,1
1,1 2,2
2,1
1,1 2,1
1,1 2,1 3,2 4,2
1,1 2,2 3,3 4,4
1,1 2,2
1
Constraint Interaction
Constraint Interaction – Model without ULI Constraints (Constrained Estimation)
(XI1)-1->[X1] (XI1)-2->[X2] (XI1)-{1}-(XI1)
(DELTA1)-->[X1] (DELTA2)-->[X2]
(DELTA1)-3-(DELTA1) (DELTA2)-4-(DELTA2)
(ETA1)-98->[Y1] (ETA1)-5->[Y2]
(ETA2)-99->[Y3] (ETA2)-6->[Y4]
(EPSILON1)-->[Y1] (EPSILON2)-->[Y2] (EPSILON3)-->[Y3] (EPSILON4)-->[Y4]
(EPSILON1)-7-(EPSILON1) (EPSILON2)-8-(EPSILON2) (EPSILON3)-9-(EPSILON3) (EPSILON4)-10-(EPSILON4)
(ZETA1)-->(ETA1) (ZETA2)-->(ETA2)
(ZETA1)-11-(ZETA1) (ZETA2)-12-(ZETA2)
(XI1)-13->(ETA1) (XI1)-13->(ETA2) (ETA1)-15->(ETA2)
Constraint Interaction
Constraint Interaction
Constraint Interaction – Model With ULI Constraints
(XI1)-->[X1] (XI1)-2->[X2] (XI1)-1-(XI1)
(DELTA1)-->[X1] (DELTA2)-->[X2]
(DELTA1)-3-(DELTA1) (DELTA2)-4-(DELTA2)
(ETA1)-->[Y1] (ETA1)-5->[Y2]
(ETA2)-->[Y3] (ETA2)-6->[Y4]
(EPSILON1)-->[Y1] (EPSILON2)-->[Y2] (EPSILON3)-->[Y3] (EPSILON4)-->[Y4]
(EPSILON1)-7-(EPSILON1) (EPSILON2)-8-(EPSILON2) (EPSILON3)-9-(EPSILON3) (EPSILON4)-10-(EPSILON4)
(ZETA1)-->(ETA1) (ZETA2)-->(ETA2)
(ZETA1)-11-(ZETA1) (ZETA2)-12-(ZETA2)
(XI1)-13->(ETA1) (XI1)-13->(ETA2) (ETA1)-15->(ETA2)
Constraint Interaction – Model With ULI Constraints
Typical Characteristics of Statistical Computing Cycles
Back-loaded – Occur late in the research cycle, after data are
gathered
Reactive– Often occur in support of analytic activities that
are reactions to previous analysis results
Traditional Statistical World-View
Data come first Analyses come second Analyses are well-understood and will work Before the data arrive, there is nothing to
analyze and no reason to start analyzing
Modern Statistical World View
Planning comes first– Power Analysis, Precision Analysis, etc.
Planning may require some substantial computing– Goal is to estimate required sample size
Data analysis must wait for data
Proactive SEM Statistical World View
SEM involves interaction between specific model(s) and data.
– Some models may not “work” with many data sets
Planning involves:– Power Analysis– Precision Analysis– Confirming Identification– Proactive Analysis of Model Performance
Without proper proactive analysis, research can be stopped cold with an “unhappy surprise.”
Barriers
Software– Design– Availability
Education