EPSY 430 Behavioral Constructs Behavioral Constructs Three Behavioral Domains.
FACTOR ANALYSIS LECTURE 11 EPSY 625. PURPOSES SUPPORT VALIDITY OF TEST SCALE WITH RESPECT TO...
-
date post
22-Dec-2015 -
Category
Documents
-
view
221 -
download
0
Transcript of FACTOR ANALYSIS LECTURE 11 EPSY 625. PURPOSES SUPPORT VALIDITY OF TEST SCALE WITH RESPECT TO...
FACTOR ANALYSIS
LECTURE 11
EPSY 625
PURPOSES
SUPPORT VALIDITY OF TEST SCALE WITH RESPECT TO UNDERLYING TRAITS (FACTORS)
• EFA- EXPLORE/UNDERSTAND UNDERLYING FACTORS FOR A TEST
• CFA- CONFIRM THEORETICAL STRUCTURE IN A TEST
HISTORICAL DEVELOPMENT
PEARSON (1901)- eigenvalue/eigenvector problem (dimensional reduction) “method of principal axes)
SPEARMAN (1904) “General Intelligence, Objectively Measured and Determined”
Others: Burt, Thompson, Garnett, Holzinger, Harmon, Thurstone
FACTOR MODELS
FIXED SAMPLE
Fixed
Principal
components, common factors
Image
SUBJECTS
VARIABLES
ALPHA Factor Analysis
Canonical Factor Analysis
Sampl
e
EXPLORATORY FACTOR ANALYSIS
USE PRINCIPAL AXIS METHOD:• ASSUMES THERE ARE 3 VARIANCE
COMPONENTS IN EACH ITEM:• COMMONALITY (h2)
• UNIQUENESS:
• SPECIFICITY (s2)
• ERROR (e2)
SINGLE FACTOR
REQUIRES AT LEAST 3 ITEMS OR MEASUREMENTS TO UNIQUELY DETERMINE
FACTOR
ITEM1
SPECIFICITY
e
ITEM2
ITEM3
e
e
.7
.8
.6
CALLED FACTOR LOADING
CORRELATION BETWEEN ITEM AND FACTOR
ASSUMED=0 FOR PARALLEL ITEMS
.6
.8
.714
FACTOR
ITEM1
SPECIFICITY
e
ITEM2
ITEM3
e
e
.7
.8
.6
ALPHA= SPEARMAN-BROWN STEPPED UP AVERAGE INTER ITEM CORRELATION:
(.56 +.42+.48)/3=.49
ALPHA= 3(.49)/[1+2(.49)]
= .74
ASSUMED=0 FOR PARALLEL ITEMS
.6
.8
.714
=1-.72
TWO FACTORS
NEED AT LEAST 2 ITEMS OR MEASUREMENTS PER FACTOR, ASSUMING FACTORS ARE CORRELATED
FACTOR 1
ITEM1e
ITEM2eITEM 3
ITEM 4
FACTOR 2
e
e.5CORRELATION
BETWEEN FACTORS
.7
.8
.6
.7
FACTOR 1
ITEM1e
ITEM2eITEM 3
ITEM 4
FACTOR 2
e
e.5CORRELATION
BETWEEN FACTORS
.7
.8
.6
.7
CORRELATION BETWEEN ANY TWO ITEMS = PRODUCT OF ALL PATHS BETWEEN THEM;
EX. R(ITEM1, ITEM4) =
.7 x .5 x .7 = .245
SIMPLE STRUCTURE TRY TO CREATE SCALE IN WHICH
EACH ITEM CORRELATES WITH ONLY ONE FACTOR:
ITEM FACTOR
1 2 3
ITEM 1 1 0 0
ITEM 2 1 0 0
ITEM 3 0 1 0
ETC
CRITERIA FOR SIMPLE STRUCTURE
Structural equation modeling provides chi square test of fit
Compares observed covariance (correlation) matrix with predicted/fitted matrix
Alternatively, look at RMSEA (Root mean square error of approximation) of deviations from fitted matrix
MATHEMATICAL MODEL
Z = persons by variables matrix of p x k standardized variables (mean=0, SD=1)
Z’Z = NR (covariance matrix) k x k Zi = aiFi + ei
MATHEMATICAL MODEL
Z = AF = C + U
ZZ’/N = R = AFF’A’ + U2
S = ZF’/N (structure matrix: correlations between Z and F)
= AFF’/N = FF’/N (correlations among factors) A = Pattern matrix
MATHEMATICAL MODEL
S = A A = S -1 (If factors uncorrelated, A=S)
Pattern matrix = Structure matrix
R = ZZ’/N = CC’/N + U2
MATHEMATICAL MODEL
If we take the covariance matrix of F to be diagonal, and the metric of variances of Fi to be 1.0,
R = AA’/N = SA’ = AS’
MATHEMATICAL MODEL
Now let Zi = aiFi + si + ei
Let Ŕ = R - D2, where D2 is a diagonal matrix of specificities and error: si + e2
i
Then Ŕ = AFF’A/N = A A’ = SA’ = AS’ = I Ŕ = AA’
MATHEMATICAL MODEL
How do we estimate s2i ?
Instead, estimate [R2- U2]ii= [I- s2i - e2
i]ii
Consider for each zi that it is predictable from the rest:
zi = b1z1 + b2z2 + …bi-1zi-1 + ...
Then R2i = variance common to all other
variables (squared multiple correlation or SMC) h2
i = communality for item i Due to Dwyer (1939)
MATHEMATICAL MODEL
SMC is estimable from the observed data, so that Ŕ = R - [1-SMCi]
where [SMCi] = diagonal matrix with SMCs for each variable on the diagonals and zeros off-diagonal
Theorem states “SMCs guarantee that the number of factors # eigenvalues>1.0
MATHEMATICAL MODEL
Ŕ =
R21.234.. 0 0 0 0 …
0 R22.134.. 0 0 0 …
0 0 R23.124.. 0 0 …
0 0 0 R24.123.. 0 …
MATHEMATICAL MODEL
SOLUTIONS: PRINCIPAL COMPONENTS (R =
Ŕ )Rq = q,
RQ = Q, = diagonal [i]
Q-1RQ = QQ’ = I = Q-1 = Q’
Q’RQ = (Spectral Theorem)
MATHEMATICAL MODEL
SOLUTIONS: PRINCIPAL AXIS ( Ŕ- I)q = 0
That is, solve for first eigenvalue | Ŕ- I | = 0, solved by Rmq = mq
begin with m=2: R2q = 2q , then put solution in R(Rq1) = 2q1, iterate for m=4
MATHEMATICAL MODEL
Now compute residual correlation matrix:R2
1 = R2 - Ŕ , iterate
EIGENVALUES
i = variance of ith factor i / i = proportion of total variance
accounted for by the ith factor i < 1 chance factor Scree plot (value x factor eigenvalue
ordered from greatest to lowest)
K
1.0
0
1 2 3 4 5 6 7 . . . . K
SCREE PLOT
ROTATION
MEANING CRITERION: SIMPLE STRUCTURE POSITIVE MANIFOLD
B=AT A=INITIAL FACTOR MATRIX
T=TRIANGULAR MATRIX
B=FINAL FACTOR MATRIX
TT’=
VARIMAX ROTATION (uncorrelated Factors)
ORTHOGONAL (RIGID) ROTATION Maximize V=n (bjp/hj)4 - (b2
jp/h2j)2
Geometric problem: (X,Y) = (x,y) cos - sin
sin - cos
VARIMAX ROTATION (X,Y) = (x,y) cos - sin sin - cos
uj = x2j - y2
j
vj = 2xjyj
A= uj
B= vj
C= (uj - vj)2
D=2 ujvj
solve tan4 = [D-2AB/h]/[C-(A2-B2)/h]
-45o 45o
Unrotated Factor 1 loading values
Unrotated Factor 2 loading values
Orthogonal (perpendicular) Rotation of Axes
OBLIQUE SOLUTION (correlated Factors)
MINIMIZE S (OBLIMIN) S = [n(v2
jp/h2j) (v2
jg/h2j)
- ((v2jp/h2
j)((v2jg/h2
j)]PROMAX:
• Start with VARIMAX, B=AT, transform with vjp = (bjp
4)/bjp
FACTOR CORRELATION
= TT’
Tij = cos(ij) -sin(ij) sin(ij) cos(ij)
rij = [cos(ij)(-sin(ij)] + [sin(ij)cos(ij)] = T11T12 + T21T22
FACTOR CORRELATION
S = P (Structure matrix= Pattern matrix x factor correlation matrix)
P = A(T’)-1
A = PT’
ij
Oblique Rotation of Axes
ALPHA FACTOR ANALYSIS
Estimates population h2i for each variable Little different from common factors
Canonical Factor Analysis
Uses canonical analysis to maximize R between factors and variables, iterative Maximum Likelihood analysis
Image Analysis
h2i = R2i.1,2,…K
pj = wjkzk (standard regression) ej = zj - pj called anti-image Var(ej)> Var(j) where Var(j) = anti-
image for the regression of zj on the factors F1,F2, …FK
FACTOR CONGRUENCE
Alternative to Confirmatory Analysis for two groups who it is hypothesized have the same factor structure:
Spq = ajpbjq / [a2jp b2jq ] This is basically the correlation between
factor loadings on the comparable factors for two groups
Example of 2 factor structure
Achievement (reading, math) and IQ (verbal, nonverbal)
quasi-multitrait multimethod analysis:• reading is verbal
• math is “nonverbal”
.9
Ach Apt
.7
Reading
.9
Arithmetic
.8
Verbal Nonverbal
.8
e
.3.6
e
.43
e
.6
e
Factor Structure
F1 F2R .9 .63
A .8 .56
V .63 .9
NV .56 .8
Reduced Correlation Matrix
R A V NV R .81 .72 .57 .50
A .64 .51 .45
V .81 .72
NV .64
.9
Ach Apt
.7
Reading
.9
Arithmetic
.8
Verbal Nonverbal
.8
e
.43.6
e
.43
e
.6
e
Factor Structure
F1 F2R .9 .63
A .8 .56
V .63 .9
NV .56 .8
Reduced Correlation Matrix
R A V NV R .91 .72 .63 .51
A .92 .65 .61
V .99 .72
NV .80
.32
.40
Revised Model with additional specificities
.37
CONFIRMATORY FACTOR ANALYSIS
BASIC PRINCIPLES
x x´)
2x1
xx = x1x2 2x2 x1x3 x2x3 2x3
BASIC PRINCIPLES
2x1 =2111 + 21
2xk =2k11 + 2k
xixk =x111 xk
x1
1
xkk
1
IDENTIFICATION RULES t-rule : tq(q+1), q=#manifest variables
• necessary but not sufficient
3-indicator rule: 1 factor3 indicators• sufficient but not necessary
2-indicator rule: 2+ factors2 indicators @ local vs. global identification:
• local: sample estimates of parameters independent- necessary but not sufficient
• global: population parameters independent- necessary and sufficient
ESTIMATION
MODEL EVALUATION• FIT: FML used to evaluate , S
• Residuals: E= S -
• RMR = SD(sij - ij )
• RMSEA = √[(2/df - 1) /(N - 1)]
• note: factor analyze E , should be 0
ˆ
ˆ
ˆ
ˆ
ˆ
Hancock’s Formula- reliability for a given factor
Hj = 1/ [ 1 + {1 / (Σ[l2ij/(1- l2ij )] ) }
Ex. l1 = .7, l2= .8, l3 = .6
H = 1 / [ 1 +1/( .49/.51 + .64/.36 + .36/.64 )]
= 1 / [ 1 + 1/ ( .98 +1.67 + .56 ) ]
= 1/ (1 + 1/3.21)
= .76
Hancock’s Formula Explained
Hj = 1/ [ 1 + {1 / (Σ[l2ij/(1- l2ij )] ) }
now assume strict parallelism: then l2ij= 2xt
thus Hj = 1/ [ 1 + {1 / (Σ[2xt /(1- 2
xt)] ) }
= k 2xt / [1 + (k-1) 2
xt ]
= Spearman-Brown formula
TEST
(n-1)FML ~ t
used for nested model: model with one or more restrictions from original
restriction = known parameter, equality of two or more parameters
Proof: Bollen shows (N-1)[-2Log(L0/L1)=(N-1)FML
where L0 is unrestricted, L1 restricted models
INCREMENTAL FIT
Bentler and Bonnet: 1 = Fb - Fm Fb
= b -
m
b
can be used to compare improvements over original model or against a standard or baseline
Bentler & Bonnet Baseline conventions b= S Alternatives:
b= [.5] or b = from a previous study example: Willson & Rupley (1997) was
used by Nichols (1997) dissertation
Bollen’s fit index2 = Fb - Fm
Fb
= b -
m
b - df
Logic: the difference in the numerator has expected value equal to the denominator
AMOS SEM PROGRAM Uses SPSS to input data- select SPSS file Draw factor model
• Circles for factors, boxes for observed variables
• Arrows from circles to boxes to indicate loadings
• Errors for each box (special drawing character)
• Label all circles and boxes with names- SPSS variable names for boxes, your own name for factors and circles
• Correlate factors with curved arrows as needed
AMOS Drawing
ANX
e1e2
e3
DEP SE
F1