Chapter 9 Factor Analysis. Introduction The essential purpose of factor analysis, is to describe, if...
-
Upload
godfrey-oconnor -
Category
Documents
-
view
234 -
download
1
Transcript of Chapter 9 Factor Analysis. Introduction The essential purpose of factor analysis, is to describe, if...
Chapter 9Factor Analysis
Introduction The essential purpose of factor
analysis, is to describe, if possible, the covariance relationships among many variables in terms of a few underlying, but unobservable random quantities called factors.
Suppose variables can be grouped by their correlations. Variables in the same group are highly correlated while variables in different groups have relatively small correlations.
Introduction It is conceivable that each group of
variables represents a single underlying construct, or factor, that is responsible for the observed correlations.
For example, correlations from the group of test scores in classics, French, English, mathematics, and music collected by Spearman suggested an underlying “intelligence” factor.
Factor analysis can be considered as an extension of principal component analysis.
7.1 The Orthogonal Factor Model (pp477- 482)7.1 The Orthogonal Factor Model (pp477- 482)
ijpp
ovE
x
x
Σxμxx C
11
pmpmppp
mm
FlFlx
FlFlx
11
1111111
FF11, …, F, …, Fmm : common factors : common factors
εε11, …, , …, εεp p : special factors: special factors
llij ij : the loading of the : the loading of the i i thth variable on the variable on the j j thth factor factor
The model in matrix formThe model in matrix form
matrix diagonal a ,Cov ,0
Cov ,0
tindependen are and 1111
ψε ε
IFF
ε F
εFLμx
E
E
pmmppp
NoteNote
''LF'LFLFF'L'
''
εLFεLFμxμx
so that so that
ψ
εε ELFFE
μxμxxΣ '
LL'
''L
'E Cov
Covariance structure for the orthogonal factor Covariance structure for the orthogonal factor model model
1.1.
2.2.
kmimkiki
iimii
llllXX
ψllXar
'
11
221
Cov
V
or
Cov
,
ψLLX
ijii lF,X
Cov
or
Cov LFX,
That portion of the variance of the That portion of the variance of the i i thth variable contributed variable contributed by the by the mm common factors is called the common factors is called the i i th th communalitycommunality. T. That portion of hat portion of due to the specific factor is due to the specific factor is often called the often called the uniquenessuniqueness, or , or specific variancespecific variance. Denotin. Denoting the g the i i thth communality by , we have communality by , we have
iii σXar V
2ih
variancespecificcommunity
222
21
Var
iimii
X
ii lllσi
222
21
2imiii lllh oror
andand piΨhσ iiii ,,2,1 , 2 The The i i thth community is the sum of squares of the loadings community is the sum of squares of the loadings of the of the i i thth variable on the variable on the mm common factors. common factors.
Example Example 99.1:.1: Verifying the relationVerifying the relationfor two factors (pp. 480 - 481)for two factors (pp. 480 - 481)
LL
Consider the covariance matrixConsider the covariance matrix
68472312
473852
2355730
1223019
Example Example 99.1:.1: Verifying the relationVerifying the relationfor two factors (pp. 480 - 481)for two factors (pp. 480 - 481)
LL
By matrix algebra, we can verify the By matrix algebra, we can verify the equalityequality
LLasas
3000
0100
0040
0002
8621
1174
81
61
27
14
68472312
473852
2355730
1223019
Example Example 99.1:.1: Verifying the relationVerifying the relationfor two factors (pp. 480 - 481)for two factors (pp. 480 - 481)
LL
Therefore, has the structure produced by an Therefore, has the structure produced by an orthogonal factor model. Sinceorthogonal factor model. Since
2m
,
81
61
27
14
L
4241
3231
2221
1211
ll
ll
ll
ll
3000
0100
0040
0002
000
000
000
000
4
3
2
1
Example Example 99.1:.1: Verifying the relationVerifying the relationfor two factors (pp. 480 - 481)for two factors (pp. 480 - 481)
LL
Therefore, the communality of is, Therefore, the communality of is, fromfrom
,,
1X22
221
2imiii lllh
1714 22212
211
21 llh
and the variance of can be decomposed asand the variance of can be decomposed as1X
21721419 22
12
112
122
1111
hll
99.2 Estimation.2 Estimation
FromFrom Matrix of observationsMatrix of observations
Sample of covariance matrixSample of covariance matrix
We need to estimate We need to estimate . Due to complexity of the . Due to complexity of the model, this is a much difficult job than that in PCA.model, this is a much difficult job than that in PCA.
XS
ε F,L,μ,
a.a. Principal Component MethodPrincipal Component Method (Sec. 9.3, textbook)(Sec. 9.3, textbook)
b.b. Maximum Likelihood MethodMaximum Likelihood Method (Sec. 9.3, textbook)(Sec. 9.3, textbook)
The solutions obtained by these methods may be The solutions obtained by these methods may be different.different.
Principal Component Solution of the Factor ModelThe principal component factor analysis of the sample covariance maThe principal component factor analysis of the sample covariance matrix S is specified in terms of its eigenvalue-eigenvector pairtrix S is specified in terms of its eigenvalue-eigenvector pair
wherewhere Let Let m<pm<p be the number of com be the number of common factors. Then the matrix of estimated factor loadings is given mon factors. Then the matrix of estimated factor loadings is given by by
The estimated specific variances are provided by the diagonal elemeThe estimated specific variances are provided by the diagonal elements of the matrix nts of the matrix , so , so
,ˆ,ˆ,,ˆ,ˆ,ˆ,ˆpp eee 2211 .ˆˆˆ
p 21
ijl~ 15-9 ˆˆˆˆˆˆ~
2211
mmeeeL
'LLS ~~
16-9 ~ with
~00
0~0
00~
~1
22
1
m
jijiii
p
ls
ψ
Communalities are estimated asCommunalities are estimated as
The principal component factor analysis of the sample correlation matrix is The principal component factor analysis of the sample correlation matrix is obtained by starting with obtained by starting with RR in place of in place of SS. .
17-9 ~~~~ 2
22
122
miiii lllh
99.3 Factor Rotation.3 Factor Rotation
Very often, the solution is not consistent with the Very often, the solution is not consistent with the statistical interpretation of the coefficients. The factor statistical interpretation of the coefficients. The factor rotation is proposed.rotation is proposed.
When When m>1m>1, there is always some inherent ambiguity , there is always some inherent ambiguity associated with the factor model. Too see this, let associated with the factor model. Too see this, let TT be be any any m x mm x m orthogonal matrix, so that orthogonal matrix, so that TTTT''= = TT''T=I. T=I. Then the Then the expression in (9-2) can be writtenexpression in (9-2) can be written
FLFLTTLFX 'μ
wherewhere
FTFLTL and '
SinceSince 0 FTF 'EE
andand
mm''
ITTTFTF CovCov
It is impossible, on the basis of observations on It is impossible, on the basis of observations on XX, to , to distinguish the loadings distinguish the loadings LL from the loadings from the loadings L*L*. That is, . That is, the factors the factors FF and and F*= TF*= T''F F have the same statistical have the same statistical properties, and even though the loadings properties, and even though the loadings L*L* are, in are, in general, different from the loadings general, different from the loadings LL, they both generate , they both generate the same covariance matrixthe same covariance matrix . That is. That isΣ
ΨLLΨLLTTΨLLΣ ''''
This ambiguity provides the rationale for “factor rotation”, This ambiguity provides the rationale for “factor rotation”, since orthogonal matrices correspond to rotations (and since orthogonal matrices correspond to rotations (and reflections) of the coordinate system for reflections) of the coordinate system for XX..
Factor loadings Factor loadings LL are determined only up to an orthogonal are determined only up to an orthogonal matrix matrix TT. Thus, the loadings. Thus, the loadings
give the same representation. The communalities, given give the same representation. The communalities, given by the diagonal elements of by the diagonal elements of are also are also unaffected by the choice of unaffected by the choice of TT..
LLTL and
'' LLLL
Factor Analysis- Principal Component Solution (7.4 examples)
Example Example 99.1: Stock-price data (pp. 469, pp. 493 - 495).1: Stock-price data (pp. 469, pp. 493 - 495)Factor analysis - principal component solutionFactor analysis - principal component solution
Variable
Estimatedfactor loadings
Specificvariances
Specificvariances
1. Allied Chemical 0.783 0.39 0.783 -0.217 0.342. Du Pont 0.773 0.4 0.773 -0.458 0.193. Union Carbide 0.794 0.37 0.794 -0.234 0.314. Exxon 0.713 0.49 0.713 0.472 0.275. Texaco 0.712 0.49 0.712 0.524 0.22
Cumulative proportion oftotal (standardized)
sample variance explained0.571 0.571 0.733
One-factor solutionEstimated
factor loadings
Two-factor solution
1F 1F 2F21 ii h~~ 21 ii h~~
Factor Analysis- Principal Component Solution
Example Example 99.1: Stock-price data (pp. 469, pp. 493 - 495).1: Stock-price data (pp. 469, pp. 493 - 495)
SAS output - one factor solutionSAS output - one factor solution
Eigenvalues for Eigenvalues for Estimated factor Estimated factor loadingsloadings
Estimated factor Estimated factor loadingsloadings
ijiij el ˆˆ~
Factor Analysis- Principal Component Solution
Example Example 99.1: Stock-price data (pp. 469, pp. 493 - 495).1: Stock-price data (pp. 469, pp. 493 - 495)
SAS output - one factor solutionSAS output - one factor solution
CommunalitiesCommunalities
iih ~~ 12
Factor Analysis- Principal Component Solution
Example Example 99.1: Stock-price data (pp. 469, pp. 493 - 495).1: Stock-price data (pp. 469, pp. 493 - 495)
SAS output - two factor solutionSAS output - two factor solution
Variable
Specificvariances
1. Allied Chemical 0.684 0.189 0.502. Du Pont 0.694 0.517 0.0253. Union Carbide 0.681 0.248 0.474. Exxon 0.621 -0.073 0.615. Texaco 0.792 -0.442 0.18
Cumulative proportion oftotal (standardized)
sample variance explained0.485 0.598
Estimatedfactor loadings
Maximum likelihood
Factor Analysis- Maximum Likelihood Method
Example Example 99.1: Stock-price data (pp. 469, pp. 493 - 495).1: Stock-price data (pp. 469, pp. 493 - 495)Factor analysis - maximum likelihood methodFactor analysis - maximum likelihood method
1F 2F21 ii h~~
Factor Analysis- Maximum Likelihood Method
Example Example 99.1: Stock-price data (pp. 469, pp. 493 - 495).1: Stock-price data (pp. 469, pp. 493 - 495)
SAS outputSAS output
Eigenvalues for Eigenvalues for Estimated factor Estimated factor loadingsloadings
Estimated factor Estimated factor loadingsloadings
ijiij el ˆˆ~
Factor Analysis- Maximum Likelihood Method
Example Example 99.1: Stock-price data (pp. 469, pp. 493 - 495).1: Stock-price data (pp. 469, pp. 493 - 495)
SAS outputSAS output
CommunalitiesCommunalities
iih ~~ 12
Factor Analysis- Maximum Likelihood Method
Example Example 99.1: Stock-price data (pp. 469, pp. 493 - 495).1: Stock-price data (pp. 469, pp. 493 - 495)
SAS outputSAS output
Factor Analysis- Principal Component Solution
Example 9.3: Consumer-preference (pp. 487-489)Example 9.3: Consumer-preference (pp. 487-489)
In a consumer-preference study, a random sample In a consumer-preference study, a random sample
of customers were asked to rate several attributes of customers were asked to rate several attributes
of a new product. The response, on a 7-point of a new product. The response, on a 7-point
semantic differential scale, were tabulated and the semantic differential scale, were tabulated and the
attribute correlation matrix constructed.attribute correlation matrix constructed.
Factor Analysis- Principal Component Solution
Example 9.3: Consumer-preference (pp. 487-489)Example 9.3: Consumer-preference (pp. 487-489)
Factor analysis - PC solution without rotationFactor analysis - PC solution without rotation
1.00 0.02 0.96 0.42 0.011.00 0.02 0.96 0.42 0.010.02 1.00 0.13 0.71 0.850.02 1.00 0.13 0.71 0.850.96 0.13 1.00 0.50 0.110.96 0.13 1.00 0.50 0.110.42 0.71 0.50 1.00 0.790.42 0.71 0.50 1.00 0.790.01 0.85 0.11 0.79 1.000.01 0.85 0.11 0.79 1.00
Data setData set
TasteTaste Good buy Good buy for moneyfor money
FlavorFlavor Suitable for Suitable for snacksnack
Provides lotsProvides lotsof energyof energy
Variable
CommunalitiesSpecificvariances
1. Taste 0.56 0.82 0.98 0.022. Good buy for money 0.78 -0.53 0.88 0.123. Flavor 0.65 0.75 0.98 0.024. Suitable for snack 0.94 -0.11 0.89 0.115. Provides lots of energy 0.80 -0.54 0.93 0.07Eigenvalues 2.85 1.81
Cumulative proportion oftotal (standardized)
sample variance0.571
Estimated factor loadings
Factor Analysis- Principal Component Solution
Example 9.3: Consumer-preference (pp. 487-489)Example 9.3: Consumer-preference (pp. 487-489)
Factor analysis - PC solution without rotationFactor analysis - PC solution without rotation
1F 2F 2ih
~ 21 ii h~~ ijiij el ˆˆ~
Factor Analysis- Principal Component Solution
Example 9.3: Consumer-preference (pp. 487-489)Example 9.3: Consumer-preference (pp. 487-489)
SAS outputSAS output
Factor Analysis- Principal Component Solution
Example 9.3: Consumer-preference (pp. 487-489)Example 9.3: Consumer-preference (pp. 487-489)
Factor analysis - PC solutFactor analysis - PC solutiion with rotation by on with rotation by varimaxvarimax
Variable
Cumumunalities
1. Taste 0.56 0.82 0.02 0.99 0.982. Good buy for money 0.78 -0.52 0.94 -0.01 0.883. Flavor 0.65 0.75 0.13 0.98 0.984. Suitable for snack 0.94 -0.1 0.84 0.43 0.895. Provides lots of energy 0.80 -0.54 0.97 -0.02 0.93
Cumulative proportion oftotal (standardized)
sample variance0.571 0.932 0.507 0.932
Estimated factor loadings
Rotated estimated factor loadings
1F 2F 2ih
~*1F
*2F
Now,Now,
5410755382
8094657856
.54-.80
.10-.94
.75.65
.53-.78
.82.56
Ψ.....
.....~'LL
001
81001
1153001
917911001
00449701001
070000
011000
000200
000120
000002
.
..
...
....
.....
.
.
.
.
.
That nearly reproduces the correlation matrix That nearly reproduces the correlation matrix RR..
Factor Analysis- Principal Component Solution
Example 9.3: Consumer-preference (pp. 487-489)Example 9.3: Consumer-preference (pp. 487-489)
SAS outputSAS output