Techniques for studying correlation and covariance structure
Principal Components Analysis (PCA)
Factor Analysis
Principal Component Analysis
Let x
and covariance matrix .
Definition:
1 1 1 p pC a x a x a x
have a p-variate Normal distribution
with mean vector
The linear combination
is called the first principal component if
1, , pa a a
is chosen to maximize
1Var C Var a x a a
subject to2 21 1pa a a a
Let
, 1 1g a V a a a a a a
Consider maximizing
subject to 2 21 1pa a a a
V Var a x a a
Using the Lagrange multiplier technique
Now
,1 0 if 1
g aa a a a
and
,2 2 0 if
g aa a a a
a
Thus is an eigenvector of and is the eigenvalue
associated with .
a
a
Also Var a x a a a a a a
Hence is maximized if is the largest
eigenvalue of .
Var a x
Summary
1 1 1 p pC a x a x a x
is the first principal component if 1
p
a
a
a
2 21i.e. 1pa a a a
is the eigenvector (length 1)of associated with the largest eigenvalue 1 of .
Let x
and covariance matrix .
Definition:
1 11 1 1 1
1 1
p p
p p pp p p
C a x a x a x
C a x a x a x
have a p-variate Normal distribution
with mean vector
The set of linear combinations
are called the principal components of
1, ,i i ipa a a
2 21 1i i i ipa a a a
The complete set of Principal components
x
if are chosen such that
Note: we have already shown that
1 11 1, , pa a a
is the eigenvector of associated with the largest eigenvalue, 1 ,of the covariance matrix and
1 1 1 1 1Var C Var a x a a
and
1. Var(C1) is maximized.
2. Var(Ci) is maximized subject to Ci being independent of C1, …, Ci-1 (the previous i -1 principle components)
We will now show that
1, ,i i ipa a a
is the eigenvector of associated with the ith largest eigenvalue, i of the covariance matrix and
i i i i iVar C Var a x a a
Proof (by induction – Assume true for i -1, then prove true for i)
1 1 1
i
i i i
C a x a
x A x
C a x a
Now
has covariance matrix
1
1i i i
i
a
A A a a
a
1 1 1 1 1 1 1
1 1 1 1 1 1 1
1 1 1 1
0
0
i i i
i i i i i i i i
i i i i i i i i i i
a a a a a a a a
a a a a a a a a
a a a a a a a a a a a a
Hence Ci is independent of C1, …, Ci-1 if
1 1 1, , ,i i i i ig a a a a a
We want to maximize i i iVar C a a
subject to
1 11. 0i i ia a a a
2. 1i ia a
Let
1 1
1 1 1
1 1
0
0
0
i i i
i i i i
i i i i
a a a a
a a a a
a a a a
1 1 1i i i i i ia a a a
1 1, , , ,0 for 1, 1i i i
j ij
g aa a j i
Now
and
1 1, , , ,1 0i i i
i ii
g aa a
Now
1, , ,i i
i
g a
a
1 1 1 12 2
0i i i i ia a a a
1 1 1 1
1
2i i i i ia a a a
1 1 1 1
1
2j i i j i j i j ia a a a a a a a
1
0 02 j j ja a
hence
Also for j < i
Hence j = 0 for j < I and equation (1) becomes
(1)
i i ia a
1, , pa a
i p
the principal component thi iC i a x
are the eignevectors of associated with the eigenvalues
Thus
and
1. Var(C1) is maximized.
2. Var(Ci) is maximized subject to Ci being independent of C1, …, Ci-1 (the previous i -1 principal components)
where
1, , pa a
0i p
Recall any positive matrix,
where are eigenvectors of of length 1 and
1 1
1
0
, ,
0p
p p
a
a a PDP
a
are eigenvalues of
1, , is an orthogonal matrix.
( )
pP a a
P P PP I
Example
In this example wildlife (moose) population density was measured over time (once a year) in three areas.
Year Area 1 Area 2 Area 3 Year Area 1 Area 2 Area 3
1 11.3 14.1 6.9 13 6.1 9.9 6.82 10.4 14 11.2 14 9.7 13.2 6.63 9.9 13 8.7 15 8.1 9.4 44 8.2 11.4 3.3 16 11.3 11.8 4.95 10.1 11.9 8.7 17 8.8 11.5 8.86 10.7 13.8 12.5 18 9.4 11.6 5.77 11 14.9 8.9 19 7.5 11.4 4.98 7.1 8.5 3.7 20 8.8 10.7 7.29 14.7 14.5 12.1 21 7.5 11.1 7
10 5.4 9 4.1 22 9.1 13.2 8.911 7.3 7.6 5.6 23 6.8 9.8 7.612 10.2 10.9 7.3
picture
Area 3
Area 1
Area 2
The Sample Statistics
4.297 3.307 3.295
3.527 3.527
6.566
S
9.10
11.62
7.19
x
1 .796 .620
1 .687
1
R
The mean vector
The covariance matrix
The correlation matrix
Principal component Analysis
1 2 3
.522 .582 .624
.523 , .359 , .733
.674 .730 .117
a a a
1 2 311.85974, 2.204232, 0.814249 The eigenvalues of S
The eigenvectors of S
1 1 2 3
2 1 2 3
3 1 2 3
.522 .523 .674
.582 .359 .730
.624 .733 .117
C x x x
C x x x
C x x x
The principal components
Area 3
Area 1
Area 2
1 1 2 3.522 .523 .674C x x x
Area 3
Area 1
Area 2
2 1 2 3.582 .359 .730C x x x
Area 3
Area 1
Area 2
3 1 2 3.624 .733 .117C x x x
Graphical Picture of Principal Components
Multivariate Normal data falls in an ellipsoidal pattern.
The shape and orientation of the ellipsoid is determined by the covariance matrix
The eignevectors of are vectors giving the directions of the axes of the ellopsoidThe eigenvalues give the length of these axes.
Recall that if is a positive definite matrix
1 1 1 p p pa a a a
1 1
1 1
0
, ,
0 p p
a
a a
a
PDPwhere P is an orthogonal matrix (P’P = PP’ = I) with the columns equal to the eigenvectors of .and D is a diagonal matrix with diagonal elements equal to the eigenvalues of .
The vector of Principal components
1 1 1
p p p
C a x a
C x P x
C a x a
c P P P PDP P P P D P P
has covariance matrix
1 0
0 p
D
An orthogonal matrix rotates vectors, thus
C P x
ctr tr P P tr PP tr
Also
1 1
p p
i iii i
1 1
var var Total Variance of p p
i ii i
C x x
rotates the vector x
into the vector of Principal components C
tr(D) =
The ratio
1 1
var
Total Variance of ii i
p p
j jjj j
C
x
denotes the proportion of variance explained by the ith principal component Ci.
The Example
i i % variance
1 11.8597 79.71%2 2.20423 14.82%3 0.81425 5.47%
Total 14.8782 100%
cov , cov ,i j i j i jC x a x e x a e
Also
1 1 1i p p p ja a a a a e
i i j i ija e a
0,0, ,0,1,0, ,0jj
e
where
cov ,corr ,
i j
i j
i j
C xC x
Var C Var x
i ij iij
jji jj
aa
Comment:If instead of the covariance matrix, , The correlation matrix , is used to extract the Principal components then the Principal components are defined in terms of the standard scores of the observations:
i ii
ii
xz
*and corr i i ijC a
The correlation matrix is the covariance matrix of the standard scores of the observations:
More Examples
Example 2: Bone Lengths of White Leghorn Fowl: The correlation matrix of the complete set of six fowl bone
measurements had the following form: Skull Length 1.000 0.584 0.615 0.601 0.570 0.600 Skull Breadth 1.000 0.576 0.530 0.526 0.555 Humerus 1.000 0.940 0.875 0.878 Ulna 1.000 0.877 0.886 Femur 1.000 0.924 Tibia 1.000
Table: Principal Components Component Dimension 1 2 3 4 5 6
Skull: Length 0.74 0.45 0.49 -0.02 0.01 0.00 Breadth 0.70 0.59 -0.41 0.00 0.00 -0.01 Wing: Humerus 0.95 -0.16 -0.03 0.22 0.05 0.16 Ulna 0.94 -0.21 0.01 0.20 -0.04 -0.17 Leg: Femur 0.93 -0.24 -0.04 -0.21 0.18 -0.03 Tibia 0.94 -0.19 -0.03 -0.20 -0.19 0.04 Eigenvalue 4.568 0.714 0.412 0.173 0.076 0.057 % of Total 76.1 11.9 6.9 2.9 1.3 0.9 Variance
Identification of Components: component Description 1 General average of all bone dimensions (Size) 2 Comparison of skull sizewith Wing and Leg lengths 3 Comparison of skull length and breadth (Skull Shape) 4 Comparison of Wing and Leg lengths 5 Comparison of femur and tibia 6 Comparison of humerus and ulna
Example 3: Weschler Adult Intelligence Scale Subtest Scores Table: Principal Components Component 1 2 3 4 WAIS subtest: Information 0.83 0.33 -0.04 -0.01 Comprehension 0.75 0.31 0.07 -0.17 Arihtmetic 0.72 0.25 -0.08 0.35 Similarities 0.78 0.14 0.00 -0.21 Digit Span 0.62 0.00 -0.38 0.58 Vocabulary 0.83 0.38 -0.03 -0.16 Digit Symbol 0.72 -0.36 -0.26 -0.01 Picture Completion 0.78 -0.10 -0.25 -0.01 Block Design 0.72 -0.26 0.36 0.18 Picture Arrangement 0.72 -0.23 0.04 -0.05 Object assembly 0.65 -0.30 0.47 0.13 Age: -0.34 0.80 0.26 0.18 Years of Education: 0.75 0.01 -0.30 -0.23
Eigenvalue 6.69 1.42 0.80 0.71
% of Total Variance 51.47 10.90 6.15 5.48 Cum % of Variance 51.47 62.37 68.52 74.01
Identification of Components: component Description 1 General intellectual Performance 2 Experiential or age factor - bipolar dimension comparing verbal
or informational skills known to increase with advancing age to subtests measuring spatial-perceptual qualities and other cognitive abilities known to decrease with age
3 Spatial imagery or perception dimension 4 Numerical facility
Computation of the eigenvalues and eigenvectors of
1 1 1 p p pa a a a
21 1 1 p p pa a a a
1 1 1 p p pa a a a
2 21 1 1 p p pa a a a
Recall:
continuing we see that:
1 1 1n n n
p p pa a a a
21 1 1
1 1
n n
pnp p p pa a a a a a
1 1 1n a a
For large values of n
1 1n a a
The algorithm for computing the eigenvector 1a
1. Compute
2 4 8 16, , , , etc
rescaling so that the elements do not become to large in value.
i.e. rescale so that the largest element is 1.
2. Compute
1 1 1 1 for large and 1n ka a n a a 1a
using the fact that:
3. Compute 1 using
1 1 1a a
4. Repeat using the matrix
21 1 1 2 2 2 p p pa a a a a a
5. Continue with i = 2 , … , p – 1 using the matrix
11 1 1
i ii i i i i i p p pa a a a a a
Example – Using Excel - Eigen
Top Related