PROCEDURE PCA - Stanford University€¦ · PROCEDURE for PCA p VARIABLES h 0.3 Striations Not ON X...
Transcript of PROCEDURE PCA - Stanford University€¦ · PROCEDURE for PCA p VARIABLES h 0.3 Striations Not ON X...
PROCEDURE for PCA
p VARIABLES
h 0.3 Striations
Not ON X DATA MATRIX
Xi ITH row of X
Xj j TH OL of X
EXAMPLE of STUDENTS on various EXAMSXi score
X j Scores on A Particular Exam
SCATTER PLOT VAR I X
X
1r points IN
IpDimensionsAR I
CEN tr.NL X X T I Col MEANS
first PC vector w C IRP HWII I s t
2 w Van l t wz Van 2 t Wp Van p
HAS maximum VARIANCE I E
UAR Lti Van LW Xi IS MAX
Linear COMBINATION of VARIABLES WHICH CAPTURES AS
MUCH As PolfiBCE of VARIATION IN THE DATA
xx
xI
tx t
ProJ with MAX UAR
MAIN STREETtM 21
Vanftif I 2 Hi En l i L
n
IF I _O THEN Van ti L tfn i
T TUm Ltily Van w xi I t w'T
UAL wtf JL 11 44
c T 2ProBeem MAX H X w 11
S.t Hw11 1
Solution X I U 2 VT
MAX Aeitievers For w V
i e first PC IS RIGHT finaucan Vector of X Twith LARGEST LWh VALUE
C X 5 TX 5 EX tX covariance matrix
first PC EIGENVECTOR of C WITH LARGEST EIGENVALUE
SECOND PC WE CRP HWY I 1 First PC
set Um wt X IS MAX
Max 11 7 11
S t 11W11 1 d W L V
solution Va
THizD PC W E IRP Llull L L L first 2Pc'sS t Van wt x IS MAX
se i Vz
a o
NEW VARIABLES PCs
ft y.plt xcv u i
NEW VANIABLES
COL MEANS of XC o Col MEANS OF Z o
umLZ.jfLHZ.jlf rfllujli rj 7 jo
EIGEN AWES ARE VARIANCES of PCs
T.givarht.jh.ee Ig Ht.gl Httpvmhxc.jgx jlxoj.tl uxclk
sum of variances of PCs Sum of VARIANCES of
INDIVIDUAL VARIABLES
TOTAL VARIATION IN DATA
Cov 7 j Zoe L Z'T 7 e
rjre u'The
o If g ye
PC VARIABLES AZE uncorrected
PROCEDURE for PCs
1 COMPUTE COVARIANCE MATRIX C X 7 F
l find EIGENVALUES I EIGENVECTORS
Al Ap Vi Vp
3 DISCARD ANY COMPONENT THAT Accounts for ONLYA SMALL proportion of variation IN THE DATA
log 20 VARIABLES
3 PC s contribute 7 90 of VARIATIONon this BASIS IGNORE 17 COMPONENTS