Principal Components
description
Transcript of Principal Components
Principal Components
• Karl Pearson
Principal Components (PC)
• Objective: Given a data matrix of dimensions nxp (p variables and n elements) try to represent these data by using r variables (r<p) with minimum lost of information
We want to find a new set of p variables, Z, which are linear combinations of the original X variable such that :
• r of them contains all the information • The remaining p-r variables are noise
First interpretation of principal components Optimal Data Representation
xi
a
zi
ri
Proyection of a point in direction a: minimize the squared distanceImplies maximizing the variance (assuming zero mean variables)
xiT
xi = riT ri+ zT
i zi
Optimal Prediction
Find a new variable zi =a’Xi which is optimal to predictThe value of Xi in each element .
In general, find r variables, zi =Ar Xi , which are optimal to forecast All Xi with the least squared error criterion
It is easy to see that the solution is that zi =a’Xi must have maximum variance
Second interpretation of PC:
The line which minimizes the orthogonal distance provides the axes of the ellipsoid
Third interpretation of PC
Find the optimal direction to represent the data. Axe of the ellipsoid which contains the data
This is idea of Pearson orthogonal regression
Second component
Properties of PC
Standardized PC
Example Inves
Example Inves
Example Medifis
Example mundodes
Example Mundodes
Example for image analysis
The analysis have been done with 16 images. PC allows that Instead of sending 16 matrices of N2 pixels
16 3 70,616
we send a vector 16x3 with the values of the components and a matrix 3xN2 with the values of the new variables. We save
If instead of 16 images we have 100 images we save 95%