Principal Components Analysis (PCA)

39
Principal Components Analysis (PCA)

description

Principal Components Analysis (PCA). Principal Components Analysis (PCA). a technique for finding patterns in data of high dimension. Outline: Eigenvectors and eigenvalues PCA: Getting the data Centering the data Obtaining the covariance matrix - PowerPoint PPT Presentation

Transcript of Principal Components Analysis (PCA)

Page 1: Principal Components Analysis (PCA)

Principal Components Analysis (PCA)

Page 2: Principal Components Analysis (PCA)

Principal Components Analysis (PCA)

a technique for finding patterns in data of high dimension

Page 3: Principal Components Analysis (PCA)

Outline:

1. Eigenvectors and eigenvalues

2. PCA:

a) Getting the data

b) Centering the data

c) Obtaining the covariance matrix

d) Performing an eigenvalue decomposition of the covariance matrix

e) Choosing components and forming a feature vector

f) Deriving the new data set

Page 4: Principal Components Analysis (PCA)

Outline:

1. Eigenvectors and eigenvalues

2. PCA:

a) Getting the data

b) Centering the data

c) Obtaining the covariance matrix

d) Performing an eigenvalue decomposition of the covariance matrix

e) Choosing components and forming a feature vector

f) Deriving the new data set

Page 5: Principal Components Analysis (PCA)

Eigenvectors & eigenvalues:

2 32 1

2 32 1

Page 6: Principal Components Analysis (PCA)

Eigenvectors & eigenvalues:

2 32 1

13*

2 32 1

Page 7: Principal Components Analysis (PCA)

Eigenvectors & eigenvalues:

2 32 1

13

115* =

2 32 1

Page 8: Principal Components Analysis (PCA)

Eigenvectors & eigenvalues:

2 32 1

13

115* =

2 32 1

32*

Page 9: Principal Components Analysis (PCA)

Eigenvectors & eigenvalues:

2 32 1

13

115* =

2 32 1

32

128* =

Page 10: Principal Components Analysis (PCA)

Eigenvectors & eigenvalues:

2 32 1

13

115* =

2 32 1

32

128* = = 4 *

32

Page 11: Principal Components Analysis (PCA)

Eigenvectors & eigenvalues:

2 32 1

13

115* =

2 32 1

32

128* = = 4 *

32

Page 12: Principal Components Analysis (PCA)

Eigenvectors & eigenvalues:

2 32 1

13

115* =

2 32 1

32

128* = = 4 *

32

not an integer multiple of the original vector

4 times the original vector

Page 13: Principal Components Analysis (PCA)

Eigenvectors & eigenvalues:

2 32 1

13

115* =

2 32 1

32

128* = = 4 *

32

not an integer multiple of the original vector

4 times the original vector

transformation matrix

eigenvector(this vector and all multiples of it)

eigenvalue

Page 14: Principal Components Analysis (PCA)

Eigenvectors & eigenvalues:

2 32 1

13

115* =

2 32 1

32

128* = = 4 *

32

not an integer multiple of the original vector

4 times the original vector

12

2 (3,2)

8

3

transformation matrix

eigenvector(this vector and all multiples of it)

eigenvalue

Page 15: Principal Components Analysis (PCA)

Eigenvectors & eigenvalues:

2 32 1

13

115* =

2 32 1

32

128* = = 4 *

32

not an integer multiple of the original vector

4 times the original vector

12

2 (3,2)

8

3

(12,8)

transformation matrix

eigenvector(this vector and all multiples of it)

eigenvalue

Page 16: Principal Components Analysis (PCA)

Eigenvectors & eigenvalues:

2 32 1

13

115* =

2 32 1

32

128* = = 4 *

32

not an integer multiple of the original vector

4 times the original vector

12

2 (3,2)

8

3

(12,8)

transformation matrix

eigenvector(this vector and all multiples of it)

eigenvalue

Some properties of eigenvectors:

Page 17: Principal Components Analysis (PCA)

Eigenvectors & eigenvalues:

2 32 1

13

115* =

2 32 1

32

128* = = 4 *

32

not an integer multiple of the original vector

4 times the original vector

12

2 (3,2)

8

3

(12,8)

transformation matrix

eigenvector(this vector and all multiples of it)

eigenvalue

Some properties of eigenvectors: eigenvectors can only be found for square matrices

Page 18: Principal Components Analysis (PCA)

Eigenvectors & eigenvalues:

2 32 1

13

115* =

2 32 1

32

128* = = 4 *

32

not an integer multiple of the original vector

4 times the original vector

12

2 (3,2)

8

3

(12,8)

transformation matrix

eigenvector(this vector and all multiples of it)

eigenvalue

Some properties of eigenvectors: eigenvectors can only be found for square matrices not every square matrix has eigenvectors

Page 19: Principal Components Analysis (PCA)

Eigenvectors & eigenvalues:

2 32 1

13

115* =

2 32 1

32

128* = = 4 *

32

not an integer multiple of the original vector

4 times the original vector

12

2 (3,2)

8

3

(12,8)

transformation matrix

eigenvector(this vector and all multiples of it)

eigenvalue

Some properties of eigenvectors: eigenvectors can only be found for square matrices not every square matrix has eigenvectors given that an n x n matrix has eigenvectors, there are n of

them

Page 20: Principal Components Analysis (PCA)

Eigenvectors & eigenvalues:

2 32 1

13

115* =

2 32 1

32

128* = = 4 *

32

not an integer multiple of the original vector

4 times the original vector

12

2 (3,2)

8

3

(12,8)

transformation matrix

eigenvector(this vector and all multiples of it)

eigenvalue

Some properties of eigenvectors: eigenvectors can only be found for square matrices not every square matrix has eigenvectors given that an n x n matrix has eigenvectors, there are n of

them the eigenvectors of a matrix are orthogonal, i.e. at right

angles to each other

Page 21: Principal Components Analysis (PCA)

Eigenvectors & eigenvalues:

2 32 1

13

115* =

2 32 1

32

128* = = 4 *

32

not an integer multiple of the original vector

4 times the original vector

12

2 (3,2)

8

3

(12,8)

transformation matrix

eigenvector(this vector and all multiples of it)

eigenvalue

Some properties of eigenvectors: eigenvectors can only be found for square matrices not every square matrix has eigenvectors given that an n x n matrix has eigenvectors, there are n of

them the eigenvectors of a matrix are orthogonal, i.e. at right

angles to each other if an eigenvector is scaled by some amount before multiplying

the matrix by it, the result is still the same multiple (because scaling a vector only changes its length, not its direction)

Page 22: Principal Components Analysis (PCA)

Eigenvectors & eigenvalues:

2 32 1

13

115* =

2 32 1

32

128* = = 4 *

32

not an integer multiple of the original vector

4 times the original vector

12

2 (3,2)

8

3

(12,8)

transformation matrix

eigenvector(this vector and all multiples of it)

eigenvalue

Some properties of eigenvectors: eigenvectors can only be found for square matrices not every square matrix has eigenvectors given that an n x n matrix has eigenvectors, there are n of

them the eigenvectors of a matrix are orthogonal, i.e. at right

angles to each other if an eigenvector is scaled by some amount before multiplying

the matrix by it, the result is still the same multiple (because scaling a vector only changes its length, not its direction)

Page 23: Principal Components Analysis (PCA)

Eigenvectors & eigenvalues:

2 32 1

13

115* =

2 32 1

64* =

not an integer multiple of the original vector

4 times the original vector

12

2 (3,2)

8

3

(12,8)

transformation matrix

eigenvector(this vector and all multiples of it)

eigenvalue

= 4 *

Some properties of eigenvectors: eigenvectors can only be found for square matrices not every square matrix has eigenvectors given that an n x n matrix has eigenvectors, there are n of

them the eigenvectors of a matrix are orthogonal, i.e. at right

angles to each other if an eigenvector is scaled by some amount before multiplying

the matrix by it, the result is still the same multiple (because scaling a vector only changes its length, not its direction)

Page 24: Principal Components Analysis (PCA)

Eigenvectors & eigenvalues:

2 32 1

13

115* =

2 32 1

64

2416* =

not an integer multiple of the original vector

4 times the original vector

12

2 (3,2)

8

3

(12,8)

transformation matrix

eigenvector(this vector and all multiples of it)

eigenvalue

= 4 *

Some properties of eigenvectors: eigenvectors can only be found for square matrices not every square matrix has eigenvectors given that an n x n matrix has eigenvectors, there are n of

them the eigenvectors of a matrix are orthogonal, i.e. at right

angles to each other if an eigenvector is scaled by some amount before multiplying

the matrix by it, the result is still the same multiple (because scaling a vector only changes its length, not its direction)

Page 25: Principal Components Analysis (PCA)

Eigenvectors & eigenvalues:

2 32 1

13

115* =

2 32 1

64

2416* = = 4 *

64

not an integer multiple of the original vector

4 times the original vector

12

2 (3,2)

8

3

(12,8)

transformation matrix

eigenvector(this vector and all multiples of it)

eigenvalue

Some properties of eigenvectors: eigenvectors can only be found for square matrices not every square matrix has eigenvectors given that an n x n matrix has eigenvectors, there are n of

them the eigenvectors of a matrix are orthogonal, i.e. at right

angles to each other if an eigenvector is scaled by some amount before multiplying

the matrix by it, the result is still the same multiple (because scaling a vector only changes its length, not its direction)

Page 26: Principal Components Analysis (PCA)

Eigenvectors & eigenvalues:

2 32 1

13

115* = not an integer multiple of the original

vector

4 times the original vector

12

2 (3,2)

8

3

(12,8)

transformation matrix

eigenvector(this vector and all multiples of it)

eigenvalue

Some properties of eigenvectors: eigenvectors can only be found for square matrices not every square matrix has eigenvectors given that an n x n matrix has eigenvectors, there are n of

them the eigenvectors of a matrix are orthogonal, i.e. at right

angles to each other if an eigenvector is scaled by some amount before multiplying

the matrix by it, the result is still the same multiple (because scaling a vector only changes its length, not its direction)

→ thus, eigenvalues and eigenvectors come in pairs

2 32 1

64

2416* = = 4 *

64

Page 27: Principal Components Analysis (PCA)

Outline:

1. Eigenvectors and eigenvalues

2. PCA:

a) Getting the data

b) Centering the data

c) Obtaining the covariance matrix

d) Performing an eigenvalue decomposition of the covariance matrix

e) Choosing components and forming a feature vector

f) Deriving the new data set

Page 28: Principal Components Analysis (PCA)

Outline:

1. Eigenvectors and eigenvalues

2. PCA:

a) Getting the data

b) Centering the data

c) Obtaining the covariance matrix

d) Performing an eigenvalue decomposition of the covariance matrix

e) Choosing components and forming a feature vector

f) Deriving the new data set

Page 29: Principal Components Analysis (PCA)

PCA

- a technique for identifying patterns in data and expressing the data in such a way as to highlight their similarities and differences

- once these patterns are found, the data can be compressed (i.e. the number of dimensions can be reduced) without much loss of information

Page 30: Principal Components Analysis (PCA)

PCA

- a technique for identifying patterns in data and expressing the data in such a way as to highlight their similarities and differences

- once these patterns are found, the data can be compressed (i.e. the number of dimensions can be reduced) without much loss of information

- example: data of 2 dimensions:

x y-0.7 0.22.1 2.71.7 2.31.4 0.81.9 21.8 10.4 -0.41.1 0.30.9 0.4-0.6 0.7

original data:

-0.5 0.0 0.5 1.0 1.5 2.0

-0.5

0.0

0.5

1.0

1.5

2.0

2.5

x

y

Page 31: Principal Components Analysis (PCA)

PCA

- a technique for identifying patterns in data and expressing the data in such a way as to highlight their similarities and differences

- once these patterns are found, the data can be compressed (i.e. the number of dimensions can be reduced) without much loss of information

- example: data of 2 dimensions:

x y-0.7 0.22.1 2.71.7 2.31.4 0.81.9 21.8 10.4 -0.41.1 0.30.9 0.4-0.6 0.7

x y-1.7 -0.81.1 1.70.7 1.30.4 -0.20.9 10.8 0-0.6 -1.40.1 -0.7-0.1 -0.6-1.6 -0.3

original data: centered data:

-0.5 0.0 0.5 1.0 1.5 2.0

-0.5

0.0

0.5

1.0

1.5

2.0

2.5

x

y

Page 32: Principal Components Analysis (PCA)

PCA

- a technique for identifying patterns in data and expressing the data in such a way as to highlight their similarities and differences

- once these patterns are found, the data can be compressed (i.e. the number of dimensions can be reduced) without much loss of information

- example: data of 2 dimensions:

x y-0.7 0.22.1 2.71.7 2.31.4 0.81.9 21.8 10.4 -0.41.1 0.30.9 0.4-0.6 0.7

x y-1.7 -0.81.1 1.70.7 1.30.4 -0.20.9 10.8 0-0.6 -1.40.1 -0.7-0.1 -0.6-1.6 -0.3

original data: centered data:

-2 -1 0 1 2

-2-1

01

2

x centered

y ce

nter

ed

Page 33: Principal Components Analysis (PCA)

Covariance matrix:

x yx 1.015556 0.696667y 0.696667 1.017778

Eigenvalues of the covariance matrix:

Eigenvectors of the covariance matrix:

1.71333420.3199991

0.706543 -0.707670.70767 0.706543

* unit eigenvectors

x y-0.7 0.22.1 2.71.7 2.31.4 0.81.9 21.8 10.4 -0.41.1 0.30.9 0.4-0.6 0.7

x y-1.7 -0.81.1 1.70.7 1.30.4 -0.20.9 10.8 0-0.6 -1.40.1 -0.7-0.1 -0.6-1.6 -0.3

original data: centered data:

-2 -1 0 1 2

-2-1

01

2

x centered

y ce

nter

ed

Page 34: Principal Components Analysis (PCA)

Covariance matrix:

x yx 1.015556 0.696667y 0.696667 1.017778

Eigenvalues of the covariance matrix:

Eigenvectors of the covariance matrix:

1.71333420.3199991

0.706543 -0.707670.70767 0.706543

* unit eigenvectors

x y-0.7 0.22.1 2.71.7 2.31.4 0.81.9 21.8 10.4 -0.41.1 0.30.9 0.4-0.6 0.7

x y-1.7 -0.81.1 1.70.7 1.30.4 -0.20.9 10.8 0-0.6 -1.40.1 -0.7-0.1 -0.6-1.6 -0.3

original data: centered data:

-2 -1 0 1 2

-2-1

01

2

x centered

y ce

nter

ed

Page 35: Principal Components Analysis (PCA)

Compression and reduced dimensionality:

- the eigenvector associated with the highest eigenvalue is the principal component of the dataset; it captures the most significant relationship between the data dimensions

- ordering eigenvalues from highest to lowest gives the components in order of significance

- if we want, we can ignore the components of lesser significance – this results in a loss of information, but if the eigenvalues are small, the loss will not be great

- thus, in a dataset with n dimensions/variables, one may obtain the n eigenvectors & eigenvalues and decide to retain p of them → this results in a final dataset with only p dimensions

- feature vector – a vector containing only the eigenvectors representing the dimensions we want to keep – in our example data, 2 choices:

- the final dataset is obtained by multiplying the transpose of the feature vector (on the left) with the transposed original dataset

- this will give us the original data solely in terms of the vectors we chose

0.706543 -0.707670.70767 0.706543 or

0.7065430.70767

Page 36: Principal Components Analysis (PCA)

Our example:

a) retain both eigenvectors:

b) retain only the first eigenvector / principal component:

0.706543 0.70767-0.70767 0.706543

0.706543 0.70767

* data t = new data

2x2 2x10 2x10

* data t = new data

1x2 2x10 1x100 1 2 3

-1.5

-1.0

-0.5

0.0

0.5

1.0

1.5

PC1

PC

2

Original data, rotated so that the eigenvectors are the axes - no loss of information.

Page 37: Principal Components Analysis (PCA)

Our example:

a) retain both eigenvectors:

b) retain only the first eigenvector / principal component:

0.706543 0.70767-0.70767 0.706543

0.706543 0.70767

* data t = new data

2x2 2x10 2x10

* data t = new data

1x2 2x10 1x10

Original data, rotated so that the eigenvectors are the axes - no loss of information.

Only 1 dimension left – we threw away the other axis.

0 1 2 3

-1.5

-1.0

-0.5

0.0

0.5

1.0

1.5

PC1

Thus: we transformed the data so that is expressed in terms of the patterns, where the patterns are the lines that most closely describe the relationships between the data.Now, the values of the data points tell us exactly where (i.e., above/below) the trend lines the data point sits.

Page 38: Principal Components Analysis (PCA)

- similar to Cholesky decomposition as one can use both to fully decompose the original covariance matrix

- in addition, both produce uncorrelated factors

Eigenvalue decomposition:

Cholesky decomposition:

Eigenvalue: decomposition

C = E V E-1

Cholesky decomposition:

C = Λ Ψ Λt

c11 c21

c21 c22* = v1 *

e11

e2

1

e11

e2

1

c11 c21

c21 c22=

e11 e12

e2

1

e2

2

v11

v2

2

e11 e12

e2

1

e2

2

-1

c11 c21

c21 c22=

l11

l21 l22

va11

va2

2

l11 l21

l22 ch1 ch2

var1 var2

va1 va2

l11 l22l21

pc1 pc2

var1 var2

v11 v22

e11 e22e21 e12

Page 39: Principal Components Analysis (PCA)

Thank you for your attention.