Introduction
-
Upload
jolanta-dbrowski -
Category
Documents
-
view
16 -
download
0
description
Transcript of Introduction
![Page 1: Introduction](https://reader031.fdocuments.in/reader031/viewer/2022032805/5681324c550346895d98c2ee/html5/thumbnails/1.jpg)
![Page 2: Introduction](https://reader031.fdocuments.in/reader031/viewer/2022032805/5681324c550346895d98c2ee/html5/thumbnails/2.jpg)
Introduction
• Given a Matrix of distances D, (which contains zeros in the main diagonal and is squared and symmetric), find variables which could be able, approximately, to generate, these distances.
• The matrix can also be a similarities matrix, squared and symmetric but with ones in the main diagonal and values between zero and one elsewhere.
• Broadly: Distance (0 d 1) =1- similarity
![Page 3: Introduction](https://reader031.fdocuments.in/reader031/viewer/2022032805/5681324c550346895d98c2ee/html5/thumbnails/3.jpg)
![Page 4: Introduction](https://reader031.fdocuments.in/reader031/viewer/2022032805/5681324c550346895d98c2ee/html5/thumbnails/4.jpg)
![Page 5: Introduction](https://reader031.fdocuments.in/reader031/viewer/2022032805/5681324c550346895d98c2ee/html5/thumbnails/5.jpg)
Principal Coordinates (Metric Multidimensional Scaling)
• Given the D matrix of distances, Can we find a set of variables able to generate it ?
• Can we find a data matrix X able to generate D?
![Page 6: Introduction](https://reader031.fdocuments.in/reader031/viewer/2022032805/5681324c550346895d98c2ee/html5/thumbnails/6.jpg)
• Main idea of the procedure:
(1) To understand how to obtain D when X is known and given,
(2) Then work backwards to build the matrix X given D
![Page 7: Introduction](https://reader031.fdocuments.in/reader031/viewer/2022032805/5681324c550346895d98c2ee/html5/thumbnails/7.jpg)
Procedure
The first is the covariance matrix S
The second is the Q matrix of scalar products among observations
With this matrix we can compute two squared and symmetric matrices
Remember that given a data matrix we have a zero mean data matrix by the transformation:
![Page 8: Introduction](https://reader031.fdocuments.in/reader031/viewer/2022032805/5681324c550346895d98c2ee/html5/thumbnails/8.jpg)
The matrix of products Q is closely related to the distance matrix , D, we are interested in. The relation between D and Q is as follows :
Main result: Given the matrix Q we can obtain the matrix D
Elements of Q:
Elements of D:
![Page 9: Introduction](https://reader031.fdocuments.in/reader031/viewer/2022032805/5681324c550346895d98c2ee/html5/thumbnails/9.jpg)
How to recover Q given D?
t =trace(Q)
Note that as we have zero mean variables the sum of any row in Q must be zero
![Page 10: Introduction](https://reader031.fdocuments.in/reader031/viewer/2022032805/5681324c550346895d98c2ee/html5/thumbnails/10.jpg)
1. Method to recover Q given D
![Page 11: Introduction](https://reader031.fdocuments.in/reader031/viewer/2022032805/5681324c550346895d98c2ee/html5/thumbnails/11.jpg)
2. Obtain X given Q
We cannot find exactly X because there will be many solutions to this problem.
IF Q=XX’ also
Q=X A A-1 X’ for any orthogonal matrix A. Thus B=XA is also a solution
The standard solution: Make the spectral decomposition of the matrix Q
Q=ABA’Where A and B contain the non zero eigenvectors and eigenvalues of the
matrix and take as solution X=AB1/2
Note that:
![Page 12: Introduction](https://reader031.fdocuments.in/reader031/viewer/2022032805/5681324c550346895d98c2ee/html5/thumbnails/12.jpg)
Conclusion
• We say that D is compatible with an euclidean metric if Q obtained as
Q=-(1/2)PDPis nonnegative (all eigenvalues non negative)
![Page 13: Introduction](https://reader031.fdocuments.in/reader031/viewer/2022032805/5681324c550346895d98c2ee/html5/thumbnails/13.jpg)
Summary of the procedure
![Page 14: Introduction](https://reader031.fdocuments.in/reader031/viewer/2022032805/5681324c550346895d98c2ee/html5/thumbnails/14.jpg)
Example 1.Cities
![Page 15: Introduction](https://reader031.fdocuments.in/reader031/viewer/2022032805/5681324c550346895d98c2ee/html5/thumbnails/15.jpg)
(Note that they add up to zero by rows and columns. The matrix has been divided
by 10000)
![Page 16: Introduction](https://reader031.fdocuments.in/reader031/viewer/2022032805/5681324c550346895d98c2ee/html5/thumbnails/16.jpg)
Example 1Eigenstructure of Q :
![Page 17: Introduction](https://reader031.fdocuments.in/reader031/viewer/2022032805/5681324c550346895d98c2ee/html5/thumbnails/17.jpg)
Final coordinates for the cities taking two dimensions:
![Page 18: Introduction](https://reader031.fdocuments.in/reader031/viewer/2022032805/5681324c550346895d98c2ee/html5/thumbnails/18.jpg)
Example 1. Plot
![Page 19: Introduction](https://reader031.fdocuments.in/reader031/viewer/2022032805/5681324c550346895d98c2ee/html5/thumbnails/19.jpg)
![Page 20: Introduction](https://reader031.fdocuments.in/reader031/viewer/2022032805/5681324c550346895d98c2ee/html5/thumbnails/20.jpg)
Similarities matrix
![Page 21: Introduction](https://reader031.fdocuments.in/reader031/viewer/2022032805/5681324c550346895d98c2ee/html5/thumbnails/21.jpg)
Example 2: similarity between products
![Page 22: Introduction](https://reader031.fdocuments.in/reader031/viewer/2022032805/5681324c550346895d98c2ee/html5/thumbnails/22.jpg)
Example 2
![Page 23: Introduction](https://reader031.fdocuments.in/reader031/viewer/2022032805/5681324c550346895d98c2ee/html5/thumbnails/23.jpg)
Relationship with PC
• PC: eigenvalues and vectors of S
• PCoordinates: eigenvalues and vectors of Q
If the data are matric both are identical. P Coordinates generalizes PC for non exactly metric data
![Page 24: Introduction](https://reader031.fdocuments.in/reader031/viewer/2022032805/5681324c550346895d98c2ee/html5/thumbnails/24.jpg)
![Page 25: Introduction](https://reader031.fdocuments.in/reader031/viewer/2022032805/5681324c550346895d98c2ee/html5/thumbnails/25.jpg)
Biplots
Representar conjuntamente los observaciones por las filas de V2 yLas variables mediante las coordenadas D2
/2 A’2
Se denimina biplots porque se hace una aproximación de dos dimensiones a la matriz de datos
![Page 26: Introduction](https://reader031.fdocuments.in/reader031/viewer/2022032805/5681324c550346895d98c2ee/html5/thumbnails/26.jpg)
![Page 27: Introduction](https://reader031.fdocuments.in/reader031/viewer/2022032805/5681324c550346895d98c2ee/html5/thumbnails/27.jpg)
Biplot
![Page 28: Introduction](https://reader031.fdocuments.in/reader031/viewer/2022032805/5681324c550346895d98c2ee/html5/thumbnails/28.jpg)
![Page 29: Introduction](https://reader031.fdocuments.in/reader031/viewer/2022032805/5681324c550346895d98c2ee/html5/thumbnails/29.jpg)
Non metric MS
![Page 30: Introduction](https://reader031.fdocuments.in/reader031/viewer/2022032805/5681324c550346895d98c2ee/html5/thumbnails/30.jpg)
![Page 31: Introduction](https://reader031.fdocuments.in/reader031/viewer/2022032805/5681324c550346895d98c2ee/html5/thumbnails/31.jpg)
![Page 32: Introduction](https://reader031.fdocuments.in/reader031/viewer/2022032805/5681324c550346895d98c2ee/html5/thumbnails/32.jpg)
![Page 33: Introduction](https://reader031.fdocuments.in/reader031/viewer/2022032805/5681324c550346895d98c2ee/html5/thumbnails/33.jpg)
A common method
• Idea: if we have a monotone relation between x and y it must be a linear exact relationship between the ranks of both variables
• Ordered regression or assign ranks and make a regression between ranks iterating