Introduction

33

description

Introduction. Given a Matrix of distances D, (which contains zeros in the main diagonal and is squared and symmetric), find variables which could be able, approximately, to generate, these distances. - PowerPoint PPT Presentation

Transcript of Introduction

Page 1: Introduction
Page 2: Introduction

Introduction

• Given a Matrix of distances D, (which contains zeros in the main diagonal and is squared and symmetric), find variables which could be able, approximately, to generate, these distances.

• The matrix can also be a similarities matrix, squared and symmetric but with ones in the main diagonal and values between zero and one elsewhere.

• Broadly: Distance (0 d 1) =1- similarity

Page 3: Introduction
Page 4: Introduction
Page 5: Introduction

Principal Coordinates (Metric Multidimensional Scaling)

• Given the D matrix of distances, Can we find a set of variables able to generate it ?

• Can we find a data matrix X able to generate D?

Page 6: Introduction

• Main idea of the procedure:

(1) To understand how to obtain D when X is known and given,

(2) Then work backwards to build the matrix X given D

Page 7: Introduction

Procedure

The first is the covariance matrix S

The second is the Q matrix of scalar products among observations

With this matrix we can compute two squared and symmetric matrices

Remember that given a data matrix we have a zero mean data matrix by the transformation:

Page 8: Introduction

The matrix of products Q is closely related to the distance matrix , D, we are interested in. The relation between D and Q is as follows :

Main result: Given the matrix Q we can obtain the matrix D

Elements of Q:

Elements of D:

Page 9: Introduction

How to recover Q given D?

t =trace(Q)

Note that as we have zero mean variables the sum of any row in Q must be zero

Page 10: Introduction

1. Method to recover Q given D

Page 11: Introduction

2. Obtain X given Q

We cannot find exactly X because there will be many solutions to this problem.

IF Q=XX’ also

Q=X A A-1 X’ for any orthogonal matrix A. Thus B=XA is also a solution

The standard solution: Make the spectral decomposition of the matrix Q

Q=ABA’Where A and B contain the non zero eigenvectors and eigenvalues of the

matrix and take as solution X=AB1/2

Note that:

Page 12: Introduction

Conclusion

• We say that D is compatible with an euclidean metric if Q obtained as

Q=-(1/2)PDPis nonnegative (all eigenvalues non negative)

Page 13: Introduction

Summary of the procedure

Page 14: Introduction

Example 1.Cities

Page 15: Introduction

(Note that they add up to zero by rows and columns. The matrix has been divided

by 10000)

Page 16: Introduction

Example 1Eigenstructure of Q :

Page 17: Introduction

Final coordinates for the cities taking two dimensions:

Page 18: Introduction

Example 1. Plot

Page 19: Introduction
Page 20: Introduction

Similarities matrix

Page 21: Introduction

Example 2: similarity between products

Page 22: Introduction

Example 2

Page 23: Introduction

Relationship with PC

• PC: eigenvalues and vectors of S

• PCoordinates: eigenvalues and vectors of Q

If the data are matric both are identical. P Coordinates generalizes PC for non exactly metric data

Page 24: Introduction
Page 25: Introduction

Biplots

Representar conjuntamente los observaciones por las filas de V2 yLas variables mediante las coordenadas D2

/2 A’2

Se denimina biplots porque se hace una aproximación de dos dimensiones a la matriz de datos

Page 26: Introduction
Page 27: Introduction

Biplot

Page 28: Introduction
Page 29: Introduction

Non metric MS

Page 30: Introduction
Page 31: Introduction
Page 32: Introduction
Page 33: Introduction

A common method

• Idea: if we have a monotone relation between x and y it must be a linear exact relationship between the ranks of both variables

• Ordered regression or assign ranks and make a regression between ranks iterating