MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes...

37
MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements xi, and embed the elements in N dimensional space such that the inter distances Dij are preserved as much as possible by ||xi−xj|| in the embedded space. 1

Transcript of MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes...

Page 1: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

MDS EmbeddingMDS takes as input a distance matrix D, containing allN × N pair of distances between elements xi, and embedthe elements in N dimensional space such that the inter distances Dij are preserved as much as possible by ||xi−xj|| in the embedded space.

1

Page 2: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

2

Page 3: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

3

Page 4: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

4

Page 5: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

128 dim space visualized by t-SNE

Joint Embeddings of Shapes and Images

Page 6: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements
Page 7: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

Image based Shape Retrieval

Page 8: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

Shape based Image Retrieval

Page 9: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

Cross-View Image Retrieval

Page 10: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

MDS Embedding

11

Page 11: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

Common MDS do not handle outliers

input SMACOF Sammon

12

Page 12: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

Two outlier distances lead to significant distortion in the embedding

In many real-world scenarios, input distances may be noisy or contain outliers, due to malicious acts, system faults, or erroneous measures.

13

Page 13: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

Two outlier distances lead to significant distortion in the embedding

In many real-world scenarios, input distances may be noisy or contain outliers, due to malicious acts, system faults, or erroneous measures.

14

Page 14: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

Least square fitting

15

Page 15: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

Least square fitting

16

Page 16: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

RANSAC- Generate Lines using Pairs of Points.- Count number of points within ε of line.- Pick the best line.

17

Page 17: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

RANSACSadly can’t be applied to MDS – a lot of data is needed for generating an embedding. Almost every sample will still have outliers.

18

Page 18: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

Forero and Giannakis method

Lasso regression parameter (when bigger there are less outliers)

The non-zero entries represent the outlier pairs

● Tuning the regularization parameter is not a simple task.● There are NxN unknowns instead of just dxN, thus it is significantly harder to solve

accuratly and thus very sensitive to the initial guess.

19

Page 19: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

Different λ applied to the same dataset withthe same initial guess, leads to different embedding qualities.

20

Page 20: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

Same λ applied to the same datasets with different initial guesses, yields different embedding qualities.

21

Page 21: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

This graph presents the number of non-zero elements in O (which represent outliers) as a function of λ.

The three plots were generated using different initial guesses that were uniformly sampled.

FG12 method is overly sensitive to the initial guess.

22

Page 22: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

Embed and remove pairs which are overly stressed…

Sadly, the overly stressed edges are not necessarily outliers.(for example long edge that became a short one can cause a lot of short edges to deform in the embedding).

Also other stress weighting has their shortcomings – we tested that method for a while.

23

Page 23: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

Geometric Reasoning

An outlier distance tends to break many triangles.

We detect those outliers and filter them.

24

Page 24: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

Broken Triangles

d3

d1

d2

For triangle with edge length

If

d3

d1

d2

then the triangle is broken

25

Page 25: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

Broken Triangles

An edge in a broken triangle is not necessarily an outlier

26

Not every outlier edge necessarily breaks a triangle

Page 26: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

Histogram of Broken Triangles

27

Page 27: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

Histogram of Broken Triangles

We set ф to be the smallest value that satisfies the following two requirements:

28

Page 28: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

Shepard Diagram

Each point represents a distance. The X-axis represents the input distances and the Y-axis represents the distance in the embedding result.

29

Page 29: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

The Red dots are the distances classified as outliers. Some of the are on diagonal – those are the false positives.

30

Page 30: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

Precision and Recall

31

Page 31: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

Threshold Performance�The outlier detection rate as a function ofthe shrinkage enlargement of the outliers relative to theground-truth value. Edges that are strongly deformed (either squeezed or enlarged) are likely to be detected.

Note: the X-axis is logarithmic: log2(Dout /DGT ).

32

Page 32: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

Qualitative ComparisonA comparison between SMACOF and our method as a function of outlier rate. Up to 22% our method has better performance.

ij

jiij

jiij D

XXSSScore

||||log,

−==∑

33

Page 33: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

The embedding of a ’PLUS’ shaped dataset with 10% outliers, and a ’SPIRAL’ shaped dataset with 15% outliers.

(a,c) SMACOF (b,d) Our technique.

35

Page 34: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

128 US CitiesTwo-dimensional embedding of SGB128 distances with 10% outliers. �The green dots are the ground-truth locations and the magenta dots represent the embedded points.(a) SMACOF (b) Our Filtering technique.

36

Page 35: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

Protein DatasetAverage cluster index value of 10 executions.�The embedding dimension is set to 6, since for lower dimensionsSMACOF fails due to co-located points.

37

Page 36: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

Outlier Detection for Robust Multi-dimensional Scaling

Thank You

40

Page 37: MDS Embedding - TAUdcor/Graphics/cg-slides/sgp-RMDS.pdf · 2017-06-19 · MDS Embedding MDS takes as input a distance matrix D, containing all N × N pair of distances between elements

Outlier Detection for Robust Multi-dimensional Scaling

Leonid Blouvshtein and Daniel Cohen-Or

41