Intermediate R - Multidimensional Scaling

5
Multidimensional Scaling Violeta I. Bartolome Senior Associate Scientist PBGB-CRIL [email protected] Multidimensional scaling Similar to PCA but takes dissimilarity as input. Provide a visual representation of the pattern of proximities among a set of objects in a lower dimensional space. Typically displayed on a 2-d plot Example The “points” that are represented in multidimensional space can be anything. These objects may be genotypes, in which case MDS can identify clusters of genotypes who are “close” versus “distant” in terms of, say, morphological characters. Multidimensional Scaling Procedures As long as the “distance” between the objects can be assessed in some fashion, MDS can be used to find the lowest dimensional space that still adequately capture the distances between objects. Once the number of dimensions is identified, a further challenge is identifying the meaning of those dimensions.

Transcript of Intermediate R - Multidimensional Scaling

Page 1: Intermediate R - Multidimensional Scaling

Multidimensional Scaling

Violeta I. BartolomeSenior Associate Scientist

[email protected]

Multidimensional scaling

• Similar to PCA but takes dissimilarity as input.

• Provide a visual representation of the pattern of proximities among a set of objects in a lower dimensional space.

• Typically displayed on a 2-d plot

Example

• The “points” that are represented in multidimensional space can be anything.

• These objects may be genotypes, in which case MDS can identify clusters of genotypes who are “close” versus “distant” in terms of, say, morphological characters.

Multidimensional Scaling Procedures

• As long as the “distance” between the objects can be assessed in some fashion, MDS can be used to find the lowest dimensional space that still adequately capture the distances between objects.

• Once the number of dimensions is identified, a further challenge is identifying the meaning of those dimensions.

Page 2: Intermediate R - Multidimensional Scaling

Multidimensional Scaling Procedures

• Basic data representation in MDS is a dissimilarity matrix that shows the distance between every possible pair of objects.

• The goal of MDS is to represent these distances with the lowest possible dimensional space.

Two types of MDS

• Classical or metric

• Non-metric

The difference between the two is the

stress function that is being minimized in

the scaling procedure. Stress is a statistic

for measuring goodness of fit.

MDS in R

Sample Data (used in PCA)

Page 3: Intermediate R - Multidimensional Scaling

Read data to RDistance matrix

Computes the Euclidean distances

between points.

Classical (metric) MDS

k is the number of

dimensions

cmdscale does not

give the stress

statistic

Plot solution

Page 4: Intermediate R - Multidimensional Scaling

Non-metric MDS

• Options for distance matrixo manhattano euclideano canberrao brayo kulczynskio jaccardo gowero altGowero morisitao horno mountfordo raupo binomialo chao

Default dimension (k) is 2

Large values indicate poor fit.

The non-metric fit is

based on stress values

The linear fit is the

correlation between

fitted values and

ordination distances

Page 5: Intermediate R - Multidimensional Scaling

Thank you!