Multivariate Statistics: Individual Di erences Scaling ... · Applied Multidimensional Scaling,...

Setting the sceneStatistical model and estimation

ApplicationDiscussion

Multivariate Statistics: Individual DifferencesScaling (INDSCAL)

Steffen Unkel

Department of Medical StatisticsUniversity Medical Center Goettingen, Germany

Summer term 2017 1/23



Proximity matrices

Apart from the (raw or preprocessed) data matrix Y ∈ Rn×p,another frequently encountered type of data is the proximitymatrix.

It is arising either1 directly from experiments in which subjects are asked to assess

the similarity of pairs of stimuli,2 or indirectly, as a measure of closeness of the pair of stimuli

derived from their raw profile data.

We are interested in uncovering any structure or pattern thatmay be present in the observed proximity matrix.




Multidimensional scaling (MDS)

MDS seeks a configuration X ∈ Rn×R in low-dimensionalspace such that the distances between points in the spacematch the given dissimilarities D ∈ Rn×n as closely aspossible.

This involves determining both the value of R that provides asatisfactory fit, and the positions of the points in the resultingR-dimensional space.

Fit will be judged by some numerical index of thecorrespondence between the observed proximities and theinter-point distances.




Algorithm for classical MDS

1 Compute the matrix of squared dissimilarities, DD, where denotes the elementwise (Hadamard) matrix product.

2 Apply double centring to this matrix to obtain the matrix ofscalar products, B:

B = −1

2Jn(DD)Jn ,

where Jn = In − n−111> is the n × n centring matrix.




Algorithm for classical scaling

3 Compute the eigendecomposition of B:

B = UΩU> ,

where Ω = diag(ω1, . . . , ωn) is the diagonal matrix ofeigenvalues and U is the corresponding n × n matrix ofnormalized eigenvectors.

4 Let the matrix of the first R eigenvalues > 0 be Ω+ and U+ amatrix of the first R columns of U. Then, the coordinatematrix of classical scaling is given by X = U+Ω

1/2+ , where

Ω1/2+ = diag(ω

1/21 , . . . , ω

1/2R ).




MDS example

Table: Dissimilarity data for all pairs of ten colas for one subject.

Cola number 1 2 3 4 5 6 7 8 9 10

1 02 16 03 81 47 04 56 32 71 05 87 68 44 71 06 60 35 21 98 34 07 84 94 98 57 99 99 08 50 87 79 73 19 92 45 09 99 25 53 98 52 17 99 84 010 16 92 90 83 79 44 24 18 98 0




MDS example

−0.4 −0.2 0.0 0.2 0.4

−0.

20.

00.

20.

4

Dimension 1

Dim

ensi

on 2

1

2

3

4

5

6

7

8

9

10

Figure: Two-dimensional classical MDS solution for dissimilarity data forall pairs of ten colas of one subject.




Types of data sets

Y1

Y2

YK

Y3

n x K plane

n x pmatrix

jthvaria

ble

Figure: Data matrices Yk (k = 1, . . . ,K ), with common variables.




Types of data sets

X1 X2 X3 XK

…

Figure: Configuration matrices Xk (k = 1, . . . ,K ) for K data sets.




Types of data sets

•

•

•

•

D1

D2

Dk

DKdijk

Figure: Distance matrices Dk (k = 1, . . . ,K ). The dashed line indicatespoints giving the distances between i and j for the kth set.




Genesis of Individual Differences Scaling (INDSCAL)




Individual Differences Scaling (INDSCAL)

INDSCAL is an extension of the classical MDS setting interms of separate dissimilarity matrices D1, . . . ,DK .

It was originally developed to explain the relationship betweensubjects’ differential cognition of a set of stimuli.

Suppose that there are K subjects and n stimuli leading to Ksymmetric n × n matrices Dk of dissimilarities.

The aim is to represent the K matrices Dk in terms of agroup average matrix G ∈ Rn×R and diagonal weight matrices(saliences) Wk ∈ RR×R (k = 1 . . . ,K ).




Illustration of INDSCAL




INDSCAL - Carroll and Chang (1970)

Put the distances or dissimilarities into inner-product form bythe double centring operation:

Bk = −1

2Jn(Dk Dk)Jn .

The INDSCAL model approximates each centred matrix Bk byBk ≈ GW2

kG>.

Specifically, the INDSCAL problem seeks for (G,W21, . . . ,W

2K )

such that the model fits the data in a least squares sense.

Optimization problem: minK∑

k=1||Bk − GW2

kG>||2 .




Algorithm

The standard numerical solution is given by an alternatingleast squares (ALS) algorithm, called CANDECOMP.

The two appearances of G in the loss function may berepresented by different matrices, say G and H.

Optimization is carried out on G and H independently, alongwith W2

1, . . . ,W2K .

The belief is that after convergence of CANDECOMP we haveG = H as is required by the model, known as the “symmetryrequirement”.




Algorithm (2)

It has been shown that the CANDECOMP algorithm canproduce asymmetric INDSCAL solutions (G 6= H) even forpositive semi-definite data Bk .

Also, one must hope that the solutions obtained forW2

1, . . . ,W2K have non-negative diagonal elements

throughout.

One can avoid negative saliences by imposing non-negativityconstraints and preserve the symmetry of the solution by usingan algorithm named SYMPRES.




Constraints

Scale indeterminacy: GWk = (GL)(L−1Wk) for anyappropriate diagonal matrix L.

Assign one of the following identification constraints:

K∑k=1

W2k = K IR or diag(G>G) = IR .

Orthonormal INDSCAL constrains the group stimulus space tobe column-wise orthonormal:

G>G = IR .

A column-wise orthonormal matrix G forms a compact set andthen the optimization problem stated above has an attainablesolution.




Direct analysis of the dissimilarity data

The reasons for using the inner-products Bk rather than themore fundamental data Dk were probably because

1 Bk has better algebraic properties than does Dk and2 Bk provides a firm link with classical scaling.

However, direct analysis of Dk (k = 1, . . . ,K ) has becomepossible.

The rows of GWk give the coordinates that generate thematrix ∆k of dissimilarities for the kth individual(k = 1 . . . ,K ), which we shall denote by GWk → ∆k .




Direct analysis of the dissimilarity data (2)

The distance version of the model GWk seeks a solution tothe following optimization problem:

minK∑

k=1

||Dk −∆k ||2 .

The routine SMACOF for individual differences solves thisoptimization problem by using a majorization approach to finda group space G and dimension weights Wk associated withK dissimilarity matrices Dk (k = 1, . . . ,K ).

This routine is available in the function smacofIndDiff() inthe R package smacof.




IDIOSCAL

INDSCAL assumes that the Wk (k = 1, . . . ,K ) are diagonalmatrices.

If one allows the Wk to be positive semi-definite symmetricmatrices, a more general model arises that allows forperson-specific (idiosyncratic) rotations of the common space.

This is known as the IDIOSCAL (Individual Differences inOrientation Scaling) model.

The consequence of allowing for a rotation of the commonspace before stretching or compression is that the point gridin the Figure on Slide 13 will be sheared, in general.




Description of the data

The dataset contains dissimilarity data arising from anexperiment carried out by Helm (1959).

There were 14 subjects that rated the similarity of ten colours,2 of whom replicated the experiment.

10 subjects have a normal colour vision (labelled by N1 toN10), 4 of them are red-green deficient in varying degrees(labelled by CD1 to CD4).

Helm’s colour data are contained in the list helm in the Rpackage smacof and the list consists of the dissimilaritymatrices for each of the subjects, including the replications.




Concluding remarks

Methods of two-way MDS are applicable to a single matrix ofproximities.

In most studies involving multiple proximity matrices, it isprecisely the individual differences that are likely to be most ofinterest.

INDSCAL and IDIOSCAL are designed specifically for scalingthree-way data and operate by deriving an overall groupstimulus space together with individual spaces for eachsubject.

The individual spaces are allowed to differ from the groupspace by a specified class of transformations.

A drawback of the models for scaling three-way data is that itis not possible to test them by statistical methods.




References

Carroll, J. D. and Chang, J. J. (1970):Analysis of individual differences in multidimensional scaling via an n-waygeneralization of ‘Eckart-Young’ decomposition, Psychometrika, 35, pp. 283-319.

ten Berge, J. M. F., Kiers, H. A. L. and Krijnen, W. P. (1993):Computational solutions for the problem of negative saliences and nonsymmetryin INDSCAL, Journal of Classification, 10, pp. 115-124.

De Leeuw, J. and Mair, P. (2009):Multidimensional Scaling Using Majorization: SMACOF in R, Journal ofStatistical Software, 31(3), pp. 1-30. An updated version of this article isavailable athttps://cran.r-project.org/web/packages/smacof/vignettes/smacof.pdf

Borg, I. and Groenen, P. J. F. (2005):Modern Multidimensional Scaling: Theory and Applications, 2nd edition,Springer.

Borg, I., Groenen, P. J. F. and Mair, P. (2013):Applied Multidimensional Scaling, Springer.


https://cran.r-project.org/web/packages/smacof/vignettes/smacof.pdf

Multivariate Statistics: Individual Di erences Scaling ... · Applied Multidimensional Scaling,...

Documents

Transcript of Multivariate Statistics: Individual Di erences Scaling ... · Applied Multidimensional Scaling,...