Multivariate Statistics: Individual Di erences Scaling ... · Applied Multidimensional Scaling,...
Transcript of Multivariate Statistics: Individual Di erences Scaling ... · Applied Multidimensional Scaling,...
Setting the sceneStatistical model and estimation
ApplicationDiscussion
Multivariate Statistics: Individual DifferencesScaling (INDSCAL)
Steffen Unkel
Department of Medical StatisticsUniversity Medical Center Goettingen, Germany
Summer term 2017 1/23
Setting the sceneStatistical model and estimation
ApplicationDiscussion
Proximity matrices
Apart from the (raw or preprocessed) data matrix Y ∈ Rn×p,another frequently encountered type of data is the proximitymatrix.
It is arising either1 directly from experiments in which subjects are asked to assess
the similarity of pairs of stimuli,2 or indirectly, as a measure of closeness of the pair of stimuli
derived from their raw profile data.
We are interested in uncovering any structure or pattern thatmay be present in the observed proximity matrix.
Summer term 2017 2/23
Setting the sceneStatistical model and estimation
ApplicationDiscussion
Multidimensional scaling (MDS)
MDS seeks a configuration X ∈ Rn×R in low-dimensionalspace such that the distances between points in the spacematch the given dissimilarities D ∈ Rn×n as closely aspossible.
This involves determining both the value of R that provides asatisfactory fit, and the positions of the points in the resultingR-dimensional space.
Fit will be judged by some numerical index of thecorrespondence between the observed proximities and theinter-point distances.
Summer term 2017 3/23
Setting the sceneStatistical model and estimation
ApplicationDiscussion
Algorithm for classical MDS
1 Compute the matrix of squared dissimilarities, DD, where denotes the elementwise (Hadamard) matrix product.
2 Apply double centring to this matrix to obtain the matrix ofscalar products, B:
B = −1
2Jn(DD)Jn ,
where Jn = In − n−111> is the n × n centring matrix.
Summer term 2017 4/23
Setting the sceneStatistical model and estimation
ApplicationDiscussion
Algorithm for classical scaling
3 Compute the eigendecomposition of B:
B = UΩU> ,
where Ω = diag(ω1, . . . , ωn) is the diagonal matrix ofeigenvalues and U is the corresponding n × n matrix ofnormalized eigenvectors.
4 Let the matrix of the first R eigenvalues > 0 be Ω+ and U+ amatrix of the first R columns of U. Then, the coordinatematrix of classical scaling is given by X = U+Ω
1/2+ , where
Ω1/2+ = diag(ω
1/21 , . . . , ω
1/2R ).
Summer term 2017 5/23
Setting the sceneStatistical model and estimation
ApplicationDiscussion
MDS example
Table: Dissimilarity data for all pairs of ten colas for one subject.
Cola number 1 2 3 4 5 6 7 8 9 10
1 02 16 03 81 47 04 56 32 71 05 87 68 44 71 06 60 35 21 98 34 07 84 94 98 57 99 99 08 50 87 79 73 19 92 45 09 99 25 53 98 52 17 99 84 010 16 92 90 83 79 44 24 18 98 0
Summer term 2017 6/23
Setting the sceneStatistical model and estimation
ApplicationDiscussion
MDS example
−0.4 −0.2 0.0 0.2 0.4
−0.
20.
00.
20.
4
Dimension 1
Dim
ensi
on 2
1
2
3
4
5
6
7
8
9
10
Figure: Two-dimensional classical MDS solution for dissimilarity data forall pairs of ten colas of one subject.
Summer term 2017 7/23
Setting the sceneStatistical model and estimation
ApplicationDiscussion
Types of data sets
Y1
Y2
YK
Y3
n x K plane
n x pmatrix
jthvaria
ble
Figure: Data matrices Yk (k = 1, . . . ,K ), with common variables.
Summer term 2017 8/23
Setting the sceneStatistical model and estimation
ApplicationDiscussion
Types of data sets
X1 X2 X3 XK
…
Figure: Configuration matrices Xk (k = 1, . . . ,K ) for K data sets.
Summer term 2017 9/23
Setting the sceneStatistical model and estimation
ApplicationDiscussion
Types of data sets
•
•
•
•
D1
D2
Dk
DKdijk
Figure: Distance matrices Dk (k = 1, . . . ,K ). The dashed line indicatespoints giving the distances between i and j for the kth set.
Summer term 2017 10/23
Setting the sceneStatistical model and estimation
ApplicationDiscussion
Genesis of Individual Differences Scaling (INDSCAL)
Summer term 2017 11/23
Setting the sceneStatistical model and estimation
ApplicationDiscussion
Individual Differences Scaling (INDSCAL)
INDSCAL is an extension of the classical MDS setting interms of separate dissimilarity matrices D1, . . . ,DK .
It was originally developed to explain the relationship betweensubjects’ differential cognition of a set of stimuli.
Suppose that there are K subjects and n stimuli leading to Ksymmetric n × n matrices Dk of dissimilarities.
The aim is to represent the K matrices Dk in terms of agroup average matrix G ∈ Rn×R and diagonal weight matrices(saliences) Wk ∈ RR×R (k = 1 . . . ,K ).
Summer term 2017 12/23
Setting the sceneStatistical model and estimation
ApplicationDiscussion
Illustration of INDSCAL
Summer term 2017 13/23
Setting the sceneStatistical model and estimation
ApplicationDiscussion
INDSCAL - Carroll and Chang (1970)
Put the distances or dissimilarities into inner-product form bythe double centring operation:
Bk = −1
2Jn(Dk Dk)Jn .
The INDSCAL model approximates each centred matrix Bk byBk ≈ GW2
kG>.
Specifically, the INDSCAL problem seeks for (G,W21, . . . ,W
2K )
such that the model fits the data in a least squares sense.
Optimization problem: minK∑
k=1||Bk − GW2
kG>||2 .
Summer term 2017 14/23
Setting the sceneStatistical model and estimation
ApplicationDiscussion
Algorithm
The standard numerical solution is given by an alternatingleast squares (ALS) algorithm, called CANDECOMP.
The two appearances of G in the loss function may berepresented by different matrices, say G and H.
Optimization is carried out on G and H independently, alongwith W2
1, . . . ,W2K .
The belief is that after convergence of CANDECOMP we haveG = H as is required by the model, known as the “symmetryrequirement”.
Summer term 2017 15/23
Setting the sceneStatistical model and estimation
ApplicationDiscussion
Algorithm (2)
It has been shown that the CANDECOMP algorithm canproduce asymmetric INDSCAL solutions (G 6= H) even forpositive semi-definite data Bk .
Also, one must hope that the solutions obtained forW2
1, . . . ,W2K have non-negative diagonal elements
throughout.
One can avoid negative saliences by imposing non-negativityconstraints and preserve the symmetry of the solution by usingan algorithm named SYMPRES.
Summer term 2017 16/23
Setting the sceneStatistical model and estimation
ApplicationDiscussion
Constraints
Scale indeterminacy: GWk = (GL)(L−1Wk) for anyappropriate diagonal matrix L.
Assign one of the following identification constraints:
K∑k=1
W2k = K IR or diag(G>G) = IR .
Orthonormal INDSCAL constrains the group stimulus space tobe column-wise orthonormal:
G>G = IR .
A column-wise orthonormal matrix G forms a compact set andthen the optimization problem stated above has an attainablesolution.
Summer term 2017 17/23
Setting the sceneStatistical model and estimation
ApplicationDiscussion
Direct analysis of the dissimilarity data
The reasons for using the inner-products Bk rather than themore fundamental data Dk were probably because
1 Bk has better algebraic properties than does Dk and2 Bk provides a firm link with classical scaling.
However, direct analysis of Dk (k = 1, . . . ,K ) has becomepossible.
The rows of GWk give the coordinates that generate thematrix ∆k of dissimilarities for the kth individual(k = 1 . . . ,K ), which we shall denote by GWk → ∆k .
Summer term 2017 18/23
Setting the sceneStatistical model and estimation
ApplicationDiscussion
Direct analysis of the dissimilarity data (2)
The distance version of the model GWk seeks a solution tothe following optimization problem:
minK∑
k=1
||Dk −∆k ||2 .
The routine SMACOF for individual differences solves thisoptimization problem by using a majorization approach to finda group space G and dimension weights Wk associated withK dissimilarity matrices Dk (k = 1, . . . ,K ).
This routine is available in the function smacofIndDiff() inthe R package smacof.
Summer term 2017 19/23
Setting the sceneStatistical model and estimation
ApplicationDiscussion
IDIOSCAL
INDSCAL assumes that the Wk (k = 1, . . . ,K ) are diagonalmatrices.
If one allows the Wk to be positive semi-definite symmetricmatrices, a more general model arises that allows forperson-specific (idiosyncratic) rotations of the common space.
This is known as the IDIOSCAL (Individual Differences inOrientation Scaling) model.
The consequence of allowing for a rotation of the commonspace before stretching or compression is that the point gridin the Figure on Slide 13 will be sheared, in general.
Summer term 2017 20/23
Setting the sceneStatistical model and estimation
ApplicationDiscussion
Description of the data
The dataset contains dissimilarity data arising from anexperiment carried out by Helm (1959).
There were 14 subjects that rated the similarity of ten colours,2 of whom replicated the experiment.
10 subjects have a normal colour vision (labelled by N1 toN10), 4 of them are red-green deficient in varying degrees(labelled by CD1 to CD4).
Helm’s colour data are contained in the list helm in the Rpackage smacof and the list consists of the dissimilaritymatrices for each of the subjects, including the replications.
Summer term 2017 21/23
Setting the sceneStatistical model and estimation
ApplicationDiscussion
Concluding remarks
Methods of two-way MDS are applicable to a single matrix ofproximities.
In most studies involving multiple proximity matrices, it isprecisely the individual differences that are likely to be most ofinterest.
INDSCAL and IDIOSCAL are designed specifically for scalingthree-way data and operate by deriving an overall groupstimulus space together with individual spaces for eachsubject.
The individual spaces are allowed to differ from the groupspace by a specified class of transformations.
A drawback of the models for scaling three-way data is that itis not possible to test them by statistical methods.
Summer term 2017 22/23
Setting the sceneStatistical model and estimation
ApplicationDiscussion
References
Carroll, J. D. and Chang, J. J. (1970):Analysis of individual differences in multidimensional scaling via an n-waygeneralization of ‘Eckart-Young’ decomposition, Psychometrika, 35, pp. 283-319.
ten Berge, J. M. F., Kiers, H. A. L. and Krijnen, W. P. (1993):Computational solutions for the problem of negative saliences and nonsymmetryin INDSCAL, Journal of Classification, 10, pp. 115-124.
De Leeuw, J. and Mair, P. (2009):Multidimensional Scaling Using Majorization: SMACOF in R, Journal ofStatistical Software, 31(3), pp. 1-30. An updated version of this article isavailable athttps://cran.r-project.org/web/packages/smacof/vignettes/smacof.pdf
Borg, I. and Groenen, P. J. F. (2005):Modern Multidimensional Scaling: Theory and Applications, 2nd edition,Springer.
Borg, I., Groenen, P. J. F. and Mair, P. (2013):Applied Multidimensional Scaling, Springer.
Summer term 2017 23/23