Canonical Correlation Analysis and Related Techniques

Canonical Correlation Analysisand Related Techniques

Simon Mason

International Research Institute for Climate Prediction

The Earth Institute of Columbia University

L i n k i n g S c i e n c e t o S o c i e t yL i n k i n g S c i e n c e t o S o c i e t y

L i n k i n g S c i e n c e t o S p o r t !L i n k i n g S c i e n c e t o S p o r t !

Principal components

A principal component is defined as a weighted sum of a set of original variables in which the variance of the weighted sum is maximized, subject to the constraint that the matrix of weights is orthogonal.

The weights could be defined to meet any set of criteria. For example, rotation is used to redefine the weights so that simple structure is achieved.

There are a number of related techniques that involve defining weighted sums for two sets of variables …


Maximum Covariance Analysis

Consider two sets of variables, X and Y, both with the same number of corresponding cases n. With principal components analysis, the aim is to define a set of weights, U, that generate new variables, Z, with maximum variance:

Consider alternative sets of weights for X and Y (UX and UY) that maximize the covariances, but with similar orthogonality constraints:

Z XU

and X X Y YZ XU Z YU



The covariance to be maximized between ZX and ZY is:

In terms of X and Y:

T X YC Z Z

TT T

X Y

X Y

C XU YU

U X YU

But XTY is the covariance matrix of X with Y, CXY.



T T

T

X Y

X XY Y

C U X YU

U C U

Rearranging:

T

T T T

T

X XY Y

X Y X X XY Y Y

X Y XY

C U C U

U CU U U C U U

U CU C



The covariance matrix is then expressed in terms of a diagonal matrix C, and two orthogonal matrices:

This is a singular value decomposition of the covariance matrix.

For this reason maximum covariance analysis (MCA) is sometimes called SVD analysis or canonical covariance analysis.

TXY X YC U CU


Canonical Correlation Analysis

Canonical correlation analysis is very similar to maximum covariance analysis except that the correlations rather than the covariances are maximized. Instead of:

The Zs are standardized to Ws to obtain the correlation matrix, R:

T X YR W W

T X YC Z Z



The standardized scores, W, are obtained by dividing the scores, Z, by their standard deviations, S (which are diagonal):

Substituting the Zs:

indicating that the matrices of weights are no longer orthogonal (cf. maximum covariance analysis).

1 1

T

T

X Y

X Y

Z X Y Z

R W W

S Z Z S

1 1

1 1

T T

T

X Y

X Y

Z X Y Z

Z X XY Y Z

R S U X YU S

S U C U S



Note that CCA is not the same as MCA using standardized original data since it is the variance of the weighted sums that is standardized in CCA.

Buell pattern and other loading interpretation problems apply to MCA and CCA just as they do in PCA. It is unusual to apply rotation, however, because the criteria of maximizing covariance / correlation would be lost.

Buell patterns


Redundancy Analysis

Redundancy analysis is another similar technique except that the explained variance in the Y using the X is maximized. The technique differs from MCA and CCA in that only the ZXs are standardized.

Note that different results will be obtained depending upon which dataset is X and which is Y.

Canonical Correlation Analysis and Related Techniques

Documents

Transcript of Canonical Correlation Analysis and Related Techniques