Correlation. The sample covariance matrix: where.

48
Correlation

Transcript of Correlation. The sample covariance matrix: where.

Page 1: Correlation. The sample covariance matrix: where.

Correlation

Page 2: Correlation. The sample covariance matrix: where.

The sample covariance matrix:

11 12 1

12 11 2

1 2

p

p

p p

p p pp

s s s

s s s

s s s

S

1

1

1

n

ik ij i kj kj

s x x x xn

where

Page 3: Correlation. The sample covariance matrix: where.

The sample correlation matrix:

12 1

12 2

1 2

1

1

1

p

p

p p

p p

r r

r r

r r

R

1

2 2

1 1

n

ij i kj kjik

ikn n

ii kk

ij i kj kj j

x x x xs

rs s

x x x x

where

Page 4: Correlation. The sample covariance matrix: where.

Note:

11

22

0 0

0 0

0 0

p p

pp

s

s

s

D

1 1R D SD where

Page 5: Correlation. The sample covariance matrix: where.

Tests for Independenceand

Non-zero correlation

Page 6: Correlation. The sample covariance matrix: where.

Tests for Independence

The test statistic

2

2

1ijij

nt r

r

If independence is true then the test statistic t will have a t -distributions with = n –2 degrees of freedom.

The test is to reject independence if:

2/ 2

nt t

Test for zero correlation (Independence between a two variables)

Page 7: Correlation. The sample covariance matrix: where.

The test statistic

0

0

11 1 1ln ln

2 1 2 1

13

rr

z

n

If H0 is true the test statistic z will have approximately a Standard Normal distribution

/ 2z z

Test for non-zero correlation (H0:

We then reject H0 if:

Page 8: Correlation. The sample covariance matrix: where.

Partial Correlation

Conditional Independence

Page 9: Correlation. The sample covariance matrix: where.

1

2

If q

p q

xx

x

Recall

11 12

12 22

has p-variate Normal distribution

with mean vector1

2

q

p q

and Covariance matrix

Then the conditional distribution of given is qi-variate Normal distribution

1- ii j ii ij jj ij

with mean vector 1 = i j i ij jj j jx

and Covariance matrix

ix

jx

Page 10: Correlation. The sample covariance matrix: where.

12 1 22 12 11 12The matrix

is called the matrix of partial variances and covariances.

th

2 1The , element of the matrix i j

1,2....ij q

is called the partial covariance (variance if i = j) between xi and xj given x1, … , xq.

1,2....1,2....

1,2.... 1,2....

ij qij q

ii q jj q

is called the partial correlation between xi and xj given x1, … , xq.

Page 11: Correlation. The sample covariance matrix: where.

Let

11 12

12 22

S S

SS S

denote the sample Covariance matrix1

2 1 22 12 11 12- S S S S SLet

th

2 1The , element of the matrix i j S

1,2....ij qs

is called the sample partial covariance (variance if i = j) between xi and xj given x1, … , xq.

Page 12: Correlation. The sample covariance matrix: where.

Also

1,2....1,2....

1,2.... 1,2....

ij qij q

ii q jj q

sr

s s

is called the sample partial correlation between xi and xj

given x1, … , xq.

Page 13: Correlation. The sample covariance matrix: where.

The test statistic

1. , ,1

. , , 2

2

1pij x xp

ij x x

n pt r

r

If independence is true then the test statistic t will have a t -distributions with = n – p - 2 degrees of freedom.

The test is to reject independence if:

2/ 2

n pt t

Test for zero partial correlation correlation (Conditional independence between a two variables given a set of p Independent variables)

1. , , pij x xr = the partial correlation between yi and yj given x1, …, xp.

Page 14: Correlation. The sample covariance matrix: where.

The test statistic

1 1

1 1

0 0. , , . , ,

0 0. , , . , ,

1 11 1ln ln

2 1 2 1

13

p p

p p

ij x x ij x x

ij x x ij x x

r

rz

n p

If H0 is true the test statistic z will have approximately a Standard Normal distribution

/ 2z z

Test for non-zero partial correlation

We then reject H0 if:

1 1

00 . , , . , ,:

p pij x x ij x xH

Page 15: Correlation. The sample covariance matrix: where.

The Multiple Correlation Coefficient

Testing independence between a single variable and a group of variables

Page 16: Correlation. The sample covariance matrix: where.

1

1Suppose

p

yx

x

1

1 11

yy y

y

has (p +1)-variate Normal distribution

with mean vector1

1 y

p

and Covariance matrix

We are interested if the variable y is independent of the vector

Definition

1x

The multiple correlation coefficient is the maximum correlation between y and a linear combination of the components of 1x

Page 17: Correlation. The sample covariance matrix: where.

1 1

1 0Let =

0

y yuAx

a x xv a

1

1 11

yy y

y

aA A

a a a

This vector has a bivariate Normal distribution

with mean vector

1

yAa

and Covariance matrix

We are interested if the variable y is independent of the vector

Derivation

1x

The multiple correlation coefficient is the maximum correlation between y and a linear combination of the components of 1x

Page 18: Correlation. The sample covariance matrix: where.

1 a x

1

11

y

yy

aa

a a

Thus we want to choose to maximize

The multiple correlation coefficient is the maximum correlation between y and

The correlation between y and 1 a x

a

a

Equivalently

2

1 1 12

11 11

1y y y

yy yy

a a aa

a a a a

Page 19: Correlation. The sample covariance matrix: where.

Note:

1 1 1111 1 1

2

2

11

1

y y

y y

yy

d a a d a aa a a a

da dad a

da a a

1 1 11 11 1 1

2

11

2 21 y y y y

yy

a a a a a a

a a

11 1 11 11

2

11

2 20

y yy

yy

a a a aa

a a

11 1 11 1or y ya a a a

11 1 111 1 11 1

1

or opt y y

y

a aa k

a

Page 20: Correlation. The sample covariance matrix: where.

1

1, ,

11n

y optopty x x

yy opt opt

aa

a a

The multiple correlation coefficient is independent of the value of k.

11 11 1

1 111 1 11 11 1

y y

yy y y

k

k k

1 11 11 1 1 11 1

11 11 1

y y y y

yyyy y y

Page 21: Correlation. The sample covariance matrix: where.

1

11 11 1

, ,and 0n

y yy x x

yy

We are interested if the variable y is independent of the vector

1if 0 y 1x

The sample Multiple correlation coefficient

1

1 11

Let denote the sample covariance matrix.yy y

y

s sS

s S

Then the sample Multiple correlation coefficient is

1

11 11 1

, , n

y yy x x

yy

s S sr

s

Page 22: Correlation. The sample covariance matrix: where.

1x

Testing for independence between y and

The test statistic 1

1

2, ,

2, ,

1

1n

n

y x x

y x x

rn pF

p r

11 11 1

11 11 1

1 y y

yy y y

s S sn p

p s s S s

If independence is true then the test statistic F will have an F-distributions with 1 = p degrees of freedom in the numerator and 1 = n – p + 1 degrees of freedom in the denominator

The test is to reject independence if:

, 1F F p n p

Page 23: Correlation. The sample covariance matrix: where.

Canonical Correlation Analysis

Page 24: Correlation. The sample covariance matrix: where.

The problemQuite often when one has collected data on several variables.

The variables are grouped into two (or more) sets of variables and the researcher is interested in whether one set of variables is independent of the other set. In addition if it is found that the two sets of variates are dependent, it is then important to describe and understand the nature of this dependence.

The appropriate statistical procedure in this case is called Canonical Correlation Analysis.

Page 25: Correlation. The sample covariance matrix: where.

Canonical Correlation: An Example

In the following study the researcher was interested in whether specific instructions on how to relax when taking tests and how to increase Motivation , would affect performance on standardized achievement tests

• Reading, • Language and • Mathematics

Page 26: Correlation. The sample covariance matrix: where.

A group of 65 third- and fourth-grade students were rated after the instruction and immediately prior taking the Scholastic Achievement tests on:

In addition data was collected on the three achievement tests

• how relaxed they were (X1) and

• how motivated they were (X2).

• Reading (Y1),

• Language (Y2) and

• Mathematics (Y3). The data were tabulated on the next page

Page 27: Correlation. The sample covariance matrix: where.

Relaxation Motivation Reading Language Math Relaxation Motivation Reading Language MathCase X 1 X 2 Y 1 Y 2 Y 3 Case X 1 X 2 Y 1 Y 2 Y 3

1 7 14 311 436 154 34 40 20 362 416 1072 43 25 501 455 765 35 40 18 596 592 6223 32 21 507 473 702 36 35 17 431 346 4934 17 12 453 392 401 37 33 17 361 414 4045 23 12 419 337 284 38 40 27 663 451 6516 10 16 545 538 414 39 31 15 569 462 3987 22 21 509 512 491 40 29 19 699 622 4788 13 19 320 308 517 41 37 16 187 223 2219 31 21 357 296 496 42 21 23 1132 839 104410 24 26 485 372 685 43 24 15 457 410 40011 26 21 811 748 902 44 19 14 413 448 52012 35 20 367 436 393 45 33 22 569 605 61513 24 17 242 349 137 46 19 19 650 685 44014 20 8 237 140 331 47 26 22 424 427 48215 38 27 417 648 618 48 20 15 475 604 74216 32 19 429 446 458 49 22 21 519 612 44617 14 11 555 579 438 50 37 22 338 463 32718 24 12 599 497 414 51 41 28 674 613 53419 38 25 403 383 606 52 29 35 381 624 56520 30 8 550 324 674 53 25 12 199 171 31621 22 25 377 496 242 54 27 21 577 523 69922 36 28 671 585 710 55 22 20 425 466 40223 3 22 498 488 481 56 4 11 392 192 35424 44 28 477 583 260 57 27 22 401 520 55825 24 25 609 413 670 58 28 23 321 410 46026 33 18 521 522 716 59 33 20 682 433 74327 24 21 495 645 491 60 33 24 719 727 105228 28 20 400 555 624 61 31 33 672 705 65029 34 7 258 175 276 62 20 11 366 309 53730 39 20 466 541 348 63 26 25 581 558 38631 7 19 709 757 589 64 23 10 681 530 58132 13 17 586 472 492 65 30 22 1019 917 88033 32 18 418 361 428

Page 28: Correlation. The sample covariance matrix: where.

1

2

Let q

p q

xx

x

Definition: (Canonical variates and Canonical correlations)

11 12

12 22

have p-variate Normal distribution

with1

2

q

p q

and

Let

be such that U1 and V1 have achieved the maximum correlation 1.

1 11 1 1 1 1 q qU a x a x a x

and 1 1

1 1 2 1 1 q p q pV b x b x b x

Then U1 and V1 are called the first pair of canonical variates and 1 is called the first canonical correlation coefficient.

Page 29: Correlation. The sample covariance matrix: where.

derivation: ( 1st pair of Canonical variates and Canonical correlation)

1 111 12

12 221 1

0 0'

0 0

a aA A

b b

has covariance matrixThus

1 11 11 11

1 11 1 21 1

q q

q p q p

a xa x a xU

V b xb x b x

Now

1 1

21

0

0

a xAx

xb

1

1

U

V

1 11 1 1 12 1

1 12 1 1 22 1

a a a b

b a b b

Page 30: Correlation. The sample covariance matrix: where.

derivation: ( 1st pair of Canonical variates and Canonical correlation)

1 111 12 1 11 1 1 12 1

12 221 1 1 12 1 1 22 1

0 0'

0 0

a a a a a bA A

b b b a b b

has covariance matrixThus

1 11 11 11

1 11 1 21 1

q q

q p q p

a xa x a xU

V b xb x b x

Now

1 1

21

0

0

a xAx

xb

1

1

U

V

1 1

1 12 1

1 11 1 1 22 1

U V

a b

a a b b

hence

Page 31: Correlation. The sample covariance matrix: where.

Thus we want to choose 1 1 and a b

is at a maximum

so that

1 1

1 12 1

1 11 1 1 22 1

U V

a b

a a b b

is at a maximum

or

1 1

2

1 12 12

1 11 1 1 22 1

U V

a b

a a b b

Let

2

1 12 1

1 11 1 1 22 1

a bV

a a b b

Page 32: Correlation. The sample covariance matrix: where.

Computing derivatives

2

1 12 1 12 1 1 11 1 1 12 1 11 1

21 1 22 1 1 11 1

2 210

a b b a a a b aV

a b b a a

and

12 1 1 11 1 1 12 1 11 1b a a a b a

2

1 12 1 12 11 1 22 1 1 12 1 22 1

21 11 11 1 22 1

2 210

a b a b b a b bV

a ab b b

12 11 1 22 1 12 1 22 1a b b a b b 11 22

1 22 12 11

1 12

or b b

b aa b

Page 33: Correlation. The sample covariance matrix: where.

Thus

2

1 12 1112 22 12 111 11 1

1 22 11 1 11 1

a ba a

b b a a

2

1 12 11 111 12 22 12 111 1 1

1 22 11 1 11 1

a ba a ka

b b a a

This shows that 1a

is an eigenvector of 1 111 12 22 12

k is the largest eigenvalue of

1 1

2

1 12 1 2

1 11 1 1 22 1

U V

a bk

a a b b

1 1

2Thus is maximized whenU V1 1

11 12 22 12 and 1a

is the eigenvector associated with the

largest eigenvalue.

Page 34: Correlation. The sample covariance matrix: where.

Also

2

1 12 11 111 12 22 12 1 1

1 22 1 1 11 1

a ba a

b b a a

11 221 22 12 11

1 12

b bb a

a b

and

1 12 122 1 12 1

1 22 1

or a b

b ab b

2

1 12 11 112 11 12 22 12 1 12 1

1 22 1 1 11 1

a ba a

b b a a

2

1 12 11 1 1 12 1 1 12 112 11 12 22 22 1 22 1

1 22 1 1 22 11 22 1 1 11 1

a ba b a bb b

b b b bb b a a

2

1 12 11 122 12 11 12 1 1

1 22 1 1 11 1

a bb b

b b a a

Page 35: Correlation. The sample covariance matrix: where.

Summary:

are found by finding , eigenvectors of the matrices

1 11 1 1 1 1 q qU a x a x a x

associated with the largest eigenvalue (same for both matrices)

1 11 1 2 1 1 q p q pV b x b x b x

The first pair of canonical variates

1 1 and a b

1 1 1 112 11 12 22 22 12 11 12 and respectively

The largest eigenvalue of the two matrices is the square of the first canonical correlation coefficient1

1 112 11 12 22 the largest eigenvalue of

1 122 12 11 12= the largest eigenvalue of

Page 36: Correlation. The sample covariance matrix: where.

Note:

then

have exactly the same eigenvalues (same for both matrices)

1 1 1 112 11 12 22 22 12 11 12 and

Proof:1 1

12 11 12 22 Let and be an eigenvalue and eigenvector of .a

1 112 11 12 22 .a a

and 1 1 1 122 12 11 12 22 22 .a a

1 1 122 12 11 12 22 where b b b a

122 Thus and is an eigenvalue and b a

1 122 12 11 12eigenvector of .

Page 37: Correlation. The sample covariance matrix: where.

The remaining canonical variates and canonical correlation coefficients

are found by finding

, so that

2 22 2 1 1 1 q qU a x a x a x

1. (U2,V2) are independent of (U1,V1).

2 22 2 2 1 1 q p q pV b x b x b x

The second pair of canonical variates

2 2 and a b

2. The correlation between U2 and V2 is maximized

The correlation, 2, between U2 and V2 is called the second canonical correlation coefficient.

Page 38: Correlation. The sample covariance matrix: where.

are found by finding , so that

1 1 1 i i

i i q qU a x a x a x

1. (Ui,Vi) are independent of (U1,V1), …, (Ui-1,Vi-1).

1 1 i i

i i i q p q pV b x b x b x

The ith pair of canonical variates

and i ia b

2. The correlation between Ui and Vi is maximized

The correlation, 2, between U2 and V2 is called the second canonical correlation coefficient.

Page 39: Correlation. The sample covariance matrix: where.

derivation: ( 2nd pair of Canonical variates and Canonical correlation)

has covariance matrix

1 1 11

1 1 2 1 1

2 2 1 22

2 2 2 2

0

0 =

0

0

a x aU

V b x b xAx

U a x xa

V b x b

Now

1

1 21 11 12

12 222 1 2

2

0

0 00

0 0 0

0

a

a abA A

a b b

b

1 11 1 1 12 1 1 11 2 1 12 2

1 22 1 1 12 2 1 22 2

2 11 2 2 12 2

2 22 2

*

* *

* * *

a a a b a a a b

b b b a b b

a a a b

b b

Page 40: Correlation. The sample covariance matrix: where.

2 2

2 12 2

2 11 2 2 22 2

U V

a b

a a b b

Now

2 2

2

2 12 22

2 11 2 2 22 2

U V

a b

a a b b

and maximizing

Is equivalent to maximizing 2

2 12 2a b

2 11 2 2 22 2 1 11 2 1 12 2 1 12 21, 1, 0, 0, 0a a b b a a a b b a

subject to

2

2 12 2 1 2 11 2 2 2 22 21 1V a b a a b b

3 1 11 2 4 1 12 2 5 1 12 2 6 1 22 2a a a b b a b b

Using the Lagrange multiplier technique

1 22 2and b b

Page 41: Correlation. The sample covariance matrix: where.

Now

and

2 12 2 12 2 1 11 2 3 11 1 5 12 12

2 2 0V

a b b a a ba

2 12 2 12 2 2 22 2 4 12 1 6 22 1

2

2 2 0V

a b a b a bb

2

2 12 2 1 2 11 2 2 2 22 21 1V a b a a b b

3 1 11 2 4 1 12 2 5 1 12 2 6 1 22 2a a a b b a b b

0, 1, 6i

Vi

also gives the restrictions

Page 42: Correlation. The sample covariance matrix: where.

These equations can used to show that

are eigenvectors of the matrices

associated with the 2nd largest eigenvalue (same for both matrices)

1 1 and a b

1 1 1 112 11 12 22 22 12 11 12 and respectively

The 2nd largest eigenvalue of the two matrices is the square of the 2nd canonical correlation coefficient2

1 12 12 11 12 22 the 2 largest eigenvalue of nd

1 122 12 11 12= the 2 largest eigenvalue of nd

Page 43: Correlation. The sample covariance matrix: where.

Coefficients for the ith pair of canonical variates,

are eigenvectors of the matrices

associated with the ith largest eigenvalue (same for both matrices)

and i ia b

1 1 1 112 11 12 22 22 12 11 12 and respectively

The ith largest eigenvalue of the two matrices is the square of the ith canonical correlation coefficienti

1 112 11 12 22 the largest eigenvalue of th

i i

1 122 12 11 12= the largest eigenvalue of thi

continuing

Page 44: Correlation. The sample covariance matrix: where.

Example

Variables

• relaxation Score (X1)

• motivation score (X2). • Reading (Y1),

• Language (Y2) and

• Mathematics (Y3).

Page 45: Correlation. The sample covariance matrix: where.

Summary StatisticsUNIVARIATE SUMMARY STATISTICS ----------------------------- STANDARD VARIABLE MEAN DEVIATION 1 Relax 26.87692 9.50412 2 Mot 19.41538 5.83066 3 Read 499.03077 172.25508 4 Lang 485.83077 156.08957 5 Math 512.52308 195.18614 CORRELATIONS ------------ Relax Mot Read Lang Math 1 2 3 4 5 Relax 1 1.000 Mot 2 0.391 1.000 Read 3 0.002 0.280 1.000 Lang 4 0.050 0.510 0.781 1.000 Math 5 0.127 0.340 0.713 0.556 1.000

Page 46: Correlation. The sample covariance matrix: where.

Canonical Correlation statistics Statistics

CANONICAL NUMBER OF BARTLETT'S TEST FOR EIGENVALUE CORRELATION EIGENVALUES REMAINING EIGENVALUES CHI- TAIL SQUARE D.F. PROB. 27.86 6 0.0001 0.35029 0.59186 1 1.56 2 0.4586 0.02523 0.15885 BARTLETT'S TEST ABOVE INDICATES THE NUMBER OF CANONICAL VARIABLES NECESSARY TO EXPRESS THE DEPENDENCY BETWEEN THE TWO SETS OF VARIABLES. THE NECESSARY NUMBER OF CANONICAL VARIABLES IS THE SMALLEST NUMBER OF EIGENVALUES SUCH THAT THE TEST OF THE REMAINING EIGENVALUES IS NON-SIGNIFICANT. FOR EXAMPLE, IF A TEST AT THE .01 LEVEL WERE DESIRED, THEN 1 VARIABLES WOULD BE CONSIDERED NECESSARY. HOWEVER, THE NUMBER OF CANONICAL VARIABLES OF PRACTICAL VALUE IS LIKELY TO BE SMALLER.

Page 47: Correlation. The sample covariance matrix: where.

continued CANONICAL VARIABLE LOADINGS --------------------------- (CORRELATIONS OF CANONICAL VARIABLES WITH ORIGINAL VARIABLES) FOR FIRST SET OF VARIABLES CNVRF1 CNVRF2 1 2 Relax 1 0.197 0.980 Mot 2 0.979 0.203 -----------------------------

CANONICAL VARIABLE LOADINGS --------------------------- (CORRELATIONS OF CANONICAL VARIABLES WITH ORIGINAL VARIABLES) FOR SECOND SET OF VARIABLES CNVRS1 CNVRS2 1 2 Read 3 0.504 -0.361 Lang 4 0.900 -0.354 Math 5 0.565 0.391 ------------------------------

Page 48: Correlation. The sample covariance matrix: where.

Summary

U1 = 0.197 Relax + 0.979 Mot

V1 = 0.504 Read + 0.900 Lang + 0.565 Math

1 = .592

U2 = 0.980 Relax + 0.203 Mot

V2 = 0.391 Math - 0.361 Read - 0.354 Lang

2 = .159