Multivariate Statistics

33
Multivariate Statistics Matrix Algebra I W. M. van der Veld University of Amsterdam

description

Multivariate Statistics. Matrix Algebra I W. M. van der Veld University of Amsterdam. Overview. Introduction Definitions Special names Matrix transposition Matrix addition Matrix multiplication. Introduction. The mathematics in which multivariate analysis is cast is matrix algebra. - PowerPoint PPT Presentation

Transcript of Multivariate Statistics

Page 1: Multivariate Statistics

Multivariate Statistics

Matrix Algebra I

W. M. van der VeldUniversity of Amsterdam

Page 2: Multivariate Statistics

Overview

• Introduction• Definitions• Special names• Matrix transposition• Matrix addition• Matrix multiplication

Page 3: Multivariate Statistics

Introduction

• The mathematics in which multivariate analysis is cast is matrix algebra.

• We will present enough matrix algebra to facilitate the description of the operations we need to understand the matrix algebra involving the multivariate analysis discussed in this course. In addition this basic understanding is necessary for the more advanced courses of the Research Master.

• Basically, all we need is a few basic tricks, at least at first. Let us summarize them, so that you will have some idea of what is coming and, more importantly, of why these topics must be mastered.

Page 4: Multivariate Statistics

Introduction• Our point of departure is always a multivariate data matrix with a certain number,

n, of rows for the individual observation units, and a certain number, m, of columns for the variables.

• In most applications of multivariate analysis, we shall not be interested in variable means. They have their interest, of course, in each study, but multivariate analysis instead focuses on variances and covariances. Therefore, the data matrix will in general be transformed into a matrix where columns have zero means and where the numbers in the column represent deviations from the mean.

• Such a matrix is the basis for the variance-covariance matrix with n rows and m columns. For a variable i, the variance is defined as Σxi

2/n, whereas for two variables i and j the covariance is defined as Σxixj /n, xi and xj being taken as deviations from the mean. Variances and covariances can be collected in the variance-covariance matrix, in that the number in row i, column i (on the diagonal), gives the variance of variable i, while the number in row i, column j (i ≠ j), gives the covariance between the pair of variables i and j, and is the same number as in row j, column i.

• An often useful transformation is to standardize the data matrix: we first take deviations from the mean for each column, then divide the deviation from the mean by the standard deviation for the same column. The result is that values in a column will have zero mean and unit variance.

• The standardized data matrix is then the basis for calculating a correlation matrix, which is nothing but a variance-covariance matrix for standardized variables. In the diagonal of this matrix, we therefore find values equal to unity. In the other cells we find correlations: in row i, column j, we shall find the correlation coefficient rij = Σxixj /nσiσj.

Page 5: Multivariate Statistics

Introduction• Very often we shall need a variable that is a linear compound of the initial

variables. The linear compound is simply a variable whose values are obtained by a weighted addition of values of the original variables. For example, with two initial variables x1 and x2, values of the compound are defined as y = w1x1 + w2x2, where w1 and w2 are weights. A linear compound could also be called a weighted sum.

• For some techniques of multivariate analysis, we need to be able to solve simultaneous equations. Doing so usually requires a computational routine called matrix inversion.

• Multivariate analysis nearly always comes down to finding a minimum or a maximum of some sort. A typical example is to find a linear compound of some variables that has maximum correlation with some other variable (multiple correlation), or to find a linear compound of the observed scores that has maximum variance (factor analysis). Therefore, among our stock of basic tricks, we need to include procedures for finding extreme values of functions.

• In addition, we shall often need to find maxima (or minima) of functions where the procedure is limited by certain side-conditions. For instance, we are given two sets of variables, and are required to find a linear compound from the first set, and another from the second set, such that the value of the correlation between these two compounds is maximum. This task can be reformulated as follows: find the two compounds in such a way that the covariance between them is maximum, given that the compounds both have unit variance.

• Very often in multivariate analysis, a maximization procedure under certain side-conditions takes on a very specific and recognizable form, namely, finding eigenvectors and eigenvalues of a given matrix.

Page 6: Multivariate Statistics

Definitions

• For multivariate statistics the most important matrix is the data matrix.

• The data matrix has a certain number, n, of rows for the individual observation units, and a certain number, m, of columns for the variables.

7176800007

5722300002

5762500001

3244900000

PolTrstSocTrstSatlifeAgeidresp Data file

71768

57223

57625

32449Data matrix

Page 7: Multivariate Statistics

Definitions

• In general a matrix has an n by m dimension.• The convention is to denote matrices by

boldface uppercase letters.

• The first subscript in a matrix element (xij) refers to the row and the second subscript refers to the column.

• It is important to remember this convention when matrix algebra is performed.

232221

131211

xxx

xxxΧ

Page 8: Multivariate Statistics

Definitions

• A vector is a special type of matrix that has only one row (called a row vector) or one column (called a column vector). Below, a is a column vector while b is a row vector.

• The convention is to denote vectors by boldface lowercase letters.

2

1

x

xa 321 xxxb

Page 9: Multivariate Statistics

Definitions

• A scalar is a matrix with only one row and one column.

• The convention is to denote scalars by italicized, lower case letters (e.g., x).

Page 10: Multivariate Statistics

Special names

• If n = m then the matrix is called a square matrix.• The data matrix is normally not square, but the

variance-covariance matrix is; and many others.• Matrix A is square but matrix B is not square.

071

5122

543

A

5122

543B

Page 11: Multivariate Statistics

Special names

• A symmetric matrix is a square matrix in which xij = xji , for all i and j.

• The data matrix is normally not symmetric, but the variance-covariance matrix is.

• Matrix A is symmetric; matrix B is not symmetric.

0101

10122

121

A

0101

21210

121

B

Page 12: Multivariate Statistics

Special names

• A diagonal matrix is a symmetric matrix where all the off diagonal elements are 0.

• The data matrix is normally not diagonal, neither is the variance covariance matrix. The variance matrix is diagonal.

• These matrices are often denoted with D; matrix D is diagonal.

700

0120

001

D

Page 13: Multivariate Statistics

Special names• An identity matrix is a diagonal matrix with 1s and only

1s on the diagonal, it is also sometimes called the unity matrix.

• This is a useful matrix in matrix algebra. • The convention is to denote the identity matrix by I.

100

010

001

I

Page 14: Multivariate Statistics

Special names

• An unit vector is a vector containing only 1s.• This is a useful vector in matrix algebra. • The convention is to denote the identity matrix by

u.

1

1

1

u

Page 15: Multivariate Statistics

Matrix transposition

• Matrix transposition is a useful transformation, with many purposes.

• The transpose of a matrix is denoted by a prime (A’) or a superscript t or T (At or AT).

• What it does? The first row of a matrix becomes the first column of the transpose matrix, the second row of the matrix becomes the second column of the transpose, etc.

120

151A

2

5

1

A'

12

25

01

A'

Page 16: Multivariate Statistics

Matrix transposition

• What the transpose of A? And the dimensions of A’?

?'

175

713

531

AA

• The transpose of a square matrix is a square matrix

?'40

31

AA

• The transpose of a symmetric matrix is simply the original matrix.

• What type of special matrix is this matrix?• What the transpose of this matrix?

Page 17: Multivariate Statistics

Matrix transposition

• The transpose of a row vector will be a column vector, and the transpose of a column vector will be a row vector.

32'3

2

aa

2

0

4

204 b'b

Page 18: Multivariate Statistics

Matrix addition

• To add two matrices;– they both must have the same number of rows, and– they both must have the same number of columns.

• The elements of the two matrices are simply added together, element by element, to produce the results.

• That is, for R = A + B, then rij = aij + bij.

0101

10122

121

A

0101

21210

121

B

00...11

210...

......11

R

0202

122412

242

Page 19: Multivariate Statistics

Matrix addition

• Matrix subtraction works in the same way, except that elements are subtracted instead of added.

• What is the result of this subtraction?

0 45 1

3 2

• What is the result of this addition?

5 0

3- 1-

4- 0

3 2

Page 20: Multivariate Statistics

Matrix addition

• Rules for matrix addition and subtraction:– A + B = B + A

Commutative– (A + B) + C = A + (B + C) Associative– (A + B)’ = A’ + B’

Page 21: Multivariate Statistics

Matrix multiplication

• Multiplication between a scalar and a vector.• Each element in the product matrix is simply the

scalar multiplied by the element in the vector.

• That is, for p = xa, then pij = xaij for all i and j. Thus,

• The following multiplication is also defined: p = ax. That is, scalar multiplication is commutative.

12

8

3*4

2*4

3

24ap x

? xap

4;3

2

xa

Page 22: Multivariate Statistics

Matrix multiplication

• Multiplication between two vectors.• To perform this, the row vector must have as many

columns as the column vector has rows.• The product is simply the sum of the first row vector

element multiplied by the first column vector element plus the second row vector element multiplied by the second column vector element plus the product of the third elements, etc.

• In algebra, if p = ab, then .

n

iiibap

1

?

2

1

0

210

52*21*10*0

Page 23: Multivariate Statistics

Matrix multiplication

• Multiplication between two matrices.• This is similar to the multiplication of two vectors.• Specifically, in the expression P = AB, p=ai• b•j,

where ai• is the ith row vector in matrix A and b•j is the jth column vector in matrix B.

• Thus, if

120

151A

17

40

21

B ?ABP

Page 24: Multivariate Statistics

Matrix multiplication

• Multiplication between two matrices.• This is similar to the multiplication of two vectors.• Specifically, in the expression P = AB, p=ai• b•j,

where ai• is the ith row vector in matrix A and b•j is the jth column vector in matrix B.

• Thus, if

120

151A

17

40

21

B

8ABP

87*10*51*1

7

0

1

1511111

bap

ABP

Page 25: Multivariate Statistics

Matrix multiplication

• Multiplication between two matrices.• This is similar to the multiplication of two vectors.• Specifically, in the expression P = AB, p=ai• b•j,

where ai• is the ith row vector in matrix A and b•j is the jth column vector in matrix B.

• Thus, if

120

151A

17

40

21

B

198ABP

191*14*52*1

1

4

2

1512112

bap

Page 26: Multivariate Statistics

Matrix multiplication

• Multiplication between two matrices.• This is similar to the multiplication of two vectors.• Specifically, in the expression P = AB, p=ai• b•j,

where ai• is the ith row vector in matrix A and b•j is the jth column vector in matrix B.

• Thus, if

120

151A

17

40

21

B

7

198ABP

77*10*21*0

7

0

1

1201221

bap

Page 27: Multivariate Statistics

Matrix multiplication

• Multiplication between two matrices.• This is similar to the multiplication of two vectors.• Specifically, in the expression P = AB, p=ai• b•j,

where ai• is the ith row vector in matrix A and b•j is the jth column vector in matrix B.

• Thus, if

120

151A

17

40

21

B

77

198ABP

71*14*22*0

1

4

2

1202222

bap

Page 28: Multivariate Statistics

Matrix multiplication

• Summary of multiplication procedure.

flekdjfiehdg

clbkajcibhag

li

kh

jg

fed

cba

Page 29: Multivariate Statistics

– Then the product P = AB is defined if ma=nb – And the dimension of P is na by mb.

Matrix multiplication• For matrix multiplication to be legal, the first matrix

must have as many columns as the second matrix has rows. This, of course, is the requirement for multiplying a row vector by a column vector.

• The resulting matrix will have as many rows as the first matrix and as many columns as the second matrix.

• In the example A had 2 rows and 3 columns while B had 3 rows and 2 columns, the matrix multiplication was therefore defined resulting in matrix with 2 rows and 2 columns.

• Or in general:– Dimension A is na by ma

Dimension B is nb by mb,

Page 30: Multivariate Statistics

Matrix multiplication

• Rules for matrix and vector multiplication:– AB ≠ BA Not commutative– A(BC ) = (AB)C Associative– A(B+C) = AB + AC Distributive– (B+C)A = BA + CA– (AB) = BA– (ABC) = CBA

• Rules for scalar multiplication:– xA = Ax Commutative – x(A+B) = xA + xB Distributive– x(AB) = (xA)B = A(xB) Associative

Page 31: Multivariate Statistics

Matrix multiplication

• What is the product of:

?42

4

3

2

168

126

84

42

4

3

2

?11

4

3

2

44

33

22

11

4

3

2

?

4

3

2

11

Not possible: [1x2][3x1]

Page 32: Multivariate Statistics

Matrix multiplication

• What is the product of:

8

6

4

0*161*8

0*121*6

0*81*4

Not defined: [3x2] by [1x3]

0

1

168

126

84

4

3

2

168

126

84

37

55

73

320

10

012

19313*35*27*07*35*23*0

20

02

Page 33: Multivariate Statistics

Matrix multiplication

• Matrix division.• For simple numbers, division can be reduced to

multiplication by the reciprocal of the divider– 32 divided by 4, is the same as– 32 multiplied by ¼, or– multiplied by 4-1,– where 4-1 is defined by the general equality a-1a = 1.

• When working with matrices, we shall adopt the latter idea, and therefore not use the term division at all; instead we take the multiplication by an inverse matrix as the equivalent of division.

• However, the computation of the inverse matrix is quite complex, and discussed next time.