Lecture slides stats1.13.l11.air

62
Statistics One Lecture 11 Multiple Regression 1

description

Lecture slides stats1.13.l11.air

Transcript of Lecture slides stats1.13.l11.air

Page 1: Lecture slides stats1.13.l11.air

Statistics One

Lecture 11 Multiple Regression

1

Page 2: Lecture slides stats1.13.l11.air

Three segments

•  Multiple Regression (MR) •  Matrix algebra •  Estimation of coefficients

2

Page 3: Lecture slides stats1.13.l11.air

Lecture 11 ~ Segment 1

Multiple Regression

3

Page 4: Lecture slides stats1.13.l11.air

Multiple Regression

•  Important concepts/topics – Multiple regression equation –  Interpretation of regression coefficients

4

Page 5: Lecture slides stats1.13.l11.air

Simple vs. multiple regression

•  Simple regression –  Just one predictor (X)

•  Multiple regression – Multiple predictors (X1, X2, X3, …)

5

Page 6: Lecture slides stats1.13.l11.air

Multiple regression

•  Multiple regression equation –  Just add more predictors (multiple Xs)

6

Ŷ = B0 + B1X1 + B2X2 + B3X3 + … + BkXk Ŷ= B0 + Σ(BkXk)

Page 7: Lecture slides stats1.13.l11.air

Multiple regression

•  Multiple regression equation

7

Ŷ = predicted value on the outcome variable Y B0 = predicted value on Y when all X = 0 Xk = predictor variables Bk = unstandardized regression coefficients Y - Ŷ = residual (prediction error) k = the number of predictor variables

Page 8: Lecture slides stats1.13.l11.air

Model R and R2

•  R = multiple correlation coefficient – R = rÝY – The correlation between the predicted scores

and the observed scores •  R2

– The percentage of variance in Y explained by the model

8

Page 9: Lecture slides stats1.13.l11.air

Multiple regression: Example

•  Outcome measure (Y) – Faculty salary (Y)

•  Predictors (X1, X2, X3) – Time since PhD (X1) – Number of publications (X2) – Gender (X3)

9

Page 10: Lecture slides stats1.13.l11.air

Summary statistics

M SD

Salary $64,115 $17,110

Time 8.09 5.24

Publications 15.49 7.51

10

Page 11: Lecture slides stats1.13.l11.air

Multiple regression: Example

•  Gender – Male = 0 – Female = 1

11

Page 12: Lecture slides stats1.13.l11.air

Multiple regression: Example

•  lm(Salary ~ Time + Pubs + Gender)

•  Ŷ = 46,911 + 1,382(Time) + 502(Pubs) + -3,484(Gender)

12

Page 13: Lecture slides stats1.13.l11.air

Table of coefficients B SE t β p

B0 46,911

Time 1,382 236 5.86 .42 < .01

Pubs 502 164 3.05 .22 < .01

Gender -3,484 2,439 -1.43 -.10 .16

13

Ŷ = 46,911 + 1,382(Time) + 502(Pubs) + -3,484(Gender)

Page 14: Lecture slides stats1.13.l11.air

Multiple regression: Example

•  What is $46,911? •  What is $502? •  Who makes more money, men or women? •  According to this model, is the gender

difference statistically significant? •  What is the strongest predictor of salary?

14

Page 15: Lecture slides stats1.13.l11.air

Multiple regression: Example

•  $46,911 is the predicted salary for a male professor who just graduated and has no publications (predicted score when all X=0)

•  $502 is the predicted change in salary associated with an increase of one publication, for professors who have been out of school for an average amount of time, averaged across men and women

15

Page 16: Lecture slides stats1.13.l11.air

Multiple regression: Example

•  Who makes more money, men or women? – Trick question: Based on the output we can’t

answer this question

•  According to this model, is the gender difference statistically significant? – No

16

Page 17: Lecture slides stats1.13.l11.air

Multiple regression: Example

•  What is the strongest predictor of salary? – Time

17

Page 18: Lecture slides stats1.13.l11.air

Segment summary

•  Important concepts/topics – Multiple regression equation –  Interpretation of regression coefficients

18

Page 19: Lecture slides stats1.13.l11.air

END SEGMENT

19

Page 20: Lecture slides stats1.13.l11.air

Lecture 11 ~ Segment 2

Matrix algebra

20

Page 21: Lecture slides stats1.13.l11.air

Matrix algebra

•  Important concepts/topics – Matrix addition/subtraction/multiplication – Special types of matrices

•  Correlation matrix •  Sum of squares / Sum of cross products matrix •  Variance / Covariance matrix

21

Page 22: Lecture slides stats1.13.l11.air

Matrix algebra

•  A matrix is a rectangular table of known or unknown numbers, for example,

22

1 2 5 1 3 4 4 2

M =

Page 23: Lecture slides stats1.13.l11.air

Matrix algebra

•  The size, or order, of a matrix is given by identifying the number of rows and columns. The order of matrix M is 4x2

23

1 2 5 1 3 4 4 2

M =

Page 24: Lecture slides stats1.13.l11.air

Matrix algebra

•  The transpose of a matrix is formed by rewriting its rows as columns

24

1 2 5 1 3 4 4 2

M = MT = 1 5 3 4 2 1 4 2

Page 25: Lecture slides stats1.13.l11.air

Matrix algebra

•  Two matrices may be added or subtracted only if they are of the same order

25

2 3 4 5 1 2 3 1

N =

Page 26: Lecture slides stats1.13.l11.air

Matrix algebra

•  Two matrices may be added or subtracted only if they are of the same order

26

M + N = =

1 2 5 1 3 4 4 2

2 3 4 5 1 2 3 1

3 5 9 6 4 6 7 3

+

Page 27: Lecture slides stats1.13.l11.air

Matrix algebra

•  Two matrices may be multiplied when the number of columns in the first matrix is equal to the number of rows in the second matrix.

•  If so, then we say they are conformable for matrix multiplication.

27

Page 28: Lecture slides stats1.13.l11.air

Matrix algebra

•  Matrix multiplication:

28

R = MT * N Rij = ∑ (MTik * Nkj)

Page 29: Lecture slides stats1.13.l11.air

Matrix algebra

29

R = MT * N = * =

2 3 4 5 1 2 3 1

37 38 18 21

1 5 3 4 2 1 4 2

Page 30: Lecture slides stats1.13.l11.air

Matrix algebra

•  In the next ten slides we will go from a raw dataframe to a correlation matrix!

30

Page 31: Lecture slides stats1.13.l11.air

Raw dataframe

31

3 2 3 3 2 3 2 4 4 4 3 4 4 4 3 5 4 3 2 5 4 3 3 2 5 3 4 3 5 4

Xnp =

Page 32: Lecture slides stats1.13.l11.air

Row vector of sums

32

3 2 3 3 2 3 2 4 4 4 3 4 4 4 3 5 4 3 2 5 4 3 3 2 5 3 4 3 5 4

T1p = 11n * Xnp = 1 1 1 1 1 1 1 1 1 1 * = 34 35 34

Page 33: Lecture slides stats1.13.l11.air

Row vector of means

33

M1p = T1p * N-1 = 34 35 34 * 10-1 = 3.4 3.5 3.4

Page 34: Lecture slides stats1.13.l11.air

Matrix of means

34

Mnp = 1n1 * M1p =

1 1 1 1 1 1 1 1 1 1

3.4 3.5 3.4

3.4 3.5 3.4 3.4 3.5 3.4 3.4 3.5 3.4 3.4 3.5 3.4 3.4 3.5 3.4 3.4 3.5 3.4 3.4 3.5 3.4 3.4 3.5 3.4 3.4 3.5 3.4 3.4 3.5 3.4

* =

Page 35: Lecture slides stats1.13.l11.air

Matrix of deviation scores

35

3 2 3 3 2 3 2 4 4 4 3 4 4 4 3 5 4 3 2 5 4 3 3 2 5 3 4 3 5 4

Dnp = Xnp - Mnp =

3.4 3.5 3.4 3.4 3.5 3.4 3.4 3.5 3.4 3.4 3.5 3.4 3.4 3.5 3.4 3.4 3.5 3.4 3.4 3.5 3.4 3.4 3.5 3.4 3.4 3.5 3.4 3.4 3.5 3.4

- =

-.4 -1.5 -.4 -.4 -1.5 -.4 -1.4 .5 .6 .6 -.5 .6 .6 .5 -.4 1.6 .5 -.4 -1.4 1.5 .6 -.4 -.5 -1.4 1.6 -.5 .6 -.4 1.5 .6

Page 36: Lecture slides stats1.13.l11.air

SS / SP matrix

Sxx = DpnT * Dnp =

-.4 -.4 -1.4 .6 .6 1.6 -1.4 -.4 1.6 -.4 -1.5 -1.5 .5 -.5 .5 .5 1.5 -.5 -.5 1.5 -.4 -.4 .6 .6 -.4 -.4 .6 -1.4 .6 .6

-.4 -1.5 -.4 -.4 -1.5 -.4 -1.4 .5 .6 .6 -.5 .6 .6 .5 -.4 1.6 .5 -.4 -1.4 1.5 .6 -.4 -.5 -1.4 1.6 -.5 .6 -.4 1.5 .6

*

= 10.4 -2.0 -.6 -2.0 10.5 3.0 -.6 3.0 4.4

Page 37: Lecture slides stats1.13.l11.air

Variance / Covariance matrix

37

Cxx = Sxx * N-1 = 10.4 -2.0 -.6 -2.0 10.5 3.0 -.6 3.0 4.4

* 10-1 = 1.04 -.20 -.06 -.20 1.05 .30 -.06 .30 .44

Page 38: Lecture slides stats1.13.l11.air

SD matrix

38

Sxx = (Diag(Cxx))1/2 = 1.02 0 0 0 1.02 0 0 0 .66

Page 39: Lecture slides stats1.13.l11.air

Correlation matrix

39

Rxx = Sxx-1 * Cxx * Sxx -1 =

1.02-1 0 0 0 1.02-1 0 0 0 .66-1

1.02-1 0 0 0 1.02-1 0 0 0 .66-1

* * 1.04 -.20 -.06 -.20 1.05 .30 -.06 .30 .44

= 1.00 -.19 -.09 -.19 1.00 .44 -.09 .44 1.00

Page 40: Lecture slides stats1.13.l11.air

Matrix algebra

•  Important concepts/topics – Matrix addition/subtraction/multiplication – Special types of matrices

•  Correlation matrix •  Sum of squares / Sum of cross products matrix •  Variance / Covariance matrix

40

Page 41: Lecture slides stats1.13.l11.air

END SEGMENT

41

Page 42: Lecture slides stats1.13.l11.air

Lecture 11 ~ Segment 3

Estimation of coefficients

42

Page 43: Lecture slides stats1.13.l11.air

Estimation of coefficients

•  The values of the coefficients (B) are estimated such that the model yields optimal predictions – Minimize the residuals!

43

Page 44: Lecture slides stats1.13.l11.air

Estimation of coefficients

•  The sum of the squared (SS) residuals is minimized – SS.RESIDUAL = Σ(Ŷ -Y)2

44

Page 45: Lecture slides stats1.13.l11.air

Estimation of coefficients

•  Standardized and in matrix form, the regression equation is Ŷ = B(X), where – Ŷ is a [N x 1] vector

•  N = number of cases

– B is a [k x 1] vector •  k = number of predictors

– X is a [N x k] matrix

45

Page 46: Lecture slides stats1.13.l11.air

Estimation of coefficients

•  Ŷ = B(X) – To solve for B – Replace Ŷ with Y – Conform for matrix multiplication:

•  Y = X(B)

46

Page 47: Lecture slides stats1.13.l11.air

Estimation of coefficients

•  Y = X(B) •  Now let’s make X square and symmetric •  To do this, pre-multiply both sides of the

equation by the transpose of X, XT

47

Page 48: Lecture slides stats1.13.l11.air

Estimation of coefficients

•  Y = X(B) becomes •  XT(Y) = XT(XB) •  Now, to solve for B, eliminate XTX •  To do this, pre-multiply by the inverse,

(XTX)-1

48

Page 49: Lecture slides stats1.13.l11.air

Estimation of coefficients

•  XTY = XT(XB) becomes •  (XTX)-1(XTY) = (XTX)-1(XTXB) – Note that (XTX)-1(XTX) = I – And IB = B

•  Therefore, (XTX)-1(XTY) = B

49

Page 50: Lecture slides stats1.13.l11.air

Estimation of coefficients

•  B = (XTX)-1(XTY)

50

Page 51: Lecture slides stats1.13.l11.air

Estimation of coefficients

•  B = (XTX)-1(XTY) •  Let’s use this formula to calculate B’s from

the raw data matrix used in the previous segment

51

Page 52: Lecture slides stats1.13.l11.air

Raw data matrix

52

3 2 3 3 2 3 2 4 4 4 3 4 4 4 3 5 4 3 2 5 4 3 3 2 5 3 4 3 5 4

Xnp =

Page 53: Lecture slides stats1.13.l11.air

Row vector of sums

53

3 2 3 3 2 3 2 4 4 4 3 4 4 4 3 5 4 3 2 5 4 3 3 2 5 3 4 3 5 4

T1p = 11n * Xnp = 1 1 1 1 1 1 1 1 1 1 * = 34 35 34

Page 54: Lecture slides stats1.13.l11.air

Row vector of means

54

M1p = T1p * N-1 = 34 35 34 * 10-1 = 3.4 3.5 3.4

Page 55: Lecture slides stats1.13.l11.air

Matrix of means

55

Mnp = 1n1 * M1p =

1 1 1 1 1 1 1 1 1 1

3.4 3.5 3.4

3.4 3.5 3.4 3.4 3.5 3.4 3.4 3.5 3.4 3.4 3.5 3.4 3.4 3.5 3.4 3.4 3.5 3.4 3.4 3.5 3.4 3.4 3.5 3.4 3.4 3.5 3.4 3.4 3.5 3.4

* =

Page 56: Lecture slides stats1.13.l11.air

Matrix of deviation scores

56

3 2 3 3 2 3 2 4 4 4 3 4 4 4 3 5 4 3 2 5 4 3 3 2 5 3 4 3 5 4

Dnp = Xnp - Mnp =

3.4 3.5 3.4 3.4 3.5 3.4 3.4 3.5 3.4 3.4 3.5 3.4 3.4 3.5 3.4 3.4 3.5 3.4 3.4 3.5 3.4 3.4 3.5 3.4 3.4 3.5 3.4 3.4 3.5 3.4

- =

-.4 -1.5 -.4 -.4 -1.5 -.4 -1.4 .5 .6 .6 -.5 .6 .6 .5 -.4 1.6 .5 -.4 -1.4 1.5 .6 -.4 -.5 -1.4 1.6 -.5 .6 -.4 1.5 .6

Page 57: Lecture slides stats1.13.l11.air

SS & SP matrix

Sxx = DpnT * Dnp =

-.4 -.4 -1.4 .6 .6 1.6 -1.4 -.4 1.6 -.4 -1.5 -1.5 .5 -.5 .5 .5 1.5 -.5 -.5 1.5 -.4 -.4 .6 .6 -.4 -.4 .6 -1.4 .6 .6

-.4 -1.5 -.4 -.4 -1.5 -.4 -1.4 .5 .6 .6 -.5 .6 .6 .5 -.4 1.6 .5 -.4 -1.4 1.5 .6 -.4 -.5 -1.4 1.6 -.5 .6 -.4 1.5 .6

*

= 10.4 -2.0 -.6 -2.0 10.5 3.0 -.6 3.0 4.4

Page 58: Lecture slides stats1.13.l11.air

SS & SP matrix

Since we used deviation scores: Substitute Sxx for XTX Substitute Sxy for XTY

Therefore, B = (Sxx) -1Sxy

Page 59: Lecture slides stats1.13.l11.air

Estimation of coefficients B = (Sxx) -1Sxy

10.5 3.0 3.0 4.4

-2.0 -.6

B = -1

-0.19 -0.01

=

Page 60: Lecture slides stats1.13.l11.air

Estimation of coefficients

Page 61: Lecture slides stats1.13.l11.air

END SEGMENT

61

Page 62: Lecture slides stats1.13.l11.air

END LECTURE 11

62