Getting More out of Multiple Regression Darren Campbell, PhD.

41
Getting More out of Multiple Regression Darren Campbell, PhD

Transcript of Getting More out of Multiple Regression Darren Campbell, PhD.

Page 1: Getting More out of Multiple Regression Darren Campbell, PhD.

Getting More out of Multiple Regression

Darren Campbell, PhD

Page 2: Getting More out of Multiple Regression Darren Campbell, PhD.

Overview

View on Teaching Statistics When to Apply How to Use & How to Interpret

Page 3: Getting More out of Multiple Regression Darren Campbell, PhD.

Multiple Regression Techniques

1. Centring removing /group difference confounds

2. Centring interpret continuous interactions

3. Spline functions – Piecemeal Polynomials

Estimate separate slopes each angle of the regression polynomial

Page 4: Getting More out of Multiple Regression Darren Campbell, PhD.

Perks of Multiple Regression

1. Realistic many influences Behaviour 2. Control over confounds 3. Test for relative importance 4. Identify interactions

Page 5: Getting More out of Multiple Regression Darren Campbell, PhD.

Why Not Use ANOVAs?

Not realistic:Many behaviours / constructs are continuous

e.g., intelligence, personality Loss of statistical power - categories

scores assumed to be the same + errormixing systematic patterns into the error term

Page 6: Getting More out of Multiple Regression Darren Campbell, PhD.

What is Centring? Simple re-scaling of raw scores

Raw Score minus Some Constant value x1 – 5.1

1 – 5.1 = -4.1

4 – 5.1 = -1.1 x2 – 29.4

30 – 29.4 = 0.6

35 -- 29.4 = 5.6

Page 7: Getting More out of Multiple Regression Darren Campbell, PhD.

A Simple Case for Centring

Babies: Cry & Fuss – parent report diary measures Fail about - limb movement

Are these 2 infant behaviours related? Emotional Responses & Emotion Regulation

Page 8: Getting More out of Multiple Regression Darren Campbell, PhD.

A Simple Case for Centring

Age Moves / Hr Crying Hrs/Day

6 week olds 5.1 4.7

6 month olds 29.43.5

Full Sample 17.2 4.1

Are these 2 infant behaviours related?

Page 9: Getting More out of Multiple Regression Darren Campbell, PhD.

6 Week-Olds

r = +.47

some infants cry more & move more

others cry less & move less

6 week-old infants

0

1

2

3

4

5

6

7

8

9

0 1 2 3 4 5 6 7 8 9 10

Activity - limb movements

Ho

urs

of

Cry

ing

Page 10: Getting More out of Multiple Regression Darren Campbell, PhD.

6 Month-Olds

r = +.38

some infants cry more & move more

others cry less & move less

What if we combine the two groups?

6 month-old infants

0

1

2

3

4

5

6

7

25 30 35 40

Activity - limb movements

Ho

urs

of

Cry

ing

Page 11: Getting More out of Multiple Regression Darren Campbell, PhD.

• Full sample r = -0.22

6 week-olds & 6-month-old infants

0

1

2

3

4

5

6

7

8

9

0 5 10 15 20 25 30 35 40

Activity - limb movements

Ho

urs

of

Cry

ing

• Do we get a significant corr? If so, what kind?

Page 12: Getting More out of Multiple Regression Darren Campbell, PhD.

What happened with the Correlations?

6 Week-olds: r = +.47 6 Month-Olds: r = +.38 6 Week & 6 Month-olds: r = -0.22

Page 13: Getting More out of Multiple Regression Darren Campbell, PhD.

Correlations = Grand Mean Centring

1) Mean Deviations for each variable: X & Y 2) Rank Order Mean Deviations 3) Correlate 2 rank orders of X & Y

Page 14: Getting More out of Multiple Regression Darren Campbell, PhD.

The Disappearing Correlation Explained

Grand Mean Centring lead to all the older infants being classified as high movers young infants low movers Young high criers & high movers -> high criers & low

movers Large Group differences in movement altered the

detection of within-group r’s

What should we do?

Page 15: Getting More out of Multiple Regression Darren Campbell, PhD.

Solution: Create Group Mean Deviations

Re-scale raw scores Raw – Group Mean 6 week-olds: xs – 5.1 6 month-olds: xs – 29.4

Page 16: Getting More out of Multiple Regression Darren Campbell, PhD.

Solution: Create Group Mean Deviations

Crying Raw AL Group Means Group Centred AL

5.7 1 -5.11 -4.11

6 4 -5.11 -1.11

2 5 -5.11 -0.11

0.5 30 -29.4 0.63

2.5 35 -29.4 5.63

2 34 -29.4 4.63

Page 17: Getting More out of Multiple Regression Darren Campbell, PhD.

• Raw Scores

6 week-olds & 6-month-old infants

0

1

2

3

4

5

6

7

8

9

0 5 10 15 20 25 30 35 40

Activity - limb movements

Ho

urs

of

Cry

ing

Page 18: Getting More out of Multiple Regression Darren Campbell, PhD.

Group Centred Scores

Group mean data r = .41 - full sample Mulitple Regression could also work on uncentred variables

Crying = Group + Uncentred AL Not a Group x AL interaction – the relation is the same for both groups

012

3456

789

-10 -8 -6 -4 -2 0 2 4 6 8 10

Limb Movements / 48 Hrs

Ho

urs

of

Cry

ing

/48

Hrs

6 Weeks Old

6 Months Old

Page 19: Getting More out of Multiple Regression Darren Campbell, PhD.

Centring so far

1. Centring is Magic 2. Different types of centring

Depending on the number used to re-scale the data

Grand mean – Pearson Correlations Group Means – Infant Limb Movements

Page 20: Getting More out of Multiple Regression Darren Campbell, PhD.

Regression Interactions Centring

Great for Interpreting Interactions trickier than for ANOVAs do not have pre-defined levels or groups based on 2+ continuous vars

Page 21: Getting More out of Multiple Regression Darren Campbell, PhD.

Multiple Regression - the Basics

The Basic Equation: Y = a + b1*X1 + b2*X2 + b3*X3 + e Outcome = Intercept + Beta1 * predictor1 + B2 * pred2 + B3 * pred3 + Error

a = expected mean response of y betas: every 1 unit change in X you get a

beta sized change in Y

Page 22: Getting More out of Multiple Regression Darren Campbell, PhD.

Regression Interactions Centring Reducing multicollinearity

interaction predictor = x1 * x2 x1 & x2 numbers near 0 stay near 0 and high x1 & x2

numbers get really high interaction term is highly correlated with original x1 &

x2 variables Centring makes each predictor: x1 & x2

have more moderate numbers above and below zero positive and negative numbers

Reduces the multiplicative exaggeration between x1 & x2 and the interaction product x1*x2

Page 23: Getting More out of Multiple Regression Darren Campbell, PhD.

Centring to reduce Multicollinearity

X1 with X1*X2 multicollinearityOriginal Variables

0

10

20

30

40

50

60

70

80

90

0 10 20

x1

x1*x

2 p

rod

uct

X1 with X1*X2 multicollinearity Centred Variables

-10

0

10

20

30

-6 -4 -2 0 2 4

x1

x1*x

2 p

rod

uct

Page 24: Getting More out of Multiple Regression Darren Campbell, PhD.

Regression

Y = a + b1*X1 + b2*X2 + b3*X1*X3 + e

How does X2 relate to Y at different levels of X1?

How does predictor 2 (shyness) relate to the outcome (social interactions) at different stress levels (X1)?

Page 25: Getting More out of Multiple Regression Darren Campbell, PhD.

Uncentred Data Centred DataX1 = 26.2 (14.5) X1 = 0.0 (14.5)X2 = 24.8 (27.6) X2 = 0.0 (27.6)

x1 x2 x12 y x1c x2c x12c y

x1 -- 0.58** 0.65** 0.14** x1c -- 0.58** 0.11 0.14*

x2 -- 0.96** 0.28** x2c -- 0.66** 0.28**

x12 -- 0.34** x12c -- 0.34**

Correlation Matrix:

** p = .01

* p = .05

Page 26: Getting More out of Multiple Regression Darren Campbell, PhD.

Regression Equation Results No Interaction:

Y = b0 + b1 * X1 + b2 * X2

Uncentred:Y = 1164.8 – 4 X1 + 20 X2 **

Centred:Y = 1550.8 – 4 X1 + 20 X2 **

Page 27: Getting More out of Multiple Regression Darren Campbell, PhD.

Regression Equation Results

Interaction Term Included: Y = b0 + b1 * X1 + b2 * X2 + b3 * X1*X2

Uncentred: Y = 1733 – 19.1 X1 – 31.7 X2 ** + 1.26 X1*X2

Centred: Y = 1260 + 12.0 X1 + 1.1 X2 + 1.26 X1*X2

Page 28: Getting More out of Multiple Regression Darren Campbell, PhD.

But what does it mean…

How does X2 relate to Y at different levels of X1?

How does predictor 2 (shyness) relate to the outcome (social interactions) at different stress levels (X1)?

Page 29: Getting More out of Multiple Regression Darren Campbell, PhD.

Post Hocs Y = b0 + b1 * X1 + b2 * X2 + b3 * X1*X2

Y = ( b1 * X1 + b0 ) + ( b2 + b3 * X1 ) * X2

-1 SD below X1 Mean & + 1SD above X1 Mean

X - (- 14.547663) X - 14.547663

X + 14.547663

Page 30: Getting More out of Multiple Regression Darren Campbell, PhD.

AL Mean Centred

0

5

10

-10 -5 0 5 10

Movement Hrs/Day

Cry

ing

Hrs

/Day

AL -1SD Below Mean

0

5

10

-10 0 10

Movement Hrs/Day

Cry

ing

H

rs/D

ay

AL +1SD Below Mean

0

5

10

-10 0 10

Movement Hrs/Day

Cry

ing

H

rs/D

ay

Scatterplots: Moving the Y Axis

Page 31: Getting More out of Multiple Regression Darren Campbell, PhD.

-1 SD Below X1 Mean Y = 1085 -19.1 X1 - 17.1 X2 + 1.26 X1*X2 t (1,196) = -1.40, p =.16

Centred: Y = 1260 + 12.0 X1 + 1.1 X2 + 1.26 X1*X2 t (1,196) = 0.12, p =.88

+1 SD Above X1 Mean Y = 1435 - 19.1 X1+ 19.4 X2 ** + 1.26 X1*X2 t (1,196) = 3.66, p =.001

Page 32: Getting More out of Multiple Regression Darren Campbell, PhD.

Regression Interaction Example

Predicting inhibitory ability with motor activity & age simon says like games 4 to 6 yr-olds & physical movement Move by Age interaction

F (1, 81) = 5.9, p < .02 Young (-1.5SD): move beta sig + Inhibition Middle (Mean) : move beta p = .10 ~ Inhibition Older (+1.5SD): move beta n.s. inhibition

Page 33: Getting More out of Multiple Regression Darren Campbell, PhD.

Polynomials, Centring, & Spline Functions

Polynomial relations: quadratic, cubic, etc

Y = a + b1*X1 - b2*X1*X1 + e

-100-50

050

100150200250

-10 -5 0 5 10 15

Page 34: Getting More out of Multiple Regression Darren Campbell, PhD.

Curvilinear Pattern Assume a symmetric

pattern – X2

But, it may not be ...

Perceived Control (Y) slowly increases & then declines rapidly in old age

0

100

200

300

400

500

0 5 10 15

-100-50

050

100150200250

-10 -5 0 5 10 15

Page 35: Getting More out of Multiple Regression Darren Campbell, PhD.

This Brings us to Spline Functions Split up predictor X

2+ variables

XLow & XHigh 0

50

100

150

200

250

-10 -5 0 5 10 15 20

XLow = X – (-5) & set values at the next change point to zero Ditto for XHigh

Re-run Y = a + b1*XLow - b2*XHigh+ e

Page 36: Getting More out of Multiple Regression Darren Campbell, PhD.

Perks of Spline Functions

Estimate slope anywhere along the range

Can be sig on one part - n.s. on another

Steeper or shallower

Page 37: Getting More out of Multiple Regression Darren Campbell, PhD.

Multiple Regression Techniques

1. Centring removing /group difference confounds

2. Centring interpret continuous interactions

3. Spline functions More precise understanding of polynomial

patterns

Page 38: Getting More out of Multiple Regression Darren Campbell, PhD.

Questions

• Alpha control procedures for spline functions– Could be argue that you are describing the pattern

already identified?

– Conservatively, you could apply an alpha control procedure. I like the False Discovery Rate procedures.

– Replication is preferred, but not always possible.

Page 39: Getting More out of Multiple Regression Darren Campbell, PhD.

Alpha Control Aside• The source of Type 1 errors is typically poorly

described.• Typical: If enough probability tests are run, the

probability will increase to the point where something becomes significant just by chance. – But, probability is linked to the representativeness of

your data and type 1 error is a proxy for the likelihood of the representativeness of your data.

• My View: The real source of Type 1 errors is that if you– divide up the data into enough subgroupings – eventually one of those subgroupings will differ

because it is misrepresentative of reality.

Page 40: Getting More out of Multiple Regression Darren Campbell, PhD.

Standardized vs Centred

• Centred is x – xM

• Standardized (x – xM)/ SDx– Makes variability for each predictor = 1 – Standardized Beta = raw b * SDx / SDy– Similar to centring but different metric needs to be

adjusted for interaction terms

• To get comparable results with interaction term– Standardization should be applied to X1 and X2 prior

to the X1*X2 estimate then use “raw” coefficients

Page 41: Getting More out of Multiple Regression Darren Campbell, PhD.

Centring and Spline Functions

Relatively simple procedures

Old dogs in the Statistic World but new tricks for many

That’s All Folks!