Curvilinear Regression Analysis

29
Lecture #18 - 4/7/2005 Slide 1 of 29 Curvilinear Regression Analysis Lecture 18 April 7, 2005 Applied Regression Analysis

Transcript of Curvilinear Regression Analysis

Page 1: Curvilinear Regression Analysis

Lecture #18 - 4/7/2005 Slide 1 of 29

Curvilinear Regression Analysis

Lecture 18

April 7, 2005Applied Regression Analysis

Page 2: Curvilinear Regression Analysis

Overview Today’s Lecture

ANOVA Example

CurvilinearRegression

Wrapping Up

Lecture #18 - 4/7/2005 Slide 2 of 29

Today’s Lecture

ANOVA with a continuous independent variable.

Curvilinear regression analysis.

Interactions with continuous variables.

Page 3: Curvilinear Regression Analysis

Overview

ANOVA Example Example ANOVA Regression SS Differences ANOVA Table

CurvilinearRegression

Wrapping Up

Lecture #18 - 4/7/2005 Slide 3 of 29

An Example

From Pedhazur, p. 513-514:

“Assume that in an experiment on the learning of pairedassociates, the independent variable is the number ofexposures to a list. Specifically, 15 subjects arerandomly assigned, in equal numbers, to five levels ofexposure to a list, so that one group is given oneexposure, a second group is given two exposures, andso on to five exposures for the fifth group. Thedependent variable measure is the number of correctresponses on a subsequent test.”

Page 4: Curvilinear Regression Analysis

Overview

ANOVA Example Example ANOVA Regression SS Differences ANOVA Table

CurvilinearRegression

Wrapping Up

Lecture #18 - 4/7/2005 Slide 4 of 29

The Analysis

Running an ANOVA (from Analyze...General LinearModel...Univariate in SPSS) produces these results:

Page 5: Curvilinear Regression Analysis

Overview

ANOVA Example Example ANOVA Regression SS Differences ANOVA Table

CurvilinearRegression

Wrapping Up

Lecture #18 - 4/7/2005 Slide 5 of 29

The Interpretation

From the example, we could test the hypothesis:

H0 : µ1 = µ2 = µ3 = µ4 = µ5

Here, F4,10 = 2.10, which gives a p-value of 0.156.

Using any reasonable Type-I error rate (like 0.05), we wouldfail to reject the null hypothesis.

We would then conclude that there is no effect of number ofexposures on learning (as measured by test score).

Note that for this analysis there were five coded vectorsproduced (four degrees of freedom for the numerator).

Page 6: Curvilinear Regression Analysis

Overview

ANOVA Example Example ANOVA Regression SS Differences ANOVA Table

CurvilinearRegression

Wrapping Up

Lecture #18 - 4/7/2005 Slide 6 of 29

A New Analysis

Instead of running an ANOVA to test for differences betweenthe means of the test scores at each level of X , couldn’t werun an linear regression?

In the words of Marv Albert: YES!

For the linear regression to be valid, the means of the levelsof X must fall on the linear regression line.

The key point is that the means must follow a linear trend.

Using the difference between the ANOVA and theRegression, I will show you how you can test for a lineartrend in the analysis.

Page 7: Curvilinear Regression Analysis

Overview

ANOVA Example Example ANOVA Regression SS Differences ANOVA Table

CurvilinearRegression

Wrapping Up

Lecture #18 - 4/7/2005 Slide 7 of 29

Multiple Regression Results

Running an regression (from Analyze...Regression...Linearin SPSS) produces these results:

Page 8: Curvilinear Regression Analysis

Overview

ANOVA Example Example ANOVA Regression SS Differences ANOVA Table

CurvilinearRegression

Wrapping Up

Lecture #18 - 4/7/2005 Slide 8 of 29

Multiple Regression Results

From the example, we could test the hypothesis:

H0 : b1 = 0

Here, F1,13 = 8.95, which gives a p-value of 0.010.

Using any reasonable Type-I error rate (like 0.05), we wouldreject the null hypothesis.

We would then conclude that there is a significantrelationship between number of exposures and learning (asmeasured by test score).

This conclusion is different than the conclusion we drewbefore.

What is different about our analysis?

Page 9: Curvilinear Regression Analysis

Overview

ANOVA Example Example ANOVA Regression SS Differences ANOVA Table

CurvilinearRegression

Wrapping Up

Lecture #18 - 4/7/2005 Slide 9 of 29

SS Differences

Notice from the ANOVA analysis, the SStreatment = 8.40.

From the regression analysis, the SSregression = 7.50.

Note the difference between the two.

The SStreatment is larger.

SSdeviation = SStreatment − SSregression = 0.90.

The difference between SStreatment and SSregression istermed SSdeviation.

Take a look at how that difference comes about.

Page 10: Curvilinear Regression Analysis

Overview

ANOVA Example Example ANOVA Regression SS Differences ANOVA Table

CurvilinearRegression

Wrapping Up

Lecture #18 - 4/7/2005 Slide 10 of 29

SS Differences

The estimated regression line is:

Y ′ = 2.7 + 0.5X

X NX X Y ′ X − Y ′ (X − Y ′)2 NX(X − Y ′)2

1 3 3.0 3.2 -0.2 0.04 0.12

2 3 4.0 3.7 0.3 0.09 0.27

3 3 4.0 4.2 -0.2 0.04 0.12

4 3 5.0 4.7 0.3 0.09 0.27

5 3 5.0 5.2 -0.2 0.04 0.12∑

NX(X − Y ′)2 0.90

Page 11: Curvilinear Regression Analysis

Overview

ANOVA Example Example ANOVA Regression SS Differences ANOVA Table

CurvilinearRegression

Wrapping Up

Lecture #18 - 4/7/2005 Slide 11 of 29

Data Scatterplot

1.00 2.00 3.00 4.00 5.00

number of exposures

2.00

3.00

4.00

5.00

6.00

nu

mb

er c

orr

ect

Ω

Ω

Ω

Ω

Ω

Ω

Ω

Ω

Ω

Ω

Ω

Ω

Ω

Ω

Ω

M

M M

M M

Page 12: Curvilinear Regression Analysis

Overview

ANOVA Example Example ANOVA Regression SS Differences ANOVA Table

CurvilinearRegression

Wrapping Up

Lecture #18 - 4/7/2005 Slide 12 of 29

SS Differences

The value obtained in the previous slide, 0.90, was equal tothe SSdeviation.

The SSdeviation is literally the calculation of a statistic thatmeasures a variable’s deviation from linearity.

This value serves as a basis for the question of:

“What is the difference between restricting the data toconfirm to a linear trend and placing no suchrestriction?” (Pedhazur, p. 517)

Page 13: Curvilinear Regression Analysis

Overview

ANOVA Example Example ANOVA Regression SS Differences ANOVA Table

CurvilinearRegression

Wrapping Up

Lecture #18 - 4/7/2005 Slide 13 of 29

SS Differences

When the SStreatment is calculated, there is no restriction onthe means of the treatment groups.

If the means fall onto a (straight) line, there will be nodifference between SStreatment and SSregression,SSdeviation = 0.

With departures from linearity, the SStreatment will be muchlarger than the SSregression.

Do you feel a statistical hypothesis test coming on?

Page 14: Curvilinear Regression Analysis

Overview

ANOVA Example Example ANOVA Regression SS Differences ANOVA Table

CurvilinearRegression

Wrapping Up

Lecture #18 - 4/7/2005 Slide 14 of 29

Hypothesis Test

The SSTreatments can be partitioned into two components:SSRegression (also called the SS due to linearity), and theremainder, the SS due to deviation from linearity.

Source df SS MS F

Between Treatments 4 8.60Linearity 1 7.50 7.50 7.50

Deviation From Linearity 3 0.90 0.30 0.30Within Treatments 10 10.00 1.00

Total 14 18.40

If the SS due to linearity leads to a significant F value, thenone can conclude a linear trend exists, and that linearregression is appropriate.

Page 15: Curvilinear Regression Analysis

Overview

ANOVA Example

CurvilinearRegression The Polynomial

Model New Example Estimation In

SPSS Parameter

Interpretation Variable

Centering Multiple

CurvilinearRegression

Wrapping Up

Lecture #18 - 4/7/2005 Slide 15 of 29

Curvilinear Regression

The preceding example demonstrated how a linear trendcould be detected using a statistical hypothesis test.

A linear trend is something we are very familiar with, havingencountered linear regression for most of this course.

Curvilinear regression analysis can be used to determine ifnot-so-linear trends exist between Y and X .

Pedhazur distinguishes between two types of trendspossible:

Intrinsically linear.

Intrinsically nonlinear.

Page 16: Curvilinear Regression Analysis

Overview

ANOVA Example

CurvilinearRegression The Polynomial

Model New Example Estimation In

SPSS Parameter

Interpretation Variable

Centering Multiple

CurvilinearRegression

Wrapping Up

Lecture #18 - 4/7/2005 Slide 16 of 29

Curvilinear Regression

An intrinsically linear model is one that is linear in itsparameters but not linear in the variables.

By transformation such a model may be reduced to alinear model.

Such models are the focus of this remainder of thislecture.

An intrinsically nonlinear model is one that may not becoerced into linearity by transformation.

Such models often require more complicated estimationalgorithms than what is provided by least squares and theGLM.

Page 17: Curvilinear Regression Analysis

Overview

ANOVA Example

CurvilinearRegression The Polynomial

Model New Example Estimation In

SPSS Parameter

Interpretation Variable

Centering Multiple

CurvilinearRegression

Wrapping Up

Lecture #18 - 4/7/2005 Slide 17 of 29

The Polynomial Model

A simple regression model extension for curved relations isthe polynomial model, such as the following second-degreepolynomial:

Y ′ = a + b1X1 + b2X2

1

One could also estimate a third-degree polynomial:

Y ′ = a + b1X1 + b2X2

1+ b3X

3

1

Or a fourth-degree polynomial:

Y ′ = a + b1X1 + b2X2

1+ b3X

3

1+ b4X

4

1

And so on...

Page 18: Curvilinear Regression Analysis

Overview

ANOVA Example

CurvilinearRegression The Polynomial

Model New Example Estimation In

SPSS Parameter

Interpretation Variable

Centering Multiple

CurvilinearRegression

Wrapping Up

Lecture #18 - 4/7/2005 Slide 18 of 29

The Polynomial Model: Estimation

The way of determining the extent to which a given model isapplicable is similar to determining if added variablessignificantly improve the predictive ability of a regressionmodel.

Beginning with a linear model (a first-degree polynomial),estimate the model, denoted as R2

y.x.

The tests of incremental variance accounted for are done foreach level of the polynomial:

Linear: R2

y.x

Quadratic: R2

y.x,x2 − R2

y.x

Cubic: R2

y.x,x2,x3 − R2

y.x,x2

Quartic: R2

y.x,x2,x3,x4 − R2

y.x,x2,x3

Page 19: Curvilinear Regression Analysis

Overview

ANOVA Example

CurvilinearRegression The Polynomial

Model New Example Estimation In

SPSS Parameter

Interpretation Variable

Centering Multiple

CurvilinearRegression

Wrapping Up

Lecture #18 - 4/7/2005 Slide 19 of 29

A New Example

From Pedhazur, p. 522:

“Suppose that we are interested in the effect of timespent in practice on the performance of a visualdiscrimination task. Subjects are randomly assigned todifferent levels of practice, following which a test ofvisual discrimination is administered, and the number ofcorrect responses is recorded for each subject. Asthere are six levels the highest-degree polynomialpossible for these data is the fifth. Our aim, however, isto determine the lowest degree-polynomial that best fitsthe data.”

Page 20: Curvilinear Regression Analysis

Overview

ANOVA Example

CurvilinearRegression The Polynomial

Model New Example Estimation In

SPSS Parameter

Interpretation Variable

Centering Multiple

CurvilinearRegression

Wrapping Up

Lecture #18 - 4/7/2005 Slide 20 of 29

Data Scatterplot

2.50 5.00 7.50 10.00

Practice Time

5.00

10.00

15.00

20.00

Tas

k S

core

Ω

Ω

Ω

Ω

ΩΩ

Ω

Ω

Ω

Ω

Ω

Ω

Ω

Ω

Ω

Ω

Ω

Ω

Page 21: Curvilinear Regression Analysis

Overview

ANOVA Example

CurvilinearRegression The Polynomial

Model New Example Estimation In

SPSS Parameter

Interpretation Variable

Centering Multiple

CurvilinearRegression

Wrapping Up

Lecture #18 - 4/7/2005 Slide 21 of 29

Estimation In SPSS

To estimate the degrees of a polynomial, first one mustcreate new variables in SPSS, each representing X raised toa given power.

Then successive regression analyses must be run, eachadding a level to the equation:

Model R2 Increase Over Previous F

X 0.883 0.940 121.029 *X , X2 0.943 0.060 15.604 *

X , X2, X3 0.946 0.003 0.911

Because adding X3 did not significantly increase R2, westop with the quadratic model.

Page 22: Curvilinear Regression Analysis

Overview

ANOVA Example

CurvilinearRegression The Polynomial

Model New Example Estimation In

SPSS Parameter

Interpretation Variable

Centering Multiple

CurvilinearRegression

Wrapping Up

Lecture #18 - 4/7/2005 Slide 22 of 29

Estimation In SPSS

Of course, there is an easier way...

In SPSS go to Analyze...Regression...Curve Estimation

Page 23: Curvilinear Regression Analysis

Overview

ANOVA Example

CurvilinearRegression The Polynomial

Model New Example Estimation In

SPSS Parameter

Interpretation Variable

Centering Multiple

CurvilinearRegression

Wrapping Up

Lecture #18 - 4/7/2005 Slide 23 of 29

Estimation In SPSS

MODEL: MOD_2.

Independent: x

Dependent Mth Rsq d.f. F Sigf b0 b1 b2 b3

y LIN .883 16 121.03 .000 3.2667 1.5571

y QUA .943 15 123.55 .000 -1.9000 3.4946 -.1384

y CUB .946 14 82.18 .000 .6667 1.8803 .1290 -.0127

Page 24: Curvilinear Regression Analysis

Overview

ANOVA Example

CurvilinearRegression The Polynomial

Model New Example Estimation In

SPSS Parameter

Interpretation Variable

Centering Multiple

CurvilinearRegression

Wrapping Up

Lecture #18 - 4/7/2005 Slide 24 of 29

Data Scatterplot

Page 25: Curvilinear Regression Analysis

Overview

ANOVA Example

CurvilinearRegression The Polynomial

Model New Example Estimation In

SPSS Parameter

Interpretation Variable

Centering Multiple

CurvilinearRegression

Wrapping Up

Lecture #18 - 4/7/2005 Slide 25 of 29

Parameter Interpretation

The b parameters in a polynomial regression are nearlyimpossible to interpret.

An independent variable is represented by more than asingle vector - what’s held constant?

The relative magnitude of the b parameters for differentdegrees cannot be compared because the SD of the higherdegree polynomials explodes.

X → s2

x

X2→ (s2

x)2

X3→ (s2

x)3

. . .

Page 26: Curvilinear Regression Analysis

Overview

ANOVA Example

CurvilinearRegression The Polynomial

Model New Example Estimation In

SPSS Parameter

Interpretation Variable

Centering Multiple

CurvilinearRegression

Wrapping Up

Lecture #18 - 4/7/2005 Slide 26 of 29

Variable Centering

Centering variables in a polynomial equation can avoidcollinearity problems.

Centering does not change the R2 of a model, only theregression parameters.

Page 27: Curvilinear Regression Analysis

Overview

ANOVA Example

CurvilinearRegression The Polynomial

Model New Example Estimation In

SPSS Parameter

Interpretation Variable

Centering Multiple

CurvilinearRegression

Wrapping Up

Lecture #18 - 4/7/2005 Slide 27 of 29

Multiple Curvilinear Regression

Running multiple curvilinear regression models are straightforward extensions from what was shown today:

Y ′ = a + b1X + b2Z + b3XZ + b4X2 + b5Z

2

Note the cross-product XZ.

This cross product term is tested above and beyond X and Z

individually.

Page 28: Curvilinear Regression Analysis

Overview

ANOVA Example

CurvilinearRegression

Wrapping Up Final Thought Next Class

Lecture #18 - 4/7/2005 Slide 28 of 29

Final Thought

Curvilinear regression canbe accomplished usingtechniques we are familiarwith.

Interpretation can betricky...

We are all lucky to be students during this season...

Page 29: Curvilinear Regression Analysis

Overview

ANOVA Example

CurvilinearRegression

Wrapping Up Final Thought Next Class

Lecture #18 - 4/7/2005 Slide 29 of 29

Next Time

No class next week (I’m in Montreal...if you are there, sayhello).

Chapter 14: Continuous and categorical independentvariables.

Comedy provided by this guy: