Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

64
Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007

Transcript of Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Page 1: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Modeling a Linear Relationship

Lecture 44Secs. 13.1 – 13.3.1Tue, Apr 24, 2007

Page 2: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Bivariate Data

Data is called bivariate if each observations consists of a pair of values (x, y).

x is the explanatory variable. y is the response variable. x is also called the independent variable. y is also called the dependent variable.

Page 3: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Scatterplots

Scatterplot – A display in which each observation (x, y) is plotted as a point in the xy plane.

Page 4: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Example

Draw a scatterplot of the following data of calories vs. cholesterol in Subway sandwiches.

Calories (x) 350 290 330 290 320 370 280 290 310 230

Cholesterol (y) 50 20 45 15 35 50 20 25 20 0

Page 5: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Example

200 250 300 350 400Calories

Ch

ole

stero

l

0

10

20

30

50

40

Page 6: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Example

Does there appear to be a relationship? How can we tell?

Page 7: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

TI-83 - Scatterplots

To set up a scatterplot,Enter the x values in L1.

Enter the y values in L2.

Press 2nd STAT PLOT.Select Plot1 and press ENTER.

Page 8: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

TI-83 - Scatterplots

The Stat Plot display appears.Select On and press ENTER.Under Type, select the first icon (a small

image of a scatterplot) and press ENTER.For XList, enter L1.For YList, enter L2.For Mark, select the one you want and press

ENTER.

Page 9: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

TI-83 - Scatterplots

To draw the scatterplot,Press ZOOM. The Zoom menu appears.Select ZoomStat (#9) and press ENTER. The

scatterplot appears.Press TRACE and use the arrow keys to

inspect the individual points.

Page 10: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Describing a Linear Relationship How would we describe this relationship?

200 250 300 350 400Calories

Ch

ole

stero

l

0

10

20

30

50

40

Page 11: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Linear Association

Draw (or imagine) an oval around the data set. If the oval is tilted, then there is some linear

association. If the oval is tilted upwards from left to right, then

there is positive association. If the oval is tilted downwards from left to right,

then there is negative association. If the oval is not tilted at all, then there is no

association.

Page 12: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Positive Linear Association

x

y

Page 13: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Positive Linear Association

x

y

Page 14: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Negative Linear Association

x

y

Page 15: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Negative Linear Association

x

y

Page 16: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

No Linear Association

x

y

Page 17: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

No Linear Association

x

y

Page 18: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Strong vs. Weak Association

The association is strong if the oval is narrow.

The association is weak if the oval is wide.

Page 19: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Strong Positive Linear Association

x

y

Page 20: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Strong Positive Linear Association

x

y

Page 21: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Weak Positive Linear Association

x

y

Page 22: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Weak Positive Linear Association

x

y

Page 23: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Example

200 250 300 350 400Calories

Ch

ole

stero

l

0

10

20

30

50

40

Page 24: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Describing the Relationship

200 250 300 350 400Calories

Ch

ole

stero

l

0

10

20

30

50

40

Page 25: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Describing the Relationship

There appears to be a strong positive linear association between calories and cholesterol in Subway sandwiches.

Page 26: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Example

Draw a scatterplot of the following data.

x y

2 3

3 5

5 9

6 12

9 16

Page 27: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Simple Linear Regression

To quantify the linear relationship between x and y, we wish to find the equation of the line that “best” fits the data.

Typically, there will be many lines that all look pretty good.

How do we measure how well a line fits the data?

Page 28: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Measuring the Goodness of Fit

Which line better fits the data?

x

y

Page 29: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Measuring the Goodness of Fit

Which line better fits the data?

x

y

Page 30: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Measuring the Goodness of Fit

Which line better fits the data?

x

y

Page 31: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Measuring the Goodness of Fit

Which line better fits the data?

x

y

Page 32: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Measuring the Goodness of Fit

Start with the scatterplot.

x

y

Page 33: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Measuring the Goodness of Fit

Draw any line through the scatterplot.

x

y

Page 34: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Measuring the Goodness of Fit

Measure the vertical distances from every point to the line

x

y

Page 35: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Measuring the Goodness of Fit

Each of these represents a deviation, called a residual, from the line.

x

y

e

Page 36: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Residuals

The i th residual – The difference between the observed value of yi and the predicted, or expected, value of yi.

Use yi^ for the predicted yi.

The formula for the ith residual is

iii yye ˆ

Page 37: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Residuals

Notice that the residual is positive if the data point is above the line and it is negative if the data point is below the line.

Page 38: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Measuring the Goodness of Fit

The ith residual.

x

y

ei

xi

yi^

yi

Page 39: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Measuring the Goodness of Fit

Find the sum of the squared residuals.

x

y

ei

xi

yi^

yi

Page 40: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Measuring the Goodness of Fit

The smaller the sum of squared residuals, the better the fit.

x

y

ei

xi

yi^

yi

Page 41: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Example

Consider the data points

x y

2 3

3 5

5 9

6 12

9 16

Page 42: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Example

2 3 4 5 6 7 8 9

5

10

15

Page 43: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Least Squares Line

Let’s see how good the fit is for the line

y^ = -1 + 2x,

where y^ represents the predicted value of y, not the observed value.

Page 44: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Sum of Squared Residuals

Begin with the data set.

x y

2 3

3 5

5 9

6 12

9 16

Page 45: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Sum of Squared Residuals

Compute the predicted y, using y^ = -1 + 2x.

x y y^

2 3 3

3 5 5

5 9 9

6 12 11

9 16 17

Page 46: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Sum of Squared Residuals

Compute the residuals, y – y^.

x y y^ y – y^

2 3 3 0

3 5 5 0

5 9 9 0

6 12 11 1

9 16 17 -1

Page 47: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Sum of Squared Residuals

Square the residuals.

x y y^ y – y^ (y – y^)2

2 3 3 0 0

3 5 5 0 0

5 9 9 0 0

6 12 11 1 1

9 16 17 -1 1

Page 48: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Sum of Squared Residuals

Find the sum of the squared residuals.

x y y^ y – y^ (y – y^)2

2 3 3 0 0

3 5 5 0 0

5 9 9 0 0

6 12 11 1 1

9 16 17 -1 1

SSE = (y – y^)2 = 2.00

Page 49: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Least Squares Line

Least squares line – The line for which the sum of the squares of the residuals is as small as possible.

The least squares line is also called the line of best fit or the regression line.

Page 50: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Regression Line

We will write regression line as

a is the y-intercept. b is the slope.

This is the usual slope-intercept form

with the two terms rearranged and relabeled.

bxay ˆ

bmxy ˆ

Page 51: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

TI-83 – Computing Residuals

It is not hard to compute the residuals and the sum of their squares on the TI-83.

(Later, we will see a faster method.) Enter the x-values in list L1 and the y-values in list L2.

Compute a + b*L1 and store in list L3 (y^ values).

Compute (L2 – L3)2. This is a list of the squared residuals.

Compute sum(Ans). This is the sum of the squared residuals.

Page 52: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Sum of Squared Residuals

Now let’s see how good the fit is for the line

y^ = -0.5 + 1.9x. We will compute the sum of squared

residuals, SSE.

Page 53: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Sum of Squared Residuals

Begin with the data set.

x y

2 3

3 5

5 9

6 12

9 16

Page 54: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Sum of Squared Residuals

Compute the predicted y, using y^ = -0.5 + 1.9x.

x y y^

2 3 3.3

3 5 5.2

5 9 9.0

6 12 10.9

9 16 16.6

Page 55: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Sum of Squared Residuals

Compute the residuals, y – y^.

x y y^ y – y^

2 3 3.3 -0.3

3 5 5.2 -0.2

5 9 9.0 0.0

6 12 10.9 1.1

9 16 16.6 -0.6

Page 56: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Sum of Squared Residuals

Compute the squared residuals.

x y y^ y – y^ (y – y^)2

2 3 3.3 -0.3 0.09

3 5 5.2 -0.2 0.04

5 9 9.0 0.0 0.00

6 12 10.9 1.1 1.21

9 16 16.6 -0.6 0.36

Page 57: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Sum of Squared Residuals

Find the sum of the squared residuals.

x y y^ y – y^ (y – y^)2

2 3 3.3 -0.3 0.09

3 5 5.2 -0.2 0.04

5 9 9.0 0.0 0.00

6 12 10.9 1.1 1.21

9 16 16.6 -0.6 0.36

SSE = (y – y^)2 = 1.70

Page 58: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Sum of Squared Residuals

We conclude that y^ = -0.5 + 1.9x is a better fit than y^ = -1 + 2x.

Is it the best fit?

Page 59: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Sum of Squared Residuals

2 3 4 5 6 7 8 9

5

10

15

y^ = -1 + 2x

Page 60: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Sum of Squared Residuals

2 3 4 5 6 7 8 9

5

10

15

y^ = -0.5 + 1.9x

Page 61: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Example

For all the lines that one could draw through this data set,

it turns out that 1.70 is the smallest possible value for the sum of the squares of the residuals.

x y

2 3

3 5

5 9

6 12

9 16

Page 62: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Example

Therefore,

y^ = -0.5 + 1.9x

is the regression line for this data set.

Page 63: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Prediction

Use the regression line to predict y when x = 4 x = 7 x = 20

Interpolation – Using an x value within the observed extremes of x values to predict y.

Extrapolation – Using an x value beyond the observed extremes of x values to predict y.

Page 64: Modeling a Linear Relationship Lecture 44 Secs. 13.1 – 13.3.1 Tue, Apr 24, 2007.

Interpolation vs. Extrapolation

Interpolated values are more reliable then extrapolated values.

The farther out the values are extrapolated, the less reliable they are.