Ch 8 Linear Regression

23
Ch 8 Linear Regression AP Statistics Mrs Johnson

description

Ch 8 Linear Regression. AP Statistics Mrs Johnson. 3 Ways to Write the Least Squares Regression Line. From Data Using the calculator, you input data into lists, run a linear regression through the data From statistics The LSRL runs through the centroid - PowerPoint PPT Presentation

Transcript of Ch 8 Linear Regression

Page 1: Ch  8  Linear Regression

Ch 8 Linear RegressionAP StatisticsMrs Johnson

Page 2: Ch  8  Linear Regression

3 Ways to Write the Least Squares Regression Line• From Data• Using the calculator, you input data into lists, run a

linear regression through the data• From statistics• The LSRL runs through the centroid • Using the statistics r, sx, sy, and the mean of x and y, we

can write equation of the LSRL from formulas GIVEN on the AP exam.

• From computer output• Many times you will be given computer output – the

slope and y intercept are always in this given data

Page 3: Ch  8  Linear Regression

Interpreting SLOPE in a problem:• When asked to interpret slope – remember that slope is the

change in y over the change in x

• State the following: As the ________ (explanatory variable) increases by 1 _______ (insert unit) the __________ (response variable) is predicted to increase/decrease (use appropriate word given sign of slope) by _______ (insert slope here and units).

• As the caloric content of a burger increases by 1 calorie, the fat content of the burger is PREDICTED to increase by _____ grams.

Page 4: Ch  8  Linear Regression

Interpreting y-intercepts:

• The y intercept occurs when the explanatory variable is 0.• Interpretation depends on the example – often times

there is no real application for the y-intercept.• When the explanatory variable is 0, the response

variable is predicted to be _____. (sub 0 into the equation and solve)

Page 5: Ch  8  Linear Regression

Coefficient of Determination – R2

• R2 is the squared correlation coefficient R

• Gives the proportion (percentage) of the data’s variation accounted for by the model

• R2 = 0 would means NONE of the variation of the data is in the model, useless.

• R2 = 1 would mean ALL of the variation in the data is accounted for in the model

Page 6: Ch  8  Linear Regression

Coefficient of Determination – R2

• Example:• A given data set has a correlation coefficient, r, of 0.8. • R2 = 0.64 --- Interpretation 64% of the variance in the

data is accounted for in our model

• A given data set has a correlation coefficient, r, of 0.4.• R2 = 0.16 – Interpretation 16% of the variance in the

data is accounted for in our model

Page 7: Ch  8  Linear Regression

Coefficient of Determination – R2

• NOTE: When interpreting R2, use this fill in the blank:• According to the linear model, _______ (insert

R2 value as a percentage) of the variability in response variable is accounted for by the variation in explanatory variable.

Page 8: Ch  8  Linear Regression

Predicting with LSRL• Using the LSRL – we can predict y values given x values• CAUTION – only use LSRL to predict behavior within the

bounds of your data• Do NOT extrapolate beyond data• Only interpolate within given data set

• Using the LSRL from previous example. Determine the fat content for a burger with 550 calories.

Page 9: Ch  8  Linear Regression

Example – Fat / Calorie Content

Finding the LSRL from given data – using calculator1. Insert data into L1 (fat) and L2 (cal)2. Go to Stat – Calc #8 – LinReg(a+bx)3. Select appropriate lists and STORE regression

line4. Write regression line using WORDS as variables

Fat(g) 19 31 34 35 39 39 43

Calories 410 580 590 570 640 680 660

Page 10: Ch  8  Linear Regression

• Interpret Slope:• As the fat content in a burger increases by 1 grams, the caloric

content is PREDICTED to increase by _____ calories.

• What is the y intercept in the burger example?• A burger with 0 fat grams, there is predicted to have _____

calories.• Interpret r• Interpret r-squared• Predict the calories for a burger with 35 grams of fat

Page 11: Ch  8  Linear Regression

Residual• The difference between the predicted value, , and

the actual value from a data point, y.

• Residual plots• Important tool for determining if a line is the best fit

for data• A line is a good fit according to the residual plot IF:• No apparent pattern – no direction or shape• Scattered horizontally, with no major gaps or outliers

Page 12: Ch  8  Linear Regression

Residual Plots

• No pattern – indicates line is a good fit

• U – Shaped pattern – indicates non-linear would be best fit

• Upside down u shaped pattern indicates non linear would be best fit

Page 13: Ch  8  Linear Regression

Residual Plot of Example:• Once you run a regression in your calculator, the

residuals are created automatically and ready for you to display• From STAT PLOT, keep the x list as L1 and go to y

list and find RESID in the list menu• Zoom 9 will show you the residual plot• Back to the burger data – what is the residual of

your 35 grams of fat burger?• Does our line OVER or UNDER predict?• Negative residuals mean our line OVER predicts• Positive residuals mean our line UNDER PREDICTS

Page 14: Ch  8  Linear Regression

Set 2: Writing the Line of Best Fit – from statistics given about data

• The line of best fit will be written in the form:

• y-hat = predicted value• b0 = y intercept

• b1 = slope

• Finding the slope of the best fit line:

• Sy= standard deviation of response variable

• Sx= standard deviation of explanatory variable• r= correlation coefficient

Page 15: Ch  8  Linear Regression

Finding the y intercept• Finding the y intercept of the best fit line:• From the equation for predicted value of y

• Given the mean values for x and y• Given the value of b1 – slope – calculated from statistics r, sx, sy

• Use the given point and solve for b0

Page 16: Ch  8  Linear Regression

Example: #36 pg 193• Given that a line is the form of best fit for a set of data which

compares fat and calories on 11 brands of fast food chicken sandwiches, and given the summary statistics:

Fat (g) Calories

Mean 20.6 472

Standard Dev 9.8 144.2

Correlation 0.974

Page 17: Ch  8  Linear Regression

Example #36 pg 193 continued• Write the equation for line of best fit.

• Interpret the slope in the context of the problem

• Explain the meaning of the y intercept

• What does it mean if a sandwich has a negative residual?

• If a sandwich had 23 grams of fat, what is the predicted value for calories?

Page 18: Ch  8  Linear Regression

Method #3 – Writing the LSRL from Computer Output• Given the following data set – comparing height (ft) and

weight (lb) for 10 people in a weight loss program• Describe and interpret the correlation

Page 19: Ch  8  Linear Regression

Reading Computer Output for Line of Best Fit

Dependent Variable is: WeightR-Squared = 0.91Variable Coefficient SE (Coeff)Constant -289.5 2.606Height 86.1 .0013

Page 20: Ch  8  Linear Regression

Linear Regression Line

Page 21: Ch  8  Linear Regression

Residual Plot – Is Line a Good Fit?

Page 22: Ch  8  Linear Regression

Interpreting Slope / y intercepts

Slope interpretation:

As the height of the participant in the weight loss program increases by 1 foot, the predicted weight of the participant increases by approximately 86 lbs.

OR

As the height of the participant increases by 1 inch, the weight of the participant increases by approximately 7 lbs.

Page 23: Ch  8  Linear Regression

Interpreting y-intercepts:

• The y intercept occurs when the explanatory variable is 0.

• What is the y intercept in this example: • -289.5• No real life interpretation – but the actual

interpretation is for a participant in the weight loss program who is 0 feet tall, the predicted weight would be -289.5 lbs.