Notebook File Chapter 8
-
Upload
amy12young -
Category
Documents
-
view
219 -
download
0
Transcript of Notebook File Chapter 8
-
8/10/2019 Notebook File Chapter 8
1/11
October 15, 20
The least squares regression line(LSRL)
Recall from yesterday:
Scatterplot:
Direction
Form
Strength
Unusual features
quantitative variables
does not imply
causation
'r' applies tolinearassociations only
a 'bent' association
may be made linear by
re-expressing the data
After standardization
-
8/10/2019 Notebook File Chapter 8
2/11
October 15, 20
Linear Regression Line
Equation 'models' the relationship allowing us to predict y
values for any given x value
y = 34x +310 where y = rent with hotel
x = spaces from 'go'
(39, 2000)
y = 34(39)+ 310
y = 1636
'hat' means predicted
Linear Regression Line
Residual
residual = y - y
residual = 2000 - 1636
= 364
if residual > 0 then observed y value is greater than predicted
value
i.e. the actual data point is abovethe regression line
residual = observed - predicted
Calculating Slope of Least Squares Regression line
Recall: r = .83
for explanation, see
Math Box on page 175
Moving away from 'standardized' equation to actual
equation
This is nice, but it means we have to change all of our data to z-
scores, solve equation and then convert z-scores back to regular
data but this is too cumbersome!
slope=
y-intercept
-
8/10/2019 Notebook File Chapter 8
3/11
October 15, 20
Revisiting yesterday's problem
a =
b =
Remember: the
least squares
regression line
always passes
through the
center of the
data (x, y)
-
8/10/2019 Notebook File Chapter 8
4/11
October 15, 20
QUOTATION OF THE DAY
"Thank God the research didn't find that novels increased toothdecay or blocked up your arteries."
LOUISE ERDRICH,
the novelist, on studies finding that after reading literary fiction people performedbetter on tests measuring empathy, social perception and emotional intelligence.
To find the equation of a regression line,
use the slope
and the point
If a scatterplot satisfies what conditions, we can discusscorrelation and regression line?
no outliers
no discernible pattern besides being linear
both variables quantitative
-
8/10/2019 Notebook File Chapter 8
5/11
October 15, 20
Important notes about the LSRL
Before you use the LSRL, you must look at the
scatterplot to make sure the relationship is
linear
Be sure to distinguish between actual data values and
predicted values. Use words like 'predicted' and 'on
average'
The slope of the LSRL has the same sign as r, the coorelation
coefficent
Be able to answer questions, such as:
What is the equation of the LSRL?
What is the interpretation of the slope and y-intercept?
-
8/10/2019 Notebook File Chapter 8
6/11
October 15, 20
The answer? Residuals
How welldoes the line fit the data?
How?Create a scatterplot of
the residuals vs. the x-values.
How appropriate is our line in makingpredications?
What is it about R2?
R2: the coefficient of determination
Shows how good one term is at predicting
another. The closer r2is to 1, the better the
statistical chance that the input x will correctly
produce term y
If r square is .80, then 80% of the variation in the values of y is explained
by its linear relationship with the values of x
Residuals
Registrations
400 600 800 1000
0
-10
10
-
8/10/2019 Notebook File Chapter 8
7/11
October 15, 20
Residuals (continued)
A good residual plot - one that confirms the
appropriate use of the LSRS - should :
have similar scatter throughout
have no interesting features
should have no direction of shape
Residuals (continued)
A good residual plot - one that confirms the
appropriate use of the LSRS - should :
se = standard deviation
of the residuals
A small Seimplies residuals that are closely centered
about the mean
The mean of the residuals = 0
the units of the residuals are the same units of the y-values
Step 1: Check that the two variables arequantitative
Check that the association islinear
Check for presence of outliers
Before you find the best-fit-line:
Step 2: Do the math
Find the least squares line
r
r2
-
8/10/2019 Notebook File Chapter 8
8/11
October 15, 20
State the conclusion
What does the y-intercept tell you? What does R2 tell you?
se = standard deviation of the residuals
-
8/10/2019 Notebook File Chapter 8
9/11
October 15, 20
Finally: check theresidual plotto make sure that the linear model is
appropriate
Note: you can not work backwards;
You cannot use the linear regression line to predict x
-
8/10/2019 Notebook File Chapter 8
10/11
October 15, 20
-
8/10/2019 Notebook File Chapter 8
11/11
October 15, 20
r(sy/sx).