9.2 Linear Regression Key Concepts: –Residuals –Least Squares Criterion –Regression Line...
-
Upload
jasper-hunter -
Category
Documents
-
view
219 -
download
0
Transcript of 9.2 Linear Regression Key Concepts: –Residuals –Least Squares Criterion –Regression Line...
9.2 Linear Regression
• Key Concepts:– Residuals– Least Squares Criterion– Regression Line– Using a Regression Equation to Make
Predictions
9.2 Linear Regression
• In this section, we will assume a significant linear correlation exists between the two variables of interest.– We would like to find the equation of the line that best fits the
data. How do we decide whether one line is a better fit than another?
• We start by measuring the residuals (or amount of error). Each residual represents the difference between what our proposed line predicts for a given vale of x and the actual observed value (see p. 486).
• A positive residual tells us our line over-estimated the observed value. A negative residual tells us our line under-estimated the observed value. If the residual = 0, we have an exact match.
9.2 Linear Regression
• The next step is to add all the residuals together.– The line that best fits the data should have the
smallest sum of residuals.• Unfortunately, whenever we add all the residuals of a data
set, we end up with a sum equal to zero. We can avoid this problem by squaring the residuals and then adding.
– According to the Least Squares Criterion, the line of best fit (called the regression line) will have the smallest sum of squared residuals.
9.2 Linear Regression
• The equation of a regression line for an independent variable x and a dependent variable y is:
where is the predicted y-value for a given x-value.
– We use the Least Squares Criterion and Calculus to derive the slope m and y-intercept b of the regression line.
y mx b
y
9.2 Linear Regression
• The slope m and y-intercept b of a regression line are given by:
22
n xy x ym
n x x
y xb y mx m
n n
9.2 Linear Regression
• It is best to use a regression line to make predictions for x-vales over (or close to) the range of the original data. Extrapolation (using a regression line to make predictions for x-values well beyond the range of the original data) can lead to highly inaccurate results.
• Practice building a regression equation and using it to make predictions:
#20 p. 491 (Wins and Earned Run Averages)
#24 p. 492 (High-Fiber Cereals: Caloric and Sugar Content)