ESTIMATING THE REGRESSION COEFFICIENTS FOR SIMPLE LINEAR REGRESSION
-
Upload
aurelia-evans -
Category
Documents
-
view
78 -
download
4
description
Transcript of ESTIMATING THE REGRESSION COEFFICIENTS FOR SIMPLE LINEAR REGRESSION
ESTIMATING THE ESTIMATING THE REGRESSION COEFFICIENTS REGRESSION COEFFICIENTS
FORFOR
SIMPLE LINEAR REGRESSIONSIMPLE LINEAR REGRESSION
Step 2: Estimating β1 and β0
• In simple linear regression, in Step 1, it was hypothesized that:
y = y = 00 + + 11x + x +
• In Step 2, the best estimates for 00 and 11 are determined.
• These “best estimates” are designated as bb00 and bb11 respectively.
Notation
ii
iii
ii10i
ii
i
ypredict toy
usingby made i,n observatiofor ERROR the y - y e
xy when x for valuepredicted thei.e. , x y
y values observedn theof average the y
x valuesobservedn theof average the x
x when x in observatioat y of valueobserved they
in observatioat x of egiven valu thex
bb
DETERMINING THE BEST STRAIGHT LINE
• The best straight line is the one that, in some sense, minimizes the overall errors.
• But the positive values for the errors will offset the negative values giving an average error value of 0.
• To make sure all quantities are positive -- the errors are squaredsquared.
• THE BEST STRAIGHT LINE MINIMIZES THE SUM OF THE SQUARED ERROR THE SUM OF THE SQUARED ERROR (SSE)(SSE)
MINIMIZING SSE• We want to minimize SSE where:
This is a function in two variables: bThis is a function in two variables: b00 and b and b11..
2i10i
2ii
))x( (y
)y(y SSE
bb
Method of Least Squares
• Because we are minimizing the sum of the squared errors, the approach for doing this is called the METHOD OF LEAST SQUARES.METHOD OF LEAST SQUARES.
• To find the minimum of a function of two variables (b0 and b1), take partial derivatives with respect to each of the variables and set them equal to 0.
• We then have two equations in the two unknowns and we can solve for the values of the two unknowns -- these are known as the NORMAL EQUATIONSNORMAL EQUATIONS for regression.
THE PARTIAL DERIVATIVES
The result from taking the partial derivatives of SSE and setting them equal to 0 is:
Simplifying gives the two normal equationsnormal equations for b0 and b1:
0)x)x( (y2 - SSE
0))x( (y2 - SSE
ii10i1
i10i0
bbb
bbb
ii1
2i0i
i1i0
yx b x b x
y b x bn
SOLVING FOR b1
THE BEST ESTIMATE FOR 1Solving the normal equations for b1 gives:
Doing a little algebra, gives these three alternate formulas:
2i
ii1
)x(x
)y)(yx(xb
2x22
i
ii
2i2
i
iiii
1s
y)cov(x,
xnx
y xnyx
n
xx
n
yx- yx
b
SOLVING FOR b0
THE BEST ESTIMATE FOR β0
Regardless of how b1 is calculated, b0 is found by:
And the regression equation is:
xbyb 10
xb b y 10
Example – The Data
1 1200 101000
2 800 92000
3 1000 110000
4 1300 120000
5 700 90000
6 800 82000
7 1000 93000
8 600 75000
9 900 91000
10 1100 105000
ii y x i
($) Sales $ Adv.Week
95900 10
959000 y
SUM 9400 959000SUM 9400 959000
940 10
9400 x
Example – Table Calculations
1 1200 101000 260 5100 1326000 67600
2 800 92000 -140 -3900 546000 19600
3 1000 110000 60 14100 846000 3600
4 1300 120000 360 24100 8676000 129600
5 700 90000 -240 -5900 1416000 57600
6 800 82000 -140 -13900 1946000 19600
7 1000 93000 60 -2900 -174000 3600
8 600 75000 -340 -20900 7106000 115600
9 900 91000 -40 -4900 196000 1600
10 1100 105000 160 9100 1456000 25600
2iiiiiii )x(x )y)(yx(x yy x xy x i
($) Sales $ Adv.Week
SUM 23,340,000 444,000SUM 23,340,000 444,000
CALCULATING b1 AND b0
THE REGRESSION EQUATION
Thus the estimated regression equation is:
52.5676444,000
23,340,000
)x(x
)y)(yx(xb
2i
ii1
52.5676x 46,486.49 y
46,486.49940)(52.5676)(95,900xbyb 10
What does the model predict sales to be when $1150 is spent on advertising?
4$106,939.2 50)52.5676(11 46,486.49 y
What does the model predict sales to be when $5,000,000 is spent on advertising?
80$262,884,4 00000)52.5676(50 46,486.49 y
But $5,000,000 is way outsideway outside the observed values for x.
The model should not be used for such predictions.
By EXCEL
Choose Regression fromData Analysis
Check Labels
Output Worksheet
Location ofY-valuesX-values
bb00
bb11Regression EquationRegression Equation
Y = 46486.49 + 52.56757xY = 46486.49 + 52.56757x
Review• b0, the point estimate for 0 , and b1, the point estimate
for 1, are found from calculus by minimizing the total minimizing the total
sum of the squared errorssum of the squared errors between the actual and predicted values of y.
• The regression equation coefficients can be found by Excel or by hand by:
• The regression equation should not be used for values of x that are “far away” from the observed x values.
)x(x
)y)(yx(x
2i
ii
1b
xbyb 10