Chapter 12 Fall2014
Transcript of Chapter 12 Fall2014
-
8/10/2019 Chapter 12 Fall2014
1/48
11
ECON 2121Methods of Economic Statistics
Chapter 12
Simple Linear Regression
-
8/10/2019 Chapter 12 Fall2014
2/48
22
Outline
Simple Linear Regression Model
Least Squares Method Coefficient of Determination
Model Assumptions
Testing for Significance
-
8/10/2019 Chapter 12 Fall2014
3/48
33
Simple Linear Regression
Regression analysis can be used to develop anequation showing how the variables are related.
Managerial decisions often are based on therelationship between two or more variables.
The variables being used to predict the value of thedependent variable are called the independentvariables and are denoted by x.
The variable being predicted is called the dependent
variable and is denoted by y.
-
8/10/2019 Chapter 12 Fall2014
4/48
44
Simple Linear Regression
The relationship between the two variables isapproximated by a straight line.
Simple linear regression involves one independentvariable and one dependent variable.
Regression analysis involving two or moreindependent variables is called multiple regression.
-
8/10/2019 Chapter 12 Fall2014
5/48
-
8/10/2019 Chapter 12 Fall2014
6/48
666
Deterministic and Probabilistic Models
-
8/10/2019 Chapter 12 Fall2014
7/48
77
Simple Linear Regression Equation
The simple linear regression equation is:
E(y) is the expected value of y for a given x value. 1 is the slope of the regression line.
0 is the y intercept of the regression line. Graph of the regression equation is a straight line.
E(y) =0 +1x
-
8/10/2019 Chapter 12 Fall2014
8/48
88
Simple Linear Regression Equation
Positive Linear Relationship
E(y)
x
Slope1is positive
Regression line
Intercept0
-
8/10/2019 Chapter 12 Fall2014
9/48
99
Simple Linear Regression Equation
Negative Linear Relationship
E(y)
x
Slope1is negative
Regression lineIntercept
0
-
8/10/2019 Chapter 12 Fall2014
10/48
1010
Simple Linear Regression Equation
No Relationship
E(y)
x
Slope1is 0
Regression lineIntercept
0
-
8/10/2019 Chapter 12 Fall2014
11/48
1111
Estimated Simple Linear Regression Equation
The estimated simple linear regression equation
is the estimated value of y for a given x value. b1 is the slope of the line.
b0 is the y intercept of the line.
The graph is called the estimated regression line.
-
8/10/2019 Chapter 12 Fall2014
12/48
1212
Estimation Process
Regression Modely =0 +1x +
Regression Equation
E(y) =0 +1xUnknown Parameters
0,1
Sample Data:x y
x1 y1. .
. .
xn yn
b0 and b1
provide estimates of0 and1
EstimatedRegression Equation
Sample Statisticsb0, b1
-
8/10/2019 Chapter 12 Fall2014
13/48
1313
Least Squares Method
Model
Estimate
yi
=0
+1
xi
+
Error
Sum of Squared Error (SEE)
SSE = (yi i)2
i= b
0+b
1x
i
ei = yi i
-
8/10/2019 Chapter 12 Fall2014
14/48
1414
Least Squares Method
Least Squares Criterion
where:
yi = observed value of the dependent variable
for the ith observation^yi = estimated value of the dependent variable
for the ith observation
-
8/10/2019 Chapter 12 Fall2014
15/48
1515
Fitting the Model: The Least Squares Approach
The least squares line is the line that hasthe following two properties:
1. The sum of the errors (SE) equals 0.
2. The sum of squared errors (SSE) is smaller thanthat for any other straight-line model.
i = b0+b1xi
-
8/10/2019 Chapter 12 Fall2014
16/48
1616
Slope for the Estimated Regression Equation
Least Squares Method
where:
xi = value of independent variable for ithobservation
_y = mean value for dependent variable
_x = mean value for independent variable
yi = value of dependent variable for ith
observation
-
8/10/2019 Chapter 12 Fall2014
17/48
1717
y-Intercept for the Estimated Regression Equation
Least Squares Method
-
8/10/2019 Chapter 12 Fall2014
18/48
1818
Least Squares Approach
The sum of the errors (SE) equals 0.
-
8/10/2019 Chapter 12 Fall2014
19/48
1919
Least Squares Approach
The sum of squared errors (SSE) is smaller than that forany other straight-line model.
-
8/10/2019 Chapter 12 Fall2014
20/48
2020
Reed Auto periodically has a special week-long sale.
As part of the advertising campaign Reed runs one ormore television commercials during the weekend
preceding the sale. Data from a sample of 5 previous
sales are shown on the next slide.
Simple Linear Regression
Example: Reed Auto Sales
-
8/10/2019 Chapter 12 Fall2014
21/48
2121
Simple Linear Regression
Example: Reed Auto Sales
Number of
TV Ads (x)
Number of
Cars Sold (y)
13
213
1424
181727
x = 10 y = 100
-
8/10/2019 Chapter 12 Fall2014
22/48
2222
Estimated Regression Equation
Slope for the Estimated Regression Equation
y-Intercept for the Estimated Regression Equation
Estimated Regression Equation
-
8/10/2019 Chapter 12 Fall2014
23/48
2323
Using Excels Chart Tools forScatter Diagram & Estimated Regression Equation
Reed Auto Sales Estimated Regression Line
-
8/10/2019 Chapter 12 Fall2014
24/48
2424
Coefficient of Determination
Relationship Among SST, SSR, SSE
where:SST = total sum of squares
SSR = sum of squares due to regression
SSE = sum of squares due to error
SST = SSR + SSE
-
8/10/2019 Chapter 12 Fall2014
25/48
2525
The coefficient of determination is:
Coefficient of Determination
where:
SSR = sum of squares due to regression
SST = total sum of squares
r2 = SSR/SST
-
8/10/2019 Chapter 12 Fall2014
26/48
2626
Coefficient of Determination
r2 = SSR/SST = 100/114 = .8772
The regression relationship is very strong; 87.72%of the variability in the number of cars sold can be
explained by the linear relationship between the
number of TV ads and the number of cars sold.
-
8/10/2019 Chapter 12 Fall2014
27/48
2727
Sample Correlation Coefficient
where:
b1
= the slope of the estimated regression
equation
-
8/10/2019 Chapter 12 Fall2014
28/48
2828
The sign of b1 in the equation is +.
Sample Correlation Coefficient
rxy = +.9366
-
8/10/2019 Chapter 12 Fall2014
29/48
2929
Assumptions About the Error Term
1. The error is a random variable with mean of zero.
2. The variance of , denoted by 2
, is the same forall values of the independent variable.
3. The values of are independent.
4. The error is a normally distributed randomvariable.
-
8/10/2019 Chapter 12 Fall2014
30/48
3030
Model Assumptions
-
8/10/2019 Chapter 12 Fall2014
31/48
3131
Model Assumptions
-
8/10/2019 Chapter 12 Fall2014
32/48
3232
Testing for Significance
To test for a significant regression relationship, wemust conduct a hypothesis test to determine whether
the value of1 is zero.
Two tests are commonly used:
t Test and FTest
Both the t test and Ftest require an estimate of 2,
the variance of in the regression model.
-
8/10/2019 Chapter 12 Fall2014
33/48
3333
An Estimate of 2
Testing for Significance
where:
s 2 = MSE = SSE/(n 2)
The mean square error (MSE) provides the estimate
of 2
, and the notation s2
is also used.
-
8/10/2019 Chapter 12 Fall2014
34/48
3434
Testing for Significance
An Estimate of
To estimate we take the square root of 2.
The resulting s is called the standard error ofthe estimate.
s 14
5 2 2.16
-
8/10/2019 Chapter 12 Fall2014
35/48
3535
Hypotheses
Test Statistic
Testing for Significance: t Test
where
-
8/10/2019 Chapter 12 Fall2014
36/48
3636
Rejection Rule
Testing for Significance: t Test
where:
t is based on a t distributionwith n - 2 degrees of freedom
Reject H0 ifp-value t
-
8/10/2019 Chapter 12 Fall2014
37/48
3737
1. Determine the hypotheses.
2. Specify the level of significance.
3. Select the test statistic.
= .05
4. State the rejection rule. Reject H0 ifp-value < .05
or |t| > 3.182 (with3 degrees of freedom)
Testing for Significance: t Test
-
8/10/2019 Chapter 12 Fall2014
38/48
3838
Testing for Significance: t Test
2.16
4 1.08
5. Compute the value of the test statistic.
6. Determine whether to reject H0.t = 4.541 provides an area of .01 in the upper
tail. Hence, thep-value is less than .02. (Also,t = 4.63 > 3.182.) We can reject H0.
-
8/10/2019 Chapter 12 Fall2014
39/48
3939
Confidence Interval for1
H0 is rejected if the hypothesized value of
1 is notincluded in the confidence interval for 1.
We can use a 95% confidence interval for1 to testthe hypotheses just used in the t test.
-
8/10/2019 Chapter 12 Fall2014
40/48
4040
The form of a confidence interval for1 is:
Confidence Interval for1
where is the t value providing an area
of /2 in the upper tail of a t distribution
with n - 2 degrees of freedom
b1 is thepoint
estimator
is themargin
of error
C fid I l f
-
8/10/2019 Chapter 12 Fall2014
41/48
4141
Confidence Interval for1
Reject H0 if 0 is not included in
the confidence interval for 1.
0 is not included in the confidence interval.
Reject H0
= 5 +/- 3.182(1.08) = 5 +/- 3.44
or 1.56 to 8.44
Rejection Rule
95% Confidence Interval for1
Conclusion
T i f Si ifi F T
-
8/10/2019 Chapter 12 Fall2014
42/48
4242
Hypotheses
Test Statistic
Testing for Significance: F Test
F= MSR/MSE
where:
MSR = mean square regression
MSE = mean square error
MSR = SSR/Regression degree of freedom
Regression degree of freedom
= Number of independent variables (excluding the constant)
T ti f Si ifi F T t
-
8/10/2019 Chapter 12 Fall2014
43/48
4343
Rejection Rule
Testing for Significance: F Test
where:For simple regression, Fis based on an Fdistribution
with 1 degree of freedom in the numerator and
n - 2 degrees of freedom in the denominator
Reject H0 if
p-value F
T ti f Si ifi F T t
-
8/10/2019 Chapter 12 Fall2014
44/48
4444
1. Determine the hypotheses.
2. Specify the level of significance.
3. Select the test statistic.
= .05
4. State the rejection rule. Reject H0 ifp-value < .05or F> 10.13 (with 1 d.f.
in numerator and3 d.f. in denominator)
Testing for Significance: F Test
F= MSR/MSE
T ti f Si ifi F T t
-
8/10/2019 Chapter 12 Fall2014
45/48
4545
Testing for Significance: F Test
5. Compute the value of the test statistic.
6. Determine whether to reject H0.
F= 17.44 provides an area of .025 in the upper
tail. Thus, thep-value corresponding to F= 21.43is less than 2(.025) = .05. Hence, we reject H0.
F= MSR/MSE = (100/1)/(14/3) = 21.43
The statistical evidence is sufficient to concludethat we have a significant relationship between thenumber of TV ads aired and the number of cars sold.
Some Cautions about the
-
8/10/2019 Chapter 12 Fall2014
46/48
4646
Some Cautions about theInterpretation of Significance Tests
Just because we are able to reject H0:1 = 0 anddemonstrate statistical significance does not enable
us to conclude that there is a linear relationshipbetween x and y.
Rejecting H0:1 = 0 and concluding that therelationship between x and y is significant doesnot enable us to conclude that a cause-and-effect
relationship is present between x and y.
Least Squares Regression only tells a linear correlationbetween x and y.
Excel Example
-
8/10/2019 Chapter 12 Fall2014
47/48
4747
Excel Example
Tools Data Analysis Regression
Excel Example
-
8/10/2019 Chapter 12 Fall2014
48/48
4848
Excel Example