Econ 3780: Business and Economics Statisticsyuppal.people.ysu.edu › econ_3790 ›...
Transcript of Econ 3780: Business and Economics Statisticsyuppal.people.ysu.edu › econ_3790 ›...
Econ 3790: Statistics Business and Economics
Instructor: Yogesh UppalEmail: [email protected]
Chapter 14
Covariance and Simple Correlation Coefficient
Simple Linear Regression
Covariance
Covariance between x and y is a measure of relationship between x and y.
1))((
1),cov(
n
xxyynSS
yx xy
Reed Auto periodically hasa special week-long sale. As part of the advertisingcampaign Reed runs one ormore television commercialsduring the weekend preceding the sale. Data from asample of 5 previous sales are shown on the next slide.
CovarianceCovariance
Example: Reed Auto SalesExample: Reed Auto Sales
CovarianceCovariance
Example: Reed Auto SalesExample: Reed Auto Sales
Number ofNumber ofTV AdsTV Ads
Number ofNumber ofCars SoldCars Sold
1133221133
14142424181817172727
Covariancex y
1 14 -1 -6 6
3 24 1 4 4
2 18 0 -2 0
1 17 -1 -3 3
3 27 1 7 7
Total=10Total =
100 SSxy =20
xx yy ))(( yyxx
54
201
),cov(
nSS
yx xy
Simple Correlation Coefficient
11
),cov(
yx
yx• Simple Population Correlation Coefficient
• If
< 0, a negative relationship between x and y.
• If
> 0, a positive relationship between x and y.
Simple Correlation Coefficient
Since population standard deviations of x and y are not known, we use their sample estimates to compute an estimate of .
11
),cov(
r
ssyxr
yx
Simple Correlation Coefficient
x y SSx SSy
1 14 -1 -6 1 36
3 24 1 4 1 16
2 18 0 -2 0 4
1 17 -1 -3 1 9
3 27 1 7 1 49
Total=10 Total=98 Total=4 Total= 114
xx yy
Example: Reed Auto SalesExample: Reed Auto Sales
Simple Correlation Coefficient
936.034.5*1
5),cov(
34.54
1141)(
144
1)(
2
2
yxxy
y
x
ssyxr
nyys
nxxs
Chapter 14 Simple Linear Regression
Simple Linear Regression Model
Coefficient of Determination
Testing for Significance
Using the Estimated Regression Equationfor Estimation and Prediction
Residual Analysis
Simple Linear Regression Model
yy
= = 00
+ + 11
xx
++
where:where:00
and and 11
are called are called parameters of the modelparameters of the model,, is a random variable called theis a random variable called the
error termerror term..
The The simple linear regression modelsimple linear regression model
is:is:
The equation that describes how The equation that describes how yy
is related to is related to xx
andandan error term is called the an error term is called the regression modelregression model..
Simple Linear Regression Equation
Positive Linear RelationshipPositive Linear Relationship
yy
xxx
Slope Slope 11is positiveis positive
Regression lineRegression line
InterceptIntercept00
Simple Linear Regression EquationSimple Linear Regression Equation
Negative Linear RelationshipNegative Linear Relationship
yyy
xxx
Slope Slope 11is negativeis negative
Regression lineRegression lineInterceptIntercept00
Simple Linear Regression EquationSimple Linear Regression Equation
No RelationshipNo Relationship
yyy
xxx
Slope Slope 11is 0is 0
Regression lineRegression lineInterceptIntercept
00
Interpretation of 0
and 1
0 (intercept parameter): is the value of y when x = 0.
1 (slope parameter): is the change in y given x changes by 1 unit.
Estimated Simple Linear Regression EquationEstimated Simple Linear Regression Equation
The The estimated simple linear regression equationestimated simple linear regression equation
0 1y b b x 0 1y b b x
•• bb11
is the slope of the line.is the slope of the line.•• bb00
is the is the yy
intercept of the line.intercept of the line.•• The graph is called the estimated regression line.The graph is called the estimated regression line.
•• is the estimated value of y for a given value of x.y
Estimation Process
Regression ModelRegression Modelyy
= = 00
+ + 11
xx
++Regression EquationRegression Equation
EE((y|xy|x) = ) = 00
+ + 11
xxUnknown ParametersUnknown Parameters
00
, , 11
Sample Data:Sample Data:x yx yxx11
yy11. .. .. .. .
xxnn
yynn
EstimatedEstimatedRegression EquationRegression Equation
bb00
and band b11provide provide point estimatespoint estimates
ofof
00 and and
11
xbby 10ˆ
Least Squares Method
Slope for the Estimated Regression Equation
2
1
)(
))((
xxSSand
xxyySSWhere
SSSS
b
x
xy
y
xy
yy--Intercept for the Estimated Regression EquationIntercept for the Estimated Regression Equation
Least Squares MethodLeast Squares Method
0 1b y b x 0 1b y b x
__yy
= mean value for dependent variable= mean value for dependent variable
__xx
= mean value for independent variable= mean value for independent variable
Estimated Regression Equation
54
201
x
xy
SSSS
b
10)2(*52010 xbyb
Slope for the Estimated Regression Equation
y-Intercept for the Estimated Regression Equation
Estimated Regression Equation
Example: Reed Auto SalesExample: Reed Auto Sales
xy *510ˆ
Scatter Diagram and Regression Line
xy 510ˆ
number of ads
3.53.02.52.01.51.0.5
num
ber o
f car
s so
ld
26
24
22
20
18
16
14
12
Estimate of Residuals
x y1 14 15 -1.03 24 25 -1.02 18 20 -2.01 17 15 2.03 27 25 2.0
y yye ˆ
Decomposition of total sum of squares
Relationship Among SST, SSR, SSE
where:where:SST = total sum of squaresSST = total sum of squaresSSR = sum of squares due to regressionSSR = sum of squares due to regressionSSE = sum of squares due to errorSSE = sum of squares due to error
SST = SSR + SSESST = SSR + SSE
2( )iy y 2( )iy y 2ˆ( )iy y 2ˆ( )iy y 2ˆ( )i iy y 2ˆ( )i iy y
-1 1 15 -5 25
-1 1 25 5 25
-2 4 20 0 0
2 4 15 -5 25
2 4 25 5 25SSE=14 SSR=100
yye ˆ 2)ˆ( yySSE yy ˆ 2)ˆ( yySSR
Decomposition of total sum of squares
Check if SST= SSR + SSE
y
The The coefficient of determinationcoefficient of determination
is:is:
Coefficient of DeterminationCoefficient of Determination
rr22
= SSR/SST= SSR/SST
rr22
= SSR/SST = 100/114 = 0.8772= SSR/SST = 100/114 = 0.8772
•• The regression relationship is very strong; about 88%The regression relationship is very strong; about 88%of the variability in the number of cars sold can beof the variability in the number of cars sold can beexplained by the number of TV ads.explained by the number of TV ads.
•• The coefficient of determination (rThe coefficient of determination (r22) is also the square of ) is also the square of the correlation coefficient (r).the correlation coefficient (r).
Sample Correlation Coefficient
21 ) of(sign rbr 21 ) of(sign rbr
ionDeterminat oft Coefficien ) of(sign 1br ionDeterminat oft Coefficien ) of(sign 1br
936.08772.0 r 936.08772.0 r
Sampling Distribution of b1
Estimate of σ2
The mean square error (MSE) provides the The mean square error (MSE) provides the estimate of estimate of σσ22..
ss
22
= MSE = SSE/(= MSE = SSE/(n n
2)2)
where:where:2)ˆ(SSE ii yy 2)ˆ(SSE ii yy
Interval Estimate of 1
:
Example: Reed Auto SalesReed Auto Sales
44.3508.1182.35
Testing for Significance: t Test
Hypotheses
Test Statistic
Where b1 is the slope estimate and SE(b1 ) is the standard error of b1 .
0 1: 0H 0 1: 0H
1: 0aH 1: 0aH
)(0
1
1
bSEbt
Rejection RuleRejection Rule
Testing for Significance: Testing for Significance: tt
TestTest
where: where: tt
is based on a is based on a tt
distributiondistributionwith with nn
--
2 degrees of freedom2 degrees of freedom
Reject Reject HH00
if if pp--value value <<
or or tt
<<
--tt
or or tt
>>
tt
1. Determine the hypotheses.1. Determine the hypotheses.
2. Specify the level of significance.2. Specify the level of significance.
3. Select the test statistic.3. Select the test statistic.
= .05= .05
4. State the rejection rule.4. State the rejection rule. Reject Reject HH00
if if pp--value value <<
.05.05or t or t ≤≤
3.182 or t 3.182 or t ≥≥
3.1823.182
Testing for Significance: Testing for Significance: tt
TestTest
0 1: 0H 0 1: 0H
1: 0aH 1: 0aH
)( 1
1
bSEbt
Testing for Significance: Testing for Significance: tt
TestTest
5. Compute the value of the test statistic.5. Compute the value of the test statistic.
6. Determine whether to reject 6. Determine whether to reject HH00
..
tt
= 4.63 > t= 4.63 > t/2/2
= 3.182. We can reject = 3.182. We can reject HH00
..
63.408.15
)( 1
1 bSE
bt
Some Cautions about the Interpretation of Significance Tests
Just because we are able to reject Just because we are able to reject HH00
: : 11
= 0 and= 0 anddemonstrate statistical significance does not enabledemonstrate statistical significance does not enableus to conclude that there is a us to conclude that there is a linear relationshiplinear relationshipbetween between xx
and and yy..
Rejecting Rejecting HH00
: : 11
= 0 and concluding that the= 0 and concluding that therelationship between relationship between xx
and and yy
is significant does is significant does
not enable us to conclude that a not enable us to conclude that a causecause--andand--effecteffectrelationshiprelationship
is present between is present between xx
and and yy..