Econ 140 Lecture 141 Multiple Regression Models Lecture 14.
-
date post
20-Dec-2015 -
Category
Documents
-
view
219 -
download
2
Transcript of Econ 140 Lecture 141 Multiple Regression Models Lecture 14.
![Page 1: Econ 140 Lecture 141 Multiple Regression Models Lecture 14.](https://reader038.fdocuments.in/reader038/viewer/2022102907/56649d4d5503460f94a2c84f/html5/thumbnails/1.jpg)
Lecture 14 1
Econ 140Econ 140
Multiple Regression Models
Lecture 14
![Page 2: Econ 140 Lecture 141 Multiple Regression Models Lecture 14.](https://reader038.fdocuments.in/reader038/viewer/2022102907/56649d4d5503460f94a2c84f/html5/thumbnails/2.jpg)
Lecture 14 2
Econ 140Econ 140Today’s plan
• How to read the estimated coefficients
• Functional form
• Testing the explanatory power of the model
• Adjustment to R2
![Page 3: Econ 140 Lecture 141 Multiple Regression Models Lecture 14.](https://reader038.fdocuments.in/reader038/viewer/2022102907/56649d4d5503460f94a2c84f/html5/thumbnails/3.jpg)
Lecture 14 3
Econ 140Econ 140Reading coefficients
• With a bi-variate model we could easily determine how a change in X affects Y
XbaY ˆˆˆ
2211ˆˆˆˆ XbXbaY • With a multivariate model ,
determining how a change in X2 affects Y is more complicated
• For a multivariate regression, you must hold X1 constant to determine the effect of a change in X2 on Y
– For this reason we call the slope coefficients in a multivariate regression the partial regression coefficients
![Page 4: Econ 140 Lecture 141 Multiple Regression Models Lecture 14.](https://reader038.fdocuments.in/reader038/viewer/2022102907/56649d4d5503460f94a2c84f/html5/thumbnails/4.jpg)
Lecture 14 4
Econ 140Econ 140Reading coefficients Example
• Back to our earnings and education example from L11.xls
• For our estimated multivariate regression equation, the expectation of Y is:
E(Y) = 4.135 + 0.057 X1 + 0.023 X2
• If we hold age constant at 30, the expectation of Y becomes:
E(Y) = 4.135 + 0.057 X1 + 0.023 (30)
= 4.135 + 0.023 (30) + 0.057 X1
– What we’re doing here is looking at the relationship between education and earnings for 30 year olds
– This can also be done for any other age, i.e. 50 year olds:
E(Y) = 4.135 + 0.057 X1 + 0.023 (50)
![Page 5: Econ 140 Lecture 141 Multiple Regression Models Lecture 14.](https://reader038.fdocuments.in/reader038/viewer/2022102907/56649d4d5503460f94a2c84f/html5/thumbnails/5.jpg)
Lecture 14 5
Econ 140Econ 140Functional form
• Our example on earnings and years of education has some economic theory in its foundation - but basically an ‘ad-hoc’ specification. We know we want to test the relationship between earnings and years of schooling.
• Let’s look at another example that is based on economic theory: the Cobb-Douglas production function Y = ALK
• If we want to test for constant returns to scale
+ = 1
![Page 6: Econ 140 Lecture 141 Multiple Regression Models Lecture 14.](https://reader038.fdocuments.in/reader038/viewer/2022102907/56649d4d5503460f94a2c84f/html5/thumbnails/6.jpg)
Lecture 14 6
Econ 140Econ 140Functional form (2)
• We can get the equation into a form we can estimate by taking logs:
ln Y = ln A + ln L + ln K
– This is called log linear form since all the variables are in logs
– The model is now linear in parameters so we can use least squares to estimate it
– The log linear form gives us estimated coefficients that are elasticities: the estimates of and give us the elasticities of labor and capital with respect to output
![Page 7: Econ 140 Lecture 141 Multiple Regression Models Lecture 14.](https://reader038.fdocuments.in/reader038/viewer/2022102907/56649d4d5503460f94a2c84f/html5/thumbnails/7.jpg)
Lecture 14 7
Econ 140Econ 140Example with longitudinal data
• L14-1.xls is on the web. It contains information on companies in the UK private sector. Data from DATASTREAM; for US: COMPUSTAT
• Note that this is a longitudinal data set - we are analyzing the same agents (the companies) over time
• I have calculated the true output elasticity with respect to labor for a 100% change in labor and the true output elasticity with respect to labor for a 10% change in labor
– Note that the larger the increase in the independent variable, the further the approximation is from the coefficient
![Page 8: Econ 140 Lecture 141 Multiple Regression Models Lecture 14.](https://reader038.fdocuments.in/reader038/viewer/2022102907/56649d4d5503460f94a2c84f/html5/thumbnails/8.jpg)
Lecture 14 8
Econ 140Econ 140Example with longitudinal data (2)
• If we want to calculate the true change, we need to calculate: 1%1ln% ofXbEXPofY
• If we want to estimate the Cobb-Douglas production function, we use the partial slope coefficients
• We can calculate the partial slope coefficients :
45.0ˆ
67.094.1071
71.722
156.60491.59847.178
165.67156.60491.59064.80ˆ
2
21
b
b
![Page 9: Econ 140 Lecture 141 Multiple Regression Models Lecture 14.](https://reader038.fdocuments.in/reader038/viewer/2022102907/56649d4d5503460f94a2c84f/html5/thumbnails/9.jpg)
Lecture 14 9
Econ 140Econ 140Example with longitudinal data (3)
• Adding our estimates together we find:
12.1ˆˆˆˆ21 bb
• Later on we’ll test the constraint that + = 1
![Page 10: Econ 140 Lecture 141 Multiple Regression Models Lecture 14.](https://reader038.fdocuments.in/reader038/viewer/2022102907/56649d4d5503460f94a2c84f/html5/thumbnails/10.jpg)
Lecture 14 10
Econ 140Econ 140Phillips Curve
• The Phillips Curve is an example of ad-hoc variable inclusion
Un
WThe equation representing this relationshipbetween unemployment and wage inflationis:
nUbaW
1
![Page 11: Econ 140 Lecture 141 Multiple Regression Models Lecture 14.](https://reader038.fdocuments.in/reader038/viewer/2022102907/56649d4d5503460f94a2c84f/html5/thumbnails/11.jpg)
Lecture 14 11
Econ 140Econ 140Phillips Curve (2)
• With ad-hoc specification we don’t know what other variables are relevant
– we need to make informed guesses determined by what we know of economic theory
![Page 12: Econ 140 Lecture 141 Multiple Regression Models Lecture 14.](https://reader038.fdocuments.in/reader038/viewer/2022102907/56649d4d5503460f94a2c84f/html5/thumbnails/12.jpg)
Lecture 14 12
Econ 140Econ 140The story so far
• Functional form
• Omitted variable bias
• Types of data
– Cross section: earnings and education
– Panel/longitudinal: Cobb-Douglas
– Time-series: Phillips Curve
![Page 13: Econ 140 Lecture 141 Multiple Regression Models Lecture 14.](https://reader038.fdocuments.in/reader038/viewer/2022102907/56649d4d5503460f94a2c84f/html5/thumbnails/13.jpg)
Lecture 14 13
Econ 140Econ 140Variation in multivariate models
• Let our model be eXbXbaY 2211
ˆˆˆˆ2211
, , , 22bbbb
• We still want to calculate:
– How to calculate these values.
![Page 14: Econ 140 Lecture 141 Multiple Regression Models Lecture 14.](https://reader038.fdocuments.in/reader038/viewer/2022102907/56649d4d5503460f94a2c84f/html5/thumbnails/14.jpg)
Lecture 14 14
Econ 140Econ 140Variation in multivariate models (2)
• It still holds that the variance of the regression line is
• It also still holds that:
2ˆYX
kn
eYX
bb
22
2
ˆˆ
ˆˆ 11
![Page 15: Econ 140 Lecture 141 Multiple Regression Models Lecture 14.](https://reader038.fdocuments.in/reader038/viewer/2022102907/56649d4d5503460f94a2c84f/html5/thumbnails/15.jpg)
Lecture 14 15
Econ 140Econ 140Test statistics in multivariate models
• We will start with the sum of squares identity, where:
Total = Explained + Residual
or
222 ˆˆ YYYYYY
• But, the composition of the ESS will be different - our sum of squares identity will look like this:
22211
2 ˆˆ eyxbyxby
• As you add more independent variables to the model, more terms get added to the ESS
![Page 16: Econ 140 Lecture 141 Multiple Regression Models Lecture 14.](https://reader038.fdocuments.in/reader038/viewer/2022102907/56649d4d5503460f94a2c84f/html5/thumbnails/16.jpg)
Lecture 14 16
Econ 140Econ 140Test statistics in multivariate models (2)
222112
2
ˆˆS of S Total
S of S Explained
y
yxbyxbR
R
• Now let’s look back to an example from an earlier lecture
– we looked at the returns to earnings of education (b1) and age (b2)
– calculate the test statistics and consider model problems
• Our R2 is:
![Page 17: Econ 140 Lecture 141 Multiple Regression Models Lecture 14.](https://reader038.fdocuments.in/reader038/viewer/2022102907/56649d4d5503460f94a2c84f/html5/thumbnails/17.jpg)
Lecture 14 17
Econ 140Econ 140Test statistics in multivariate models (3)
• You will also be given these these values:
• On an exam you may be asked to estimate the regression line, given a matrix of products and cross-products like this:
y x1 x2
y 25.05 15.75 164.37x1 163.00 276.17x2 6394.97
53.38 77.5
83.12 36
2
1
XY
Xn
![Page 18: Econ 140 Lecture 141 Multiple Regression Models Lecture 14.](https://reader038.fdocuments.in/reader038/viewer/2022102907/56649d4d5503460f94a2c84f/html5/thumbnails/18.jpg)
Lecture 14 18
Econ 140Econ 140Test statistics in multivariate models (4)
• We can start our calculations with:
• The regression line we calculated earlier is:
21 023.0057.0135.4ˆ XXY
617.033
371.20ˆ
336
37.164023.075.15057.005.25ˆ
2
2
YX
YX
• Taking the square root, we find the root mean square error:
617.0ˆ YXRootMSE
![Page 19: Econ 140 Lecture 141 Multiple Regression Models Lecture 14.](https://reader038.fdocuments.in/reader038/viewer/2022102907/56649d4d5503460f94a2c84f/html5/thumbnails/19.jpg)
Lecture 14 19
Econ 140Econ 140Test statistics in multivariate models (5)
• Taking the square root gives us
• We can then calculate:
0041.617.024.966110
97.6394
617.017.27697.63943.16
97.6394ˆ
221
b
064.00041.0ˆ1
b
![Page 20: Econ 140 Lecture 141 Multiple Regression Models Lecture 14.](https://reader038.fdocuments.in/reader038/viewer/2022102907/56649d4d5503460f94a2c84f/html5/thumbnails/20.jpg)
Lecture 14 20
Econ 140Econ 140Test statistics in multivariate models (6)
• Taking the square root gives
• We can then calculate:
000104.0617.024.966110
163ˆ 2
2b
01.0000104.0ˆ2
b
![Page 21: Econ 140 Lecture 141 Multiple Regression Models Lecture 14.](https://reader038.fdocuments.in/reader038/viewer/2022102907/56649d4d5503460f94a2c84f/html5/thumbnails/21.jpg)
Lecture 14 21
Econ 140Econ 140Hypothesis test on education
• The t-ratio is calculated:
• We can also form a null hypothesis 0: 10 bH
891.0064.0
0057.0 t
• For a significance level of 5% we have a table t value of t/2,33 = 2.035
• Since |t| < t /2 , we accept the null hypothesis
• Recall that the purpose of the test was to examine whether or not education has an effect on earnings. Can we accept this given what we know about economics?
![Page 22: Econ 140 Lecture 141 Multiple Regression Models Lecture 14.](https://reader038.fdocuments.in/reader038/viewer/2022102907/56649d4d5503460f94a2c84f/html5/thumbnails/22.jpg)
Lecture 14 22
Econ 140Econ 140Hypothesis test on age
• The t-ratio is calculated:
• We construct another hypothesis test: 0: 20 bH
3.201.0
0023.0
ˆ
ˆ
2
22 b
bbt
• For a significance level of 5% we have a table t value of
t/2,33 = 2.035
• Since |t| > t /2 , we reject the null hypothesis
![Page 23: Econ 140 Lecture 141 Multiple Regression Models Lecture 14.](https://reader038.fdocuments.in/reader038/viewer/2022102907/56649d4d5503460f94a2c84f/html5/thumbnails/23.jpg)
Lecture 14 23
Econ 140Econ 140Looking at R2
• Let’s look at R2:
187.005.25
678.405.25
)37.164(023.0)75.15(057.0
ˆˆ2
22112
y
yxbyxbR
• This is a rather low R2
– This means that the regression equation doesn’t explain the variation well– The regression equation only explains about 1/5 of the variation in Y
![Page 24: Econ 140 Lecture 141 Multiple Regression Models Lecture 14.](https://reader038.fdocuments.in/reader038/viewer/2022102907/56649d4d5503460f94a2c84f/html5/thumbnails/24.jpg)
Lecture 14 24
Econ 140Econ 140Looking at R2 (2)
• What should we do about the form of our estimated equation when years of education are shown to be statistically insignificant at our chosen significance level?
• We chose a 5% significance level for our test, but we might have been able to reject the null at a different significance level
• Remember: with hypothesis test we want to reduce the number of type I errors where we falsely reject a null
![Page 25: Econ 140 Lecture 141 Multiple Regression Models Lecture 14.](https://reader038.fdocuments.in/reader038/viewer/2022102907/56649d4d5503460f94a2c84f/html5/thumbnails/25.jpg)
Lecture 14 25
Econ 140Econ 140Testing explanatory power
• What if we examined the regression equation as a whole?• To do so, we look at this null hypothesis:
H0 : b1 = b2 = 0
– This says that neither of the independent variables has any explanatory power– To test this, we will use an F test
![Page 26: Econ 140 Lecture 141 Multiple Regression Models Lecture 14.](https://reader038.fdocuments.in/reader038/viewer/2022102907/56649d4d5503460f94a2c84f/html5/thumbnails/26.jpg)
Lecture 14 26
Econ 140Econ 140Testing explanatory power (2)
• The F statistic that we’re looking at can be found on the LINEST output• The F test comes from the ANOVA table for the multivariate case, which
looks like this:
Source ofvariation
Sum of Squares Degrees ofFreedom
Mean SquaredDeviation
Explained yxbyxb 2211ˆˆ 2
2
ˆˆ2211 yxbyxb
Residual 2e n-33ˆ2
n
e
Total 2y n-11
2
n
y
![Page 27: Econ 140 Lecture 141 Multiple Regression Models Lecture 14.](https://reader038.fdocuments.in/reader038/viewer/2022102907/56649d4d5503460f94a2c84f/html5/thumbnails/27.jpg)
Lecture 14 27
Econ 140Econ 140Testing explanatory power (3)
• The F statistic will look like:
• Using the F table, you choose a significance level and use the degrees of freedom in the numerator and denominator, or F0.05, 2, 33
– The 1st row in the table is df in the numerator– The 1st column is the df in the denominator– The 2nd column is the significance level
F 4 72 220 33 33
2 360 62
381..
.
..
F
b x y b x y
e n
1 2 22
1 2
3^
^^
![Page 28: Econ 140 Lecture 141 Multiple Regression Models Lecture 14.](https://reader038.fdocuments.in/reader038/viewer/2022102907/56649d4d5503460f94a2c84f/html5/thumbnails/28.jpg)
Lecture 14 28
Econ 140Econ 140Testing explanatory power (4)
• If our calculated F statistic is greater than (to the right of) our F table value, we reject the null
• If our calculated F statistic is less than (to the left of) our F table value, we accept the null
F table value
H0: Accepting the null
H1: Rejecting the null
F
![Page 29: Econ 140 Lecture 141 Multiple Regression Models Lecture 14.](https://reader038.fdocuments.in/reader038/viewer/2022102907/56649d4d5503460f94a2c84f/html5/thumbnails/29.jpg)
Lecture 14 29
Econ 140Econ 140Testing explanatory power (5)
• Looking at the F table, we find that there is no value for exactly 33 df– We have to approximate using 30 df instead
– Our approximated F value is F0.05, 2, 33 3.29
• We reject the null because F > F0.05, 2, 33
• Had we picked a 1% significance level, or F table value would be F0.01, 2, 33 5.27
– and we would’ve accepted the null because F < F0.01, 2, 33
![Page 30: Econ 140 Lecture 141 Multiple Regression Models Lecture 14.](https://reader038.fdocuments.in/reader038/viewer/2022102907/56649d4d5503460f94a2c84f/html5/thumbnails/30.jpg)
Lecture 14 30
Econ 140Econ 140Testing explanatory power (6)
• In summary, we’re more likely to reject the null at a greater significance level• In this case, we rejected at a 5% significance level and accepted at a 1% level• Graphically:
F* value
F
1%5%
3.29 3.81 5.27
![Page 31: Econ 140 Lecture 141 Multiple Regression Models Lecture 14.](https://reader038.fdocuments.in/reader038/viewer/2022102907/56649d4d5503460f94a2c84f/html5/thumbnails/31.jpg)
Lecture 14 31
Econ 140Econ 140Testing explanatory power (7)
• The t-test suggests that we should remove years of education from our regression
• An F-test on the joint hypothesis rejects the null, but the test is weak. At a lower significance level (1 percent), we would have accepted the null.
• In this instance, we want to keep the years of education variable in the equation because of what we know of economic theory
• What to do? Conclude that the economic theory is weak. Obtain more data and try again!
![Page 32: Econ 140 Lecture 141 Multiple Regression Models Lecture 14.](https://reader038.fdocuments.in/reader038/viewer/2022102907/56649d4d5503460f94a2c84f/html5/thumbnails/32.jpg)
Lecture 14 32
Econ 140Econ 140Adjustment to R2
• The more variables added to a regression, the higher R2 will be
– R2 is important, but it isn’t the sole criteria for judging a model’s explanatory power
• Adjusted R2 adjusts for the loss in degrees of freedom associated with adding independent variables to the regression
![Page 33: Econ 140 Lecture 141 Multiple Regression Models Lecture 14.](https://reader038.fdocuments.in/reader038/viewer/2022102907/56649d4d5503460f94a2c84f/html5/thumbnails/33.jpg)
Lecture 14 33
Econ 140Econ 140Adjustment to R2 (2)
• Adjusted R2 is written as
Adj R2 = 1 - (1 - R2)((n - 1)/(n - k))
n : sample size
k : number of parameters in the regression
![Page 34: Econ 140 Lecture 141 Multiple Regression Models Lecture 14.](https://reader038.fdocuments.in/reader038/viewer/2022102907/56649d4d5503460f94a2c84f/html5/thumbnails/34.jpg)
Lecture 14 34
Econ 140Econ 140What’s next
• Restricted least squares and the Cobb Douglas Production function
• Including qualitative indicators into the regression equation (e.g. race, gender, marital status).