Multiple Regression

25
1 Multiple Regression Dr. C. Ertuna

description

Multiple Regression. Dr. C. Ertuna. Multiple Regression (CLR). (5.2). A Multiple Linear Regression Model will look like. Assumptions of CLR Model. Linearity Variance in X: Var(X) ≠ 0 Non-Stochastic X: Cov(X s , u t ) = 0 No Multicollinearity Homoskedasticity: Var(u t ) = s 2 - PowerPoint PPT Presentation

Transcript of Multiple Regression

Page 1: Multiple Regression

1

Multiple Regression

Dr. C. Ertuna

Page 2: Multiple Regression

Multiple Regression (CLR)

A Multiple Linear Regression Model will look like

2

(5.2)

𝑋 1 𝑡≡1

Page 3: Multiple Regression

3

• Linearity• Variance in X: Var(X) ≠ 0• Non-Stochastic X: Cov(Xs, ut) = 0• No Multicollinearity

• Homoskedasticity: Var(ut) = s2 • Zero expected value for Residuals: E(ut) = 0• Serial Independence: Cov(us, ut) = 0• Normality of Residuals

Assumptions of CLR ModelRe

sidu

als

Page 4: Multiple Regression

4

:cannot be used to compare models with different number of explanatory variables, even if the functional forms of models are the same. Additional repressors will decrease RSS and hence increase . takes into account the number of additional Regressor.

< (p.65 Eq. 5.57)

Goodness of Fit Measures

Page 5: Multiple Regression

5

AIC (Akaike Information Criteria): Best criteria among the alternatives. The decision criteria is that the model with the lowest AIC is the best.AIC = (RSS/n)*exp(2*k/n)

Goodness of Fit Measures (Cont.)

Number of ObservationsNumber of Regressors + Intercept

Residual Sum of Squares

Page 6: Multiple Regression

6

AIC (Akaike Information Criteria): Best criteria among the alternatives. The decision criteria is that the model with the lowest AIC is the best.AIC = n*ln(RSS/n)+2*k

Goodness of Fit Measures (Cont.)

Number of ObservationsNumber of Regressors + Intercept

Residual Sum of Squares

Page 7: Multiple Regression

Testing Model Parameters

For several reasons a researcher may wish to test whether certain regression parameters (or a certain set of regression parameters) are equal to one another or to a specific value.For example: Are the impact of two Regressors ( and same? That means; Is ?It is like setting restrictions to those parameters. That is why it is also called

“Testing Linear Restrictions”7

Page 8: Multiple Regression

8

Restrictions on a model can take different forms: (a) Combination of coefficients assume certain

value: for example (b) One coefficient assumes certain value: for

example Or in the Redundant Variable case (meaning is redundant Regressor – does not contribute to the explanation of the model)

Restricted and UnrestrictedModels

Page 9: Multiple Regression

9

There are several methods to test linear restrictions, such as:a) Likelihood Ratio Test (based on estimation of

both restricted and unrestricted model),b) Wald Test (based on estimation of unrestricted

model), andc) Lagrange Multiplier Test (based on estimation of

restricted model).

Testing Linear Restrictions

Page 10: Multiple Regression

10

Unrestricted model means a model prior to any restrictions. In general unrestricted models have more Regressors than the restricted versions.The fundamental concept behind any linear restriction test is that the RSS of Restricted model (with fewer explanatory variables) is grater than the RSS of Unrestricted model (that has more explanatory variables).

Restricted and UnrestrictedModels

Page 11: Multiple Regression

11

In all three approaches of linear restriction tests the Null Hypothesis is as followsHo: There is no difference between restricted and unrestricted models in terms of Goodness of Fit, hence we don’t need extra Regressor(s) (or the unrestricted model). To express it another way, we can say that the change in the Goodness of fit between two specification is statistically insignificant.

Ho: of Linear Restriction Tests

Page 12: Multiple Regression

Ha: of Linear Restriction Tests

If on the other handp-value < α Then the unrestricted model provides a better Goodness of Fit than the restricted model.

For example if the restriction is and p-value < α that means that or in other words Regressor is not redundant, it does contribute to the explanation of the model. 12

Page 13: Multiple Regression

Linear Restriction Application

t-test in the SPSS’s output of Coefficient Table is a special case for Wald test. Ho: = 0 (it tests whether is redundant or not.)F-test in the SPSS’s output of ANOVA Table is a form of Likelihood Ratio test.Ho: = = = 0 (joint significance of the X’s are tested)

13

Page 14: Multiple Regression

Example: Omitted Variable Test

Page 78. Two Models Model-1: Table 5.4 & Model-2: Table 5.5Using the F-form of the Likelihood Ratio test.

F(, N - ) = p-value = FDIST(F-value; ; N - )

14

Page 15: Multiple Regression

Definition of Test Parameters

= Residual Sum of Squares for Restricted Model (Model with fewer variables)

= Residual Sum of Squares for Unrestricted Model (Model with fewer variables)

= Number of parameters (included the intercept) in the Unrestricted model.

= Number of parameters (included the intercept) in the Unrestricted model.

N = Number of observations of the unrestricted model.

15

Page 16: Multiple Regression

Steps in Omitted Variable Test

1) Get RSS and k of Restricted Model (the model with fewer variables)

2) Get RSS and k and N of Unrestricted Model (the model with more variables)

3) Apply the F-form of the Likelihood Ratio test (get organized and use Excel for computation).

16

F(, N - ) = p-value = FDIST(F-value; ; N - )

Page 17: Multiple Regression

Omitted Variable Test: Step-1

Page 18: Multiple Regression

Omitted Variable Test: Step-2

Page 19: Multiple Regression

Omitted Variable Test: Step-3

Use formula on page 70 to compute on ExcelF-valuewhere, Model-1 is restricted modelWhy?Because it doesn’t have variable EDUC or in other

words Regression Parameter for EDUC is set to zero in Model-1 (restriction).

and than use =FDIST() to compute the p-value. 19

Page 20: Multiple Regression

F-form of the Likelihood Ratio test

20

Restricted UnrestrictedRSS = 153,5327 135,211

k = 3 4N = 900 900

F-value = 121,412039p-value = 0,00000

Page 21: Multiple Regression

DECISION

• Since the p-value is smaller than alpha we decide that the restrictions does not hold.

• In other words unrestricted model is better than the restricted model.

• Particularly, variable “EDUC” should be part of the model’s explanatory variables.

21

Page 22: Multiple Regression

22

END

Page 23: Multiple Regression

Test for marginal contribution of new variable• Very useful test in deciding if a new variable should be retained in the

model• Eg: mortality rate of a country is a function of its National Income,

literacy rate, health indicators.• Question is should we include per capita income in the model. • Estimate a model without PCI and get Rsq(old).• Re-estimate including PCI and get its Rsq(new).

Ho: Addition of new variable does not improve the modelH1: Addition of new variable improves the modelIf estimated F is higher than critical F table value, reject null hypothesis.

It means PCI needs to be included in the above example.

newnew

oldnew

knRparametersnewofnoRRF

/)1(

``./)(2

22

Page 24: Multiple Regression

Testing equality of coefficients

• To test if 2 slope coefficients are equal– T test approach, Ho: = ↔ - = 0

),(2)var()var()(

)(

4

^

3

^

4

^

3

^

4

^

3

^

4

^

3

^4

^

3

^

CovSE

SEt

Page 25: Multiple Regression

Testing linear equality restriction

• Theory might lead you impose certain “a priori” restrictions in your model

• Eg: constant returns to scale in Cob-Douglas model• b2+b3=1 This is a linear restriction• How do you check this is valid or not• One way is using t test: Ho: + = 1↔ + - 1 = 0

),(2)var()var()(

)(

1

3

^

2

^

3

^

2

^

3

^

2

^

3

^

2

^3

^

2

^

CovSE

SEt