4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a...

22
4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive impact on Canadian identity? -Once a regression has been run, hypothesis tests work to both refine the regression and answer the question -To do this, we assume that the error is normally distributed -Hypothesis tests also assume no
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    213
  • download

    0

Transcript of 4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a...

Page 1: 4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.

4. Multiple Regression Analysis: Estimation

-Most econometric regressions are motivated by a question-ie: Do Canadian Heritage commercials

have a positive impact on Canadian identity?-Once a regression has been run, hypothesis tests work to both refine the regression and answer the question-To do this, we assume that the error is normally distributed-Hypothesis tests also assume no statistical issues in the regression

Page 2: 4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.

4. Multiple Regression Analysis: Inference

4.1 Sampling Distributions of the OLS Estimators

4.2 Testing Hypotheses about a Single Population Parameter: The t test

4.3 Confidence Intervals

4.4 Testing Hypothesis about a Single Linear Combination of the Parameters

4.5 Testing Multiple Linear Restrictions: The F test

4.6 Reporting Regression Results

Page 3: 4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.

4.1 Sampling Distributions of OLS-In chapter 3, we formed assumptions that make OLS unbiased and covered the issue of omitted variable bias-In chapter 3 we also obtained estimates for OLS variance and showed it was smallest of all linear unbiased estimators

-Expected value and variance are just the first two moments of Bjhat, its distribution can still have any shape

Page 4: 4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.

4.1 Sampling Distributions of OLS-From our OLS estimate formulas, the sample distributions of OLS estimators depends on the underlying distribution of the errors

-In order to conduct hypothesis tests, we assume that the error is normally distributed in the population

-This is the NORMALITY ASSUMPTION:

Page 5: 4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.

Assumption MLR. 6(Normality)

The population error u is independent of the explanatory variables x1, x2,…,xk and is normally distributed with zero mean and variance σ2:

),0(~ 2Nu

Page 6: 4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.

Assumption MLR. 6 NotesMLR. 6 is much stronger than any of our previous assumptions as it implies:

MLR. 4: E(u|X)=E(u)=0MLR. 5: Var(u|X)=Var(u)=σ2

Assumptions MLR. 1 through MRL. 6 are the CLASSICAL LINEAR MODEL (CLM) ASSUMPTIONS used to produce the CLASSICAL LINEAR MODEL-CLM assumptions are all the Gauss-Markov assumptions plus a normally distributed error

Page 7: 4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.

4.1 CLM AssumptionsUnder the CLM assumptions, the efficiency of OLS’s estimators is enhanced-OLS estimators are now the MINIMUM VARIANCE UNBIASED ESTIMATORS-the “linear” requirement has been dropped and OLS is now BUE-the population assumptions of CLM can be summarised as:

),...(~| 222110 kk xxxNXy

-conditional on x, y has a normal distribution with mean linear in X and a constant variance

Page 8: 4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.

4.1 CLM AssumptionsThe normal distribution of errors assumption is driven by the following:

1) u is the sum of many unobserved factors that affect y

2) By the CENTRAL LIMIT THEOREM (CLT), u has an approximately normal distribution (see appendix C)

Page 9: 4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.

4.1 Normality Assumption ProblemsThis normality assumption has difficulties:

1) Factors affecting u can have widely different distributions-this assumption becomes worse depending on the number of factors in u and how different their distributions are

2) The assumption assumes that u factors affect y in SEPARATE, additive fashions-if u factors affect y in a complicated fashion, CLT doesn’t apply

Page 10: 4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.

4.1 Normality Assumption ProblemsIn general, the normality of u is an empirical (not theoretical) matter

-if empirical evidence shows that a distribution is NOT normal, we can practically ask if it is CLOSE to normal-often applying a transformation (such as logs) can make a non-normal distribution normal

-Consequences of nonnormality are covered in Chapter 5

Page 11: 4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.

4.1 Nonnormality-In some cases, MLR. 6 is clearly false-Take the regression:

u sin210 gFlosBrushingtsDentalVisi

-since dental visits have only a few values for most people, our dependent variable is far from normal

-we will see that nonnormality is not a difficulty in large samples

-for now, we simply assume normality-error normality extends to the OLS

estimators:

Page 12: 4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.

Theorem 4.1(Normal Sampling

Distributions)Under the CLM assumptions MLR.1 through

MLR. 6, conditional on the sample values of the independent variable,

(3.3) )]ˆVar(,N[~ˆj1j

Where Var(Bjhat) was given in Chapter 3 [equation 3.51]. Therefore,

N(0,1)-)ˆcd(/)-ˆ( jjj

Page 13: 4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.

Theorem 4.1 Proof

w

ijij

ijj

j

ij

SSR

u

Where rjhat and SSRj come from the regression of xj on all other x’s

-Therefore w is non random-Bjhat is therefore a linear combination of the error terms (u)

-MLR. 6 and MLR. 2 make the errors independent, normally distributed random variables

Page 14: 4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.

Theorem 4.1 ProofAny linear combination of independent normal random

variables is itself normally distributed (Appendix B)This proves the first equation in the proof

-The second equation comes from the fact that standardizing a normal random variable (by subtracting its mean and dividing by its standard deviation) gives us a standard normal random variable (statistical theory)

Page 15: 4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.

4.1 NormalityTheorem 4.1 allows us to do simple hypothesis

tests by assigning a normal distribution to OLS estimators

Furthermore, since any linear combination of OLS estimators has a normal distribution, and any subset of OLS estimators has a joint normal distribution, more complicated tests can be done

Page 16: 4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.

4.2 Single Hypothesis Tests: t tests-this section covers testing hypotheses about single

parameters from the population regression function-Consider the population model:

(4.2)u x...xx kk22110 y

-we assume this model satisfies CLM assumptions

-we know that OLS’s estimate of Bj is unbiased

-the true Bj is unknown, and in order to hypothesis about Bj’s true value, we use statistical inference and the following theorem:

Page 17: 4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.

Theorem 4.2(t Distribution for the

Standardized Estimators)Under the CLM assumptions MLR.1

through MLR. 6,(4.3) t~

)ˆse(

) -ˆ(1-k-n

j

jj

where k+1 is the number of unknown parameters in the population model

(4.2)u x...xx kk22110 y

(k slope parameters and the intercept B0).

Page 18: 4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.

Theorem 4.2 Notes

Theorem 4.2 differs from theorem 4.1:

-4.1 deals with a normal distribution and standard deviation-4.2 deals with a t distribution and standard error

-replacing σ with σhat causes this-see section B.5 for more details

Page 19: 4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.

4.2 Null Hypothesis-In order to perform a hypothesis test, we

first need a NULL HYPOTHESIS of the form: (4.4) 0 : j0 H

-which examines the idea that xj has no partial effect on y

-For example, given the regression and null hypothesis:

0:H

uCarnations Roses tlowerEffec

10

210

F

-We examine the idea that roses don’t make no impression (good or bad) in a bouquet

Page 20: 4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.

4.2 Hypothesis Tests-Hypothesis tests are easy, the hard part is

calculating the needed values in the regression-our T STATISTIC or S RATIO is calculated as:

(4.5) )ˆse(

ˆ

j

jt

-therefore, given an OLS estimate for Bj and its standard error, we can calculate a t statistic

-note that our t stat will have the same sign as Bjhat

-note also that larger Bjhats cause larger t stats

Page 21: 4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.

4.2 Hypothesis Tests-Note that Bjhat will never EXACTLY equal zero

-instead we ask: How far is Bjhat from zero?

-a Bjhat far from zero provides evidence that Bj isn’t zero

-but the sampling error (standard deviation) must also be taken into account-Hence the t statistic

-t measures how many estimated standard deviations Bjhat is from zero

Page 22: 4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.

4.2 Hypothesis Tests-values of t significantly far from zero cause the null hypothesis to be rejected

-to determine how far t must be from zero, we select a SIGNIFICANCE LEVEL – a probability of rejecting H0 when it is true

-we know that the sample distribution of t is tn-k-1, which is key

-note that we are testing the population parameters (Bj) not the estimates (Bjhat)