The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection...

32
Introduction y and x normal Direction of regression, causality Exogeneity and pre-determinedness Time series data and pre-determ The regression model with one stochastic regressor (part II) 3150/4150 Lecture 7 Ragnar Nymoen Department of Economics, University of Oslo 6 Feb 2012 The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo

Transcript of The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection...

Page 1: The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection of regression, causality Exogeneity and pre-determinednessTime series data and

Introduction y and x normal Direction of regression, causality Exogeneity and pre-determinedness Time series data and pre-determinedness

The regression model with one stochasticregressor (part II)3150/4150 Lecture 7

Ragnar Nymoen

Department of Economics, University of Oslo

6 Feb 2012

The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo

Page 2: The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection of regression, causality Exogeneity and pre-determinednessTime series data and

Introduction y and x normal Direction of regression, causality Exogeneity and pre-determinedness Time series data and pre-determinedness

I We will finish Lecture topic 4: The regression model withstochastic regressor

I We will first look at an application: The Norwegian Phillipscurve over a long historical period (seperate note): Reminderabout the importance of variable transformation for obtaininga conditional expectations function that is linear in parameters

I Then look at an special case of the theory: Regression withvariables that are jointly normally distributed. It is useful asreference and for introducing two issues that that did not arisein RM1:

I Why regress y and x and not x on y?I And what is the relationship between regression and correlationand between regression and causality?

I Finally, we define exogeneity as an econometric concept, and

The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo

Page 3: The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection of regression, causality Exogeneity and pre-determinednessTime series data and

Introduction y and x normal Direction of regression, causality Exogeneity and pre-determinedness Time series data and pre-determinedness

I extend the regression model to time series dataI References: See Lecture 6 and the more detailed referencesthat we give below.

The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo

Page 4: The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection of regression, causality Exogeneity and pre-determinednessTime series data and

Introduction y and x normal Direction of regression, causality Exogeneity and pre-determinedness Time series data and pre-determinedness

Binormal variables II Before we begin: Remember that regression does not “requirenormally distributed variables”: In fact already RM1 showedthat!

Assume that we have stochastic variables (yi ,xi ), i = 1, 2, . . . , nthat are generated by the following system of linear equations:

yi = µy + εy ,i (1)

xi = µx + εx ,i (2)

where µy and µx are parameters and εy ,i and εx ,i have a normaljoint probability distribution(

εxiεyi

)∼ N

(0,(

σ2x ωxy

ωxy σ2y

))(3)

The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo

Page 5: The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection of regression, causality Exogeneity and pre-determinednessTime series data and

Introduction y and x normal Direction of regression, causality Exogeneity and pre-determinedness Time series data and pre-determinedness

Binormal variables IIεx ,i and εy ,i are therefore bivariate normal with expectation zeroand covariance matrix (

σ2x ωxy

ωxy σ2y

).

The correlation coeffi cient between εx ,i and εy ,i is:

ρxy =ωxy

σxσy.

I It is the population correlation coeffi cient.I Since linear combination of normally distributed variables arealso normally distributed, it follows that yi and xi given by(1), (2) are also normally distributed

The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo

Page 6: The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection of regression, causality Exogeneity and pre-determinednessTime series data and

Introduction y and x normal Direction of regression, causality Exogeneity and pre-determinedness Time series data and pre-determinedness

Binormal variables III

I From the properties of the normal distribution: thedistribution of yi conditional on xi is also normal, withexpectation

E[yi | xi ] = µy − ρxyσyσx

µx︸ ︷︷ ︸β1

+ ρxyσyσx︸ ︷︷ ︸

β2

xi

= β1 + β2xi (4)

I We will not derive this, but if you are interested, see e.g., BNkap 4.5.6 and 5.7

The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo

Page 7: The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection of regression, causality Exogeneity and pre-determinednessTime series data and

Introduction y and x normal Direction of regression, causality Exogeneity and pre-determinedness Time series data and pre-determinedness

Binormal variables IVI If we define the stochastic variables ei , i = 1, 2, . . . , n

ei = yi − E (yi | xi ) (5)

we see that the regression model:

yi = β1 + β2xi + ei . (6)

gives yi as the sum of the conditional expectations function(4) and the disturbance ei .

I This is of course a general characterization of RM2, what wehave gained by assuming a bivariate normal is that β1 and β2have been expressed as functions of the underlying populationparameters µx , µy , σ2x , σ2y and ρxy .

The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo

Page 8: The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection of regression, causality Exogeneity and pre-determinednessTime series data and

Introduction y and x normal Direction of regression, causality Exogeneity and pre-determinedness Time series data and pre-determinedness

Binormal variables VNote that ei can be written as

ei = µy + εyi − β1 − β2(µx + εxi )

= εyi −ωxy

σ2xεxi

Which can be used to show:

E (ei ) = 0, E (ei εxi ) = 0

Var(ei ) ≡ σ2 = σ2y (1− ρ2xy ) (7)

E (xiei ) = 0 for all i

In particular (7) shows that the reduction in unexplained varianceof yi relative to total variance of yi is due to correlation.

The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo

Page 9: The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection of regression, causality Exogeneity and pre-determinednessTime series data and

Introduction y and x normal Direction of regression, causality Exogeneity and pre-determinedness Time series data and pre-determinedness

Regression, correlation and causality I

The statistical system given by (1), (2) and (3) is mapped intomodel form:

yi = β1 + β2xi + ei (8)

xi = µx + εx ,i (9)

where is (8) is the conditional model of yi given xi (what we wishto explain) and (9) is the marginal model of xi (what we do nottry to explain).Note: this does not mean that (8) and (9) prove that xi is causingyi !

The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo

Page 10: The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection of regression, causality Exogeneity and pre-determinednessTime series data and

Introduction y and x normal Direction of regression, causality Exogeneity and pre-determinedness Time series data and pre-determinedness

Regression, correlation and causality II

An equally valid model of the statistical system is

xi = γ1 + γ2yi + ε i (10)

yi = µy + εy ,i (11)

where ε i has similar properties as ei , but for the case where wemodel xi conditionally on yi .γ2 can be shown to be

γ2 = ρxyσxσy

which is not β2 and not 1β2either.

The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo

Page 11: The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection of regression, causality Exogeneity and pre-determinednessTime series data and

Introduction y and x normal Direction of regression, causality Exogeneity and pre-determinedness Time series data and pre-determinedness

Regression, correlation and causality IIINote that you have shown in Seminar exercise 1 that the sameresults hold in the data, i.e., when the population parameters arereplaced by empirical moments!So we have two conditional model, one representing

x −→ y

and the othery −→ x

I How can we tell which of them represents causation?I The general answer is that we cannot assert causality fromregression alone– that can only be done with reference to(subject matter) theory!

The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo

Page 12: The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection of regression, causality Exogeneity and pre-determinednessTime series data and

Introduction y and x normal Direction of regression, causality Exogeneity and pre-determinedness Time series data and pre-determinedness

Regression, correlation and causality IV

I Recall the picture of econometrics as a combined disciplinethat we began with!

I Looking ahead to intermediate and advanced courses: Inboth cross section data and with time series data we haveoften access to natural experiments, that can make itpossible to substantiate a causal interpretation

The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo

Page 13: The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection of regression, causality Exogeneity and pre-determinednessTime series data and

Introduction y and x normal Direction of regression, causality Exogeneity and pre-determinedness Time series data and pre-determinedness

Exogeneity defined I

Part of the specification of RM2 was that

E (ei | xh) = 0, ∀i and h (12)

which implies that the disturbance ei is uncorrelated with all xhvariables:

cov(ei , xh) = 0, ∀i and h (13)

We showed that, because of conditioning, we had for h = i

cov(ei , xi ) = 0 (14)

is an inherent property of the model. It always holds.

I However: (13) is a more general statement than (14).

The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo

Page 14: The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection of regression, causality Exogeneity and pre-determinednessTime series data and

Introduction y and x normal Direction of regression, causality Exogeneity and pre-determinedness Time series data and pre-determinedness

Exogeneity defined II

I It is therefore custom to include (13) as an assumption in themodel specification.

I This assumption is called the assumption of exogenousexplanatory variable, cf HGL p 402 and BN .

I We will now look at two examples where exogeneity fail, butwith different consequences for the OLS estimators

The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo

Page 15: The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection of regression, causality Exogeneity and pre-determinednessTime series data and

Introduction y and x normal Direction of regression, causality Exogeneity and pre-determinedness Time series data and pre-determinedness

The measurement error model

Measurement error in the regressor I

HGL Ch 10.2. BN kap 6.3

I Assume that the parameters of interest is between anobservable variable yi and an unobservable variable x∗

(permanent income is the example in HGL)

yi = β1 + β2x∗i + vi

I By the same assumptions as for RM2, but using the symbolsx∗i and vi is place of xi and ei , this can formulated as aregression model.

I However, that model would be irrelevant for practice since x∗iis unobservable

The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo

Page 16: The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection of regression, causality Exogeneity and pre-determinednessTime series data and

Introduction y and x normal Direction of regression, causality Exogeneity and pre-determinedness Time series data and pre-determinedness

The measurement error model

Measurement error in the regressor II

I To formulate a model in observables weextend the list ofassumption with

xi = x∗i + ui

where ui is a random measurement error that isuncorrelated with both vi and x∗i .

I It is tempting to say that

yi = β1 + β2xi + ei (15)

is a valid regression model.

The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo

Page 17: The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection of regression, causality Exogeneity and pre-determinednessTime series data and

Introduction y and x normal Direction of regression, causality Exogeneity and pre-determinedness Time series data and pre-determinedness

The measurement error model

Measurement error in the regressor III

However, since ei in this case must be

ei = vi − β2ui

thencov(ei , xi ) = −β2var(ui ) 6= 0 (16)

showing that xi cannot be regarded as exogenous in (15).

I If we estimate (15) by OLS, what do we get in terms ofproperties?

I We will only motivate an answer, since a precise answer willuse Probability limits that will be explained under Topic 6

The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo

Page 18: The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection of regression, causality Exogeneity and pre-determinednessTime series data and

Introduction y and x normal Direction of regression, causality Exogeneity and pre-determinedness Time series data and pre-determinedness

The measurement error model

Measurement error in the regressor IV

As always, the OLS estimator for β2 can be written as

β2 =∑ni=1(xi − x)yi

∑ni=1(xi − x)2

= β2 +∑ni=1(xi − x)ei

∑ni=1(xi − x)2

Unlike in RM2, we cannot show

E(

∑ni=1(xi − x)ei

∑ni=1(xi − x)2

)= 0

with the use of conditional expectation because xi and ei containcommon stochastic variables.

The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo

Page 19: The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection of regression, causality Exogeneity and pre-determinednessTime series data and

Introduction y and x normal Direction of regression, causality Exogeneity and pre-determinedness Time series data and pre-determinedness

The measurement error model

Measurement error in the regressor V

I Intuitively however, we can guess that there is going to be abias since the ∑n

i=1(xi − x)ei is an empirical counterpart tocov(ei , xi ), which is non-zero from the specification of themodel.

I This turns out to be true: In fact failure of exogenity of ximplies that β2 becomes inconsistent:

I We do not “get” the exactly true β2 even in infinitely largesamples.

Looking ahead: The method of moments (Topic 10) can be usedinstead of OLS to obtain a consistent estimator.

The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo

Page 20: The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection of regression, causality Exogeneity and pre-determinednessTime series data and

Introduction y and x normal Direction of regression, causality Exogeneity and pre-determinedness Time series data and pre-determinedness

The measurement error model

Measurement error in y

If the only departure from RM2 is that we have

y ∗i = β1 + β2xi + vi

where y ∗ is unobservable, the consequences are different.

I As long as the measurement error in y is uncorrelated with x ,the model in terms of the observables has the same propertiesas before.

I In particular: No bias of OLS estimator for β2! Show as aDIY!

The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo

Page 21: The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection of regression, causality Exogeneity and pre-determinednessTime series data and

Introduction y and x normal Direction of regression, causality Exogeneity and pre-determinedness Time series data and pre-determinedness

The Lucas critique

Rational expectations and the Lucas critique I

I The measurement error model can be use to explain thefamous Lucas critique in macroeconomics

I Let x∗t represent the expected value of xt .I Under the hypothesis of adaptive expectations the OLSestimator of β2 remains consistent.

I But under the assumption of rational expectations we havethat ui in

xi = x∗i + ui

represents a random expectations error.I The result is that OLS gives an inconsistent estimator of thestructural parameter β2.

The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo

Page 22: The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection of regression, causality Exogeneity and pre-determinednessTime series data and

Introduction y and x normal Direction of regression, causality Exogeneity and pre-determinedness Time series data and pre-determinedness

The Lucas critique

Rational expectations and the Lucas critique II

I Inconsistent because the OLS estimator is contaminated byparameters of the expectations formation process.

I Moreover: Since expectations change when policy changes,the OLS estimator β2 is subject to structural breaks: It willchange when policy chagnes and will be an unreliable guide tojudge the effects of economic polices.

I Looking ahead: Later courses discuss both the theory and therelevance of the Lucas critique (it can in fact be tested!). Ifinterested: BN 5.12 is relatively detailed comared to otherintroductory books.

The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo

Page 23: The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection of regression, causality Exogeneity and pre-determinednessTime series data and

Introduction y and x normal Direction of regression, causality Exogeneity and pre-determinedness Time series data and pre-determinedness

Models for time series data I

For time series data we use t as a subscript for the stochasticvariables/observations. It is also custom to replace n by T .If we formulate a static model

yt = β1 + β2xt + et (17)

for time series data, the specification of RM2 will in essence beunchanged, with e.g., assumption d. written as

cov (et , et±s | xt ) = 0, ∀s 6= t

which is called the assumption of no autocorrelation in thedisturbances.

The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo

Page 24: The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection of regression, causality Exogeneity and pre-determinednessTime series data and

Introduction y and x normal Direction of regression, causality Exogeneity and pre-determinedness Time series data and pre-determinedness

Models for time series data II

I For the static model (17) the hypothesis of no autocorrelationoften fails. This regularly shows up in the OLS residuals etfrom (17) which are usually highly correlated with et−1 (andoften “older residual as well).

I The explanation is that time series variables are typicallyserially correlated: yt is usually highly correlated with yt−1,and xt is correlated with xt−1.

I Therefore the independent sampling assumption of RM2 isirrelevant for the case of time series data

The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo

Page 25: The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection of regression, causality Exogeneity and pre-determinednessTime series data and

Introduction y and x normal Direction of regression, causality Exogeneity and pre-determinedness Time series data and pre-determinedness

A simple dynamic model I

I The solution of the problem with autocorrelation is either toI correct the OLS estimators, or toI represent the serial correlation of yt and xt in the conditionalexpectation (dynamic econometric models)

The simplest example a dynamic model is

yt = β1 + β2yt−1 + et ,with − 1 < β2 < 1 (18)

where the explanatory variable replacing xt is the history the yvariable.

I This type of equation is called an autoregressive model oforder one (AR(1)). It is a linear stochastic difference equation.

The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo

Page 26: The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection of regression, causality Exogeneity and pre-determinednessTime series data and

Introduction y and x normal Direction of regression, causality Exogeneity and pre-determinedness Time series data and pre-determinedness

A simple dynamic model III In terms of properties of estimators: How close does thismodel come to RM2?

I The answer is: So close that it can be seen as a variant ofRM2

To complete the specification of the dynamic regression model, wecan define the conditional expectation function

E (yt | yt−1) = β1 + β2yt−1

and the disturbance properties

E (et | yt−1) = 0var(et | yt−1) = σ2

cov(et , , et±s | yt−1) = 0The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo

Page 27: The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection of regression, causality Exogeneity and pre-determinednessTime series data and

Introduction y and x normal Direction of regression, causality Exogeneity and pre-determinedness Time series data and pre-determinedness

A simple dynamic model IIIWhat can we say about

cov(et±s ,yt−1) ?

in this model?For s = 0, we have from E (et | yt−1) = 0 that

cov(et ,yt−1) = 0

but at we know, exogeneity requires that yt−1 is uncorrelated withall disturbances, both past and future.The mathematical solution for yt in (18) is found by repeatedsubstitution of yt−1, yt−2 and so on back to infinity:

yt = β1 ∑∞i=0 βi2 +∑∞

i=0 βi2et−i (19)

(19) shows that

The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo

Page 28: The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection of regression, causality Exogeneity and pre-determinednessTime series data and

Introduction y and x normal Direction of regression, causality Exogeneity and pre-determinedness Time series data and pre-determinedness

A simple dynamic model IV

I yt−1 is uncorrelated with et and all future disturbances, butI yt−1 is correlated with et−1 and all other past disturbances

Hence yt−1 is not exogenous in (18), but yt−1is not completeendogenous either.

I We have an intermediary case between exogeneity andendogeneity, and we say that yt−1 is a pre-determinedvariable in (19).

I In the case of a pre-determined explanatory variable theproperties of the OLS estimators β2 and β1 are consistent butwith finite sample biases, that are due to the correlationbetween yt−1 and past disturbances.

The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo

Page 29: The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection of regression, causality Exogeneity and pre-determinednessTime series data and

Introduction y and x normal Direction of regression, causality Exogeneity and pre-determinedness Time series data and pre-determinedness

A simple dynamic model V

As an example, we have that

E (β2 − β2) ≈−2β2T

for the simplest case with β1 = 0 (no drift).

The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo

Page 30: The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection of regression, causality Exogeneity and pre-determinednessTime series data and

Introduction y and x normal Direction of regression, causality Exogeneity and pre-determinedness Time series data and pre-determinedness

10 20 30 40 50 60 70 80 90 100­0.08

­0.07

­0.06

­0.05

­0.04

­0.03

­0.02

­0.01

I plot of bias formulafor β2 = 0.5

I T = 1, 2, ..., 100

The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo

Page 31: The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection of regression, causality Exogeneity and pre-determinednessTime series data and

Introduction y and x normal Direction of regression, causality Exogeneity and pre-determinedness Time series data and pre-determinedness

Summary of the regression model I

I As long as the regressor is deterministic or exogenous, and theclassical assumptions about the disturbance properties hold,the regression model gives OLS estimators that are BLUE.

I In the case of stochastic x , the proof is in term of conditionaland iterated expectations.

I With normally distributed disturbances, hypotheses tests andconfidence intervals can be based on percentiles from thet-distribution.

I Consistency of estimators also holds. We have only provedthat for the case of deterministic regressor: The theory ofProbability limit is needed for the case of stochastic x .

The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo

Page 32: The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection of regression, causality Exogeneity and pre-determinednessTime series data and

Introduction y and x normal Direction of regression, causality Exogeneity and pre-determinedness Time series data and pre-determinedness

Summary of the regression model II

I Without normally distributed disturbances, the t-test isapproximately valid, and the degree of approximation becomesbetter with larger n

I If x is a pre-determined stochastic regressor, there is a (small)bias in the OLS estimator.

I That bias is decreasing in the sample size.I Hence, for typical sample sizes (more than 30 observations)the case or pre-determinedness can be regarded as a variant ofRM2: the properties are very similar.

The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo