The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection...

Introduction y and x normal Direction of regression, causality Exogeneity and pre-determinedness Time series data and pre-determinedness

The regression model with one stochasticregressor (part II)3150/4150 Lecture 7

Ragnar Nymoen

Department of Economics, University of Oslo

6 Feb 2012

The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo


I We will finish Lecture topic 4: The regression model withstochastic regressor

I We will first look at an application: The Norwegian Phillipscurve over a long historical period (seperate note): Reminderabout the importance of variable transformation for obtaininga conditional expectations function that is linear in parameters

I Then look at an special case of the theory: Regression withvariables that are jointly normally distributed. It is useful asreference and for introducing two issues that that did not arisein RM1:

I Why regress y and x and not x on y?I And what is the relationship between regression and correlationand between regression and causality?

I Finally, we define exogeneity as an econometric concept, and



I extend the regression model to time series dataI References: See Lecture 6 and the more detailed referencesthat we give below.



Binormal variables II Before we begin: Remember that regression does not “requirenormally distributed variables”: In fact already RM1 showedthat!

Assume that we have stochastic variables (yi ,xi ), i = 1, 2, . . . , nthat are generated by the following system of linear equations:

yi = µy + εy ,i (1)

xi = µx + εx ,i (2)

where µy and µx are parameters and εy ,i and εx ,i have a normaljoint probability distribution(

εxiεyi

)∼ N

(0,(

σ2x ωxy

ωxy σ2y

))(3)



Binormal variables IIεx ,i and εy ,i are therefore bivariate normal with expectation zeroand covariance matrix (

σ2x ωxy

ωxy σ2y

).

The correlation coeffi cient between εx ,i and εy ,i is:

ρxy =ωxy

σxσy.

I It is the population correlation coeffi cient.I Since linear combination of normally distributed variables arealso normally distributed, it follows that yi and xi given by(1), (2) are also normally distributed



Binormal variables III

I From the properties of the normal distribution: thedistribution of yi conditional on xi is also normal, withexpectation

E[yi | xi ] = µy − ρxyσyσx

µx︸︷︷︸β1

+ ρxyσyσx︸︷︷︸

β2

xi

= β1 + β2xi (4)

I We will not derive this, but if you are interested, see e.g., BNkap 4.5.6 and 5.7



Binormal variables IVI If we define the stochastic variables ei , i = 1, 2, . . . , n

ei = yi − E (yi | xi ) (5)

we see that the regression model:

yi = β1 + β2xi + ei . (6)

gives yi as the sum of the conditional expectations function(4) and the disturbance ei .

I This is of course a general characterization of RM2, what wehave gained by assuming a bivariate normal is that β1 and β2have been expressed as functions of the underlying populationparameters µx , µy , σ2x , σ2y and ρxy .



Binormal variables VNote that ei can be written as

ei = µy + εyi − β1 − β2(µx + εxi )

= εyi −ωxy

σ2xεxi

Which can be used to show:

E (ei ) = 0, E (ei εxi ) = 0

Var(ei ) ≡ σ2 = σ2y (1− ρ2xy ) (7)

E (xiei ) = 0 for all i

In particular (7) shows that the reduction in unexplained varianceof yi relative to total variance of yi is due to correlation.



Regression, correlation and causality I

The statistical system given by (1), (2) and (3) is mapped intomodel form:

yi = β1 + β2xi + ei (8)

xi = µx + εx ,i (9)

where is (8) is the conditional model of yi given xi (what we wishto explain) and (9) is the marginal model of xi (what we do nottry to explain).Note: this does not mean that (8) and (9) prove that xi is causingyi !



Regression, correlation and causality II

An equally valid model of the statistical system is

xi = γ1 + γ2yi + ε i (10)

yi = µy + εy ,i (11)

where ε i has similar properties as ei , but for the case where wemodel xi conditionally on yi .γ2 can be shown to be

γ2 = ρxyσxσy

which is not β2 and not 1β2either.



Regression, correlation and causality IIINote that you have shown in Seminar exercise 1 that the sameresults hold in the data, i.e., when the population parameters arereplaced by empirical moments!So we have two conditional model, one representing

x −→ y

and the othery −→ x

I How can we tell which of them represents causation?I The general answer is that we cannot assert causality fromregression alone– that can only be done with reference to(subject matter) theory!



Regression, correlation and causality IV

I Recall the picture of econometrics as a combined disciplinethat we began with!

I Looking ahead to intermediate and advanced courses: Inboth cross section data and with time series data we haveoften access to natural experiments, that can make itpossible to substantiate a causal interpretation



Exogeneity defined I

Part of the specification of RM2 was that

E (ei | xh) = 0, ∀i and h (12)

which implies that the disturbance ei is uncorrelated with all xhvariables:

cov(ei , xh) = 0, ∀i and h (13)

We showed that, because of conditioning, we had for h = i

cov(ei , xi ) = 0 (14)

is an inherent property of the model. It always holds.

I However: (13) is a more general statement than (14).



Exogeneity defined II

I It is therefore custom to include (13) as an assumption in themodel specification.

I This assumption is called the assumption of exogenousexplanatory variable, cf HGL p 402 and BN .

I We will now look at two examples where exogeneity fail, butwith different consequences for the OLS estimators



The measurement error model

Measurement error in the regressor I

HGL Ch 10.2. BN kap 6.3

I Assume that the parameters of interest is between anobservable variable yi and an unobservable variable x∗

(permanent income is the example in HGL)

yi = β1 + β2x∗i + vi

I By the same assumptions as for RM2, but using the symbolsx∗i and vi is place of xi and ei , this can formulated as aregression model.

I However, that model would be irrelevant for practice since x∗iis unobservable




Measurement error in the regressor II

I To formulate a model in observables weextend the list ofassumption with

xi = x∗i + ui

where ui is a random measurement error that isuncorrelated with both vi and x∗i .

I It is tempting to say that

yi = β1 + β2xi + ei (15)

is a valid regression model.




Measurement error in the regressor III

However, since ei in this case must be

ei = vi − β2ui

thencov(ei , xi ) = −β2var(ui ) 6= 0 (16)

showing that xi cannot be regarded as exogenous in (15).

I If we estimate (15) by OLS, what do we get in terms ofproperties?

I We will only motivate an answer, since a precise answer willuse Probability limits that will be explained under Topic 6




Measurement error in the regressor IV

As always, the OLS estimator for β2 can be written as

β2 =∑ni=1(xi − x)yi

∑ni=1(xi − x)2

= β2 +∑ni=1(xi − x)ei

∑ni=1(xi − x)2

Unlike in RM2, we cannot show

E(

∑ni=1(xi − x)ei

∑ni=1(xi − x)2

)= 0

with the use of conditional expectation because xi and ei containcommon stochastic variables.




Measurement error in the regressor V

I Intuitively however, we can guess that there is going to be abias since the ∑n

i=1(xi − x)ei is an empirical counterpart tocov(ei , xi ), which is non-zero from the specification of themodel.

I This turns out to be true: In fact failure of exogenity of ximplies that β2 becomes inconsistent:

I We do not “get” the exactly true β2 even in infinitely largesamples.

Looking ahead: The method of moments (Topic 10) can be usedinstead of OLS to obtain a consistent estimator.




Measurement error in y

If the only departure from RM2 is that we have

y ∗i = β1 + β2xi + vi

where y ∗ is unobservable, the consequences are different.

I As long as the measurement error in y is uncorrelated with x ,the model in terms of the observables has the same propertiesas before.

I In particular: No bias of OLS estimator for β2! Show as aDIY!



The Lucas critique

Rational expectations and the Lucas critique I

I The measurement error model can be use to explain thefamous Lucas critique in macroeconomics

I Let x∗t represent the expected value of xt .I Under the hypothesis of adaptive expectations the OLSestimator of β2 remains consistent.

I But under the assumption of rational expectations we havethat ui in

xi = x∗i + ui

represents a random expectations error.I The result is that OLS gives an inconsistent estimator of thestructural parameter β2.



The Lucas critique

Rational expectations and the Lucas critique II

I Inconsistent because the OLS estimator is contaminated byparameters of the expectations formation process.

I Moreover: Since expectations change when policy changes,the OLS estimator β2 is subject to structural breaks: It willchange when policy chagnes and will be an unreliable guide tojudge the effects of economic polices.

I Looking ahead: Later courses discuss both the theory and therelevance of the Lucas critique (it can in fact be tested!). Ifinterested: BN 5.12 is relatively detailed comared to otherintroductory books.



Models for time series data I

For time series data we use t as a subscript for the stochasticvariables/observations. It is also custom to replace n by T .If we formulate a static model

yt = β1 + β2xt + et (17)

for time series data, the specification of RM2 will in essence beunchanged, with e.g., assumption d. written as

cov (et , et±s | xt ) = 0, ∀s 6= t

which is called the assumption of no autocorrelation in thedisturbances.



Models for time series data II

I For the static model (17) the hypothesis of no autocorrelationoften fails. This regularly shows up in the OLS residuals etfrom (17) which are usually highly correlated with et−1 (andoften “older residual as well).

I The explanation is that time series variables are typicallyserially correlated: yt is usually highly correlated with yt−1,and xt is correlated with xt−1.

I Therefore the independent sampling assumption of RM2 isirrelevant for the case of time series data



A simple dynamic model I

I The solution of the problem with autocorrelation is either toI correct the OLS estimators, or toI represent the serial correlation of yt and xt in the conditionalexpectation (dynamic econometric models)

The simplest example a dynamic model is

yt = β1 + β2yt−1 + et ,with − 1 < β2 < 1 (18)

where the explanatory variable replacing xt is the history the yvariable.

I This type of equation is called an autoregressive model oforder one (AR(1)). It is a linear stochastic difference equation.



A simple dynamic model III In terms of properties of estimators: How close does thismodel come to RM2?

I The answer is: So close that it can be seen as a variant ofRM2

To complete the specification of the dynamic regression model, wecan define the conditional expectation function

E (yt | yt−1) = β1 + β2yt−1

and the disturbance properties

E (et | yt−1) = 0var(et | yt−1) = σ2

cov(et , , et±s | yt−1) = 0The regression model with one stochastic regressor (part II) Department of Economics, University of Oslo


A simple dynamic model IIIWhat can we say about

cov(et±s ,yt−1) ?

in this model?For s = 0, we have from E (et | yt−1) = 0 that

cov(et ,yt−1) = 0

but at we know, exogeneity requires that yt−1 is uncorrelated withall disturbances, both past and future.The mathematical solution for yt in (18) is found by repeatedsubstitution of yt−1, yt−2 and so on back to infinity:

yt = β1 ∑∞i=0 βi2 +∑∞

i=0 βi2et−i (19)

(19) shows that



A simple dynamic model IV

I yt−1 is uncorrelated with et and all future disturbances, butI yt−1 is correlated with et−1 and all other past disturbances

Hence yt−1 is not exogenous in (18), but yt−1is not completeendogenous either.

I We have an intermediary case between exogeneity andendogeneity, and we say that yt−1 is a pre-determinedvariable in (19).

I In the case of a pre-determined explanatory variable theproperties of the OLS estimators β2 and β1 are consistent butwith finite sample biases, that are due to the correlationbetween yt−1 and past disturbances.



A simple dynamic model V

As an example, we have that

E (β2 − β2) ≈−2β2T

for the simplest case with β1 = 0 (no drift).



10 20 30 40 50 60 70 80 90 1000.08

0.07

0.06

0.05

0.04

0.03

0.02

0.01

I plot of bias formulafor β2 = 0.5

I T = 1, 2, ..., 100



Summary of the regression model I

I As long as the regressor is deterministic or exogenous, and theclassical assumptions about the disturbance properties hold,the regression model gives OLS estimators that are BLUE.

I In the case of stochastic x , the proof is in term of conditionaland iterated expectations.

I With normally distributed disturbances, hypotheses tests andconfidence intervals can be based on percentiles from thet-distribution.

I Consistency of estimators also holds. We have only provedthat for the case of deterministic regressor: The theory ofProbability limit is needed for the case of stochastic x .



Summary of the regression model II

I Without normally distributed disturbances, the t-test isapproximately valid, and the degree of approximation becomesbetter with larger n

I If x is a pre-determined stochastic regressor, there is a (small)bias in the OLS estimator.

I That bias is decreasing in the sample size.I Hence, for typical sample sizes (more than 30 observations)the case or pre-determinedness can be regarded as a variant ofRM2: the properties are very similar.


The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection...

Documents

Transcript of The regression model with one stochastic regressor (part II) · Introductiony and x normalDirection...