“Econometrics may not have the everlasting charm of · commonly used test for serial correlation,...

“Econometrics may not have the everlasting charm of Holmesian characters and adventures, or even a famous resident of Baker Street, but there is much in his methodological approach to the solving of criminal cases that is of relevance to applied econometric modeling. Holmesian detection may be interpreted as accommodating the relationship between data and theory, modeling procedures, deductions and inferences, analysis of biases, testing of theories, re-evaluation and reformulation of

216216

testing of theories, re-evaluation and reformulation of theories, and finally reaching a solution to the problem at hand. With this in mind, can applied econometricians learn anything from the master of detection?

Michael McAleer, “Sherlock Holmes and the Search for Truth: A Diagnostic Tale”, Journal of Economic Surveys8,4(1994)317-370.

Serial Correlation(Autocorrelation)

Heteroscedasticity

217217

Heteroscedasticity

Collinearity Diagnostics

Influence Diagnostics

Structural Change

Definition

TestsDurbin-Watson Test

Nonparametric Runs Test

Durbin h Test

218218

Durbin h Test

Lagrange Multiplier (LM) TestBox-Pierce Q StatisticLjung- Box Q * Statistic (Small-sample modification of Box-Pierce Q Statistic)

Generalized Least Squares

Disturbance terms are not independent.

The correlation between and is called an autocorrelation of order k.

ji0)(En,...,2,1iX...XXY

ji

ikiki22i110i

≠≠εε=ε+β++β+β+β=

tε kt−ε

219219

autocorrelation of order k.

Autocorrelation or serial correlation refers to the lack of independence of error (or disturbance) terms. Autocorrelation and serial correlation refer to the same phenomenon. Simply put, a systematic

220220

to the same phenomenon. Simply put, a systematic pattern exists in the residuals of the econometric model. Ideally, the residuals, which represent a composite of all factors not embedded in the model, should exhibit no pattern. That is to say, t he residuals should follow a white-noise (or random) pattern.

With the use of time-series data in econometric applications, serial correlation is “public enemy number one”. Systematic patterns in the error terms commonly arise due to the (inadvertent) omission of explanatory variables in econometric models. These

221221

explanatory variables in econometric models. These variables may come from disciplines other than economics, finance, or business, for example, psychology and sociology. Or, these variables may represent factors that simply are difficult to quan tify, such as tastes and preferences of consumers or technological innovation on the part of producers.

� Bishop (1981)

� Errors “contaminated” with autocorrelation or seria l correlationPotential of discovering “spurious” relationships d ue to problems with autocorrelated errors (Granger and

222222

to problems with autocorrelated errors (Granger and Newbold, 1974)

� Difficulties with structural analysis and forecasti ng

� If the error structure is autoregressive, then OLS estimates of the regression parameters are: (1) unbiased, (2) consistent, but (3) inefficient in sm all and in large samples.

� The estimates of the standard errors of the coefficients in any econometric model are biased downward if the residuals are positively autocorrelated. They are biased upward if the residuals are negatively autocorrelated.

� Therefore, the calculated t -statistic is biased

223223

� Therefore, the calculated t -statistic is biased upward or downward in the opposite direction of the bias in the estimated standard error of that coefficient.

� Granger and Newbold (1974) further suggest that the econometric results can be defined as “nonsense” if R 2 >DW(d).

Positive autocorrelation of the errors generally tends to make the estimate of the error variance to o small, so confidence intervals are too narrow and n ull hypotheses are rejected with a higher probability t han the stated significance level. Negative autocorrela tion of the errors generally tends to make the estimate of the

224224

the errors generally tends to make the estimate of the error variance too large, so confidence intervals a re too wide; as well, the power of significance tests is r educed. With either positive or negative autocorrelation, l east-squares parameter estimates usually are not as effi cient as generalized least-squares parameter estimates.

� Ordinary regression analysis is based on several st atistical assumptions. One key assumption is that the errors are independent of each other. However, with time serie s data, the ordinary regression residuals usually are correlate d over time.

� Violation of the independent errors assumption has three important consequences for ordinary regression.

225225

important consequences for ordinary regression.

� First, statistical tests of the significance of the parameters and the confidence limits for the predicted values are not correct.

� Second, the estimates of the regression coefficient s are not as efficient as they would be if the autocorrelation w ere taken into account.

� Third, since the ordinary regression residuals are not independent, they contain information that can be u sed to improve the prediction of future values.

� The AUTOREG procedure solves this problem by augmen ting the regression model with an autoregressive model f or the random error, thereby accounting for the systematic pattern of the errors. Instead of the usual regression mode l, the following autoregressive error model is used:

εβxy +′=

226226

� The notation indicates that each v t is normally and independently distributed with mean 0 and varia nce .

),0( ~

...

2

2211

σ

εφεφεφε

εβ

INv

v

xy

t

tmtmttt

ttt

+−−−−=

+′=

−−−

),0( ~ 2σINvt 2σ

By simultaneously estimating the regression coefficients β and the autoregressive error model parameters φi , the AUTOREG procedure corrects the regression estimates for autocorrelation. Thus, this kind of regression analysis is often called

227227

this kind of regression analysis is often called autoregressive error correction or serial correlation correction. This technique also is called the use of generalized least squares (GLS).

The AUTOREG procedure can produce two kinds of predicted values and corresponding residuals and confidence limits. The first kind of predicted value is obtained from only the structura l part of the model; this predicted value is an estimate of the unconditional mean of the

228228

estimate of the unconditional mean of the dependent variable at time t. The second kind of predicted value includes both the structural part o f the model and the predicted value of the autoregressive error process. Both the structural part and autoregressive error process of the model (termed the “full” model) are used to forecast future values.

The Durbin-Watson Test

)(4)ˆˆ(

0

0:

0:

2

21

1

0

ρρ

≤−

=≤

≠=

∑=

−

statisticdoree

DW

H

H

n

ttt

229229

)1(2)(

)(4ˆ

0

1

2

2

ρ−≈

≤=≤∑

=

=

dDW

statisticdore

DW n

tt

t

(approximation good only for large samples)

4d then -1, if

0;d then 1, if 2;d then 0, If

==ρ==ρ==ρ

230230

� dL,dU depend on α, k, n� DW invalid with models that contain no intercept an d

models that contain lagged dependent variables

� The sampling distribution of d depends on the values of the exogenous variables and hence Durbin and Watson derived upper (dU) limits and lower (dL) limits for the significance levels for d.

� Tables of the distribution are found in

231231

� Tables of the distribution are found in most econometric textbooks.

� The Durbin-Watson test perhaps is the most used procedure in econometric applications.

232232

Appendix G, Statistical Table, cont.

233233

Although the DW test is the most commonly used test for serial correlation, there are limitations:

(1) tests for only first-order serial correlation.

(2) test may be inconclusive.

234234

(3) test cannot be applied in models with lagged dependent variables.

(4) test cannot be applied in models without intercepts.

There are other tables for the DW test that have been prepared to take care of special situations. Some of these are:

(1) R.W. Farebrother (1980) provides tables for regression models with no intercept term.

235235

(2) Savin and White (1977) present tables for the DW test for samples with 6 to 200 observations and for as many as 20 regressors.

(3) Wallis (1972) gives tables for regression model s with quarterly data. Here one would like to test for fourth-order autocorrelation rather than first-order autocorrel ation. In this case, the DW statistic is:

∑

∑

=

=−−

=n

1t

2t

n

5t

24tt

4

û

)ûû(d

236236

Wallis provides 5% critical values d L and d U for two situations: where the k regressors include an inter cept (but not a full set of seasonal dummy variables) and another where the regressors include four quarterly seasonal dummy va riables. In each case the critical values are for testing H 0: ρρρρ =0 against H 1: ρρρρ > 0. For the hypothesis H 1: ρρρρ < 0, Wallis suggests that the appropriate critical values are (4-d U) and (4-d L). King and Giles (1978) give further significance points for these t ests.

=1t

(4) King (1981) gives the 5% points for d L and d Uquarterly time-series data with trend and/or seasonal dummy variables. These tables are for testing first-order autocorrelation.

(5) King (1983) gives tables for the DW test for

237237

(5) King (1983) gives tables for the DW test for monthly data. In case of monthly data, we may wish to test for twelfth-order autocorrelation.

� More general than the DW test. Interest in H 0: ρρρρ = 0.� Test of AR(1) process in the error terms

N+ = number of positive residualsN- = number of negative residualsN = number of observationsNr = number of runs

� Example:

238238

� Example:

� Test Statistic:

� Reject H 0 (non-autocorrelation) if the test statistic is too large in absolute value.

[ ][ ] ))1(/())2(2(

/)2(2 −−=

=−+−+

−+

NNNNNNNNVAR

NNNNE

r

r

)1,0(N)N(VAR/))N(EN(Z rrr ≈−=

239239

240240

In this example, sample evidence exists to suggest the presence of positive serial correlation, the more

common form of pattern in the residuals in regard to the use of economic or financial data.

241241

In the Greene problem for gasoline, In the Greene problem for gasoline, DW = 0.786 and = 0.601.DW = 0.786 and = 0.601.

� Use of Nonparametric Runs testN = 36 N+ = 19 N- = 17 Nr = 11

ρ̂

[ ][ ] 69.8))1(/())2(2(

94.17/)2(2 =−−=

==−+−+

−+

NNNNNNNNVAR

NNNNE r

242242

[ ] 69.8))1(/())2(2( 2 =−−= −+−+ NNNNNNNNVAR r

.05at 96.1

0reject05at

35.295.2

94.6

69.8

94.1711

)())((

0

====

−=−=−=

−=

αcrit

rrr

Z

.:ρ H, .α

Z

NVARNENZ

� Analysts must recognize that a “good” Durbin-Watson statistic is insufficient evidence upon which to conclude that the error structure is “contamination free” in terms of autocorrelation. The Durbin-Watso n test is only applicable for the presence of first-o rder autocorrelation.

� There is little reason to suppose that the correct model for residuals is AR(1); a mixed, autoregressive, moving -average (ARMA) structure is much more likely

243243

moving -average (ARMA) structure is much more likely to be correct, especially with quarterly, monthly, and weekly frequencies of time-series data. Modeling of the residuals can be employed following the methodology of Box and Jenkins (1976).

� Owing to higher frequencies of time-series data use d in applied econometrics in recent years, the pattern o f the error structure generally is more complex than the common AR(1) pattern.

Coefficient associated with↑

1tY −

A large sample test for autocorrelation when lagged dependent variables are present.

d is the DW statistic

)1,0(N~h

))(nV1/(nˆh

d)2/1(1ˆ

&

β−ρ=−≈ρ

244244

)1,0(N~h &

Test breaks down if 1)ˆ(nV ≥β

If the Durbin-h test breaks down, compute the OLS residuals Then regress on and the set of exogenous variables. The test for is carried out by testing the significance of the coefficient

tû

tû ,y,û 1t1t −−

1tû −

0=ρ

245245

OLS estimates

Presence of a lagged

dependent variable

� In the Greene Problem for gasoline demand

1805.0)2/639.1(1ˆ =−=ρ

2462465788.1

0155.0)12456.0()(

35

2

=

==

=

h

Bv

n

LM - Lagrange multiplier

0...:H

),0(IN~eeu...uuu

.n,...,2,1tuX...Xy

p210

2ttptp2t21t1t

tktkt110t

=ρ==ρ=ρ

σ+ρ++ρ+ρ=

=+β++β+β=

−−−

The X’s may or may not include lagged dependent variables.

247247

The X’s may or may not include lagged dependent variables.

First: estimate by OLS and obtain the least squares residuals

Second: estimate

Third: test whether the coefficients of are all zero. Use the conventional F-statistic.

.vûX...Xû tp

1tiitktkt110t +ρ+γ++γ+γ= ∑

=−

tû

itû −

Check the serial correlation pattern of the residua ls, need to be sure that there is no serial correlation (desire wh ite noise)Box and Pierce (1970) suggest looking at not just t he first-order autocorrelation but autocorrelation of all orders o f residuals.

Calculate where∑=

=m

2k ,rNQ

248248

is the autocorrelation of lag k, and N is the numbe r of observations in the series.If the model fitted is appropriate, where p is the number of estimated parameters

Ljung and Box (1978) suggest a modification of the Q-statistic for moderate sample sizes.

∑=

−−+=m

1k

2k

1r)kN()2N(N*Q

∑=1k

2kr

2~ pmQ −χ&

� We use the correlations and partial correlations of the residuals over time. The idea is to determine the appropriate pattern in the error structure from the autocorrelation and partial autocorrelation functio ns associated with the residuals.

249249

� Autocorrelation functions tell us about moving aver age (MA) patterns.

� Partial autocorrelation functions tell us about autoregressive (AR) patterns.

� Anticipate ARMA error structures, particularly high er-order AR patterns in residuals of econometric model s.

250250

The test can be used for different specifications of the error process:

For Example: .euu t4t4t +ρ= −

251251

estimate

test

.ˆ...ˆ 44110 ttktktt vuXXu +++++= −ργγγ

.0:H 40 =ρ

252252

253253

254254

255255

256256

� With time-series data, in most cases serial correlation problems will surface

� Analysts must examine the error structure carefully

� Minimally� Graph the residuals over time

� Consider the significance of the Durbin-Watson statistic

257257

statistic

� Consider higher-order autocorrelation structure via PROC ARIMA

� Consider the Godfrey LM Test

� Consider the Box-Pierce or Ljung-Box Tests (Q-Statistics)

� Re-estimate econometric models with AR(p) error structures

� The regression model is specified as where the ∈∈∈∈i’s are identically and independently distributed:

If the ∈∈∈∈i’s are not independent or their variances are not constant, the parameter est imates are unbiased, but the estimate of the covariance matrix is inconsistent.

� One of the key assumptions of regression is that th e

,iii xy ∈+= β

and 0)( =∈E I.)( 2σ=∈′∈E

258258

� One of the key assumptions of regression is that th e variance of the errors is constant across observati ons. If the errors have constant variance, the errors are calle d homoscedastic. Standard estimation methods are inefficient when the errors are heteroscedastic or have non-constant variance. As well, this issue leads to problems in tests of hypotheses.

� Null hypothesis: the disturbance terms are homosced astic.� See McCulloch (1985)

iH i allfor :22

0 σσ =

Example:

where represents savings

represents incomei

i

XY

22

22110

)(

,...,2,1...

ii

ikikiii

E

niXXXY

σε

εββββ

=

=+++++=

iii XY εββ ++= 10

n represents the number of observations

259259

represents incomeiX

� Assumption of homoscedasticity

� Since E( ∈∈∈∈i) = 0 by assumption, we may write the homoscedasticity condition as

� The variance of the error term or disturbance term is constant for all observations.

This issue is problematic with microeconomic data o r more

iVar i allfor )(2σ=∈

.)( 22 σ=∈iE

260260

� This issue is problematic with microeconomic data o r more generally cross-sectional data; take the case of in come and expenditure of households for example.

� The assumption of homoscedasticity is not very plau sible on a priori grounds; we expect less variation in consu mption (or saving) for low-income households and more variatio n in consumption (or saving) for high-income households.

� Hence, the need to consider heteroscedastic disturb ance terms in applied econometrics.

e2

0 xj

e2

0 xj

e2

0 xj

e2 e2

(a) (b) (c)

261261

xj 00 xj(d) (e)Diagram of estimated squared residuals against expl anatory variables.

are plotted against xj, j = 1, 2, …, k2ie

1. OLS parameter estimates are unbiasedand consistent . But they are not efficient .

2. The estimated variances of the parameters

262262

2. The estimated variances of the parameters of the model are biased estimators.

� When the disturbance term is heteroscedastic, OLS parameter estimates are unbiased and consistent, bu t they are NOT BLUE.

� The estimated variances of OLS parameters are in ge neral biased. Hence the conventionally calculated confide nce intervals and tests of significance are invalid.

� Use of weighted least squares to overcome these

263263

� Use of weighted least squares to overcome these consequences of heteroscedasticity.

with relatively high (low) variance

� Alternatively use maximum likelihood (ML) estimatio n

n) ..., 2, 1,(i nsobservatio (reward) penalize ,1

2===

i

ii weightwσ

� Transform the model

� Let

N ..., 2, 1,i ...22110 =∈+++++= ikikiii XXXY ββββ22)( iiE σ=∈

iikikiiiiiiii wXwXwXwwYw ∈+++++= ββββ ...22110

iw σ11 ==

264264

� Let

� So,

...221

10

i

i

i

kik

i

i

i

i

ii

i XXXY

σσβ

σβ

σβ

σβ

σ∈+++++=

ii

iw σσ 2==

... ***22*11

*0

*ikikiii XXXY ∈+++++= ββββ

1)(1

)(2

* =∈=

∈=∈ iii

ii VarVarVar σσ

� With SAS (or EVIEWS), one can use the WEIGHT statement together with the PROC MODEL to correct for heteroscedasticity.

265265

� The WEIGHT statement follows the FIT statement.

� Model SpecificationSavingi = a + b*INCOMEi + ∈∈∈∈i491 Households

1989 Datab = marginal propensity to save

266266

b = marginal propensity to save

Assumption:

Let ii

i INCOMEw

112

==σ

22ii INCOME=σ

)(2 INCOMEfi =σ

267267

268268

Dependent variable (residuals) 2

22 02910.intercept ii incat+=σ

OLS Estimates

Suggests the presence of heteroscedasticity.

269269

Use of weighted least squares.

[ ] 2122 02910.intercept11

ii

i

incatweight

+==

σ

WLS Estimates}

MPS = .27697 with WLS

MPS = .35954 with OLS

270270

� When the variance of the errors of a classical line ar modelY = Xββββ + ∈∈∈∈

� is not constant across observations (heteroscedasti c), so that for some, the OLS estimator

� is unbiased but it is inefficient. Models that take into account the changing variance can make more efficient use of the data. W hen the variances, are known, generalized least squares (GLS) can be us ed and the estimator

YXXXOLS ′′=−1)(β̂

YXXXGLS11)(ˆ −− Ω′Ω′=β

22ji σσ ≠

,2iσ

271271

� where

� is unbiased and efficient. However, GLS is unavaila ble when the variances, are unknown. n refers to the number of observations .

GLS

,2iσ

=Ω

2

22

21

0

0

0

000

00

00

00

nσ

σσ

L

LO

L

L

=Ω

2

22

21

0

0

0

000

00

00

00

ˆ

ne

e

e

L

LO

L

L

.ˆOLSiii XY β′−=∈

272272

� Note that

=Ω−

21000

000

0022

10

00021

1

1ˆ

ne

e

e

L

LO

L

L

� Assumptions Concerning

� In the case of the micro consumption (or saving) function, the variance of the disturbance terms often is assumed to be positively associated with the level of household income.

2iσ

),,...,,( 212

piiii zzzf=σ

273273

� To operationalize such assumption, we need to specify not only , but also the functional form of the association.

� Common forms of association represent both multiplicative heteroscedasticity and additive heteroscedasticity.

piii zzz and ,...,, 21

� Note that this specification applies only when all Zs are positive.

ppiiii zzz

δδδσσ ... 21 2122 =

pipiii zzz ln ...lnlnlnln 221122 δδδσσ ++++=

0...: ====H δδδ

274274

under H 0: we have homoscedastic disturbances if we reject H 0, then evidence suggests heteroscedastic disturbance terms.

� Appropriate

0...: 210 ==== pH δδδ

2/2/22/1 ...

1

21p

piii

izzz

w δδδ=

� Under H 0: we have homoscedastic

) ... ( 2211022

pipiii zazazaa ++++= σσ

0...: 210 ==== paaaH

275275

� Under H 0: we have homoscedastic disturbancesif we reject H 0, then evidence suggests heteroscedastic disturbance terms.

� Appropriate

21

22110 ) ... (

1

pipii

i

zazazaaw

++++=

� Usually it is not likely that analysts would know o f variables related to the variance of the disturbanc e term that have not already been included in the econometric specification. Thus, the usual choices

Practically speaking, apart from the functional form of the heteroscedastic issue, how do analysts select the z variables?

276276

econometric specification. Thus, the usual choices of the z variables are likely to be the explanatory variables (the X variables).

� Prime candidates for the z variables are the non-discrete explanatory variables X.

(1) Park-Glejser(2) Breusch-Pagan-Godfrey

Harvey

277277

(3) Harvey(4) White

� Replace

ikikiii

ukiiti

ikikiti

uXXX

eXXXH

uXXXYik

+++++=

=

+++++=

ln...lnlnlnln

...:

...

221122

2122

0

22110

21

δδδσσσσ

ββββδδδ

2i

2i elnln ⇒σ

278278

� Replace

� Perform F Test on

� In F non-significant, then the disturbance terms ar e homoscedastic

� If F significant, then disturbance terms are hetero scedastic

ii elnln ⇒σ

s'δ

i

ji

k

j

weight

X

Correctioni

=

Π=

2

1

1δ

279279

OLS Estimates

280280

Auxiliary regression

ii

ii

uPCINC

PCAID

++

+=

ln

lnlnln

2

122

δ

δσσ

281281

F test indicates the presence of

heteroscedasticity.

WLS

282282

WLS

2/36463.42/90627.1

1

iii PCINCPCAID

weight =

Test H0: coefficient of nez, mwz, and wez are jointly

equal to 0.

Test H0: coefficient of nez = coefficient of mwz.

283283

Test H0: coefficient of nez = coefficient of wez.

Test H0: coefficient of mwz = coefficient of wez.

Use of Proc Model to handle the

284284

Heteroscedasticity Problem (use of

weight statement).

Same WLS estimates as before.

� Regress

Perform F test on

ikikiii

ikikiii

uXXXH

uXXXe

+++++=

+++++=

γγγγσγγγγ

...:

...

221102

0

221102

s'γ

285285

� Perform F test on

� Correction:2

1k

1jjij0

i

X

1WEIGHT

γ+γ

=

∑=

s'γ

Breusch and Pagan (1979); Godfrey (1978)

286286

OLS estimates


iiii upcincpcaide +++= 2102 γγγ

287287

F test associated with the auxiliary regression

H0: coefficient of pcaid = coefficient of pcinc = 0

[ ] 21*13583.3*92871.54173931

pcincpcaidweight i

++−=

288288

WLS estimates


equal to 0.


289289



290290

Use of Proc model with

weight statement.

� To operationalize this hypothesis, formulate the following regression:

[ ]

iKiKiii

iKiKiii

uXaXaXaae

uXaXaXaaH

+++++=

+++++=

...ln

...exp:

221102

221102

0 σ

291291

� Perform F test on a’s

� Correction:

2

1

K

1jjij0

i

Xaaexp

1WEIGHT

+

=

∑=

292292

OLS estimates


iiii upcincapcaidaae +++= 2102ln

293293

Test H 0: a1 = a2 =0 indicates the presence of

heteroscedasticity

[ ] 21)*0009203.*00824.96476.1(exp1

ii

i

pcincpcaidweight

++=

294294

WLS estimates


equal to 0.


295295



296296

WLS estimates via Proc Model

� influence by any of the regressors, squares of regr essors, cross-products of regressors on� Step 1 = Use OLS and obtain OLS residuals� Step 2 = Square residuals (form� Step 3 = Form squares of the right-hand side variab les (k

terms) & cross products of regressors (k(k-1)/2 ter ms)� Step 4 = Regress against original regressors, and

terms in step 3

2ie

2σ

)e2i

297297

terms in step 3

� Correction:

kiikkkiikik

ikkikii

XXXXX

XXXe

)1()2/)3((2112212

211110

2

2......

...

−++

+

+++++

++++=

ααααααα

2

1

)1()2/)3((2112

212

211110

2...

......

1

+++++++++

=

−++

+

kiikkkiik

ikikkiki

i

XXXX

XXXX

WEIGHT

ααααααα

298298

OLS estimates

Auxiliary

299299


F test indicates the presence of

heteroscedasticity

WLS estimates with

*88919.329146628[

1

ii pcaid

weight+−

=

300300

212

2

]**08819.*00364.0

*25458.0*24914.51

iii

ii

i

pcincpcaidpcinc

pcaidpcinc

−−

++


equal to 0.


301301



302302

WLS estimates with weight statement from

Proc Model.

Testp-value of F-statistic

Park Glejser 0.0406

303303

Breusch-Pagan-Godfrey

0.0008

Harvey 0.0311

White 0.0018

� All tests indicate the presence of heteroscedasticity.

Correction for Heteroscedasticity—WLSSummary of 1970 State Data Problem with Heterosceda sticity

Variable OLS Park-Glejser Breusch-Pagan-Godfrey

Harvey White

Intercept -665.51586(103.70268)

-303.89856(124.50782)

-429.75385(115.99855)

-291.33079(123.71054)

-583.81742(106.63934)

PCAID 2.55446(0.18707)

1.69045(0.27151)

2.05626(0.25728)

1.64611(0.27646)

2.26077(0.25838)

304304

PCINC 0.22216(0.02495)

0.16693(0.02536)

0.18451(0.02511)

0.16576(0.02482)

0.21301(0.02300)

NE 28.32663(41.50517)

67.66182(32.06849)

51.97487(35.43834)

69.01092(31.49605)

51.30945(33.40153)

MW 61.90315(37.66921 )

79.60480(27.08596)

78.65195(29.50126)

77.16368(27.03204)

76.51217(30.10805)

WE 33.69220(38.88263 )

78.98334(31.65617)

69.98310(35.81770)

79.60937(30.56592)

66.40836(31.60469)

� Standard errors in parentheses.

Correction for Heteroscedasticity—WLSSummary of 1970 State Data Problem with Heterosceda sticity,

cont.

Variable OLS Park-GlejserBreusch-

Pagan-GodfreyHarvey White

F-Test on Region Dummy

Variables0.95 3.75 2.73 3.71 2.72

305305

P-value of F-test

0.4235 0.0176 0.0549 0.0183 0.0560

R2 0.8951 0.7463 0.7972 0.7506 0.8606

0.8832 0.7175 0.7741 0.7223 0.84442R

� The R2 and statistics come from the PROC MODEL procedure.

2R

� The heteroscedastic regression model:

� The heteroscedastic regression model is estimated u sing the following log-likelihood function:

)(

),0(~

22

2

η

σσ

σ

β

ii

ii

ii

iii

zlh

h

N

xy

′=

=

∈

∈+′=

2

306306

where

� Use non-linear estimation procedures to maximize the likelihood function. Parameters are σσσσ2, ηηηη, and ββββ. Typically, z i is a subset of the x iexplanatory variables.

2

1

2

12

1)ln(

2

1)2ln(

2 ∑∑==

−−−=

N

i i

ii

N

i

eN

σσπl

.βiii xye ′−=

(1) Retrieve OLS residuals

(2) Square the OLS residuals

(3) Graph the square of the OLS residuals vs. Non-discrete explanatory variables

(4) Apply Park-Glejser, BPG, Harvey, and/or White tests

(5) If any of the F-tests from (4) are statistically significant, then heteroscedasticity is present

307307

significant, then heteroscedasticity is present

(6) To alleviate the heteroscedasticity problem, use WLS or ML estimation.

(7) Report WLS (GLS) or ML estimates, standard errors, p-values, etc.

(8) Retrieve appropriate goodness-of-fit statistics.

Nature of Problem

Consequences

Introduction

Belsley, Kuh, Welsch Diagnostics

308308

Variance inflation factors

Condition indices

Variance-decomposition proportions

Circumvention of Problem:

Ridge Regression

309309

Multiple regression of Y on X and Z.

OLS estimators: Use of blue area to estimate and green area to estimate

Discard information in red areaXβ Zβ

∑=

≈k

0jjj 0Xa

Near linear dependency among regressor variables

310310

variables

Departure from orthogonality of the columns of X

Singularity

Elements “explode”→

→

−1T

T

)XX(

)XX(

Orthogonal Variables

Non-orthogonal Variables

110

01

10

011 Case

XX)XX(XX T1TT

−

19.26.574.4

74.426.5

19.

9.12 Case

−−

311311

Key Points:

(1) Sampling variances of estimated OLS coefficients increase sharply

(2) Greater sampling covariances for the OLS coefficients

02.505.49

5.4950

199.

99.13 Case

−−

� Deals with specific characteristics of the data matrix X-data problem, not a statistical problem

� Speak in terms of severity rather than of its

312312

� Speak in terms of severity rather than of itsexistence or nonexistence

� Effects on structural integrity of econometric models

Opposite of collinear orthogonal

� Constitutes a threat to the proper specification an d effective estimation of a structural relationship

� Covariance among parameter estimates are often large and of the wrong sign

� Larger variances (standard errors) of regression coefficients; are not indistinguishable from the consequences of inadequate variability in the regressors

1T2 )XX()(VAR −σ=β

large and of the wrong sign

� Difficulties in interpretation

� Confidence regions for parameters are wide

� Increase in type II error(Accept H 0 when H 0 false)

� Decrease in power of tests

Multicollinearity refers to the presence of highly intercorrelated exogenous variables in regression models. It is not surprising that it is considered “one of the most ubiquitous, significant, and

314314

difficult problems in applied econometrics…often referred to by modelers as the familiar curse.” Collinearity diagnostics measure how much regressors are related to other regressors and how these relationships affect the stability and variance of the regression estimates.

Signs of multicollinearity in a regression analysis include:

(1) Large standard errors on the regression coeffi cient, so that estimates of the true model parameters become unstable and low t-values prevail.

(2) The parameter estimates vary considerably from

315315

(2) The parameter estimates vary considerably from sample to sample.

(3) Often there will be drastic changes in the reg ression estimates after only minor data revision.

(4) Conflicting conclusions will be reached from t he usual tests of significance (such as the wrong sign for a parameter).

(5) Extreme correlations between pairs of variables.

(6) Omitting a variable from the equation results in smaller regression standard errors.

316316

in smaller regression standard errors.

(7) A good fit not providing good forecasts.

(1) produce a set of condition indices that signal the presence of one or more near dependencies among the variables. (Linear dependency, an extreme form of multicollinearity, occurs when there is an exact linear relationship among the variables).

(2) uncover those variables that are involved in p articular near dependencies and to assess the degree to which

317317

near dependencies and to assess the degree to which the estimated regression coefficients are being degraded by the presence of the near dependencies.

In practice, if one exogenous variable has a high s quared multiple correlation (R-squared) with the other ind ependent variables, it is extremely unlikely that the exogen ous variable in question contributes significantly to the predic tion equation. When the R-squared is too high, the vari ables are, in essence, redundant.

The variance inflation factor (VIF i) for variable i is defined as follows:

As the squared multiple correlation of the exogenou s variable with the other exogenous variables approaches unity, the corresponding VIF becomes infinite. If exogenous variables are o rthogonal to each

)R1(

1VIF

2i

i −=

318318

VIF becomes infinite. If exogenous variables are o rthogonal to each other (no correlation), the variance inflation fact or is 1.0. VIF i thus provides us with a measure of how many times larger the variance of the ith regression coefficient will be for multicollinea r data than for orthogonal data (where each VIF is 1.0). If the VI F’s are not too much larger than 1.0, multicollinearity is not a problem . An advantage of knowing the VIF for each variable is that it gives the user a tangible idea of how much of the variances of the estimated coefficients are degraded by the multicollinearity.

Small determinant -- some (or many) of the eigenvalues are small

Belsley, Kuh, Welsch diagnostic tools

XX of seigenvalue

...XX

T

p21T

⇑

λλλ=

319319

Belsley, Kuh, Welsch diagnostic tools

Condition number of the X matrix

Condition index

MIN/MAXMIN/MAX)X(k µµ=λλ=

p,...,1s/MAX ss =µµ=η

alue.singular vsth theis sµ

sth condition index of the nxp data matrix X.

Key Point

As many near dependencies among the columns of a data matrix X as there are high condition indices.

p,...,1s/MAX ss =µµ=η

320320

data matrix X as there are high condition indices.

Weak dependencies are associated with condition indices around 5 or 10.

Moderate to strong relations are associated with condition indices > 30.

Diagnostic Procedure

111

::::

...

...

)(VAR...)(VAR)VAR(Value Singular

pp1p0pp

p111101

p001000

p10

ΠΠΠµ

ΠΠΠµΠΠΠµ

βββ

321321

Source:Belsley, Kuh, Welsch. Regression Diagnostics Identifying Influential Data and Sources of Collinearity, (1980), John Wiley & Sons.

5. moreor two)2(30)1(

s

s

>Π≥µ

322322

Note that variables 1, 2, 5, and 6 are highly corre lated and the VIF’s for all variables (except variable 3) are greater than 10 with one of them being greater than 1,000.

Examination of the condition index column reveals a dominating dependency situation with high numbers for several indices.

Hoerl and Kennard (1970)

[ ] [ ]T1T

R

TT

YX)kIXX(ˆ

0set

d/dL

)d(k)XY()XY(L Minimize

+=β⇒

=β

−ββ−β−β−=

−

−

323323

[ ] [ ][ ]

OLST1TR

1TT1T2R

T1TR

ˆXX)kIXX(ˆ

)kIXX(XX)kIXX(ˆVAR

XXkIXXˆE

β+=β

++σ=β

β+=β

−

−−

−

There exists a number k such that

[ ] [ ]2

R

)bias(iancevarMSE

OLSˆMSEˆMSE

+=β≤β

324324

325325

326326

327327

“Econometrics may not have the everlasting charm of · commonly used test for serial correlation,...

Documents

Transcript of “Econometrics may not have the everlasting charm of · commonly used test for serial correlation,...