“Econometrics may not have the everlasting charm of · commonly used test for serial correlation,...
Transcript of “Econometrics may not have the everlasting charm of · commonly used test for serial correlation,...
-
“Econometrics may not have the everlasting charm of Holmesian characters and adventures, or even a famous resident of Baker Street, but there is much in his methodological approach to the solving of criminal cases that is of relevance to applied econometric modeling. Holmesian detection may be interpreted as accommodating the relationship between data and theory, modeling procedures, deductions and inferences, analysis of biases, testing of theories, re-evaluation and reformulation of
216216
testing of theories, re-evaluation and reformulation of theories, and finally reaching a solution to the problem at hand. With this in mind, can applied econometricians learn anything from the master of detection?
Michael McAleer, “Sherlock Holmes and the Search for Truth: A Diagnostic Tale”, Journal of Economic Surveys8,4(1994)317-370.
-
Serial Correlation(Autocorrelation)
Heteroscedasticity
217217
Heteroscedasticity
Collinearity Diagnostics
Influence Diagnostics
Structural Change
-
Definition
TestsDurbin-Watson Test
Nonparametric Runs Test
Durbin h Test
218218
Durbin h Test
Lagrange Multiplier (LM) TestBox-Pierce Q StatisticLjung- Box Q * Statistic (Small-sample modification of Box-Pierce Q Statistic)
Generalized Least Squares
-
Disturbance terms are not independent.
The correlation between and is called an autocorrelation of order k.
ji0)(En,...,2,1iX...XXY
ji
ikiki22i110i
≠≠εε=ε+β++β+β+β=
tε kt−ε
219219
autocorrelation of order k.
-
Autocorrelation or serial correlation refers to the lack of independence of error (or disturbance) terms. Autocorrelation and serial correlation refer to the same phenomenon. Simply put, a systematic
220220
to the same phenomenon. Simply put, a systematic pattern exists in the residuals of the econometric model. Ideally, the residuals, which represent a composite of all factors not embedded in the model, should exhibit no pattern. That is to say, t he residuals should follow a white-noise (or random) pattern.
-
With the use of time-series data in econometric applications, serial correlation is “public enemy number one”. Systematic patterns in the error terms commonly arise due to the (inadvertent) omission of explanatory variables in econometric models. These
221221
explanatory variables in econometric models. These variables may come from disciplines other than economics, finance, or business, for example, psychology and sociology. Or, these variables may represent factors that simply are difficult to quan tify, such as tastes and preferences of consumers or technological innovation on the part of producers.
-
� Bishop (1981)
� Errors “contaminated” with autocorrelation or seria l correlationPotential of discovering “spurious” relationships d ue to problems with autocorrelated errors (Granger and
222222
to problems with autocorrelated errors (Granger and Newbold, 1974)
� Difficulties with structural analysis and forecasti ng
� If the error structure is autoregressive, then OLS estimates of the regression parameters are: (1) unbiased, (2) consistent, but (3) inefficient in sm all and in large samples.
-
� The estimates of the standard errors of the coefficients in any econometric model are biased downward if the residuals are positively autocorrelated. They are biased upward if the residuals are negatively autocorrelated.
� Therefore, the calculated t -statistic is biased
223223
� Therefore, the calculated t -statistic is biased upward or downward in the opposite direction of the bias in the estimated standard error of that coefficient.
� Granger and Newbold (1974) further suggest that the econometric results can be defined as “nonsense” if R 2 >DW(d).
-
Positive autocorrelation of the errors generally tends to make the estimate of the error variance to o small, so confidence intervals are too narrow and n ull hypotheses are rejected with a higher probability t han the stated significance level. Negative autocorrela tion of the errors generally tends to make the estimate of the
224224
the errors generally tends to make the estimate of the error variance too large, so confidence intervals a re too wide; as well, the power of significance tests is r educed. With either positive or negative autocorrelation, l east-squares parameter estimates usually are not as effi cient as generalized least-squares parameter estimates.
-
� Ordinary regression analysis is based on several st atistical assumptions. One key assumption is that the errors are independent of each other. However, with time serie s data, the ordinary regression residuals usually are correlate d over time.
� Violation of the independent errors assumption has three important consequences for ordinary regression.
225225
important consequences for ordinary regression.
� First, statistical tests of the significance of the parameters and the confidence limits for the predicted values are not correct.
� Second, the estimates of the regression coefficient s are not as efficient as they would be if the autocorrelation w ere taken into account.
� Third, since the ordinary regression residuals are not independent, they contain information that can be u sed to improve the prediction of future values.
-
� The AUTOREG procedure solves this problem by augmen ting the regression model with an autoregressive model f or the random error, thereby accounting for the systematic pattern of the errors. Instead of the usual regression mode l, the following autoregressive error model is used:
εβxy +′=
226226
� The notation indicates that each v t is normally and independently distributed with mean 0 and varia nce .
),0( ~
...
2
2211
σ
εφεφεφε
εβ
INv
v
xy
t
tmtmttt
ttt
+−−−−=
+′=
−−−
),0( ~ 2σINvt 2σ
-
By simultaneously estimating the regression coefficients β and the autoregressive error model parameters φi , the AUTOREG procedure corrects the regression estimates for autocorrelation. Thus, this kind of regression analysis is often called
227227
this kind of regression analysis is often called autoregressive error correction or serial correlation correction. This technique also is called the use of generalized least squares (GLS).
-
The AUTOREG procedure can produce two kinds of predicted values and corresponding residuals and confidence limits. The first kind of predicted value is obtained from only the structura l part of the model; this predicted value is an estimate of the unconditional mean of the
228228
estimate of the unconditional mean of the dependent variable at time t. The second kind of predicted value includes both the structural part o f the model and the predicted value of the autoregressive error process. Both the structural part and autoregressive error process of the model (termed the “full” model) are used to forecast future values.
-
The Durbin-Watson Test
)(4)ˆˆ(
0
0:
0:
2
21
1
0
ρρ
≤−
=≤
≠=
∑=
−
statisticdoree
DW
H
H
n
ttt
229229
)1(2)(
)(4ˆ
0
1
2
2
ρ−≈
≤=≤∑
=
=
dDW
statisticdore
DW n
tt
t
(approximation good only for large samples)
4d then -1, if
0;d then 1, if 2;d then 0, If
==ρ==ρ==ρ
-
230230
� dL,dU depend on α, k, n� DW invalid with models that contain no intercept an d
models that contain lagged dependent variables
-
� The sampling distribution of d depends on the values of the exogenous variables and hence Durbin and Watson derived upper (dU) limits and lower (dL) limits for the significance levels for d.
� Tables of the distribution are found in
231231
� Tables of the distribution are found in most econometric textbooks.
� The Durbin-Watson test perhaps is the most used procedure in econometric applications.
-
232232
-
Appendix G, Statistical Table, cont.
233233
-
Although the DW test is the most commonly used test for serial correlation, there are limitations:
(1) tests for only first-order serial correlation.
(2) test may be inconclusive.
234234
(3) test cannot be applied in models with lagged dependent variables.
(4) test cannot be applied in models without intercepts.
-
There are other tables for the DW test that have been prepared to take care of special situations. Some of these are:
(1) R.W. Farebrother (1980) provides tables for regression models with no intercept term.
235235
(2) Savin and White (1977) present tables for the DW test for samples with 6 to 200 observations and for as many as 20 regressors.
-
(3) Wallis (1972) gives tables for regression model s with quarterly data. Here one would like to test for fourth-order autocorrelation rather than first-order autocorrel ation. In this case, the DW statistic is:
∑
∑
=
=−−
=n
1t
2t
n
5t
24tt
4
û
)ûû(d
236236
Wallis provides 5% critical values d L and d U for two situations: where the k regressors include an inter cept (but not a full set of seasonal dummy variables) and another where the regressors include four quarterly seasonal dummy va riables. In each case the critical values are for testing H 0: ρρρρ =0 against H 1: ρρρρ > 0. For the hypothesis H 1: ρρρρ < 0, Wallis suggests that the appropriate critical values are (4-d U) and (4-d L). King and Giles (1978) give further significance points for these t ests.
=1t
-
(4) King (1981) gives the 5% points for d L and d Uquarterly time-series data with trend and/or seasonal dummy variables. These tables are for testing first-order autocorrelation.
(5) King (1983) gives tables for the DW test for
237237
(5) King (1983) gives tables for the DW test for monthly data. In case of monthly data, we may wish to test for twelfth-order autocorrelation.
-
� More general than the DW test. Interest in H 0: ρρρρ = 0.� Test of AR(1) process in the error terms
N+ = number of positive residualsN- = number of negative residualsN = number of observationsNr = number of runs
� Example:
238238
� Example:
� Test Statistic:
� Reject H 0 (non-autocorrelation) if the test statistic is too large in absolute value.
[ ][ ] ))1(/())2(2(
/)2(2 −−=
=−+−+
−+
NNNNNNNNVAR
NNNNE
r
r
)1,0(N)N(VAR/))N(EN(Z rrr ≈−=
-
239239
-
240240
In this example, sample evidence exists to suggest the presence of positive serial correlation, the more
common form of pattern in the residuals in regard to the use of economic or financial data.
-
241241
-
In the Greene problem for gasoline, In the Greene problem for gasoline, DW = 0.786 and = 0.601.DW = 0.786 and = 0.601.
� Use of Nonparametric Runs testN = 36 N+ = 19 N- = 17 Nr = 11
ρ̂
[ ][ ] 69.8))1(/())2(2(
94.17/)2(2 =−−=
==−+−+
−+
NNNNNNNNVAR
NNNNE r
242242
[ ] 69.8))1(/())2(2( 2 =−−= −+−+ NNNNNNNNVAR r
.05at 96.1
0reject05at
35.295.2
94.6
69.8
94.1711
)())((
0
====
−=−=−=
−=
αcrit
rrr
Z
.:ρ H, .α
Z
NVARNENZ
-
� Analysts must recognize that a “good” Durbin-Watson statistic is insufficient evidence upon which to conclude that the error structure is “contamination free” in terms of autocorrelation. The Durbin-Watso n test is only applicable for the presence of first-o rder autocorrelation.
� There is little reason to suppose that the correct model for residuals is AR(1); a mixed, autoregressive, moving -average (ARMA) structure is much more likely
243243
moving -average (ARMA) structure is much more likely to be correct, especially with quarterly, monthly, and weekly frequencies of time-series data. Modeling of the residuals can be employed following the methodology of Box and Jenkins (1976).
� Owing to higher frequencies of time-series data use d in applied econometrics in recent years, the pattern o f the error structure generally is more complex than the common AR(1) pattern.
-
Coefficient associated with↑
1tY −
A large sample test for autocorrelation when lagged dependent variables are present.
d is the DW statistic
)1,0(N~h
))(nV1/(nˆh
d)2/1(1ˆ
&
β−ρ=−≈ρ
244244
)1,0(N~h &
Test breaks down if 1)ˆ(nV ≥β
If the Durbin-h test breaks down, compute the OLS residuals Then regress on and the set of exogenous variables. The test for is carried out by testing the significance of the coefficient
tû
tû ,y,û 1t1t −−
1tû −
0=ρ
-
245245
OLS estimates
Presence of a lagged
dependent variable
-
� In the Greene Problem for gasoline demand
1805.0)2/639.1(1ˆ =−=ρ
2462465788.1
0155.0)12456.0()(
35
2
=
==
=
h
Bv
n
-
LM - Lagrange multiplier
0...:H
),0(IN~eeu...uuu
.n,...,2,1tuX...Xy
p210
2ttptp2t21t1t
tktkt110t
=ρ==ρ=ρ
σ+ρ++ρ+ρ=
=+β++β+β=
−−−
The X’s may or may not include lagged dependent variables.
247247
The X’s may or may not include lagged dependent variables.
First: estimate by OLS and obtain the least squares residuals
Second: estimate
Third: test whether the coefficients of are all zero. Use the conventional F-statistic.
.vûX...Xû tp
1tiitktkt110t +ρ+γ++γ+γ= ∑
=−
tû
itû −
-
Check the serial correlation pattern of the residua ls, need to be sure that there is no serial correlation (desire wh ite noise)Box and Pierce (1970) suggest looking at not just t he first-order autocorrelation but autocorrelation of all orders o f residuals.
Calculate where∑=
=m
2k ,rNQ
248248
is the autocorrelation of lag k, and N is the numbe r of observations in the series.If the model fitted is appropriate, where p is the number of estimated parameters
Ljung and Box (1978) suggest a modification of the Q-statistic for moderate sample sizes.
∑=
−−+=m
1k
2k
1r)kN()2N(N*Q
∑=1k
2kr
2~ pmQ −χ&
-
� We use the correlations and partial correlations of the residuals over time. The idea is to determine the appropriate pattern in the error structure from the autocorrelation and partial autocorrelation functio ns associated with the residuals.
249249
� Autocorrelation functions tell us about moving aver age (MA) patterns.
� Partial autocorrelation functions tell us about autoregressive (AR) patterns.
� Anticipate ARMA error structures, particularly high er-order AR patterns in residuals of econometric model s.
-
250250
-
The test can be used for different specifications of the error process:
For Example: .euu t4t4t +ρ= −
251251
estimate
test
.ˆ...ˆ 44110 ttktktt vuXXu +++++= −ργγγ
.0:H 40 =ρ
-
252252
-
253253
-
254254
-
255255
-
256256
-
� With time-series data, in most cases serial correlation problems will surface
� Analysts must examine the error structure carefully
� Minimally� Graph the residuals over time
� Consider the significance of the Durbin-Watson statistic
257257
statistic
� Consider higher-order autocorrelation structure via PROC ARIMA
� Consider the Godfrey LM Test
� Consider the Box-Pierce or Ljung-Box Tests (Q-Statistics)
� Re-estimate econometric models with AR(p) error structures
-
� The regression model is specified as where the ∈∈∈∈i’s are identically and independently distributed:
If the ∈∈∈∈i’s are not independent or their variances are not constant, the parameter est imates are unbiased, but the estimate of the covariance matrix is inconsistent.
� One of the key assumptions of regression is that th e
,iii xy ∈+= β
and 0)( =∈E I.)( 2σ=∈′∈E
258258
� One of the key assumptions of regression is that th e variance of the errors is constant across observati ons. If the errors have constant variance, the errors are calle d homoscedastic. Standard estimation methods are inefficient when the errors are heteroscedastic or have non-constant variance. As well, this issue leads to problems in tests of hypotheses.
� Null hypothesis: the disturbance terms are homosced astic.� See McCulloch (1985)
iH i allfor :22
0 σσ =
-
Example:
where represents savings
represents incomei
i
XY
22
22110
)(
,...,2,1...
ii
ikikiii
E
niXXXY
σε
εββββ
=
=+++++=
iii XY εββ ++= 10
n represents the number of observations
259259
represents incomeiX
-
� Assumption of homoscedasticity
� Since E( ∈∈∈∈i) = 0 by assumption, we may write the homoscedasticity condition as
� The variance of the error term or disturbance term is constant for all observations.
This issue is problematic with microeconomic data o r more
iVar i allfor )(2σ=∈
.)( 22 σ=∈iE
260260
� This issue is problematic with microeconomic data o r more generally cross-sectional data; take the case of in come and expenditure of households for example.
� The assumption of homoscedasticity is not very plau sible on a priori grounds; we expect less variation in consu mption (or saving) for low-income households and more variatio n in consumption (or saving) for high-income households.
� Hence, the need to consider heteroscedastic disturb ance terms in applied econometrics.
-
e2
0 xj
e2
0 xj
e2
0 xj
e2 e2
(a) (b) (c)
261261
xj 00 xj(d) (e)Diagram of estimated squared residuals against expl anatory variables.
are plotted against xj, j = 1, 2, …, k2ie
-
1. OLS parameter estimates are unbiasedand consistent . But they are not efficient .
2. The estimated variances of the parameters
262262
2. The estimated variances of the parameters of the model are biased estimators.
-
� When the disturbance term is heteroscedastic, OLS parameter estimates are unbiased and consistent, bu t they are NOT BLUE.
� The estimated variances of OLS parameters are in ge neral biased. Hence the conventionally calculated confide nce intervals and tests of significance are invalid.
� Use of weighted least squares to overcome these
263263
� Use of weighted least squares to overcome these consequences of heteroscedasticity.
with relatively high (low) variance
� Alternatively use maximum likelihood (ML) estimatio n
n) ..., 2, 1,(i nsobservatio (reward) penalize ,1
2===
i
ii weightwσ
-
� Transform the model
� Let
N ..., 2, 1,i ...22110 =∈+++++= ikikiii XXXY ββββ22)( iiE σ=∈
iikikiiiiiiii wXwXwXwwYw ∈+++++= ββββ ...22110
iw σ11 ==
264264
� Let
� So,
...221
10
i
i
i
kik
i
i
i
i
ii
i XXXY
σσβ
σβ
σβ
σβ
σ∈+++++=
ii
iw σσ 2==
... ***22*11
*0
*ikikiii XXXY ∈+++++= ββββ
1)(1
)(2
* =∈=
∈=∈ iii
ii VarVarVar σσ
-
� With SAS (or EVIEWS), one can use the WEIGHT statement together with the PROC MODEL to correct for heteroscedasticity.
265265
� The WEIGHT statement follows the FIT statement.
-
� Model SpecificationSavingi = a + b*INCOMEi + ∈∈∈∈i491 Households
1989 Datab = marginal propensity to save
266266
b = marginal propensity to save
Assumption:
Let ii
i INCOMEw
112
==σ
22ii INCOME=σ
)(2 INCOMEfi =σ
-
267267
-
268268
Dependent variable (residuals) 2
22 02910.intercept ii incat+=σ
OLS Estimates
-
Suggests the presence of heteroscedasticity.
269269
Use of weighted least squares.
[ ] 2122 02910.intercept11
ii
i
incatweight
+==
σ
-
WLS Estimates}
MPS = .27697 with WLS
MPS = .35954 with OLS
270270
-
� When the variance of the errors of a classical line ar modelY = Xββββ + ∈∈∈∈
� is not constant across observations (heteroscedasti c), so that for some, the OLS estimator
� is unbiased but it is inefficient. Models that take into account the changing variance can make more efficient use of the data. W hen the variances, are known, generalized least squares (GLS) can be us ed and the estimator
YXXXOLS ′′=−1)(β̂
YXXXGLS11)(ˆ −− Ω′Ω′=β
22ji σσ ≠
,2iσ
271271
� where
� is unbiased and efficient. However, GLS is unavaila ble when the variances, are unknown. n refers to the number of observations .
GLS
,2iσ
=Ω
2
22
21
0
0
0
000
00
00
00
nσ
σσ
L
LO
L
L
-
=Ω
2
22
21
0
0
0
000
00
00
00
ˆ
ne
e
e
L
LO
L
L
.ˆOLSiii XY β′−=∈
272272
� Note that
=Ω−
21000
000
0022
10
00021
1
1ˆ
ne
e
e
L
LO
L
L
-
� Assumptions Concerning
� In the case of the micro consumption (or saving) function, the variance of the disturbance terms often is assumed to be positively associated with the level of household income.
2iσ
),,...,,( 212
piiii zzzf=σ
273273
� To operationalize such assumption, we need to specify not only , but also the functional form of the association.
� Common forms of association represent both multiplicative heteroscedasticity and additive heteroscedasticity.
piii zzz and ,...,, 21
-
� Note that this specification applies only when all Zs are positive.
ppiiii zzz
δδδσσ ... 21 2122 =
pipiii zzz ln ...lnlnlnln 221122 δδδσσ ++++=
0...: ====H δδδ
274274
under H 0: we have homoscedastic disturbances if we reject H 0, then evidence suggests heteroscedastic disturbance terms.
� Appropriate
0...: 210 ==== pH δδδ
2/2/22/1 ...
1
21p
piii
izzz
w δδδ=
-
� Under H 0: we have homoscedastic
) ... ( 2211022
pipiii zazazaa ++++= σσ
0...: 210 ==== paaaH
275275
� Under H 0: we have homoscedastic disturbancesif we reject H 0, then evidence suggests heteroscedastic disturbance terms.
� Appropriate
21
22110 ) ... (
1
pipii
i
zazazaaw
++++=
-
� Usually it is not likely that analysts would know o f variables related to the variance of the disturbanc e term that have not already been included in the econometric specification. Thus, the usual choices
Practically speaking, apart from the functional form of the heteroscedastic issue, how do analysts select the z variables?
276276
econometric specification. Thus, the usual choices of the z variables are likely to be the explanatory variables (the X variables).
� Prime candidates for the z variables are the non-discrete explanatory variables X.
-
(1) Park-Glejser(2) Breusch-Pagan-Godfrey
Harvey
277277
(3) Harvey(4) White
-
� Replace
ikikiii
ukiiti
ikikiti
uXXX
eXXXH
uXXXYik
+++++=
=
+++++=
ln...lnlnlnln
...:
...
221122
2122
0
22110
21
δδδσσσσ
ββββδδδ
2i
2i elnln ⇒σ
278278
� Replace
� Perform F Test on
� In F non-significant, then the disturbance terms ar e homoscedastic
� If F significant, then disturbance terms are hetero scedastic
ii elnln ⇒σ
s'δ
i
ji
k
j
weight
X
Correctioni
=
Π=
2
1
1δ
-
279279
OLS Estimates
-
280280
-
Auxiliary regression
ii
ii
uPCINC
PCAID
++
+=
ln
lnlnln
2
122
δ
δσσ
281281
F test indicates the presence of
heteroscedasticity.
-
WLS
282282
WLS
2/36463.42/90627.1
1
iii PCINCPCAID
weight =
-
Test H0: coefficient of nez, mwz, and wez are jointly
equal to 0.
Test H0: coefficient of nez = coefficient of mwz.
283283
Test H0: coefficient of nez = coefficient of wez.
Test H0: coefficient of mwz = coefficient of wez.
-
Use of Proc Model to handle the
284284
Heteroscedasticity Problem (use of
weight statement).
Same WLS estimates as before.
-
� Regress
Perform F test on
ikikiii
ikikiii
uXXXH
uXXXe
+++++=
+++++=
γγγγσγγγγ
...:
...
221102
0
221102
s'γ
285285
� Perform F test on
� Correction:2
1k
1jjij0
i
X
1WEIGHT
γ+γ
=
∑=
s'γ
Breusch and Pagan (1979); Godfrey (1978)
-
286286
OLS estimates
-
Auxiliary regression
iiii upcincpcaide +++= 2102 γγγ
287287
F test associated with the auxiliary regression
H0: coefficient of pcaid = coefficient of pcinc = 0
-
[ ] 21*13583.3*92871.54173931
pcincpcaidweight i
++−=
288288
WLS estimates
-
Test H0: coefficient of nez, mwz, and wez are jointly
equal to 0.
Test H0: coefficient of nez = coefficient of mwz.
289289
Test H0: coefficient of nez = coefficient of wez.
Test H0: coefficient of mwz = coefficient of wez.
-
290290
Use of Proc model with
weight statement.
-
� To operationalize this hypothesis, formulate the following regression:
[ ]
iKiKiii
iKiKiii
uXaXaXaae
uXaXaXaaH
+++++=
+++++=
...ln
...exp:
221102
221102
0 σ
291291
� Perform F test on a’s
� Correction:
2
1
K
1jjij0
i
Xaaexp
1WEIGHT
+
=
∑=
-
292292
OLS estimates
-
Auxiliary regression
iiii upcincapcaidaae +++= 2102ln
293293
Test H 0: a1 = a2 =0 indicates the presence of
heteroscedasticity
-
[ ] 21)*0009203.*00824.96476.1(exp1
ii
i
pcincpcaidweight
++=
294294
WLS estimates
-
Test H0: coefficient of nez, mwz, and wez are jointly
equal to 0.
Test H0: coefficient of nez = coefficient of mwz.
295295
Test H0: coefficient of nez = coefficient of wez.
Test H0: coefficient of mwz = coefficient of wez.
-
296296
WLS estimates via Proc Model
-
� influence by any of the regressors, squares of regr essors, cross-products of regressors on� Step 1 = Use OLS and obtain OLS residuals� Step 2 = Square residuals (form� Step 3 = Form squares of the right-hand side variab les (k
terms) & cross products of regressors (k(k-1)/2 ter ms)� Step 4 = Regress against original regressors, and
terms in step 3
2ie
2σ
)e2i
297297
terms in step 3
� Correction:
kiikkkiikik
ikkikii
XXXXX
XXXe
)1()2/)3((2112212
211110
2
2......
...
−++
+
+++++
++++=
ααααααα
2
1
)1()2/)3((2112
212
211110
2...
......
1
+++++++++
=
−++
+
kiikkkiik
ikikkiki
i
XXXX
XXXX
WEIGHT
ααααααα
-
298298
OLS estimates
-
Auxiliary
299299
Auxiliary regression
F test indicates the presence of
heteroscedasticity
-
WLS estimates with
*88919.329146628[
1
ii pcaid
weight+−
=
300300
212
2
]**08819.*00364.0
*25458.0*24914.51
iii
ii
i
pcincpcaidpcinc
pcaidpcinc
−−
++
-
Test H0: coefficient of nez, mwz, and wez are jointly
equal to 0.
Test H0: coefficient of nez = coefficient of mwz.
301301
Test H0: coefficient of nez = coefficient of wez.
Test H0: coefficient of mwz = coefficient of wez.
-
302302
WLS estimates with weight statement from
Proc Model.
-
Testp-value of F-statistic
Park Glejser 0.0406
303303
Breusch-Pagan-Godfrey
0.0008
Harvey 0.0311
White 0.0018
� All tests indicate the presence of heteroscedasticity.
-
Correction for Heteroscedasticity—WLSSummary of 1970 State Data Problem with Heterosceda sticity
Variable OLS Park-Glejser Breusch-Pagan-Godfrey
Harvey White
Intercept -665.51586(103.70268)
-303.89856(124.50782)
-429.75385(115.99855)
-291.33079(123.71054)
-583.81742(106.63934)
PCAID 2.55446(0.18707)
1.69045(0.27151)
2.05626(0.25728)
1.64611(0.27646)
2.26077(0.25838)
304304
PCINC 0.22216(0.02495)
0.16693(0.02536)
0.18451(0.02511)
0.16576(0.02482)
0.21301(0.02300)
NE 28.32663(41.50517)
67.66182(32.06849)
51.97487(35.43834)
69.01092(31.49605)
51.30945(33.40153)
MW 61.90315(37.66921 )
79.60480(27.08596)
78.65195(29.50126)
77.16368(27.03204)
76.51217(30.10805)
WE 33.69220(38.88263 )
78.98334(31.65617)
69.98310(35.81770)
79.60937(30.56592)
66.40836(31.60469)
� Standard errors in parentheses.
-
Correction for Heteroscedasticity—WLSSummary of 1970 State Data Problem with Heterosceda sticity,
cont.
Variable OLS Park-GlejserBreusch-
Pagan-GodfreyHarvey White
F-Test on Region Dummy
Variables0.95 3.75 2.73 3.71 2.72
305305
P-value of F-test
0.4235 0.0176 0.0549 0.0183 0.0560
R2 0.8951 0.7463 0.7972 0.7506 0.8606
0.8832 0.7175 0.7741 0.7223 0.84442R
� The R2 and statistics come from the PROC MODEL procedure.
2R
-
� The heteroscedastic regression model:
� The heteroscedastic regression model is estimated u sing the following log-likelihood function:
)(
),0(~
22
2
η
σσ
σ
β
ii
ii
ii
iii
zlh
h
N
xy
′=
=
∈
∈+′=
2
306306
where
� Use non-linear estimation procedures to maximize the likelihood function. Parameters are σσσσ2, ηηηη, and ββββ. Typically, z i is a subset of the x iexplanatory variables.
2
1
2
12
1)ln(
2
1)2ln(
2 ∑∑==
−−−=
N
i i
ii
N
i
eN
σσπl
.βiii xye ′−=
-
(1) Retrieve OLS residuals
(2) Square the OLS residuals
(3) Graph the square of the OLS residuals vs. Non-discrete explanatory variables
(4) Apply Park-Glejser, BPG, Harvey, and/or White tests
(5) If any of the F-tests from (4) are statistically significant, then heteroscedasticity is present
307307
significant, then heteroscedasticity is present
(6) To alleviate the heteroscedasticity problem, use WLS or ML estimation.
(7) Report WLS (GLS) or ML estimates, standard errors, p-values, etc.
(8) Retrieve appropriate goodness-of-fit statistics.
-
Nature of Problem
Consequences
Introduction
Belsley, Kuh, Welsch Diagnostics
308308
Variance inflation factors
Condition indices
Variance-decomposition proportions
Circumvention of Problem:
Ridge Regression
-
309309
Multiple regression of Y on X and Z.
OLS estimators: Use of blue area to estimate and green area to estimate
Discard information in red areaXβ Zβ
-
∑=
≈k
0jjj 0Xa
Near linear dependency among regressor variables
310310
variables
Departure from orthogonality of the columns of X
Singularity
Elements “explode”→
→
−1T
T
)XX(
)XX(
-
Orthogonal Variables
Non-orthogonal Variables
110
01
10
011 Case
XX)XX(XX T1TT
−
19.26.574.4
74.426.5
19.
9.12 Case
−−
311311
Key Points:
(1) Sampling variances of estimated OLS coefficients increase sharply
(2) Greater sampling covariances for the OLS coefficients
02.505.49
5.4950
199.
99.13 Case
−−
-
� Deals with specific characteristics of the data matrix X-data problem, not a statistical problem
� Speak in terms of severity rather than of its
312312
� Speak in terms of severity rather than of itsexistence or nonexistence
� Effects on structural integrity of econometric models
Opposite of collinear orthogonal
-
� Constitutes a threat to the proper specification an d effective estimation of a structural relationship
� Covariance among parameter estimates are often large and of the wrong sign
� Larger variances (standard errors) of regression coefficients; are not indistinguishable from the consequences of inadequate variability in the regressors
1T2 )XX()(VAR −σ=β
large and of the wrong sign
� Difficulties in interpretation
� Confidence regions for parameters are wide
� Increase in type II error(Accept H 0 when H 0 false)
� Decrease in power of tests
-
Multicollinearity refers to the presence of highly intercorrelated exogenous variables in regression models. It is not surprising that it is considered “one of the most ubiquitous, significant, and
314314
difficult problems in applied econometrics…often referred to by modelers as the familiar curse.” Collinearity diagnostics measure how much regressors are related to other regressors and how these relationships affect the stability and variance of the regression estimates.
-
Signs of multicollinearity in a regression analysis include:
(1) Large standard errors on the regression coeffi cient, so that estimates of the true model parameters become unstable and low t-values prevail.
(2) The parameter estimates vary considerably from
315315
(2) The parameter estimates vary considerably from sample to sample.
(3) Often there will be drastic changes in the reg ression estimates after only minor data revision.
(4) Conflicting conclusions will be reached from t he usual tests of significance (such as the wrong sign for a parameter).
-
(5) Extreme correlations between pairs of variables.
(6) Omitting a variable from the equation results in smaller regression standard errors.
316316
in smaller regression standard errors.
(7) A good fit not providing good forecasts.
-
(1) produce a set of condition indices that signal the presence of one or more near dependencies among the variables. (Linear dependency, an extreme form of multicollinearity, occurs when there is an exact linear relationship among the variables).
(2) uncover those variables that are involved in p articular near dependencies and to assess the degree to which
317317
near dependencies and to assess the degree to which the estimated regression coefficients are being degraded by the presence of the near dependencies.
In practice, if one exogenous variable has a high s quared multiple correlation (R-squared) with the other ind ependent variables, it is extremely unlikely that the exogen ous variable in question contributes significantly to the predic tion equation. When the R-squared is too high, the vari ables are, in essence, redundant.
-
The variance inflation factor (VIF i) for variable i is defined as follows:
As the squared multiple correlation of the exogenou s variable with the other exogenous variables approaches unity, the corresponding VIF becomes infinite. If exogenous variables are o rthogonal to each
)R1(
1VIF
2i
i −=
318318
VIF becomes infinite. If exogenous variables are o rthogonal to each other (no correlation), the variance inflation fact or is 1.0. VIF i thus provides us with a measure of how many times larger the variance of the ith regression coefficient will be for multicollinea r data than for orthogonal data (where each VIF is 1.0). If the VI F’s are not too much larger than 1.0, multicollinearity is not a problem . An advantage of knowing the VIF for each variable is that it gives the user a tangible idea of how much of the variances of the estimated coefficients are degraded by the multicollinearity.
-
Small determinant -- some (or many) of the eigenvalues are small
Belsley, Kuh, Welsch diagnostic tools
XX of seigenvalue
...XX
T
p21T
⇑
λλλ=
319319
Belsley, Kuh, Welsch diagnostic tools
Condition number of the X matrix
Condition index
MIN/MAXMIN/MAX)X(k µµ=λλ=
p,...,1s/MAX ss =µµ=η
alue.singular vsth theis sµ
-
sth condition index of the nxp data matrix X.
Key Point
As many near dependencies among the columns of a data matrix X as there are high condition indices.
p,...,1s/MAX ss =µµ=η
320320
data matrix X as there are high condition indices.
Weak dependencies are associated with condition indices around 5 or 10.
Moderate to strong relations are associated with condition indices > 30.
-
Diagnostic Procedure
111
::::
...
...
)(VAR...)(VAR)VAR(Value Singular
pp1p0pp
p111101
p001000
p10
ΠΠΠµ
ΠΠΠµΠΠΠµ
βββ
321321
Source:Belsley, Kuh, Welsch. Regression Diagnostics Identifying Influential Data and Sources of Collinearity, (1980), John Wiley & Sons.
5. moreor two)2(30)1(
s
s
>Π≥µ
-
322322
Note that variables 1, 2, 5, and 6 are highly corre lated and the VIF’s for all variables (except variable 3) are greater than 10 with one of them being greater than 1,000.
Examination of the condition index column reveals a dominating dependency situation with high numbers for several indices.
-
Hoerl and Kennard (1970)
[ ] [ ]T1T
R
TT
YX)kIXX(ˆ
0set
d/dL
)d(k)XY()XY(L Minimize
+=β⇒
=β
−ββ−β−β−=
−
−
323323
[ ] [ ][ ]
OLST1TR
1TT1T2R
T1TR
ˆXX)kIXX(ˆ
)kIXX(XX)kIXX(ˆVAR
XXkIXXˆE
β+=β
++σ=β
β+=β
−
−−
−
There exists a number k such that
[ ] [ ]2
R
)bias(iancevarMSE
OLSˆMSEˆMSE
+=β≤β
-
324324
-
325325
-
326326
-
327327