The Hausman test in dynamic panel model326974/...I propose a Hausman test in dynamic panel model....

1

The Hausman test in dynamic panel model

Author: Mengque Liu

Supervisor: Johan Lyhagen

Master thesis in Statistics

Faculty of Statistics

Uppsala University, Sweden

May, 2010

Abstract

I propose a Hausman test in dynamic panel model. The aim of the test is to detect whether there existfixed effects in the dynamic model. It is based on the comparison of the PGMM estimator and theinstrumental variables estimator which is consistent only under the null hypothesis. A Monte Carlosimulation is presented to compare the finite sample properties of the Hausman test in the dynamic modelto that in individual-specific effects model.

Key words: dynamic model, Hausman test, PGMM estimator, instrumental variables estimator

2

1. Introduction

In the study of panel data, unobserved individual heterogeneity correlated with the regressors is an

important issue. Hausman [1978] proposes the Hausman specification test to detect fixed effects in the

individual-specific effects model. The Hausman test in the dynamic model is my interest. This essay aims

to compare the Hausman test in the individual-specific effects model with that in the dynamic model. And

our focus is on the panel with a large number of individuals but a short period of time.

For the dynamic model which includes a lagged dependent variable as the regressor, the standard panel

estimators would all be inconsistent.

Anderson and Hsiao [1981] suggested a simple instrumental variable method to estimate the first-

differenced dynamic panel data model. The lagged values of the dependent variable become valid

instruments. The instrumental variable estimator is consistent though it is less efficient. Arellano-Bond

[1991] proposed the generalized method of moments (GMM)1 to estimate the first-differenced dynamic

panel data model which is called the difference GMM. The generalized method of moments is based on the

sample analog of the population moment conditions. It allows us to use wider unbalanced instruments sets

which will lead to a more efficient estimator. Holtz-Eakin, Newey and Rosen [1988] made some similar

study on that estimation. All the estimators which they proposed are consistent when fixed effects are

present.

In the thesis I suggest an instrumental variable estimator which is only consistent when the individual

effects are random. I hope to perform the Hausman test to detect the fixed effects in the dynamic model

based on the comparison between that instrumental variable estimator and the panel generalized method of

moments estimator which Arellano-Bond [1991] proposed.

In addition, I proceed to the Monte Carlo study to see if the Hausman test is less efficient in the dynamic

model than in the individual-specific effects model.

The organization of the essay is as follows: Section 2 introduces panel models including random effects

model, fixed effects model and dynamic model. Section 3 derives the Hausman tests in different kinds of

models. Section 4 presents a Monte Carlo study to explore the small sample properties of the tests. Section

5shows an empirical example. Section 6 conclusions.

1 The generalized method of moments is developed by Lars Peter Hansen.

3

2. Individual-specific effects model and dynamic model

2.1 Individual-specific effects model

Panel data can not only offer us the information across different individuals, but also the information for a

given individual across the time. The individual-specific effects model in which the intercept coefficients

vary over individual and slopes remain the same is a kind of classical regression models to present the

special feature of panel data.

, 1...... , 1......it i it ity x i N t T (2.1)

Where it is iid over i and t. The random variables i are unobserved individual effects. The assumption

of strong exogeneity is made in our discussion about the individual-specific effects model.

1[ | , ,.... ] 0, 1,...,it i i iTE x x t T

2.1.1 Random effects model and random effects estimator

There is a variant of the model (2.1) which is called random effects model. ‘… the unobservable individual

effects i are random variables that are distributed independently of the regressors. This model is called

random effects model, which usually makes the additional assumptions that

2

2

~ [ , ]

~ [0, ]i

it

(2.2)

So that both the random effects and the error term in (2.1) are assumed to be iid.’(Cameron and Trivedi,

2005, p.700)

There exists serial correlation in ,i ty in the random effects model, though both the random effects and error

term are iid. If we set =0, so that it i ity .

1 1

1

1

2

2 2

[ , ] [ , ]

cov( , )var( ) var( )

it it i it i it

i it i it

i it i it

Cor y y Cor

For the errors of the random effects model are correlated over time for a given individual and

heteroskedastic, the feasible GLS estimator is efficient and consistent under the random effects assumption.

The random effects estimator is the feasible GLS estimator of the RE model. The random effects estimator

is fully efficient under the assumptions (2.2).

4

The random effects estimator can be obtained from the OLS estimation of this transformed model

‘… 2 (2.3)

Where is asymptotically iid, and ̂ is consistent for

2 21

T

Note that ̂ =0 corresponds to pooled OLS, ̂ =1 corresponds to within estimation, and ̂ 1 as

T , This is a two-step estimator of .’(Cameron and Trivedi, 2005, p.700)

When the fixed effects occur, the unobservable individual effects are correlated with the observed

regressors, so the errors are still correlated with the ˆit ix x , random effects estimator is inconsistent.

2.1.2 Fixed effects model and fixed effects estimator

If the individual-specific effects are correlated with the observed regressors, the model (2.1) is called fixed

effects model which is another variant of model (2.1).3

The random effects estimator fails to yield the consistent estimates of under the fixed effects model.

The within or fixed estimator make use of individual-specific deviations over time to obtain the consistent

estimator in the fixed effects model. Because that we get rid of the individual effects term in the process of

getting the individual-specific deviations over time.

The within estimator can be obtained from the OLS estimation of this transformed model:4

'( ) ( ),it i it i it iy y x x i=1,… … ,N, t= 1,… ..,T (2.4)

2.2 Dynamic model

If a lagged term for the dependent variable is included in the usual individual-specific effects panel data

model, the model is a dynamic model.

'1it it it i ity y x 1,...., , 1,...,i N t T (2.5)

2 See Cameron and Trivedi, 2005, p.734~736 for the derivation of (2.3) and ways to estimate 2 ,

2 and

3 Cameron and Trivedi, 2005, p.7004 Cameron and Trivedi, 2005, p.704

5

The time series correlation in ,i ty now is different with that in the individual specific effects model.

Besides the indirect effect through i , the past value of dependent variable , 1i ty directly induces the serial

correlation.

‘… let 0 , so that , 1it i t i ity y .

1 , 1 1

1

2 2

[ , ] [ , ]

[ , ]

(1 )1 (1 ) / (1 )

it it i t i it it

i it

Cor y y Cor y y

Cor y

The result makes it clear that there are two possible reasons for correlation between ,i ty and , 1i ty .’

(Cameron and Trivedi, 2005, p.763)

2.2.1 Instrumental variables estimator

If i is a random effect, the dynamic model can be treated as adding a dependent variable lagged once to a

random effects model.

OLS estimation of (2.5) will lead to inconsistent estimation of and . This is because the regressor

, 1i ty is correlated with i and hence with the composite error term ( i it ).

To obtain a consistent estimator in this situation, we consider using the instrumental variables estimator

with , 1 , 2( )i t i ty y as an instrument for , 1i ty . This is a valid instrument, since i is cancelled

in , 1 , 2( )i t i ty y , , 1 , 2( )i t i ty y is not correlated with ( i it ). Furthermore, , 1 , 2( )i t i ty y is

correlated with , 1i ty .

The model (2.5) is just-identified with , 1 , 2( )i t i ty y as an instrument for , 1i ty .The instrumental

variables estimator is

' 1 'ˆ ( )IV Z X Z y

In this model Z is a ( 1)N T K matrix with ' ', 1 , 2[ , ]it i t i t itz y y x .

This instrumental variables estimator is consistent in the dynamic model with random individual effects,

but it is inconsistent when the fixed individual effects occur.

2.2.2 Panel GMM estimator

6

If i is a fixed effect, the dynamic model can be treated as adding a dependent variable lagged once to a

fixed effects model. Model (2.5) is first differenced to get rid of the individual effect:

'1 1 2 1 1( ) ( ) ( )it it it it it it it ity y y y x x , 2,...,t T (2.6)

In this thesis, our focus is on the short panel data with a few time periods and a large number of individuals. If

it is a long panel data with large T, the impact of one year’s shock on the individual’s fixed effects will decline

with time (see Roodman, 2006). On the contrary, the correlation of the lagged dependent variable with the

error term is significant, in short panels. , 1 , 2( )i t i ty y is correlated with 1( )it it in (2.6). The OLS

estimation of (2.6) will fail to provide the consistent estimators.

Anderson and Hsiao (1981) proposed using , 2i ty or , 2 , 3( )i t i ty y as instruments for , 1 , 2( )i t i ty y . If

the successive observations are positively correlated, the instrumental variable estimator with , 2i ty as

instrument is less efficient than that with , 2 , 3( )i t i ty y as instrument. This instrumental variable

estimator is consistent though it is less efficient. Holtz-Eakin et al. (1988) and Arellano-Bond (1991) proposed

the generalized method of moments (GMM) to estimate the model (2.6).

The basic idea of the GMM 5estimator is that the population moment conditions can be replaced by the sample

moment conditions. The property of the instruments for regressors implies that the moment conditions of the

errors with the instruments are equal to zero. Based on the analog principle, the sample moment conditions are

used to estimate the parameters. Each moment condition is for each instrument. In the dynamic panel model,

the lags of the dependent variable can be valid instruments. Thus there are more moment conditions than

regressors, the model is over-identified. In the generalized method of moments, the parameters are estimated

by minimizing this quadratic form:

' ' '

1 1

( ) [ ] [ ]N N

N i i N i ii i

Q Z u W Z u

6

Where 'iZ denotes a matrix of instruments, NW denotes a weighting matrix. i i iu y X .

For the using of regressors in other periods as instruments for the current period regressors, the panel

generalized method of moments (PGMM) estimation could be more efficient.

According to the proposition of Holtz-Eakin et al. (1988) and Arellano-Bond (1991), we estimate the model

(2.6) with PGMM estimator using , 2i ty , , 3i ty as instruments for , 1 , 2( )i t i ty y and , , 1( )i t i tx x as an

instrument for itself. And this PGMM estimator is consistent in dynamic model with fixed individual effects.

In the dynamic model (2.6), the PGMM estimator is

5 The generalized method of moments is developed by Lars Peter Hansen. For the introduction to GMM,see Lars Peter Hansen (1982): Large Sample Properties of Generalized Method of Moments Estimators,Econometrica 50, 1029-1054.6 See Cameron and Trivedi, 2005, p.745

7

~ ~ ~ ~' ' ' 1 ' '

1 1 1 1

ˆ [( ) ( )] ( ) ( )N N N N

i i iPGMM i N i i N i ii i i i

X Z W Z X X Z W Z y

7

Where~

'iX is (T-2)×(K+1) matrix with t th row '

, 1( , )i t ity x , t=3,… ,T,~

iy is a (T-2)×1 vector with

t th row ,i ty , and iZ is a (T-2)×r matrix of instruments

'3

'4

'

0 ... 00 ... .. .. ... 00 .. 0

i

ii

iT

zz

Z

z

Where ' ', 2 , 3[ , , ]it i t i t itz y y x .

3. Hausman test

The Hausman tests are based on comparisons between two different estimators.

‘… Consider two estimators ̂ and~

.We consider the testing situation where

~

0

~

1

ˆ: lim( ) 0,

ˆ: lim( ) 0

H p

H p

Assume the difference between the two root-N consistent estimators is also root-N consistent under

0H with mean 0 and a limit normal distribution, so that

~ˆ( ) [0, ]

d

HN V

Where denotes the variance matrix in the limiting distribution. Then the Hausman test statistic

is asymptotically 2 ( )q distributed under 0H . We reject 0H at level if 2 ( )H q .’

(Cameron and Trivedi ,2005, p.271~272)

The Hausman test is a classical test to detect whether the model (2.1) is with fixed effect or not. The test is

based on the comparison between the random effects estimator and the within estimator in the individual-

specific effects model. If the statistically significant difference between these two estimators occurs, we can

draw the conclusion that the model (2.1) is with fixed effects.

7 See Cameron and Trivedi, 2005, p.765

8

It is similar in the dynamic models. But the comparison is between the panel GMM estimator and

instrumental variable estimator.

In the individual-specific effects model which is described in section 2, the random effects estimator under

the null hypothesis is fully efficient. Then the Hausman test statistic simplifies to

8

This statistic is asymptotically 2 ( )q distributed under the null hypothesis.

In the dynamic model, neither of the panel GMM estimator and instrumental variable estimator are fully

efficient estimators. The simplified form of HV can not be used and it should be replaced by the general

form. So we have to seek the approach of finding the consistent estimate of HV . Under the assumption that

the observations are independent over i this variance matrix can be consistently estimated through the

bootstrap method.

A panel-robust Hausman test statistic is

9

Similarly, if the statistically significant difference between the PGMM estimator and instrumental variables

estimator occurs, we can draw the conclusion that fixed effects are present in dynamic model.

Hausman tests in individual-specific effects model and dynamic model are evaluated and compared through

their probability of making mistakes. 10 There are two types of mistakes the tests would make. If the fixed

effects are not present, but the Hausman test incorrectly to reject the null hypothesis, then Type I error

occurs. If the fixed effects are present, but the Hausman test accepts the null hypothesis, then Type II error

occurs. The Hausman test in the model which has a larger probability of making mistakes is less efficient

than the other one.

4. Monte Carlo Study

Small sample properties of the Hausman test statistic can be obtained by performing a Monte Carlo study.

Firstly, we will study the Hausman test in the individual-specific effects model and then the dynamic model.

At last, we will compare the small sample properties in the static and dynamic models.

The Hausman test is

0 :H the individual-specific effects are uncorrelated with regressors

1 :H fixed effects are present

8 See Cameron and Trivedi , 2005, p.7189 See Cameron and Trivedi , 2005, p.718 for the panel robust Hausman test statistic10 George Casella, Roger L. Berger, Statistical Inference, 2nd Edition, p.382, 383

9

4.1 The Hausman test in the individual-specific effects model

The data ( , )it ity x under the null hypothesis are generated according to the random effects model:

, 1...... , 1......it i it ity x i N t T

Where 2 2 2 2~ ( , ), ~ (0, ), ~ ( , ), ~ ( , )i it it i iN N x N N

set 1 , 2 2 2 21, 1, 1, 1 , 0, 0 , N=100,T=7

Figure 1 shows there exists time-series correlation in ity . While Figure 2 shows that there does not exist

time series correlation in the individual-specific deviations of ity . Thus we can see that the time series

correlation in ity comes from the individual effect.

Figure 1 Autocorrelation function of ity in the

individual-specific effects model

Figure 2 Autocorrelation function of theindividual-specific deviations of ity fromits time-averaged values in the individual-specific effects model

10

We fit the individual-specific effects model to the data and get the random effects estimator and fixed

effects estimator of . Then we get the Hausman test statistic:

under the null hypothesis.

Figure 3 gives the density for 500 computed values of Hausman test statistic. The statistic is expected to

follow a chi-square distribution with 1 degree of freedom.

The true size of the test statistic is the proportion of the 500 observations in which 2 (1)H . The size-

correlated critical values are also obtained in the Monte Carlo study. For example, the upper 5 percentile of

the 500 simulated values of Hausman test statistic is the size-correlated critical value for 0.05 significance

level.

Consider the power of the Hausman test under the alternative hypothesis, and we assume that the individual

effects depend on the mean of the regressor itx which is a particular specification of the possible model

under the alternative hypothesis. In the data-generating process, the individual effects are generated in this

model: 2, ~ (0, )it

ti i i

xN

T

. Then a new set of 500 observations is produced. The power is

the proportion of this 500 observations in which 2 (1)H .

The simulation result is presented in the Table 1.

Figure 3: Density for 500 computed values of Hausman test statisticin the individual-specific effects model

11

Table1. Hausman test size and power in the individual-specific effects model for 500 observations.

Nominal Size Actual size Actual power Size-corrected critical value

0.01 0.018 1 8.617997

0.025 0.044 1 5.876021

0.05 0.072 1 4.905718

0.1 0.118 1 3.10286

If we increase the simulation size to 1000.The simulation result is presented in Table 2.

Table1. Hausman test size and power in the individual-specific effects model for 1000 observations.


0.01 0.01 1 6.615710

0.025 0.028 1 5.080407

0.05 0.052 1 3.885145

0.1 0.092 1 2.616923

The asymptotic result is pretty well, though the simulation size is not very big. The actual size of the test in

the individual-specific effects model is a little lager than the nominal size. The power is ideal which means

the Hausman test is powerful for detecting the existence of the correlation between the mean of the

regressor and the dependent variable in the individual-specific effects model.

4.2 The Hausman test in dynamic model

The data ( , )it ity x under the null hypothesis are generated according to the random effects model:

'1it it it i ity y x , i=1,… .N, t=1,… … ,T.

Where 2 2 2 2~ ( , ), ~ (0, ), ~ ( , ), ~ ( , )i it it i iN N x N N , 10.7, i iy

set 1 , 2 2 2 21, 1, 1, 1 , 0, 0 , N=100,T=7

Figure 4 shows there exists time series correlation in ity . While Figure 5 shows that there still exists time

series correlation in the individual-specific deviations of ity . Thus we can see that the time series

correlation in ity is induced by the past value and the individual effect.

12

We fit the dynamic model to the data and get the instrumental variables estimators and Panel GMM

estimators of and . Because neither of these two estimators is fully efficient in dynamic models, we

can not use the simplified form of covariance matrix. We generate 500 bootstrap replicates to estimate the

variance matrix consistently. Then we get the Hausman test statistic.

under the null hypothesis.

Figure 6 gives the density for 500 computed values of Hausman test statistic. The statistic is expected to

follow a chi-square distribution with 2 degree of freedom.

Figure 4 Autocorrelation function of ityin the dynamic model

Figure 5 Autocorrelation function of theindividual-specific deviations of ity fromits time-averaged values in the dynamicmodel

13

The true size of the test statistic is the proportion of the 500 observations in which 2 (2)H . The size-

corrected critical values are also obtained in the Monte Carlo study. For example, the upper 5 percentile of

the 500 simulated values of Hausman test statistic is the size-corrected critical value for 0.05 significance

level.

Consider the power of the Hausman test under the alternative hypothesis, and we assume that the individual

effects depend on the mean of the regressor itx which is a particular specification of the possible model

under the alternative hypothesis. In the data-generating process, the individual effects are generated in this

model: 2, ~ (0, )it

ti i i

xN

T

and 1 2i iy . Then a new set of 500 observations is produced.

The power is the proportion of this 500 observations in which 2 (2)H .

Figure 6: Density for 500 computed values of Hausman test statisticin the individual-specific effects model

14

The simulation result is presented in the Table 3.

Table 3 Hausman test size and power in the dynamic model for 500 observations


0.01 0.008 1 8.677593

0.025 0.024 1 7.297520

0.05 0.064 1 6.337962

If we increase the simulation size to 1000.The simulation result is presented in Table 4.

Table 4 Hausman test size and power in the dynamic model for 1000 observations


0.01 0.129 1 23.11916

0.025 0.195 1 17.69622

0.05 0.254 1 14.64198

The asymptotic result is not very well. The actual size of the test in the individual-specific effects model is

lager than the nominal size. The power is ideal which means the Hausman test is powerful for detecting the

existence of the correlation between the mean of the regressor and the dependent variable in the individual-

specific effects model.

Compare with the Hausman test in the individual-specific effects model, the type I error is larger in

dynamic models. The Hausman test in dynamic model has lager probability of making mistakes. Thus the

test in the dynamic model is less efficient.

5. Empirical example

The following example is from Arellano and Bond (1991). The dataset “Employment and Wage in

England”is a panel of 140 observations from 1976 to 1984. The number of observations is 1031. The

individual is firms in the United-Kingdom. It is an unbalanced panel with n=140, t=7-9.

15

Figure 7 and Figure 8 shows there still exists time series correlation in the individual-specific deviations of

employment from its time-averaged values. We can conclude that the time-series correlation in

employment is not only due to the individual specific tendency but also induced by the past employment.

Thus we can fit a dynamic model to the data.

The dynamic model is

1log( ) log( ) log( ) log( ) log( )it it it it it i itemp emp wage capital output

1,...., , 1,...,i N t T (5.1)

Our aim is to find whether there exist fixed effects in the dynamic model. Apply the Hausman test for

dynamic model which is described in Section 3.

The Hausman test statistic is 13.23045. The critical values at 0.05 significance level of chi-square for df=4

is 9.49. So we can reject the null hypothesis and draw a conclusion that the fixed effects are present in the

dynamic model (5.1).

Figure 7: Autocorrelation functionof employment

Figure 8: Autocorrelation function ofthe individual-specific deviationsof employment from its time-averaged

16

6. Conclusion

I propose a Hausman test in the dynamic model. It is based on the comparison of the PGMM estimator

and the instrumental variables estimator which is consistent only under the null hypothesis. A Monte Carlo

simulation is presented to compare the finite sample properties of the Hausman test in the dynamic model

to that in individual-specific effects model. And I find out that the Hausman tests in both the individual-

specific effects model and dynamic model are powerful for detecting the existence of the correlation

between the mean of the regressor and the dependent variable. The Hausman test is less efficient in

dynamic model than in the individual-specific effects model.

17

Reference

[1]A. Colin Cameron and Pravin K. Trivedi, SUPPLEMENT MICROECONOMETRICS:METHODS AND APPLICATIONS, Cambridge University Press, New York, May 2005

[2]Arellano, M. and Bond, S. (1991), "Some Tests of Specification for Panel Data: Monte Carlo Evidenceand an Application to Employment Equations", The Review of Economic Studies, vol. 58(2), April 1991,pp.227–297.

[3] Anderson, T.W. and Hsiao, C., 1981. "Estimation of dynamic models with error components", Journalof the American Statistical Association 76, pp. 589–606

[4] Holtz-Eakin, Douglas & Newey, Whitney & Rosen, Harvey S, 1988. "Estimating VectorAutoregressions with Panel Data," Econometrica, Econometric Society, vol. 56(6), pages 1371-95

[5] George Casella, Roger L. Berger, Statistical Inference, 2nd Edition p.382, 383

[6] Arellano, M., and S. R. Bond (1991). "Some Tests of Specification for Panel Data: Monte CarloEvidence and an Application to Employment Equations, " Review of Economic Studies, 58, 277–297.

[7] David Roodman, 2006. "How to Do xtabond2: An Introduction to "Difference" and"System" GMM in Stata," Working Papers 103, Center for Global Development.

[8] Hausman, J.A. (1978). "Specification Tests in Econometrics", Econometrica, 46 (6), 1251–1271.

[9] Lars Peter Hansen (1982). "Large Sample Properties of Generalized Method of Moments Estimators",Econometrica 50, 1029-1054.

18

Appendix

R code for the Monte Carlo study of dynamic model:library(plm)library(sem)library(boot)y=c()H=c()set.seed=127for(k in 1:500){############### t=7N=100t=7p=0.7mu=rnorm(N,5,1)id=c(1:N)T=c(1:t)autodatarandom=c()for (i in 1:N){x=rnorm(t,mu[i],1)Epsilon=rnorm(t,0,1)ideffect=rnorm(1,0,1)#############for (j in 1:t){if (j==1) y[j]=mu[i] else{y[j]=p*y[j-1]+ideffect+x[j]+Epsilon[j]}########################end ifID=rep(i,t)}###########################end for jdata=data.frame(cbind(ID,T,y,x))autodatarandom=rbind(autodatarandom,data)}#################################end for iautorandom=plm(y~lag(y,1)+lag(y,2)+x-1,data=autodatarandom,effect="individual",index=c("ID","T"))ytry=autorandom$model[,1]y_1tr=autorandom$model[,2]xtry=autorandom$model[,4]ivtry=y_1tr-autorandom$model[,3]t=tsls(ytry~y_1tr+xtry-1,~xtry+ivtry)coe=t$coefficientsautofixed=pgmm(y~lag(y,1)+x | lag(y,2:3),data=autodatarandom,effect="twoways",model="twosteps")coeff=as.matrix(autofixed$coefficients[[2]][1:2])###############################################bootstrapr=500d1=c()d2=c()for(i in 1:r){sample=sample(1:100,replace = TRUE)databoot=data.frame()boot=data.frame()for(j in 1:length(sample)){id=c(rep(j,7))boot=as.data.frame(cbind(id,autodatarandom[(1+(sample[j]-1)*7):(7+(sample[j]-1)*7),2:4]))databoot=rbind(databoot,boot)}#####################################end for one databootautorandom=plm(y~lag(y,1)+lag(y,2)+x-1,data=databoot,effect="individual",index=c("id","T"))ytry=autorandom$model[,1]y_1tr=autorandom$model[,2]

19

xtry=autorandom$model[,4]ivtry=y_1tr-autorandom$model[,3]t=tsls(ytry~y_1tr+xtry-1,~xtry+ivtry)coe=t$coefficientsautofixed=pgmm(y~lag(y,1)+x | lag(y,2:3),data=databoot,effect="twoways",model="twosteps")coeff=as.matrix(autofixed$coefficients[[2]][1:2])d1[i]=coe[1]-coeff[1]d2[i]=coe[2]-coeff[2]}##################################end 100f=cbind(d1,d2)V=var(f)######################get the panel bootstrap estimate of the variance matrix############################################the panel-robust Hausman test statisticH[k]=t(coeff-coe)%*%solve(V)%*%(coeff-coe)}H=as.numeric(H)hist(H,freq=FALSE,breaks=500,xlim=c(0,30),main="Monte Carlo Simulations of Hausman Test indynamic models (autocorrelated data 500 simulations" ,xlab="Hausman Test Statistic",ylab="Density")########################################################################actual sizecurve(dchisq(x, df = 2), col = 2, lty = 2, lwd = 2, add = TRUE)count=function(x,value){count=0for(i in 1:length(x)){if (abs(x[i])>value ) count=count+1 else{count=count}}return(count)}size1=count(H,5.99)/500####################nominal size 0.05size2=count(H,7.38)/500####################nominal size 0.025size3=count(H,9.22)/500####################nominal size 0.01#############################the test powerset.seed=127y=c()H=c()for(k in 1:500){###########################t=7N=100t=7p=0.7mu=rnorm(N,5,1)id=c(1:N)T=c(1:t)autodatafixed=c()for (i in 1:N){x=rnorm(t,mu[i],1)Epsilon=rnorm(t,0,1)Eta=rnorm(1,0,1)ideffect=mean(x)+Etafor (j in 1:t){if (j==1) y[j]=mu[i] else{y[j]=p*y[j-1]+ideffect+x[j]+Epsilon[j]}################end ifID=rep(i,t)}###################end for jdata=data.frame(cbind(ID,T,y,x))autodatafixed=rbind(autodatafixed,data)####################autocorrelated data with fixed effect}########################end for i

20

autorandom=plm(y~lag(y,1)+lag(y,2)+x-1,data=autodatafixed,effect="individual",index=c("ID","T"))ytry=autorandom$model[,1]y_1tr=autorandom$model[,2]xtry=autorandom$model[,4]ivtry=y_1tr-autorandom$model[,3]t=tsls(ytry~y_1tr+xtry-1,~xtry+ivtry)coe=t$coefficientsautofixed=pgmm(y~lag(y,1)+x | lag(y,2:3),data=autodatafixed,effect="twoways",model="twosteps")coeff=as.matrix(autofixed$coefficients[[2]][1:2])###############################################bootstrapr=250d1=c()d2=c()for(i in 1:r){sample=sample(1:100,replace = TRUE)databoot=data.frame()boot=data.frame()for(j in 1:length(sample)){id=c(rep(j,7))boot=as.data.frame(cbind(id,autodatafixed[(1+(sample[j]-1)*7):(7+(sample[j]-1)*7),2:4]))databoot=rbind(databoot,boot)}#####################################end for one databootautorandom=plm(y~lag(y,1)+lag(y,2)+x-1,data=databoot,effect="individual",index=c("id","T"))ytry=autorandom$model[,1]y_1tr=autorandom$model[,2]xtry=autorandom$model[,4]ivtry=y_1tr-autorandom$model[,3]t=tsls(ytry~y_1tr+xtry-1,~xtry+ivtry)coe=t$coefficientsautofixed=pgmm(y~lag(y,1)+x | lag(y,2:3),data=databoot,effect="twoways",model="twosteps")coeff=as.matrix(autofixed$coefficients[[2]][1:2])d1[i]=coe[1]-coeff[1]d2[i]=coe[2]-coeff[2]}##################################end 100f=cbind(d1,d2)V=var(f)################################################get the panel bootstrap estimate of thevariance matrixH[k]=t(coeff-coe)%*%solve(V)%*%(coeff-coe)}H=as.numeric(H)hist(H,freq=FALSE,breaks=10000,xlim=c(0,50),main="Monte Carlo Simulations of Hausman Test indynamic models (autocorrelated data 500 simulations" ,xlab="Hausman Test Statistic",ylab="Density")########################################################################actual sizecount=function(x,value){count=0for(i in 1:length(x)){if (abs(x[i])>value ) count=count+1 else{count=count}}return(count)}power1=count(H,5.99)/500####################nominal size 0.05power2=count(H,7.38)/500####################nominal size 0.025power3=count(H,9.22)/500####################nominal size 0.01

21

R code for the Monte Carlo study of the individual-specific effects model

library(plm)Hausman=c()for(k in 1:500){mu=rnorm(100,0,1)N=100t=7id=c(1:N)T=c(1:t)datarandom=c()for (i in 1:N){x=rnorm(t,mu[i],1)Epsilon=rnorm(t,0,1)ideffect=rnorm(1,0,1)y=ideffect+x+EpsilonID=rep(i,t)data=data.frame(cbind(ID,T,y,x))datarandom=rbind(datarandom,data)}######################################data.frame of random effects modelrandomra=plm(y~x-1,data=datarandom,effect="individual",model=c("random"),index=c("ID","T"))randomfd=plm(y~x-1,data=datarandom,effect="individual",model=c("within"),index=c("ID","T"))## Hausman testhausmantest=phtest(randomra,randomfd)hausman[k]=hausmantest$statistic}hausman=as.numeric(hausman)hist(hausman,freq=FALSE,breaks=200,xlim=c(0,10),main="Monte Carlo Simulations of HausmanTest(500 simulations)",xlab="Hausman Test Statistic",ylab="Density")curve(dchisq(x, df = 1), col = 2, lty = 2, lwd = 2, add = TRUE)#############################the test powerhausman=c()for(k in 1:500){mu=rnorm(100,0,1)N=100t=7id=c(1:N)T=c(1:t)datafd=c()for (i in 1:N){x=rnorm(t,mu[i],1)Epsilon=rnorm(t,0,1)Eta=rnorm(1,0,1)ideffect=mean(x)+Etay=ideffect+x+EpsilonID=rep(i,t)data=data.frame(cbind(ID,T,y,x))datafd=rbind(datafd,data)}######################################data.frame of random effects modelrandomra=plm(y~x-1,data=datafd,effect="individual",model=c("random"),index=c("ID","T"))randomfd=plm(y~x-1,data=datafd,effect="individual",model=c("within"),index=c("ID","T"))## Hausman testhausmantest=phtest(randomra,randomfd)

22

hausman[k]=hausmantest$statistic}#############################count=function(x,value){count=0for(i in 1:length(x)){if (abs(x[i])>value ) count=count+1 else{count=count}}return(count)}power1=count(hausman,3.84)/500####################nominal size 0.05power2=count(hausman,5.02)/500####################nominal size 0.025power3=count(hausman,6.63)/500####################nominal size 0.01power4=count(hausman,2.7055)/500####################nominal size 0.1

R code for the empirical example

library(plm)library(sem)data("EmplUK", package = "plm")autorandom=plm(log(emp)~lag(log(emp),1)+lag(log(emp),2)+log(wage)+log(capital)+log(output)-1,data=EmplUK,effect="individual",index=c("firm","year"))Nemp=autorandom$model[,1]Nemp_1=autorandom$model[,2]Nwage=autorandom$model[,4]Ncapital=autorandom$model[,5]Noutput=autorandom$model[,6]iv=Nemp_1-autorandom$model[,3]t=tsls(Nemp~Nemp_1+Nwage+Ncapital+Noutput-1,~Nwage+Ncapital+Noutput+iv)coe=t$coefficientsautofixed=pgmm(log(emp)~lag(log(emp),1)+log(wage)+log(capital)+log(output) |lag(log(emp),2:3),data=EmplUK,effect="twoways",model="twosteps",transformation = c("d"))coeff=as.matrix(autofixed$coefficients[[2]][1:4])###############################################bootstrapr=500d1=c()d2=c()d3=c()d4=c()for(i in 1:r){set.seed=127sample=sample(1:140,replace = TRUE)databoot=data.frame()boot=data.frame()for(j in 1:length(sample)){############################140n=0databoot1=data.frame()for(k in1:(dim(EmplUK)[1])){##############################################################if (EmplUK[k,1]==sample[j]){n=n+1boot=EmplUK[k,2:7]databoot1=rbind(databoot1,boot)}

23

else{n=n}############################################end if}###############################end for kfirm=c(rep(j,n))databoot1=as.data.frame(cbind(firm,databoot1))databoot=rbind(databoot1,databoot)}#####################################end for one databootautorandom=plm(log(emp)~lag(log(emp),1)+log(wage)+log(capital)+log(output)-1,data=databoot,effect="individual",index=c("firm","year"))Nemp=autorandom$model[,1]Nemp_1=autorandom$model[,2]Nwage=autorandom$model[,4]Ncapital=autorandom$model[,5]Noutput=autorandom$model[,6]iv=Nemp_1-autorandom$model[,3]t=tsls(Nemp~Nemp_1+Nwage+Ncapital+Noutput-1,~Nwage+Ncapital+Noutput+iv)coe=t$coefficientsautofixed=pgmm(log(emp)~lag(log(emp),1)+log(wage)+log(capital)+log(output) |lag(log(emp),2:3),data=databoot,effect="twoways",model="twosteps",transformation = c("d"))coeff=as.matrix(autofixed$coefficients[[2]][1:4])d1[i]=coe[1]-coeff[1]d2[i]=coe[2]-coeff[2]d3[i]=coe[3]-coeff[3]d4[i]=coe[4]-coeff[4]}##################################end 140f=cbind(d1,d2,d3,d4)V=var(f)################################################get the panel bootstrap estimate of thevariance matrix############################################the panel-robust Hausman test statisticH=t(coeff-coe)%*%solve(V)%*%(coeff-coe)

The Hausman test in dynamic panel model326974/...I propose a Hausman test in dynamic panel model....

Documents

Transcript of The Hausman test in dynamic panel model326974/...I propose a Hausman test in dynamic panel model....