Panel Vector Autoregression under Cross Sectional · PDF fileof the model variables are ......

Panel Vector Autoregression under CrossSectional Dependence�

Xiao Huang

October 2004

AbstractThis paper studies the fully modi�ed (FM) estimation of panel

vector autoregression (VAR) under cross sectional dependence whenthe time dimension of the panel is large. The time series propertiesof the model variables are allowed to be an unknown mixture of sta-tionary and unit root processes with possible cointegrating relations.The common shocks are modelled with a factor structure. We extendfactor analysis in Bai and Ng (2002) and Bai (2003) to vector processand give the asymptotic distribution of the estimated factor and factorloadings. FM estimation is used to obtain the estimates of the para-meters in panel VAR. We use simulation to study the performance offactor analysis and compare the instrumental variable (IV) estimator,FM-VAR estimator, continuously updated estimator of Bai and Kao(2004) and factor augmented estimator. It is found the factor aug-mented method gives better �nite sample properties than other threemethods when the signal from common shock is strong.

Key Words: cross sectional dependence, factor analysis, nonstationary paneldata, VAR

JEL Classi�cation: C13, C23, C33�The author thanks Aman Ullah for stimulating discussions and greatly acknowl-

edges �nancial support from the Department of Economics, U.C.Riverside. All errorsare mine. Correspondence address: Department of Economics, University of California,Riverside, Riverside CA 92521-0427. Phone: (951)9078434. Fax: (951)8275685. Email:[email protected]

1 Introduction

There has been a growing interest in studying the cross sectional depen-

dence in panel data analysis. Cross sectional dependence is common for

both macro- and microeconomic data. Oil shocks, global �nancial crises are

two typical sources of cross-section dependence in a macroeconomic data set

containing time series from di¤erent countries. Political stability of some

particular countries could also have a global e¤ect on other countries�econ-

omy. Even within a country, the performance of di¤erent �rms in di¤erent

industries are likely to be a¤ected by macroeconomic factors like tax policy,

monetary policy, and business cycles. For data in the microeconomic level

within a region, cross sectional dependence is also not rare. Not only does the

general state of the economy a¤ects the household behavior, preferences and

fashions could have an e¤ect on every household as well. Spatial dependence

across di¤erent regions is another example.

The interest in relaxing the assumption of cross sectional independence

has already brought several important papers into the literature. Roughly

speaking, there are two types of interest. The �rst one is focusing on the

parameter estimation in the conditional mean under cross-section depen-

dence. Pesaran (2002) suggested an estimation method based on augmented

regression for both static and dynamic panel. Phillips and Sul (2003a) pro-

posed a median unbiased estimation procedures for estimation,testing and

con�dence interval construction in dynamic panel. Factor loadings are also

estimated in Phillips and Sul (2003a). Analytic bias in dynamic panel estima-

tion ignoring cross section dependence is studied in Phillips and Sul (2003b).

Andrews(2003) studied the instrumental variable (IV) and least square (LS)

estimators for cross section data under cross-section dependence of any form.

Bai and Kao (2004) recently studied the panel cointegration with a factor

1

structure in the errors, where they obtained the limiting distribution of the

fully modi�ed (FM) estimators and loading coe¢ cients. They also proposed

a continously-updated fully modi�ed (CUP-FM) estimator which has better

small sample properties than 2-step FM and OLS estimator. Panel unit root

test under cross section dependence is studied by Moon and Perron (2004),

Pesaran (2003) Im and Pesaran (2003), and others. For panel VAR, Mutl

(2002) studied the maximum likelihood estimator under spatial correlation in

short panel; Pesaran et al.(2004) build a global VAR (GVAR) model where

the focus is on the estimation in the conditional mean. Pesaran(2004) also

develops a test for cross section dependence in panel data.

The other type of interest is related to factor analysis with panel structure

data which assumes that variations in a large number of economic variables

can be modeled by a small number of reference variables, though those refer-

ence variables may not have any economic interpretation. Stock and Watson

(1998,2002a,b) provided some asymptotic results in the context of di¤usion

index forecasting. Forni et al.(2000) and Forni and Lippi (2001) gave the

general result for dynamic factor models. Bai and Ng (2002) gave crite-

ria to select the number of factors under heteroskedasticity and some weak

dependence between the factors and the errors. Bai (2003a) developed an

inferential theory for factor models of large dimensions. Bai and Ng(2003)

developed tests which can distinguish whether nonstationarity in the data

comes from common components or from idiosyncratic source. Bai (2003b)

studied the large-dimension factor models with nonstationary dynamic fac-

tors. Ng(2004) proposed a cross section dependence test which can �nd out

the number of units that are correlated,understanding if there is heterogene-

ity in the correlations, and evaluating the magnitude of the correlations. For

panel VAR, Canova and Ciccarelli (2002) used Bayesian method to integrate

2

Panel VAR and index models for forecasting purpose.

In this paper, we study the cross sectional dependence in nonstationary

Panel VAR via factor analysis. Our interest is the estimation of both re-

gression coe¢ cients and cross sectional shocks when the time dimension of

the panel is large. We allow parameter heterogeneity across di¤erent units.

The basic procedure is to get a �rst stage consistent estimator for each cross

section unit i, ignoring the cross sectional dependence. Combining the �rst-

stage residuals from di¤erent units, we apply the method in factor analysis

to determine the number of shocks and estimate those shocks. We then rees-

timate the regression coe¢ cient based on the estimated factor and residuals

in second stage using the fully modi�ed estimation of Phillips (1995) and

factor augmented regression. It is noted that the theory of factor analysis

for both N and T large developed so far are all for scalar process, see Bai

and Ng (2002), Bai (2003). We thus extend the method to vector process in

this paper.

The rest of the paper is organized as follows. Section 2 extends the

method of factor analysis to vector process and gives the asymptotic results.

Section 3 gives the fully modi�ed estimator of VAR with cross sectional

dependence. Section 4 provides the simulation result. Section 5 are the

conclusions.

2 Factor Model

Consider the following qth order panel VAR model,

yit = Ji (L) yi;t�1 + xit; (2.1)

xit = �ift + eit; i = 1; � � � ; N , t = 1; � � � ; T; (2.2)

3

where yit; xit and "it arem�1 vectors, �i is am�rmatrix of factor loadings, ftis r�1 vector of cross sectional shock, Ji (L) =

Pq�1h=1 JihL

h�1 and i = 1; : : : N

, t = 1; : : : ; T: De�ne J�i (L) =Pq�1

h=1 J�ihL

h�1; J�ih = �Pq

g=h+1 Jih; and Ai =

Ji (1) ; (2:1) can be rewritten as

yit = J�i (L)�yi;t�1 + Aiyi;t�1 + �ift + eit: (2.3)

If ft and eit are assumed to be i:i:d: or with limited time dependence

structure, MA(1), say, a �rst stage consistent estimator J�i (L) and Ai can be

obtained via instrumental variable method, provided T is su¢ ciently large.

Let xit = yit � J�i (L) � Aiyi;t�1, which can be treated as observed datato recover the unobservable factor structure, �ift; using method of factor

analysis. It is well known that if xit is a scalar, the estimated factor ft

ispT times one of the k largest eigenvector of the outer product of the

observed data matrix and factor loadings �i can also be estimated, where

k is the estimated number of factors. We show in the following that for a

vector process xit, the solution to factor analysis with larger T and N is the

same as the scalar process except for some modi�cation with respect to the

dimension of xit:

Consider the factor model in (2:2) : De�ne x0t = (x1t; � � � ; xNt) ; a m�Nmatrix; xi = (xi1; � � � ; xiT )0 ; a T �m matrix with xit being a m� 1 vector,whose pth element is xitp: Let �

0 = (�01; � � � ; �0N) be a r �mN matrix with

�0i = (�i1; � � � ; �im) ; a r �m matrix, where r is the true number of factors

and �ip is an r� 1 vector. Similar to xt and xi; we de�ne et and ei; and eachelement of et or ei is "it: Let e = (e1; � � � ; eN) , f = (f1; � � � ; fT )0, a T � rmatrix and x = (x1; � � � ; xN) ; and we have

xiT�m

= fT�r

�0ir�m

+ eiT�m

; (2.4)

xT�mN

= fT�r

�0r�mN

+ eT�mN

: (2.5)

4

The objective function is

min�ip;ft

V (k) =1

mNT

NXi=1

TXt=1

mXp=1

�xitp � �0ipft

�2: (2.6)

Let�~f; ~��be the solution to (2:6) ; the f.o.c.s give the following

~�ip = (P

t ftf0t)�1(P

t ftxitp) ; (2.7)

~ft =�P

i

Pp �ip�

0ip

��1 �Pi

Pp xitp�ip

�: (2.8)

Plug (2:7) into the (2:6) ; we have

V (k) =1

N

Pi vec

�x0i � x0if (f 0f)

�1f 0�0vec

�x0i � x0if (f 0f)

�1f 0�

=1

N

Pi tr��xi � x0if (f 0f)

�1f 0xi

��x0i � x0if (f 0f)

�1f 0��

=1

Ntr (P

i xix0i)�

1

Ntr(f 0

Pi xix

0if): (2.9)

The last line of (2:9) follows from the normalization condition f 0f=T = Ir:

Minimizing (2:6) is equivalent to maximizing the second term in (2:9) ; the

solution of which ispT times k largest eigenvector of the matrix xx0: The

estimated factor loading is then given by ~�0= ~f 0x=T: Alternatively, we can

plug (2:8) into the objective function and we have

V (k) =1

T

Pt

�vec (x0t)� � (�0�)

�1�0vec (x0t)

�0 �vec (x0t)� � (�0�)

�1�0vec (x0t)

�=

1

T

Pt vec(x

0t)0vec (x0t)�

1

T

Pt vec (x

0t)0� (�0�)

�1�0vec (x0t)

=1

T

Pt vec(x

0t)0vec (x0t)�

1

Ttr (�0x0x�) ; (2.10)

where the last line of (2:10) follows from the normalization �0�= (mN) = Ir

and ~� is chosen aspmN times k largest eigenvectors of x0x: The factor is

given by ~f = x~�= (mN) : The two approaches are both valid and di¤er in

computation time for di¤erent values of T and N:

5

The above solution is the same as the scalar process case in Stock and

Watson (1998), and it is natural to ask whether the factor number selection

criteria developed in Bai and Ng (2002) apply in this case. The answer is yes,

but we need to modi�ed the criteria functions with respect tom , the number

of variables included in the vector regression process. Under the assumption

that the number of cross sectional shocks a¤ecting each units are the same,

the correct selection of the number of true underlying factors can be done

using the criterion functions in Bai and Ng (2002). For example, if we have

three variables in the vector process, GDP, interest rate and equity price,

then we can select the series GDP from each country or region, forming a

dynamic panel, and applying Bai and Ng�s method. The number of cross

sectional shocks, r; can then be consistently estimated with su¢ cient data.

However, in some cases, we are also interested in estimating the factors,

factor loadings, and possibly using them for forecasting purpose, then the

scalar process which ignores the information in interest rate and equity price

is certainly not suitable. Hence, if the interest is not only the determination

of the number of factors, but also the estimation of factors, we need to

consider the vector process. In the following, we extend the Bai and Ng�s

(2002) method to vector process and also gives the asymptotic results of the

estimated factors and factor loadings as in Bai (2003).

The following assumptions are needed 8i; j = 1; � � � ; N ; s; t = 1; � � � ; T ;p; q = 1; � � � ;m:Assumption A. Ekftk4 <1 and T�1

Pt ftf

0t ! �f as T !1 where �f

is a positive de�nite matrix.

Assumption B. k�ipk � �� <1; and k (mN)�1 �0��k ! 0 as N !1for some positive de�nite matrix ��:

Assumption C. For a �nite positive constant M;

6

1. E (eitp) = 0; E jeitpj8 �M;2. E

�m�1N�1P

i

Pp eispeitp

�= mN (s; t) ; j mN (s; s)j � M for all

s; t; p: T�1P

s

Pt j mN (s; t)j �M;

3. E(eitpejsq) = � ij;t;pq with j� ij;t;pqj � j� ij;pqj for some � ij;pq and all t:Also, (mN)�1

Pi

Pj

Pp

Pq j� ij;pqj �M;

4. Ej(mN)�1=2P

i

Pp (eitpeisp � E (eitpeisp)) j4 �M 8(s; t):

Assumptions A and B are the same as Bai and Ng (2002). Assumption

C2 allows general correlation structure among variable within a particular

unit i: For unit i; its variables are allowed to have both contemporaneous

dependence and time dependence. Assumption C3 allows the cross sectional

dependence in the error term. We do not make the assumption of weak

dependence between factors and idiosyncratic errors here. This assumption

can be imposed without any e¤ect on the asymptotic results, but it is not

appropriate for the factor-augmented regression model developed in Section

4.

In order to consistently estimate the number of factors, we need a penalty

function g (mN;T ) in the following criteria function

PC (k) = V�k; fk

�+ kg(mN;T );

where V�k; fk

�is obtained by plugging estimated factors into (2:6) : Mini-

mizing PC (k) w.r.t. k gives the consistent estimator of r:

Theorem 2.1. Under the Assumptions A-C and k factors are obtained

by the principal component method. Let k = argminPC(k): Then limN;T!1

, Prob(k = r) = 1 if g (mN;T )! 0 and C2mNTg (mN;T )!1 as N; T !1with CmNT = min

�pmN;

pT�:

See Appendix for proof. This theorem is similar to the one in Bai and

Ng (2002) except the inclusion of m in the penalty function. When N and

7

T are large, the role of m in the criteria function is negligible However, it

does make some di¤erence in �nite sample. Explicit forms of g (mN;T ) are

given in Section 4. The distribution theory of the estimated factors, factor

loadings and common component can also be obtained with some additional

assumptions.

Assumption DP

s j mN (s; t)j �M andP

i

Pp

Pq j� ij;pqj �M:

Assumption E

1. Ek (mNT )�1=2P

s

Pi

Pp fs (eispeitp � E (eispeitp)) k2 �M;

2. Ek (mNT )�1=2P

t ft�0vec (e0t) k2 �M;

3. For each t; as N ! 1; (mN)�1=2P

i

Pp �ipeitp

d! N (0;�t) ; where

�t = limN!1 (mN)�1P

i

Pj

Pp

Pq �ip�

0jqE (eiptejqt) ;

4. T�1=2P

t fteitpd! N (0;�ip) where�ip = p limT!1 T

�1Ps

PtE (ftf

0seitpeisp) :

Assumption F The eigenvalues of the r � r matrix ��f are distinct.Assumption F is the same as Assumption G in Bai (2003). Under the

Assumptions A-F, we can easily derive the distribution theory for the esti-

mated factors, factor loadings and common components. Since the results

are quite similar to the scalar case in Bai (2003), we state the distribution

theory in one Theorem 2.2 and a sketch of the proof is given in Appendix A.

Theorem 2.2. Under the Assumptions A-F andpN=T !1 as N; T !

1; we have the following distribution theories for the estimated factor model,pmN

�~ft �H 0ft

�d! N (0; V �1Q�tQ

0V �1) ;pT�~�ip �H�1�ip

�d! N (0; Q0�1��ipQ

�1) ;�1pmNVitp +

1TWitp

��1=2(~citp � citp)

d! N (0; 1) ;

where H = (�0�=mN)�f 0 ~f=T

�V �1mNT ; VmNT is the diagonal matrix of the

largest r eigenvalues of (mNT )�1 xx0 in decreasing order, V , Q is de�ned in

Proposition 1 in Bai (2003), Vitp = �0ip�

�1� �t�

�1� �ip;Witp = f

0t�

�1f �ip�

�1f ft; ~citp =

8

~�0ipft:

See Appendix for proof. The above asymptotic theory can be used to

construct con�dence interval in factor analysis. It will also be of possible use

for the testing of cross sectional dependence in Panel VAR.

Theorem 2.1 and Theorem 2.2 are very similar to the corresponding the-

orems in Bai and Ng (2002) and Bai (2003), which is not a surprise if we

reshape the data matrix. Consider a simple case with N = 3;m = 2: Suppose

we have data for 3 countries, and for each countries, our VAR includes GDP

and interest rate. Then we can always reorder our data in such a way that

within each country, there are 2�T observations, the �rst T observations isthe GDP series, and the second T observations is the interest rate observa-

tions. Thus the data for each country can be split into two groups, where

each group can be treated as an new individual unit and we will then have

N � m = 3 � 2 = 6 units. Let N� = m � N; then it is exactly the samemodel as in Bai and Ng (2003). In the matrix notation, it is the reordering

of the column vector of x in (2:5) ; which gives the same solution for (2:6)

after singular value decomposition. It then motivates us to think what is

the advantage of consider the factor analysis for vector process? The �rst

advantage is its simplicity. Applying factor analysis to (2:2) doesn�t require

reshape of the data matrix, and the estimation procedure is much simpler

when there is iteration procedure as in Bai and Kao (2004). The second

advantage is related to e¢ cient estimation. Note reshaping the data matrix

treating each time series in each cross sectional unit as a new individual unit,

which is not useful when we implement e¢ cient estimation of the factor and

factor loading. Table 2 shows that the factor analysis gives misleading results

when there are strong presence of group-wise heteroskedasticity. If the vari-

ance covariance matrix for GDP and interest rate is very di¤erent for each

9

country, then it is possible the factor selection criteria and factor estimation

method will give wrong number of factors and ine¢ cient estimation of factors

and factor loadings. In this case, a more appealing procedure will be using

the information in the covariance matrix and changing the objective function

to

min�ip;ft

VmNT =1

mNT

NXi=1

TXt=1

(xit � �ift)0��1i (xit � �ift) ; (2.11)

where �i in practice can be the �rst stage estimation of the covariance matrix

of the data in group i and xit is am�1 vector. (2:11) may give more e¢ cientestimation results in the presence of heteroskedasticity and serial correlation.

3 Fully Modi�ed Estimation

The common shock studied in Section 2 is assumed to follow a station-

ary process, but we allow possible nonstationarity and cointegration for the

process yit: Since T is large, we can estimate the model for each individual

unit and allow for possible parameter heterogeneity, and it becomes a VAR

model with cross sectional dependence. By assuming normality of the error

term "it in (2:2) and exogeneity of the cross sectional shocks, one may ex-

tend Johansen�s (1988) method to the process with exogenous variables as in

Pesaran et al. (2000). In that case, we can further relax the assumption of

stationary cross sectional shocks. However, the asymptotic results of factor

analysis for nonstationary vector process have to be derived similar to Bai

(2004). Without imposing any assumption on the error term and the cross

sectional shocks, we use Phillips�(1995) fully modi�ed method for estimation

in this section.

Fully modi�ed method was developed by Phillips (1995) as an alternative

way to estimate possibly cointegrated VAR process without pretesting for

10

cointegration rank and the location of unit root. It can correct possible serial

correlation in the error term and the endogeneity of the regressors resulting

from cointegration. (2:1) di¤ers from the model in Phillips (1995) with the

extra term �ift; which, combined with the error term "it, can be treated as

a composite error and we expect the asymptotic theory for the estimated

coe¢ cient in (2:1) gives some di¤erent variance covariance structure. It is

noted that recently Bai and Kao (2004) also used the fully modi�ed method

for estimation in panel with cross sectional dependence. However, the model

in this paper di¤ers from theirs in several aspects: we study panel VAR

instead of static panel with scalar process; time period T is large in this

paper which allows for parameter heterogeneity; we also consider the �nite

sample performance of factor augmented regression through simulation given

in Section 4.

Rewrite (2:1) as

yit = J�i (L)�yi;t�1 + Ayi;t�1 + �ift + eit (3.1)

= Jizt + Aiyi;t�1 + e�it; (3.2)

where Ji = (J�1 ; � � � ; J�q�1); zt = (�y0i;t�1; � � � ;�y0i;t�q+1) and e�it = �ift + eit:Assumption C in Section 2 allows very general form of serial correlation for ft

and "it: However, in order to get a �rst stage consistent estimator from (2:1) ;

we have to impose limited serial correlation structure in both the factors and

errors. For simplicity, a MA(1) structure for both ft and "it is assumed in

simulation. We can certainly allow for higher order moving average process at

the cost of truncating more data. De�ne the following notations for the fully

modi�ed estimator1: let at and bt be two covariance stationary processes.

The long-run and one-sided long-run covariance matrix between a and b are

1These succint notations are from Kauppi (2004).

11

given by ab =P1

j=�1E (at+jbt) and �ab =P1

j=0E (at+jbt) ; whose kernel

estimators are ab =PT�1

j=�T+1w (j=K) �ab (j) ; �ab =PT�1

j=0 w (j=K) �ab (j) :

w (�) is the kernel function with a lag truncation or bandwidth parameter Kand �ab (j) = T�1

P1�t;t+j�T at+jb

0t: The following assumptions except for

H1 and H2 are from Phillips (1995,p.1044).

Assumption H.

1. ft is identically distributed MA(1) process with zero mean and covari-

ance matrix �f in Assumption A and �nite fourth order cumulants.

2. "it is identically distributed MA(1) process with zero mean and covari-

ance matrix �" in Assumption A and �nite fourth order cumulants.

3. jIm � J (L)Lj = 0 has roots on or outside the unit circle.4. Ai = Im + ��

0 where � and � are m� r0 matrices of full column rankr0; 0 � r0 � m:5. �0? (J

� (1)� Im) �? is nonsingular, where �? and �? are m� (m� r0)matrices of full column rank such that �0?� = �

0?� = 0:

De�ne the orthogonal transformation matrixHT = [�; �?]:Multiply both

sides of (3:2) with HT and we obtain the set of equations similar to (31a)

and (31b) in Phillips (1995).

y1it

= J1zt + Ai11y1i;t�1 + e�1it; (3.3)

y2it

= y2i;t�1 + u2t; u2t = e

�2it + J2zt + A21y1i;t�1; (3.4)

where the coe¢ cients,J1; J2; Ai11; Ai21 are elements of the properly parti-

tioned matrix H 0TJiHT and H

0TAiHT in (28) of Phillips (1995, p.1045), and

underscored variables are left-multiplication of the original variable with H 0T :

Multiplying the variable with HT separates the stationary and nonstationary

part of the data. Thus y1itis the stationary part and y

2itis the nonstationary

part. De�ne wit = (f 0t ; e0it; u

02t)

0 to be a vector stationary process. The func-

tional central limit theory for wit gives T�1=2P[T �]

t witd! B (�) � BM () :

12

jp and �jp is the corresponding element in and � according to the di-

mension of ft; eit and u2t for j; p = f; e; 2:

In matrix notation, we write (3:2) as,

Y 0i = J iZ0i + AiY

0i;�1 + E

�0i = F iX

0i + E

�0i (3.5)

= F 1iX0i1 + F 2iX

0i2 + E

�0i ; (3.6)

whereX 0i1 andX

0i2 contain the stationary and nonstationary regressors,respectively.

It is immediately seen from (3:6) that the similar asymptotic results for F ican be obtained as in Phillips (1995). The result is summarized in the fol-

lowing theorem.

Theorem 3.1. Under Assumption H, we havepT�F+

1i � F+1i�

d! N�0; (Im ��111 )''e(Im ��111 ) + (Im ��111 )''f (Im ��111 )

�;

T�F+

2i � F+2i�

d! �i

�R 10dBf �2B

02

��R 10B2B

02

��1+�R 1

0dBe�2B

02

��R 10B2B

02

��1;

where��111 = E (Xi1tX0i1t), ''f =

P1j=�1E (�ift+jf

0t�0i Xi1t+jX

0i1t) ; ''e =P1

j=�1E�eit+je

0it Xi1t+jX

0i1t

�; Bf �2 = Bf�f2

�122 B2, Be�2 = Be�e2

�122 B2

and B2 � BM (22) : Bf and Be are the corresponding vector Brownian mo-

tion in wit (=H 0Twit):

The proof of Theorem 3.1 is a straight forward extension of the Theorem

5.1 in Phillips (1995) to the composite error "�it = �ift + "it and thus is

omitted. In practice, we needn�t know the nonstationary property of the

data, and the estimator F+i takes the following form

F+i = [Y 0i Zi... Y 0i Yi;�1 � T �i�f�y�1 � f�y�1�y�1�y�1

��Y 0i;�1Yi;�1 � T ��y�1�y�1

��T �e�y�1 � e�y�1�y�1�y�1

��Y 0i;�1Yi;�1 � T ��y�1�y�1

�] (X 0

iXi)�1: (3.7)

If IV estimator is used in the �rst stage, the T in (3:7) should be(T � 2) :The distribution of the FM estimator in the original coordinates is given

in the following corollary.

13

Corollary 3.1. Under the results in Theorem 3.1, we havepT�F+1i � F+1i

�G

d! N�0; (Im ��111 )''e(Im ��111 ) + (Im ��111 )''f (Im ��111 )

�;

T�F+2i � F+2i

�G?

d! �i

�R 10dBf �2B

02

��R 10B2B

02

��1+�R 1

0dBe�2B

02

��R 10B2B

02

��1;

where

G =

�Iq�1 H 0

0 �

�; G? =

�0...�0?

�and Bf �2 and Be�2 are de�ned similarly as in Theorem 3.1.

(3:7) gives the estimates of nonstationary VAR with unknown mixture of

I (0) and I (1) variables. We may also have I (2) variables in applications

and the above method doesn�t work in that case. However, it is possible

to use the resiudal-based fully modi�ed VAR procedure in Chang (2000) for

estimation purpose where unknown mixture of I (0) ; I (1) and I (2) variables

is allowed.

The endogeneity and serial correlation correction in (3:7) is taken with

respect to both the factors and error terms, which is equivalent to making

correction with respect to the composite error term e�it = �ift + eit: In other

words, FM-VAR ignoring the cross sectional dependence yields the same so-

lution as the one in (3:7). But that argument is based on procedure that

we do not make use of the estimated factors. There are several ways where

we have di¤erences between FM-VAR with and without cross sectional de-

pendence. One way is to consider the dynamic behavior in the factors. The

limited dynamic behavior in ft and "it is assumed, but can be tested. For

forecasting purpose, one can use the method in Galbraith et al.(2002) to

recover the moving average coe¢ cient and improve forecasting performance

using the information in the factors and errors. The other way is to use the

factor augmented regression to improve the estimation even if there is no

dynamic behavior in either shocks or errors. Once the common shocks are

14

estimated, we can consider the following factor augmented regression

yit = ai + J�i (L)�yi;t�1 + Ayi;t�1 + �ift + eit: (3.8)

The FM-VAR for (3:8) needs to correct the endogeneity and serial correlation

only w.r.t. eit; and ft is added as a stationary regressor. Generally, we

expect the factor augmented regression gives better estimation results. The

simulation in Section 4 shows that this argument holds conditioning on the

nature of the data generating process and how precise the factor analysis is.

4 Simulation Results

In this section we study the �nite sample properties of both the vector factor

analysis and the FM-VAR with cross sectional dependence.

Three cases for the estimation of the number of factors are considered

in the simulation, and the estimated number of factors, k; is simulated 1000

times and its average is reported in Tables 1,2 and 3, respectively. The

maximum number of factors during estimation is set to be 10. In the �rst

case, we consider the following data generating process,

xitp = �0ipft + eitp;

i = 1; � � � ; N t = 1; � � � ; T p = 1; � � � ;m;

where each element of ft; �ip and "it are generated as i:i:d: N (0; 1) 8i; t; p:m is set to be 5 and N and T take di¤erent values. The number of cross

sectional shocks, r; is set to be 5. The factor number selection criteria are

15

modi�ed from those in Bai and Ng (2002).

PCp1 (k) = V�k; fk

�+ k�2

�mN + T

mNT

�log

�mN

mN + T

�PCp2 (k) = V

�k; fk

�+ k�2

�mN + T

mNT

�log�C2mNT

�PCp3 (k) = V

�k; fk

�+ k�2 log

�C2mNT

�=C2mNT

ICp1 (k) = log�V�k; fk

��+ k

�mN + T

mNT

�log

�mN

mN + T

�ICp2 (k) = log

�V�k; fk

��+ k

�mN + T

mNT

�log�C2mNT

�ICp3 (k) = log

�V�k; fk

��+ k log

�C2mNT

�=C2mNT

Table 1 gives the results for homoskedasticity case. All the six criteria

consistently select the number of factors except for the case when T is very

small. In Table 2, we consider the case when there are group-speci�c het-

eroskedasticity, where each group�s variance is generated from Chi-square

distribution with 2 degrees of freedom. The generated Chi-square variable is

then multiplied with the standard normal random variable to generate the

error term. Elements in ft and �ip are still from N (0; 1) : We observe that

for the heteroskedasticity case considered in this section, the six criteria still

consistently select the number of factors when the number of cross sectional

units is large and T is moderately large. When the number of cross sectional

units is small, they all give wrong selection of the number even T is large.

This suggests that when there is strong heteroskedasticity across groups, the

number of cross sectional units has to be large enough for consistent selection

of the factors. The performance of the six criteria is worse than they do in

Table IV of Bai and Ng (2002) and reason is heteroskedasticity in the data

generating process (DGP) in Table 2 is stronger. In Table 3, the factor ft

follows a MA(1) process and the MA coe¢ cient matrix is generated from

16

N (0; 1) : �ip and "it are from N (0; 1) : We see from the table that for most

of the cases, all the six criteria gives correct answer. As in Table 1, if T is

too small, those criteria functions fail.

Next, we study the �nite sample property of the FM-VAR with cross sec-

tional shocks. For simplicity, we consider the bivariate VARMA(1,1) model:

The number of factors is set to be 1. (3:1) reduces to

yit = Aiyi;t�1 + �ift + eit; (4.1)

ft = �fuft�1 + uft; (4.2)

eit = �"u"t�1 + u"t: (4.3)

The coe¢ cient matrix is chosen from Binder et al. (2003).

Ai =

�0:5 0:1�0:5 1:1

�with � = (�0:5;�0:5)0 ; � = (1;�0:2)0 and we assume parameter homogene-ity across di¤erent groups to simplify the DGP. Each element of �i follows

N (0:1; 1) or N (0:5; 1). (uft; u0"t)0 follows

�uftu"t

�v N

0@ 1 0 00 1 0:80 0:8 1

1A ;where we allow for correlation between elements of u"t; but correlation be-

tween uft and u"t is not allowed. Another value of 5 for the variance of uft is

also tried in simulation. �f = 0:2 or 0:8 in simulations. �" takes the following

value

�" =

�0:3 0:4�"21 0:6

�;

where �e21 = �0:8; 0; 0:8 in simulation. This value of �e is taken from Phillipsand Loretan (1991). Finally, we let N = 50; 100 and T = 50; 100; 200 and use

di¤erent combinations of (N; T ) in simulation. 9 di¤erent simulation tables

17

for FM-VAR are reported in this section. We number these 9 DGPs from

DGP4 to DGP12, corresponding to the numbering of the tables.

N �2f �f �e21 �iDGP4 50 1 0.2 -0.8 N (0:1; 1)DGP5 50 5 0.2 -0.8 N (0:1; 1)DGP6 50 5 0.6 -0.8 N (0:1; 1)DGP7 50 5 0.2 0.8 N (0:1; 1)DGP8 50 5 0.2 0.8 N (0:1; 1)DGP9 50 5 0.8 0 N (0:5; 1)DGP10 100 5 0.8 0 N (0:5; 1)DGP11 50 1 0.8 0 N (0:5; 1)DGP12 100 1 0.8 0 N (0:5; 1)

The long-run and one-sided variance covariance matrices in (3:7) are cal-

culated with the KERNEL procedure in COINT 2.0, where we use the Parzen

kernel and an arbitrary lag truncation number of 5. Simulations using other

kernels give similar results and are not reported in this paper.

In each table, we report the bias and standard deviation (s.d.) of each

element of Ai with 1000 replications. Besides the FM estimator and factor

augmented (AUG) FM estimator, �rst-stage instrumental variable estimator

(IV) and continuously-updated estimator (UFM) of Bai and Kao (2004) are

also considered. The continuously-updated estimator is obtained through an

iteration procedure: FM is used as estimator as the initial estimator, and

by reestimating the factors, we get another FM estimator. The procedure is

repeated until convergence is reached. The maximum iteration steps is set

to be 30, and the convergence criterion is 0.001.

In Table 4, when T is small, 3 out of 4 parameter estimates from the

augmented regression are better than those from other methods. When T

is large ( T = 200), the FM estimator is catching up in two parameter

estimates, Ai11 and Ai12: If the variance �2f is increased as in Table 5, the

augmented estimator performs worse than FM estimator and updated FM

18

estimator. Only in the case when T is large, augmented regression gives 1 out

of 4 parameter estimates that is better than the FM estimator. The reason

for this is that when the variance of cross sectional shock is increased, it

becomes harder for factor analysis to extract the covariability across groups.

If we increase the magnitude of �f from 0.2 to 0.6 in Table 6, there is some

improvement in the performance of augmented regression and it happens only

in the case when T is large. Overall, no estimator dominates the others in

Table 6: If we change �"22 to 0.8, the simulation results in Tables 7 and 8

still give mixed results regarding the performance of augmented regression

compared to other three estimators. The simulation tables reported so far

do not have a positive conclusion on the e¤ectiveness of the factor augment

regression in terms of bias, though MSE is smaller in this case. A further

look at the DGP gives the answer. Notice in DGP 4 to 8, we have a very

noisy error process, and the variance in Tables 5 to 8 is also large. As argued

previously, large variance decrease the e¢ ciency of factor analysis, though

the number of factor can still be estimated; also, noisy error term will weaken

the signal from cross sectional shock if the magnitude of error, "it; relative

to cross sectional shock, ft; is large.

These �ndings suggest us to investigate the performance of factor aug-

mented regression when the signal from ft is strong compared to that from

"it and the variance of ft is small. It is expected that in these cases, the fact

analysis can e¢ ciently extract the cross sectional shocks and the augmented

regression gives better results. This is indeed the results in Tables 9-12. In

Tables 9 and 10, we have the same DGP, but di¤erent sample size. It is clear

that as T gets larger, the augmented regression performs better in 3 out of

4 parameter estimates than other methods. Even for the parameter A22; the

bias of AUG is not large, 0.0071. We also try the case when the variance of ft

19

is small and results is reported in Tables 11 and 12 where AUG gives compa-

rable performance with other three methods in some estimated parameters

and gives substantial improvement for other estimated parameters :

5 Conclusion

This paper studies the cross sectional dependence in panel VAR using method

of factor analysis and makes the following contributions to the literature: 1)

we extend factor analysis to the vector process. 2) we give the limiting

distribution of the FM-VAR with cross sectional dependence 3) �nite sample

properties of IV, FM, UFM and AUG is investigated through simulation.

Our �ndings are: 1) The vector factor analysis proposed in this paper gives a

simpler approach than that in Bai and Ng (2002) in analyzing the data with

panel VAR structure. Moreover, the approach is suitable for e¢ cient factor

analysis. The simulation results for both factor analysis, especially those in

Table 2, and FM estimation indicates unsatisfactory performance of factor

analysis under certain DGPs, and it is hoped the e¢ cient method can take

into consideration of both heteroeskedasticity and serial correlation and give

better estimation. 2) Like in Phillips (1995), we give the asymptotic results

of the estimators for both stationary and nonstationary variables in panel

VAR. Again, they are found to have a normal mixture distribution in the

limit. 3) We �nd the six modi�ed factor number selection criteria generally

select the true number of factors, but give wrong answer for some DGPs with

heteroskedasticity and serial correlation considered in this paper. Tables 9

to 12 shows that augmented regression performs reasonably well when cross

sectional signal is strong relative to the magnitude of heteroskedasticty and

serial correlation in the error term. However, if there is strong presence of

heteroskedasticity and serial correlation, the factor analysis cannot extract

20

the factors e¢ ciently and augmented regression performs worse than FM-

VAR estimators.

There are several important questions remain unanswered in the literature

of panel VAR with cross sectional dependence. One is testing the existence of

cross sectional shocks and assessment of magnitude of those shocks, which is

important before we apply the augmented regression. It can possibly be done

using of the approach in Ng(2004). Phillips and Sul (2003) give the analytic

bias for dynamic panel ignoring cross sectional dependence, and extension of

their work to panel VAR is also interesting. Comparison of the �nite sample

property of the bias-correted estimator and the four estimators considered in

this paper is necessary. Another interesting direction is the extension of the

result in this paper to nonstationary cross sectional dependence, where we

may obtain the results similar to those in Bai and Ng (2004). The property of

the more e¢ cient factor analysis as in (2:11) also needs to be investigated. It

is important to note that for many microeconomic data, the time dimension

is short, and extension of the work in Binder et al(2003) to cross-section

dependence is necessary. We hope to study these issues in subsequent work.

.

21

Appendix A

Lemma A.1 Under Assumptions A-C, we have

(i) T�1P

s

Pt mN (s; t)

2 �M1;

(ii) E�T�1

Pt k (mN)

�1=2Pi

Pp eitp�

0ipk2

��M1;

(iii) E

�T�2

Pt

Ps

�(mN)�1

Pi

Pp xitpxisp

�2��M1;

Proof of Lemma A.1. The proof follows the same line as that in Lemma

A.1 of Bai and Ng (2002). �Lemma A.2. For any �xed k > 1; there exits a r � k matrix Hk with

rank = min (k; r) and CmNT = min�pmN;

pT�; such that

C2mNT

�T�1

Pt kfkt �Hk0ftk

�= Op (1) (A.1)

Proof of Lemma A.2. This lemma di¤ers from the Theorem 1 in Bai

and Ng (2002) with the extra m, the dimension of the VAR process. The

proof is analogous to that of Bai and Ng (2002), except that there is have

one more dimension . Let Hk0 = ( ~fk0f=T ) (�0�=mN) ; then it can be veri�ed

that the following equality holds,

~fkt �Hk0ft = T�1

TPs=1

~fks mN (sp; tp)+T�1

TPs=1

~fks �st+T�1

TPs=1

~fks �st+T�1

TPs=1

~fks �st;

(A.2)

where

�st = vec(e0s)0vec (e0t) =mN � mN (s; t) ; (A.3)

�st = (mN)�1 f 0s�0vec (e0t) ;

�st = (mN)�1 f 0t�0vec (e0s) :

Using the inequality (w + x+ y + z+)2 � 4 (w2 + x2 + y2 + z2) ; it is easyto see that kfkt �Hk0ftk � 4 (at + bt + ct + dt) ; where at = T�2k

Ps~fks mN (s; t) k2;

22

bt = T�2kP

s~fks �stk2; ct = T�2k

Ps~fks �stk2; dt = T�2k

Ps~fks �stk2: Then

T�1P

t kfkt �Hk0ftk � 4T�1P

t (at + bt + ct + dt) :

From Lemma A.1 (i) and Cauchy-Schwarz inequality, T�1P

t at is Op(1):

For bt;

T�1P

t bt = T�3P

t

Ps

Pu~fk0s~fku�st�ut

��T�1

Ps k ~fkt k2

� �T�2

Ps

Pu (P

t �st�ut)2�1=2

But E (P

t �st�ut)2 � T 2maxs;tE j�stj

4 �T 2maxs;t (mN)

�2E��(mN)�1=2Pi

Pp (eitpeisp � E (eitpeisp))

��4� (mN)�2M:Thus T�1

Pt bt � Op (T=mN) : Consider ct;

ct = T�2k (mN)�1 �0vec (e0t)P

s~fks f

0sk2

� (mN)�2 k�0vec (e0t) k2�T�1

Ps k ~fks k2

� �T�1

Ps kfsk2

�= (mN)�2 kvec (e0t)�k �Op (1)

Then T�1P

t ct = Op (1)�(mN)�1P

t k (mN)�1=2 vec (e0t)�k = Op (1=mN)

by Lemma A (ii) ; and the result for dt can be proved in a similar way. Com-

bine the results for at to dt and plug them into (A:2) ; we have the result in

Lemma A.2.�Lemma A.3. With 1 � k � r;

V�k; fk

�� V

�k; fHk

�= Op

�C�1mNT

�(A.4)

Proof. Similar to the Lemma 2 in Bai and Ng (2002), we have

V�k; fk

�= (mNT )�1

Pi vec (xi)

0�Im Mk

f

�vec (xi)

V�k; fHk

�= (mNT )�1

Pi vec (xi)

0 (Im MfH) vec (xi)

V�k; fk

�� V

�k; fHk

�= (mNT )�1

Pi vec (xi)

0�Im

�PfH � P kf

��vec (xi)

23

where MfH = IT � PfH and Mkf= IT � fk

�fk0fk

��1fk0 = IT � P kf : Let

Dk = fk0fk=T and D = Hk0f 0fHk=T; following the same decomposition in

Bai and Ng (2002), we have

P kf� PfH = T�1[

�fk � fHk

�D�1k

�fk � fHk

�0+�fk � fHk

�D�1k H

k0f 0

+fHkD�1k

�fk � fHk

�0+ fHk

�D�1k �D�1�Hk0f 0] ((A.5))

and (mNT )�1P

i vec (xi)0�Im

�PfH � P kf

��vec (xi) = I + II + III + IV

. De�ne the T � 1 vector xi�p = (xi1p; xi2p; � � � ; xitp)0, then

I = (mNT )�1P

i vec (xi)0�Im

�fk � fHk

�D�1k

�fk � fHk

�0�vec (xi)

= (mNT )�1P

i

Pp x

0i�p

�fk � fHk

�D�1k

�fk � fHk

�0xi�p

��T�2

Pt

Ps kfkt �Hk0ftk2kfks �Hk0fsk2kD�1

k k2�1=2�

T�2P

t

Ps

�Pi

Pp xitpxisp

�2�1=2= Op

�C�2mNT

�By Lemma A.1 (iii) and Lemma A.2.

II = (mNT )�1P

i vec (xi)0�Im

�fk � fHk

�D�1k H

k0f 0�vec (xi)

= (mNT )�1P

i

Pp

Pt

Ps

�fkt � ftHk

�D�1k H

k0f 0sxitpxisp

��T�1

Pt kfkt �Hk0ftk2

�1=2kD�1

k k2�T�1

Ps kHk0fsk2

�1=2��T�2

Pt

Ps

�Pi

Pp xitpxisp

�2�1=2= Op

�C�1mNT

�= III

IV = (mNT )�1P

i

Pp

Pt

Ps f

0tH

k�D�1k �D�1�Hk0fsxitpxisp

= kD�1k �D�1k

Pp

�(mN)�1

Pi

�T�1

Pt kf 0tHkk jxitpj

�2�= Op

�C�1mNT

�24

where the property kDk � Dk = Op�C�1mNT

�is proved in Bai and Ng. The

result in Lemma A.3 follows from combining I, II, III and IV. �Lemma A.4. For each k < r; there exists a � k > 0 such that

p lim infN;T!1

V�k; fHk

�� V (r; f) = � k (A.6)

Proof. Again, this Lemma is identical to Lemma 3 in Bai and Ng. We

prove a vector version of it. De�ne the T�1 vectors �i�p = (�i1p;�i2p; � � � ; �iTp)0

and ei�p = (ei1p; : : : ; eiTp)0 ;

V�k; fHk

�� V (r; f) = (mNT )�1

Pi vec (xi)

0 (Im (Pf � PfH)) vec (xi)

= (mNT )�1P

i

Pp

��0i�pf

0 + e0i�p�(Pf � PfH) (f�i�p + ei�p)

= (mNT )�1P

i

Pp �

0i�pf

0 (Pf � PfH) f�i�p

+2 (mNT )�1P

i

Pp e

0i�p (Pf � PfH) f�i�p

+(mNT )�1P

i

Pp e

0i�p (Pf � PfH) ei�p

= I + II + III

I > 0 and III > 0; the proof follows Bai and Ng. Consider the second

term II:

II =P

p

�2 (mNT )�1

Pi e0i�pPff�i�p � 2 (mNT )

�1Pi e0i�pPfHf�i�p

�and the �rst term in the brackets is��(mNT )�1Pi e

0i�pPff�i�p

�� =��(mNT )�1Pt ft

Pi eitp�ip

��

�T�1

Pt kftk2

�1=2(mN)�1=2

�T�1

Pt k (mN)

�1=2Pi eitp�ipk2

�1=2= Op

�(mN)�1=2

�by Lemma A.1 (ii) : The second term is of the same order. Thus II =

Op

�(mN)�1=2

�:�

25

Lemma A.5. For any k � r; V�k; fk

�� V

�r; f r

�= Op

�C�2mNT

�Proof. See Bai and Ng (2002).

Proof of Theorem 2.1. Follows the proof of Bai and Ng (2002) and

the results in Lemmas A.1 to A.5.�Lemma A.6. Under Assumptions A-E, we have

(i) T�1P

s~fs mN (sp; tq) = Op

�T�1=2C�1mNT

�(ii) T�1

Ps~fs�st = Op

�(mN)�1=2C�1mNT

�(iii)T�1

Ps~fs�st = Op

�(mN)�1=2

�(iv)T�1

Ps~fs�st = Op


�Proof. (i) follows by the decomposition

(i) = T�1P

s(~fs �H 0fs) mN (sp; tq) + T

�1H 0Ps fs mN (sp; tq)

and applying Cauchy-Schwarz inequality, Lemma A.2, Assumption A and

D.

(ii) - (iv) can be proved in a similar way.�Let VmNT be the r � r diagonal matrix of the �rst r eigenvalues of the

matrix (mNT )�1 xx0 and de�ne H = (�0�=mN)�f 0 ~f=T

�V �1mNT : Assuming

at this stage the number of factors, r; can be consistently estimated using

the result in Theorem 2.1, we can drop the superscript k in (A:2) write it as

~ft�H 0ft = V�1mNT

�T�1

Ps~fs mN (sp; tq) + T

�1Ps~fs�st + T

�1Ps~fs�st + T

�1Ps~fs�st

�(A.7)

The proofs of the distribution of factor loadings and common components

need other additional lemmas.

Lemma A.7. Under Assumptions A-E, we have

T�1�~f � fH

�0ei�p = Op

�C�2mNT

�

26

Proof. From (A:7) ; we have

T�1P

t

Pp

�~ft �H 0ft

�eitp = V �1mNT [T

�2Pt

Ps

Pp~fs mN (sp; tq) + T

�2Pt

Ps

Pp~fs�steitp

+T�2P

t

Ps

Pp~fs�steitp + T

�2Pt

Ps

Pp~fs�steitp]

= VmNT (I + II + III + IV )

Consider the term I

I = T�2P

t

Ps

Pp

�~fs �H 0fs

� mN (s; t) eitp + T

�2Pt

Ps

Pp fs mN (s; t) eitp

� T�1=2�T�1

Ps k ~fs �H 0fsk2

�1=2�T�2

Ps

�Pt mN (s; t)

Pp eitp

�2�1=2+T�2

Pt

Ps

Pp mN (s; t)

�Ekfsk2

�1=2 �E(e2itp)

�1=2=

= T�1=2Op�C�1mNT

�+O

�T�1

�= T�1=2Op

�C�1mNT

�The second term II

II = T�2P

t

Ps

Pp

�~fs �H 0fs

��steitp + T

�2Pt

Ps

Pp fs�steitp

��T�1

Ps k ~fs �H 0fsk2

�1=2�T�1

Ps

�T�1

Pt �st

Pp eitp

�2�1=2+Op

�(mNT )�1=2

�= Op


�For III

III = T�2P

t

Ps

Pp(~fs �H 0fs)�steitp + T

�2Pt

Ps

Pp fs�steitp

= III1 + III2

27

and

III1 ��T�1

Ps k ~fs �H 0fsk2

�1=2�T�3

Ps

�Pt �st

Pp eitp

�2�1=2= Op

�C�1mNT

��T�1

Ps

�T�1

Pt

Pq (mN)

�1Pj

Pq �jqejtq

�2�1=2= Op((mN)

�1=2C�1mNT )

III2 = H 0 �T�1Ps fsf0s

� �(mNT )�1

Pt

Pj

Pp

Pq �iqejtqeitp

�� H 0 �T�1Ps fsf

0s

�(mN)

Pj

Pp

Pq j� ij;pqj

= O�(mN)�1

�Hence III = Op((mN)

�1=2C�1mNT ) + O�(mN)�1

�: The order of IV can

be obtained similar to III: Thus I + II + III + IV = Op�C�2mNT

�:�

Lemma A.8. Under Assumptions A-E, we have

T�1�~f � fH

�0f = Op

�C�2mNT

�Proof. Rewrite it as in (A:2)

T�1P

t

�~ft �H 0ft

�f 0t = V �1mNT (T

�1Pt

Ps~fsf

0t mN (sp; tq) + T

�1Pt

Ps~fsf

0t�st

+T�1P

t

Ps~fsf

0t�st + T

�1Pt

Ps~fsf

0t�st)

= V �1mNT (I + II + III + IV )

The proof is then similar to the one in Lemma A.7 and is omitted here.�Proof of Theorem 2.2. Using the result in Lemma A.6, (A:7) can be

written as

pmN

�~ft �H 0ft

�= V �1mNTT

�1Ps

�~fsf

0s

�(mN)�1=2

Pi

Pp �ipeipt + op (1)

(A.8)

Under Assumption E3 and using Lemma A.3 in Bai (2003), the �rst term

on the r.h.s. of (A:7) is distributed as N(0; V �1Q�tQ0V �1):

28

Also note for ~�ip; we have the following identity,

~�ip �H�1�ip = T�1H 0f 0ei�p + T

�1 ~f 0�f � ~fH�1

��ip + T

�1�~f � fH

�0ei�p

(A.9)

Applying Lemma B.3 in Bai (2003) and Lemma A.7, together with As-

sumption E4, it�s easy to see thatpT (�ip �H�1�ip)

d! N (0; Q0�1�ipQ�1) :

Similarly, a small modi�cation of the proof in Appendix C in Bai (2003) gives

the distribution of the common factor in Theorem 2.2.�

29

References

Andrews,D.W.K.(2003), "Cross-Section Regression with Common Shocks",Cowles Foundation Discussion Paper.

Bai,J.(2003a), "Inference on factor models of large dimension", Economet-rica 71, 135-171.

Bai,J.(2003b), "Estimating cross-section common stochastic trends in non-stationary panel data", Journal of Econometrics, in press.

Bai,J. and S. Ng(2002), "Determine the number of factors in approximatefactor models", Econometrica 70, 91-221.

Bai,J. and S. Ng(2004), "A PANIC attack on unit roots and cointegration",Econometrica, forthcoming.

Bai,J. and C. Kao(2004), "On The Estimation And Inference Of A Coin-tegrated Regression In Panel Data With Cross-sectional Dependence",manuscript, Department of Economics, Syracuse University.

Binder,M., C.Hsiao and M.H.Pesaran(2003),"Estimation and Inference inShort Panel Vector Autoregressions with Unit Roots and Cointegra-tion", manuscript, Cambridge University.

Chang,Y.(2000),"Vector Autoregressions with UnknownMixtures of I(0),I(1),andI(2) Components", Econometric Theory 16, 905-926.

Forni,M., M.Hallin, M.Lippi and L.Reichlin(2000), "The Generalized Dynamic-Factor Model: Identi�cation and Estimation", Review of Economicsand Statistics, 82(4) 540-554.

Forni,M. and M.Lippi(2001), "The Generalized Dynamic Factor Model:Representation Theory", Econometric Theory, 17 1113-1141.

Galbraith, J.W., A. Ullah and V. Zinde-Walsh(2002),"Estimation of TheVector Moving Average Model by Vector Autoregression", EconometricReviews 21(2), 205-219.

Im,K. S. and M.H. Pesaran(2003), "On The Panel Unit Root Tests UsingNonlinear Instrumental Variables", working paper, Trinity College.

30

Johansen,S.(1988), "Statistical Analysis of Cointegration Vectors", Journalof Economic Dynamics and Control, 12 231-254.

Kauppi,H.(2004),"On the Robustness of Hypothesis Testing Based on FullyModi�ed Vector Autoregression when Some Roots Are Almost One",Econometric Theory 20, 341-359.

Moon,H.R. and B.Perron(2004), "Testing for a unit root in panels withdynamic factors", Journal of Econometrics, in press.

Mutl,Jan(2002), "Panel VAR with Spatial Dependence", manuscript,http://econpapers.hhs.se/cpd/2002/113_Mutl.pdf.

Ng,S.(2004),"Testing Cross-section Correlation in Panel Data Using Spac-ings", manuscript, Department of Economics University of Michigan.

Pesaran, M.H., Y. Shin and R. J. Smith (2002), "Structural analysis ofvector error correction models with exogenous I(1) variables", Journalof Econometrics, 97(2) 293-343.

Pesaran,M.H.(2002), "Estimation and Inference in Large Heterogenous Pan-els with Cross Section Dependence", working paper, Trinity College.

Pesaran,M.H.(2003), "A Simple Panel Unit Root Test In The Presence OfCross Section Dependence", working paper, Trinity College.

Pesaran,M.H.(2004), "General Diagnostic Tests for Cross Section Depen-dence in Panels", working paper, Trinity College.

Pesaran,M.H., T. Schuermann and S.M.Weiner(2004),"Modeling RegionalInterdependencies Using a Global Error-Correcting MacroeconometricModel", Journal of Business and Economic Statistics 22(2), 129-162.

Phillips,P.C.B. andM. Loretan(1991),"Estimating Long-run Economic Equi-libria", Review of Economic Studies 58, 407-436.

Phillips,P.C.B.(1995),"Fully Modi�ed Least Squares and Vector Autoregres-sion", Econometrica 63(5), 1023-1078.

Phillips,P.C.B. and D. Sul(2003a),"Dynamic Panel Estimation and Homo-geneity Testing under Cross Section Dependence", Econometrics Jour-nal 6(1), 217-259.

31

Phillips,P.C.B. and D. Sul(2003b),"Bias in Dynamic Panel Estimation withFixed E¤ects, Incidental Trends and Cross Section Dependence",CowlesFoundation Discussion Papers: 1438.

Stock,J. and M. Watson(1998), "Di¤usion Indexes", working paper 6702,NBER.

Stock,J. and M. Watson(2002a), "Forecasting Using Principal ComponentsFrom a Large Number of Predictors", Journal of the American Statis-tical Association, 97 1167-1179.

Stock,J. and M. Watson(2002b), "Macroeconomic Forecasting Using Di¤u-sion Indexes", Journal of Business and Economic Statistics, 20 147-162.

32

N T PCp1 PCp2 PCp3 ICp1 ICp2 ICp3

100 40 5 5 5 5 5 5100 60 5 5 5 5 5 5200 60 5 5 5 5 5 5500 60 5 5 5 5 5 5100 100 5 5 5 5 5 5200 100 5 5 5 5 5 5500 100 5 5 5 5 5 540 100 5 5 5 5 5 560 100 5 5 5 5 5 560 200 5 5 5 5 5 510 50 5.355 5.002 5.002 5 5 9.86210 100 5 5 5 5 5 520 100 5 5 5 5 5 5

100 10 10 10 10 10 10 10100 20 5 5 5 5 5 5

Table 1 r = 5


100 40 5.053 5.037 5.037 4.873 4.853 4.925100 60 5 5 5 5 5 5200 60 5.04 5.024 5.024 4.995 4.991 4.999500 60 5 5 5 4.997 4.997 4.997100 100 5.88 5.557 5.557 5.002 5 5,343200 100 5 5 5 5 5 5500 100 5 5 5 5 5 540 100 10 10 10 10 10 1060 100 9.426 8.773 8.773 8.23 6.606 1060 200 9.541 8.558 8.558 9.152 7.08 1010 50 10 9.996 9.996 10 9.491 1010 100 10 10 10 10 10 1020 100 8.32 6.717 6.717 5.46 5.005 10

100 10 10 10 10 10 10 10100 20 8.77 8.616 8.616 4.229 4.173 4.358

Table 2 r = 5 heterskedasticity

33


100 40 3 3 3 3 3 3100 60 3 3 3 3 3 3200 60 3 3 3 3 3 3500 60 3 3 3 3 3 3100 100 3 3 3 3 3 3200 100 3 3 3 3 3 3500 100 3 3 3 3 3 340 100 3 3 3 3 3 360 100 3 3 3 3 3 360 200 3 3 3 3 3 310 50 4.246 3.054 3.054 3 3 9.05210 100 3 3 3 3 3 320 100 3 3 3 3 3 3.001

100 10 10 10 10 10 10 10100 20 3.792 3.58 3.58 3 3 3

Table 3 r = 3 MA(1)

A11 = 0.5 A12 = 0.1T = 50 T = 100 T = 200 T = 50 T = 100 T = 200

IV -0.3394 -0.2785 -0.2598 0.0872 0.0691 0.0595(0.1477) (0.1017) (0.0713) (0.0456) (0.0289) (0.0187)

FM -0.0551 -0.0169 -0.0058 0.0072 0.0002 -0.0008(0.1165) (0.0785) (0.0546) (0.036) (0.0223) (0.0144)

UFM -0.0026 0.0361 0.0464 -0.011 -0.0153 -0.0142(0.1167) (0.0783) (0.0543) (0.0362) (0.0223) (0.0143)

AUG 0.0299 0.069 0.0798 0.004 -0.0079 -0.0116(0.1) (0.0693) (0.0471) (0.0299) (0.0188) (0.0121)

A21 = -0.5 A22 = 1.1T = 50 T = 100 T = 200 T = 50 T = 100 T = 200

IV -0.499 -0.4582 -0.453 0.09 0.0854 0.0871(0.1472) (0.1001) (0.0704) (0.0454) (0.0286) (0.0185)

FM 0.1769 0.2015 0.2048 -0.0466 -0.0471 -0.0453(0.1189) (0.0804) (0.056) (0.0366) (0.0228) (0.0147)

UFM 0.3679 0.3718 0.3637 -0.1027 -0.0932 -0.0849(0.1372) (0.0908) (0.0626) (0.0419) (0.0256) (0.0163)

AUG -0.1593 -0.1324 -0.1225 0.0346 0.0288 0.0266(0.0924) (0.0629) (0.0434) (0.0276) (-0.0174) (-0.0111)

Table 4 (DGP4)

Estimator

34

A11 = 0.5 A12 = 0.1T = 50 T = 100 T = 200 T = 50 T = 100 T = 200

IV -0.3516 -0.2834 -0.2624 0.0912 0.0708 0.0601(0.0619) (0.1099) (0.0765) (0.0617) (0.0381) (0.0241)

FM -0.1033 -0.063 -0.0535 0.0151 0.0089 0.0085(0.1311) (0.0873) (0.0603) (0.0501) (0.0303) (0.0191)

UFM -0.0796 -0.0399 -0.0298 0.0046 0.0001 0.0015(0.1315) (0.0873) (0.0602) (0.0504) (0.0304) (0.0191)

AUG -0.0824 -0.0429 -0.02938 0.0277 0.016 0.0113(0.0928) (0.0628) (0.0431) (0.0323) (0.012) (0.0126)

A21 = -0.5 A22 = 1.1T = 50 T = 100 T = 200 T = 50 T = 100 T = 200

IV -0.451 -0.3971 -0.3894 0.085 0.0755 0.0754(0.1623) (0.1095) (0.0761) (0.0617) (0.0379) (0.024)

FM 0.0811 0.1087 0.1096 -0.035 -0.0337 -0.0296(0.1305) (0.0869) (0.0602) (0.0497) (0.0301) (0.019)

UFM 0.1805 0.1921 0.1883 -0.0682 -0.0591 -0.051(0.1404) (0.0923) (0.0637) (0.0526) (0.0316) (0.0199)

AUG -0.1932 -0.1635 -0.154 0.0409 0.0353 0.0331(0.0851) (0.0574) (0.0395) (0.0297) (0.0184) (0.0115)

Table 5 (DGP5)

Estimator

A11 = 0.5 A12 = 0.1T = 50* T = 100 T = 200 T = 50* T = 100 T = 200

IV -0.3766 -0.2505 -0.2286 0.1014 0.0655 0.0539(0.1699) (0.1121) (0.0776) (0.0675) (0.0425) (0.0267)

FM -0.0862 -0.0084 0.00004 0.0218 -0.0039 -0.0035(0.1309) (0.084) (0.0577) (0.052) (0.0318) (0.0199)

UFM -0.0761 0.0132 0.0218 0.0142 -0.0122 -0.0103(0.1315) (0.0841) (0.0577) (0.0524) (0.0319) (0.0199)

AUG -0.1087 -0.0193 -0.0054 0.0401 0.0112 0.0064(0.0869) (0.0574) (0.0394) (0.0314) (0.0193) (0.0121)

A21 = -0.5 A22 = 1.1T = 50* T = 100 T = 200 T = 50* T = 100 T = 200

IV -0.4522 -0.3435 -0.3353 0.0937 0.0665 0.0654(0.17) (0.1121) (0.0775) (0.0677) (0.0425) (0.0267)

FM 0.0942 0.1403 0.1411 -0.0298 -0.0429 -0.0379(0.1288) (0.084) (0.0578) (0.0512) (0.0316) (0.0198)

UFM 0.1489 0.2147 0.2113 -0.0513 -0.0662 -0.0573(0.1342) (0.0888) (0.0609) (0.0529) (0.033) (0.0206)

AUG -0.1718 -0.1206 -0.1113 0.00425 0.0267 0.0245(0.0802) (0.0531) (0.0365) (0.0292) (0.018) (0.0112)

Table 6 (DGP 6)

Estimator

35

A11 = 0.5 A12 = 0.1T = 50 T = 100 T = 200 T = 50 T = 100 T = 200

IV -0.2458 -0.1986 -0.1940 0.0209 0.0188 0.0280(0.2055) (0.1342) (0.0904) (0.1003) (0.0580) (0.0344)

FM -0.2101 -0.1715 -0.1568 0.0415 0.0376 0.0323(0.1696) (0.1088) (0.0728) (0.0825) (0.0470) (0.0278)

UFM -0.0091 -0.1832 -0.1653 0.0461 0.0404 0.0344(0.1708) (0.1093) (0.0730) (0.0832) (0.0472) (0.0279)

AUG -0.0419 -0.0153 -0.0070 -0.0121 -0.0072 -0.0047(0.1165) (0.0763) (0.0511) (0.052) (0.0303) (0.0179)

A21 = -0.5 A22 = 1.1T = 50 T = 100 T = 200 T = 50 T = 100 T = 200

IV -0.0625 -0.0449 -0.0464 -0.0523 -0.0309 -0.0142(0.2540) (0.1654) (0.1113) (0.1195) (0.0694) (0.0411)

FM -0.1917 -0.1689 -0.1668 0.0059 0.0178 0.0228(0.2027) (0.1296) (0.0868) (0.0955) (0.0545) (0.0322)

UFM -0.2078 -0.1740 -0.1680 0.0056 0.0165 0.0220(0.2045) (0.1300) (0.0869) (0.0963) (0.0547) (0.0323)

AUG 0.0873 0.0939 0.0941 -0.0533 -0.0374 -0.0294(0.1487) (0.0972) (0.0649) (0.0652) (0.0381) (0.0245)

Tables 7 (DGP 7)

Estimator

A11 = 0.5 A12 = 0.1T = 50 T = 100 T = 200 T = 50 T = 100 T = 200

IV -0.2192 -0.1663 -0.1626 0.0161 0.0117 0.0215(0.2128) (0.1368) (0.0914) (0.1135) (0.0649) (0.0383)

FM -0.1552 -0.1135 -0.0983 0.0340 0.0288 0.0218(0.1664) (0.1049) (0.0696) (0.0879) (0.0494) (0.0290)

UFM -0.1761 -0.1266 -0.1080 0.0399 0.0323 0.0243(0.1679) (0.1055) (0.0699) (0.0888) (0.0497) (0.0292)

AUG -0.0263 0.0011 0.0097 -0.0151 -0.0096 -0.0079(0.1079) (0.0694) (0.0463) (0.0509) (0.0293) (0.0171)

A21 = -0.5 A22 = 1.1T = 50 T = 100 T = 200 T = 50 T = 100 T = 200

IV -0.0570 -0.0335 -0.0360 -0.0525 -0.0338 -0.0164(0.2552) (0.1636) (0.1095) (0.1310) (0.0752) (0.0443)

FM -0.1314 -0.1053 -0.1035 -0.0033 0.0073 0.0110(0.1955) (0.1228) (0.0818) (0.0996) (0.0562) (0.0330)

UFM -0.1515 -0.1143 -0.1083 -0.0015 0.0076 0.0113(0.1975) (0.1234) (0.0820) (0.1007) 0.0565) (0.0331)

AUG 0.0841 0.0927 0.0940 -0.0522 -0.0362 -0.0292(0.1380) (0.0887) (0.0591) (0.0639) (0.0369) (0.0216)

Table 8 (DGP 8)

Estimator

36

A11 = 0.5 A12 = 0.1T = 50 T = 100 T = 200 T = 50 T = 100 T = 200

IV -0.2979 -0.2216 -0.2043 0.0779 0.0542 0.0475(-0.2071) (0.1370) (0.0914) (-0.122) (0.0759) (0.0452)

FM -0.0812 -0.0316 -0.0211 0.0122 0.0020 0.0004(-0.1573) (0.1016) (0.0674) (-0.0919) (0.0560) (0.0333)

UFM -0.0929 -0.0424 -0.0324 0.011 0.0024 0.0015(-0.1585) (0.1022) (0.0678) (-0.0926) (0.0563) (0.0335)

AUG -0.0873 -0.0326 -0.0176 0.0263 0.0114 0.0069(-0.097) (0.0613) (0.0418) (-0.0503) (0.0304) (0.0178)

A21 = -0.5 A22 = 1.1T = 50 T = 100 T = 200 T = 50 T = 100 T = 200

IV -0.2855 -0.2196 -0.2109 0.0530 0.0383 0.0396(-0.2117) (0.1393) (0.0928) (0.1238) (0.0768) (0.0457)

FM 0.0027 0.0440 0.0467 -0.0264 -0.0286 -0.0229(-0.1556) (0.0999) (0.0662) (0.0911) (0.0553) (0.0328)

UFM -0.0039 0.0331 0.0330 -0.0313 -0.0295 -0.0220(-0.1571) (0.1006) (0.0666) (0.0919) (0.0556) (0.0330)

AUG -0.0737 -0.0296 -0.0181 0.0137 0.0036 0.0028(0.0954) (0.0036) (0.0411) (0.0495) (0.0299) (0.0176)

Table 9 ( DGP 9)

Estimator

A11 = 0.5 A12 = 0.1T = 50* T = 100 T = 200 T = 50* T = 100 T = 200

IV -0.3659 -0.2295 -0.2055 0.1135 0.0605 0.0487(0.2120) (0.1365) 0.0912 (0.1222) (0.0745) (0.0450)

FM -0.0957 -0.0355 -0.0220 0.0268 0.0028 0.0012(0.1611) (0.1011) 0.0672 (0.0922) (0.0551) (0.0331)

UFM -0.1100 -0.0470 -0.0334 0.0285 0.0035 0.0024(0.1622) (0.1017) 0.0676 (0.0929) (0.0554) (0.0333)

AUG -0.0965 -0.0368 -0.0204 0.0443 0.0114 0.0071(0.1040) (0.0625) 0.0414 (0.0518) (0.0299) (0.0177)

A21 = -0.5 A22 = 1.1T = 50* T = 100 T = 200 T = 50* T = 100 T = 200

IV -0.3751 -0.2311 -0.2128 0.0959 0.0458 0.0406(0.2162) (0.1385) (0.0924) (0.1238) (0.0754) (0.0454)

FM 0.0062 0.0384 0.0432 -0.0182 -0.0275 -0.0213(0.1581) (0.0995) (0.0660) (0.0908) (0.0544) (0.0326)

UFM -0.0033 0.0267 0.0297 -0.0197 -0.0280 -0.0205(0.1594) (0.1002) (0.0664) (0.0916) (0.0547) (0.0328)

AUG -0.0734 -0.0280 -0.0209 0.0273 0.0032 0.0030(0.1027) (0.0547) (0.0407) (0.0512) (0.0294) (0.0174)

Table 10 (DGP 10)

Estimator

37

A11 = 0.5 A12 = 0.1T = 50 T = 100 T = 200 T = 50 T = 100 T = 200

IV -0.2991 -0.2276 -0.2100 0.0762 0.0548 0.0484(0.1758) (0.1185) (0.0813) (0.0779) ( 0.0487) (0.0295)

FM -0.0717 -0.0252 -0.0143 0.0099 -0.0002 -0.0002(0.1340) ( 0.0882) (0.0601) (0.0590) ( 0.0361) (0.0218)

UFM -0.0815 -0.0345 -0.0245 0.0083 -0.0003 0.0006( 0.1350) ( 0.0886) (0.0604) (0.0594) (0.0363) (0.0219)

AUG -0.0140 0.0293 0.0435 0.0091 -0.0033 -0.0061( 0.1067 ) (0.0707) (0.0479) ( 0.0441) (0.0270) (0.0161 )

A21 = -0.5 A22 = 1.1T = 50 T = 100 T = 200 T = 50 T = 100 T = 200

IV -0.2969 -0.2364 -0.2289 0.0527 0.0408 0.0429(0.1827) (0.1221) (0.0837) (0.0804) ( 0.0500 ) (0.0302 )

FM 0.0478 0.0856 0.0904 0.0349 -0.0368 -0.0304(0.1316) ( 0.0858 ) (0.0584) ( 0.0579) ( 0.0352 ) (0.0212)

UFM 0.0458 0.0768 0.0770 -0.0413 -0.0388 -0.0299(0.1332) ( 0.0864) (0.0587) (0.0586) ( 0.0354) (0.0213)

AUG -0.0071 0.0276 0.0384 -0.0026 -0.0100 -0.0092( 0.1048 ) ( 0.0694) (0.0470) ( 0.0433 ) (0.0265 ) (0.0158)

Estimator

Table 11 (DGP 11)

A11 = 0.5 A12=0.1T = 50 T = 100 T = 200 T = 50 T = 100 T = 200

IV -0.2962 -0.2303 -0.2100 0.0736 0.0581 0.0490(0.1783) (0.1182) (0.0811) (0.0832) (0.0478) (0.0293)

FM -0.0647 -0.0288 -0.0124 0.0037 0.0009 -0.0007(0.1361) (0.0881) (0.0599) (0.0631) (0.0355) (0.0216)

UFM -0.0733 -0.0388 -0.0224 0.0006 0.0011 0.0001(0.1370) (0.0886) (0.0602) (0.0636) (0.0357) (0.0217)

AUG -0.0142 0.0254 0.0445 0.0066 -0.0023 -0.0069(0.1077) (0.0706) (0.0476) (0.0465) (0.0267) (0.0161)

A21 = -0.5 A22 = 1.1T = 50 T = 100 T = 200 T = 50 T = 100 T = 200

IV -0.2908 -0.2440 -0.2297 0.0485 0.0454 0.0433(0.1842) (0.1219) (0.0832) (0.0854) (0.0491) (0.0300)

FM 0.0581 0.0814 0.0902 -0.0449 -0.0357 -0.0301(0.1332) (0.0858) (0.0581) (0.0618) (0.0346) (0.0210)

UFM 0.0594 0.0717 0.0769 -0.0534 -0.0373 -0.0296(0.1349) (0.0863) (0.0584) (0.0626) (0.0348) (0.0211)

AUG -0.0084 0.0233 0.0395 -0.0069 -0.0093 -0.0100(0.1057) (0.0693) (0.0467) (0.0456) (0.0262) (0.0157)

Estimator

Table 12 (DGP 12)

38

Panel Vector Autoregression under Cross Sectional · PDF fileof the model variables are ......

Documents

Transcript of Panel Vector Autoregression under Cross Sectional · PDF fileof the model variables are ......