GMM Estimation from Incomplete and Rotating Panels · GMM Estimation from Incomplete and Rotating...

21
GMM Estimation from Incomplete and Rotating Panels Pedro Albarran * and Manuel Arellano September 2008 1 Introduction We consider the general problem of estimation and testing from a sequence of overlapping moment conditions generated by incomplete or rotating panel data. The crucial idea of our suggested method is to separate the problem of moment choice from that of estimation of optimal instruments. In this way, we are able to form optimal combinations of all the moment conditions generated by incomplete or rotating panels without experiencing an uncon- trolled increase in the number of first-stage coefficients. Our estimators are only “GMM estimators” in the Sargan–Hansen sense of setting to zero linear combinations of orthogonality conditions, but not in the sense of minimizing a cuadratic form in all the available moments. Rather, we form direct esti- mates of individual-specific optimal instruments pooling all the information available in the sample. 2 Model and Estimator Assumptions and Notation Consider a vector stochastic process {w t } t=-∞ such that the joint distribution of a given time series w j = ( w (t 0 +1) , ..., w (t 0 +T ) ) satisfies r j moment conditions j ( w j ) =0, (1) * Universidad Carlos III, Madrid CEMFI, Madrid 1

Transcript of GMM Estimation from Incomplete and Rotating Panels · GMM Estimation from Incomplete and Rotating...

Page 1: GMM Estimation from Incomplete and Rotating Panels · GMM Estimation from Incomplete and Rotating Panels Pedro Albarran and Manuel Arellanoy September 2008 1 Introduction We consider

GMM Estimation from Incomplete andRotating Panels

Pedro Albarran∗ and Manuel Arellano†

September 2008

1 Introduction

We consider the general problem of estimation and testing from a sequenceof overlapping moment conditions generated by incomplete or rotating paneldata. The crucial idea of our suggested method is to separate the problemof moment choice from that of estimation of optimal instruments. In thisway, we are able to form optimal combinations of all the moment conditionsgenerated by incomplete or rotating panels without experiencing an uncon-trolled increase in the number of first-stage coefficients. Our estimators areonly “GMM estimators” in the Sargan–Hansen sense of setting to zero linearcombinations of orthogonality conditions, but not in the sense of minimizinga cuadratic form in all the available moments. Rather, we form direct esti-mates of individual-specific optimal instruments pooling all the informationavailable in the sample.

2 Model and Estimator

Assumptions and Notation Consider a vector stochastic process wt∞t=−∞such that the joint distribution of a given time series wj =

(w(t0+1), ..., w(t0+T )

)satisfies rj moment conditions

Eψj(wj, θ

)= 0, (1)

∗Universidad Carlos III, Madrid†CEMFI, Madrid

1

Page 2: GMM Estimation from Incomplete and Rotating Panels · GMM Estimation from Incomplete and Rotating Panels Pedro Albarran and Manuel Arellanoy September 2008 1 Introduction We consider

where j is an index for the pair (t0, T ) and θ is a vector of unknown coefficientsof order k.1 Moreover, let V j = E

[ψj (wj, θ)ψj (wj, θ)

′], Dj = E [Υj (wj, θ)],

Υj (wj, θ) = ∂ψj (wj, θ) /∂θ′, and Πj = Dj′ (V j)−1

.The data consists of independent observations on N cross-sectional units

wj(1)1 , ..., w

j(N)N

where j (i) is the value of j for the i-th unit, which is

independent of wj(i)i . Thus, two units with the same value of j have identical

initial periods and time series length. The index j takes on values in the set1, 2, ..., J.

Let (t0i, Ti) be the pair that corresponds to j (i). The observed variablesfor individual i are therefore wi(t0i+1), ..., wi(t0i+Ti). Any wit with t ≤ t0i ort > t0i + Ti is well defined but regarded as a missing or latent variable.

Let ψj` be the `-th component of ψj (wj, θ) and let ιj`i be an indicator of

whether ψj` is observed for individual i (for given θ). Moreover, let Ij

i bea diagonal matrix of order rj whose `-th element is given by ιj`i. Note that

ψj (wj, θ) is observed for individual i when j = j (i) (i.e. Ij(i)i is an identity

matrix), but some of its elements may still be observable even if j 6= j (i).If ιj`i = 1 for j 6= j (i), then Eψj

` (wj, θ) = 0 is a redundant momentgiven those in Eψj(i)

(wj(i), θ

)= 0. For example, the entire vector ψj (wj, θ)

could just be a subset of ψj(i)(wj(i), θ

). This assumption is just a coherency

requirement, because in its absence the distribution of wj would satisfy moremoment conditions than those stated in (1).

Estimation We consider cross-sample (or multisample) estimators θthat solve

N∑i=1

Πj(i)ψj(i)(w

j(i)i , θ

)= 0 (2)

where Πj is an estimator of Πj based on a preliminary consistent estimate θas follows:

Πj = Dj′(V j)−1

Dj =

(N∑

i=1

Iji

)−1 N∑i=1

Iji Υj

i

1The vector of moments for a given j may effectively depend on only some of thecomponents of θ. We regard θ as the full parameter vector for all relevant j.

2

Page 3: GMM Estimation from Incomplete and Rotating Panels · GMM Estimation from Incomplete and Rotating Panels Pedro Albarran and Manuel Arellanoy September 2008 1 Introduction We consider

vec(V j)

=

(N∑

i=1

Iji ⊗ I

ji

)−1

vec

(N∑

i=1

Iji ψ

ji ψ

j′i I

ji

)

and ψji = ψj

(wj

i , θ)

and Υji = Υj

(wj

i , θ)

. Note that in these expressions j

need not coincide with j (i), so that some or all of the components in ψji or

Υji may be latent variables.

A computationally convenient form of extremum estimator for this prob-lem is

θ = arg minc∈Θ

N∑i=1

ψj(i)(w

j(i)i , c

)′Πj(i)′

[N∑

i=1

Πj(i)ψj(i)i ψ

j(i)′i Πj(i)′

]−1 N∑i=1

Πj(i)ψj(i)(w

j(i)i , c

).

Another possibility is given by

θ = arg minc∈Θ

N∑i=1

ψj(i)(w

j(i)i , c

)′Πj(i)′

[N∑

i=1

Dj(i)′(V j(i)

)−1

Dj(i)

]−1 N∑i=1

Πj(i)ψj(i)(w

j(i)i , c

).

Note that the weight matrices in these two expressions are different, butnevertheless the estimators coincide because the number of effective momentsis the same as the number of parameters.

Asymptotic Normality Taking a first-order expansion of (2) scaledby N−1/2 around the true value we have

0 =1√N

N∑i=1

Πj(i)ψj(i)(w

j(i)i , θ

)+

1

N

N∑i=1

Πj(i)∂ψj(i)

(w

j(i)i , θ

)∂c′

√N (θ − θ)+op (1) .

Moreover, under the assumption that for all j Πj p→ Πj as N →∞,

−E(Πj(i)Dj(i)

)√N(θ − θ

)=

1√N

N∑i=1

Πj(i)ψj(i)(w

j(i)i , θ

)+ op (1)

d→ N[0, E

(Πj(i)V j(i)Πj(i)′)] .

Finally, since Πj = Dj′ (V j)−1

, we have

√N(θ − θ

)d→ N (0,W ) (3)

3

Page 4: GMM Estimation from Incomplete and Rotating Panels · GMM Estimation from Incomplete and Rotating Panels Pedro Albarran and Manuel Arellanoy September 2008 1 Introduction We consider

where

W =[E(Dj(i)′ (V j(i)

)−1Dj(i)

)]−1

, (4)

which can be consistently estimated as

W =

[1

N

N∑i=1

Dj(i)′(V j(i)

)−1

Dj(i)

]−1

. (5)

Note that an alternative, equivalent expression for W is

W =

[J∑

j=1

Dj′ (V j)−1

Dj Pr (j)

]−1

. (6)

3 Linear Models with Fixed Effects and Pre-

determined Variables

A leading situation in the panel context is one in which moments are obtainedas orthogonality conditions between a transformed disturbance and laggedvalues of a vector of conditioning variables. In a linear model, we have

yit = x′itθ + ηi + vit E (zisvit) = 0 (s ≤ t)

where ηi is a fixed effect and zis is a vector of predetermined instruments.Letting wt = (yt, x

′t, z′t)′, the time series wj =

(w(t0+1), ..., w(t0+T )

)implies

the moment conditions

E

z(t0+1)...

z(t0+t)

(y∗(t0+t) − x∗′(t0+t)θ)

= 0 (t = 1, ..., T )

where v∗(t0+t) denotes forward orthogonal deviations (Arellano and Bover,

1995):

v∗(t0+t) =

(T − t

T − t+ 1

)1/2 [v(t0+t) −

1

T − t(v(t0+t+1) + ...+ v(t0+T )

)].

4

Page 5: GMM Estimation from Incomplete and Rotating Panels · GMM Estimation from Incomplete and Rotating Panels Pedro Albarran and Manuel Arellanoy September 2008 1 Introduction We consider

In a more compact notation, we can write

ψj(wj, θ

)= Zj′ (yj∗ −Xj∗θ

)≡ Zj′vj∗

Dj = −E(Zj′Xj∗)

V j = E(Zj′vj∗vj∗′Zj

)Πj = −E

(Zj′Xj∗) [E (Zj′vj∗vj∗′Zj

)]−1

where yj∗ =(y∗(t0+1), ..., y

∗(t0+T−1)

)′, etc.

Since ψj (wj, θ) is linear in θ, the cross-sample estimator has a closed-formexpression given by

θ =

(N∑

i=1

Πj(i)Zj(i)′i X

j(i)∗i

)−1 N∑i=1

Πj(i)Zj(i)′i y

j(i)∗i

where Πj = Dj′(V j)−1

and

Dj =

(N∑

i=1

Iji

)−1 N∑i=1

Iji Z

j(i)′i X

j(i)∗i .

A one-step choice of V j is

vec(V j

I

)=

(N∑

i=1

Iji ⊗ I

ji

)−1

vec

(N∑

i=1

Iji Z

j(i)′i Z

j(i)i Ij

i

),

and a two-step choice

vec(V j

II

)=

(N∑

i=1

Iji ⊗ I

ji

)−1

vec

(N∑

i=1

Iji Z

j(i)′i v

j(i)∗i v

j(i)∗′i Z

j(i)i Ij

i

)

where vj∗i denotes one-step residuals.

4 Comparisons with Alternative Estimators

In this section we compare the previous cross-sample GMM estimator θ withtwo alternative estimators. The first one is a pooled GMM estimator based

5

Page 6: GMM Estimation from Incomplete and Rotating Panels · GMM Estimation from Incomplete and Rotating Panels Pedro Albarran and Manuel Arellanoy September 2008 1 Introduction We consider

on the union of the available sample moments. The second is an expandedGMM estimator that minimizes the sum of GMM criteria for each balancedsubpanel. We find that pooled (or stacked) GMM is generally inefficient rel-

ative to θ, and that expanded GMM, while asymptotically equivalent to θ, isbased on a much larger number of first-stage coefficients than θ. The impli-cation is that expanded GMM is less robust than θ to alternative asymptoticplans, and is likely to exhibit poor finite sample properties.

4.1 Nonredundant Moments

Let ψ (w, θ) be a vector of dimension r containing the total number of nonre-dundant moments spanned by the J different time series available:

ψ (w, θ) =⋃j∈J

ψj(wj, θ

).

Note that ψ (w, θ) need not correspond to the moment implications from thedistribution of any single time series (e.g. the moment implications froma rotating panel of overlapping time series of four periods each, coveringtwenty periods in total, will differ from those of a complete twenty year-period panel).

The construction of ψ (w, θ) can be approached as follows. Let j1 be an

index for(t10, T

1)

corresponding to the longest time series among those with

the earliest start, so that

t10 = min (t0i)

T1

= max(Ti | t0i = t

10

),

and let ψj1

(wj1 , θ

)be the moments associated with such time series. Next,

let t20 be the earliest start for a time series going beyond t

10 + T

1:

t20 = min

(t0i | t0i + Ti > t

10 + T

1)

andT

2= max

(Ti | t0i = t

20

).

6

Page 7: GMM Estimation from Incomplete and Rotating Panels · GMM Estimation from Incomplete and Rotating Panels Pedro Albarran and Manuel Arellanoy September 2008 1 Introduction We consider

Form j2 ≡(t20, T

2)

and ψj2

(wj2 , θ

), and consider the partition

ψj2

(wj2 , θ

)=

ψj2a

(wj2 , θ

j2b

(wj2 , θ

) ,

such that ψj2a

(wj2 , θ

)is observable to j1 individuals but ψ

j2b

(wj2 , θ

)is not.

Then form

ψ[2] (w, θ) =

ψj1

(wj1 , θ

j2b

(wj2 , θ

) .

Next, consider

t30 = min

(t0i | t0i + Ti > t

20 + T

2)

T3

= max(Ti | t0i = t

30

)get j3 ≡

(t30, T

3)

and form

ψ[3] (w, θ) =

ψj1

(wj1 , θ

j2b

(wj2 , θ

j3b

(wj3 , θ

)

where ψj3b

(wj3 , θ

)is the subset of ψj3

(wj3 , θ

)that is not observed by the

j1 or j2 individuals. Moments are accumulated in this way until we get a

ψ[`] (w, θ) such that t`0 + T

`= max (t0i + Ti), which then coincides with the

full vector of nonredundant moments ψ (w, θ).

4.2 Pooled GMM

We can form ψi (c) = ψ (wi, c) for each i, despite the fact that there couldbe no single individual in the sample for whom the entire vector ψi (c) isobservable. Define an r × r diagonal matrix Ii of indicators of observabilityof the components of ψ (w, θ) for individual i. A pooled GMM estimator is

7

Page 8: GMM Estimation from Incomplete and Rotating Panels · GMM Estimation from Incomplete and Rotating Panels Pedro Albarran and Manuel Arellanoy September 2008 1 Introduction We consider

given by

θp = arg minc∈Θ

[N∑

i=1

Iiψ (wi, c)

]′ [ N∑i=1

Iiψ(wi, θ

)ψ(wi, θ

)′Ii

]−1 [ N∑i=1

Iiψ (wi, c)

].

An example of this method is the unbalanced panel estimator for dynamiclinear models proposed in Arellano and Bond (1991).

Following standard GMM theory, the asymptotic variance matrix of the

estimation error√N(θp − θ

)is

V ar(θp

)=(D′V −1D

)−1

where

D = E

[Ii∂ψ (wi, θ)

∂c′

]= E (Ii)E

[∂ψ (wi, θ)

∂c′

]V = E

[Iiψ (wi, θ)ψ (wi, θ)

′ Ii].

4.3 Expanded GMM: Minimizing the sum of GMMcriteria for each balanced subpanel

On the other hand, letting dki = 1 [j (i) = k], we can consider GMM estima-tion based on the list of moments:

ψ† (wi, c) =

d1iψ1 (w1

i , θ)...

dJiψJ(wJ

i , θ)

which leads to the estimator

θs = arg minc∈Θ

J∑j=1

[

N∑i=1

djiψj(wj

i , c)]′ [ N∑

i=1

djiψj(wj

i , θ)ψj(wj

i , θ)′]−1 [ N∑

i=1

djiψj(wj

i , c)]

with first-order conditions

J∑j=1

N∑i=1

djiΠ (c)j ψj(wj

i , c)

= 0

8

Page 9: GMM Estimation from Incomplete and Rotating Panels · GMM Estimation from Incomplete and Rotating Panels Pedro Albarran and Manuel Arellanoy September 2008 1 Introduction We consider

where

˜Π (c)j =

[N∑

i=1

dji

∂ψj(wj

i , c)

∂c

]′ [ N∑i=1

djiψj(wj

i , θ)ψj(wj

i , θ)′]−1

orN∑

i=1

˜Π (c)j(i) ψj(i)

(w

j(i)i , c

)= 0 (7)

Note that (7) differs in two ways from (2). Firstly the estimate of Π in

(2) is kept fixed, but more importantly,˜Π (c)j is estimated using only obser-

vations with dji = 1, whereas the component matrices of Πj are estimatedelement-by-element using all the observations available in each case.

As long as plimN→∞N−1∑N

i=1 dji > 0 for all j, θs and θ are asymptoti-cally equivalent, although their finite sample properties may be very different,specially if J is large, some N−1

∑Ni=1 dji are small, but there is considerable

overlap among individual time series for different values of j.Let N j =

∑Ni=1 dji be the number of individuals for which we observe a

time series with the length and origin specified by j. Let N j`k =

∑Ni=1 ι

j`iι

jki be

the number of individuals for which moments ψj` and ψj

k are observable. Notethat N j

`k ≥ N j. Standard asymptotic analysis for (7) requires that for all jplimN→∞N

j/N > 0, whereas for (2) the requirement is the milder conditionplimN→∞N

j`k/N > 0.

Example 1 As a simple example, suppose that for j = 1, 2 we ob-serve w1

i = wi1, wi2, wi3 and w2i = wi2, wi3, respectively, with associated

moments

ψ1(w1

i , θ)

=

zi1vi1

zi1vi2

zi2vi2

zi1vi3

zi2vi3

zi3vi3

, ψ2(w2

i , θ)

=

zi2vi2

zi2vi3

zi3vi3

.

Moreover, suppose that plimN→∞N1/N > 0 but N2/N → 0, so that the

condition for (7) does not hold. However, since ψ2 (w2i , θ) is also observed for

individuals with j = 1 the requirement for (2) is still satisfied.

9

Page 10: GMM Estimation from Incomplete and Rotating Panels · GMM Estimation from Incomplete and Rotating Panels Pedro Albarran and Manuel Arellanoy September 2008 1 Introduction We consider

Asymptotic Efficiency Let us write the asymptotic variance of θs andθ as

V ar(θs

)=(D†′V †−1D†

)−1(8)

where D† = E[∂ψ† (wi, c) /∂c

′] and V † = E[ψ† (wi, c)ψ

† (wi, c)′]. Equation

(8) is just an alternative expression for (4) or (6). Let the dimension ofψ† (wi, c) be r† =

∑Jj=1 rj. We can write

Iiψ (wi, c) = Hψ† (wi, c)

where H is an r × r† selection matrix (r ≤ r†). Therefore, D = HD†,V = HV †H ′, and[

V ar(θs

)]−1

−[V ar

(θp

)]−1

= D†′V †−1D† −D†′H ′(HV †H ′

)−1HD†

= G′[I −H ′

(HH

′)−1

H

]G ≥ 0

where G = V †−1/2D† and H = HV †1/2. This shows that θp is dominated by

θs in terms of asymptotic efficiency.

Example 2 Suppose that for j = 1, 2 we observe w1i = wi1, wi2 and

w2i = wi2, wi3, respectively, with associated moments

ψ1(w1

i , θ)

=

zi1vi1

zi1vi2

zi2vi2

, ψ2(w2

i , θ)

=

zi2vi2

zi2vi3

zi3vi3

where vit = yit − x′itθ, xit is k × 1, zit is q × 1, and wit = (yit, x

′it, z

′it)′. Thus,

pooled GMM is based on

N∑i=1

Iiψ (wi, θ) =N∑

i=1

d1izi1vi1

d1izi1vi2

zi2vi2

d2izi2vi3

d2izi3vi3

10

Page 11: GMM Estimation from Incomplete and Rotating Panels · GMM Estimation from Incomplete and Rotating Panels Pedro Albarran and Manuel Arellanoy September 2008 1 Introduction We consider

with2

D′N =

[N∑

i=1

Ii∂ψ (wi, θ)

∂θ′

]′= −

N∑i=1

(d1ixi1z

′i1 d1ixi2z

′i1 xi2z

′i2 d2ixi3z

′i2 d2ixi3z

′i3

).

Let us consider a one-step pooled GMM estimator with weight matrix

AN =

N∑

i=1

d1izi1z

′i1 0 0 0 0

0 d1izi1z′i1 d1izi1z

′i2 0 0

0 d1izi2z′i1 zi2z

′i2 0 0

0 0 0 d2izi2z′i2 d2izi2z

′i3

0 0 0 d2izi3z′i2 d2izi3z

′i3

−1

so that

−D′NAN =

(Π1

...Π†2...Π3

)where

Π1 =N∑

i=1

d1ixi1z′i1

(N∑

i=1

d1izi1z′i1

)−1

Π†2 =

(Π†21

...Π†22

)=( ∑N

i=1 d1ixi2z′i1

∑Ni=1 xi2z

′i2

)( N∑i=1

d1izi1z′i1 d1izi1z

′i2

d1izi2z′i1 zi2z

′i2

)−1

Π3 =

(Π31

...Π32

)=( ∑N

i=1 d2ixi3z′i2

∑Ni=1 d2ixi3z

′i3

)( N∑i=1

d2izi2z′i2 d2izi2z

′i3

d2izi3z′i2 d2izi3z

′i3

)−1

.

Notice that Π1 is the regression coefficient of xi1 on zi1 in the d1i = 1subsample. As long as plimN−1

∑Ni=1 d1i > 0, it is a consistent estimate

2In terms of the notation used in Arellano and Bond (1991), we have

N∑i=1

d1izi1vi1

d1izi1vi2

zi2vi2

d2izi2vi3

d2izi3vi3

=N∑

i=1

d1i

zi1 00 zi1

0 zi2

0 00 0

(vi1

vi2

)+ d2i

0 00 0zi2 00 zi2

0 zi3

(vi2

vi3

)where (

vi1

vi2

)=(yi1

yi2

)−(x′i1x′i2

)θ.

11

Page 12: GMM Estimation from Incomplete and Rotating Panels · GMM Estimation from Incomplete and Rotating Panels Pedro Albarran and Manuel Arellanoy September 2008 1 Introduction We consider

of Π1 = E (xi1z′i1) [E (zi1z

′i1)]−1. Π†2 is the regression coefficient of xi2 on

(d1iz′i1, z

′i2) in the full sample. It is therefore a consistent estimate of

Π†2 =

(Π†21

...Π†22

)=(p1E (xi2z

′i1) E (xi2z

′i2))( p1E (zi1z

′i1) p1E (zi1z

′i2)

p1E (zi2z′i1) E (zi2z

′i2)

)−1

where p1 = E (d1i). Finally, Π3 is the regression coefficient of xi3 on (z′i2, z′i3)

in the d2i = 1 subsample.First-order conditions are

D′NAN

N∑i=1

d1i

zi1 00 zi1

0 zi2

0 00 0

(vi1

vi2

)+ d2i

0 00 0zi2 00 zi2

0 zi3

(vi2

vi3

) = 0

orN∑

i=1

[d1i

(xi1

...xi2

)(vi1

vi2

)+ d2i

(xi2

...xi3

)(vi2

vi3

)]= 0

where

xi1 = Π1zi1

xi2 = Π†21d1izi1 + Π†22zi2

xi3 = Π31zi2 + Π32zi3.

The estimator can be written in the general form

θ =

N∑

i=1

[d1i (xi1x′i1 + xi2x

′i2) + d2i (xi2x

′i2 + xi3x

′i3)]

−1

N∑i=1

[d1i (xi1yi1 + x′i2yi2) + d2i (xi2yi2 + xi3yi3)] (9)

Expanded GMM is based on

N∑i=1

ψ† (wi, c) =N∑

i=1

d1izi1vi1

d1izi1vi2

d1izi2vi2

d2izi2vi2

d2izi2vi3

d2izi3vi3

12

Page 13: GMM Estimation from Incomplete and Rotating Panels · GMM Estimation from Incomplete and Rotating Panels Pedro Albarran and Manuel Arellanoy September 2008 1 Introduction We consider

leading to an estimator of the same form as (9) but which uses:

xi1 = Π1zi1

xi2 = Π21d1izi1 + Π22d1izi2 + Π∗2d2izi2

xi3 = Π31zi2 + Π32zi3.

where

Π2 =

(Π21

...Π22

)=( ∑N

i=1 d1ixi2z′i1

∑Ni=1 d1ixi2z

′i2

)( N∑i=1

d1izi1z′i1 d1izi1z

′i2

d1izi2z′i1 d1izi2z

′i2

)−1

Π∗2 =N∑

i=1

d2ixi2z′i2

(N∑

i=1

d2izi2z′i2

)−1

Π2 and Π∗2 are, respectively, consistent estimators of

Π2 =

(Π21

...Π22

)=(E (xi2z

′i1) E (xi2z

′i2))( E (zi1z

′i1) E (zi1z

′i2)

E (zi2z′i1) E (zi2z

′i2)

)−1

andΠ∗2 = E (xi2z

′i2) [E (zi2z

′i2)]−1.

Cross-sample GMM uses the same form of instruments as expanded GMM,but different estimates of the first-stage coefficients:

xi1 = Π1zi1

xi2 = Π21d1izi1 + Π22d1izi2 + Π∗2d2izi2

xi3 = Π31zi2 + Π32zi3.

where

Π2 =

(Π21

...Π22

)=( P

i d1ixi2z′i1Pi d1i

Pi xi2z′i2N

)( Pi d1izi1z′i1P

i d1i

Pi d1izi1z′i2P

i d1iPi d1izi2z′i1P

i d1i

Pi zi2z′i2N

)−1

=N∑

i=1

(d1ixi2z

′i1 d1xi2z

′i2

)( N∑i=1

d1izi1z′i1 d1izi1z

′i2

d1izi2z′i1 d1zi2z

′i2

)−1

,

13

Page 14: GMM Estimation from Incomplete and Rotating Panels · GMM Estimation from Incomplete and Rotating Panels Pedro Albarran and Manuel Arellanoy September 2008 1 Introduction We consider

Π∗2 =N∑

i=1

xi2z′i2

(N∑

i=1

zi2z′i2

)−1

,

and d1 = N−1∑N

i=1 d1i.

Note that Π∗2 and Π∗2 are both consistent for Π∗2, but Π∗2 is obtainedfrom the whole sample whereas Π∗2 is only based in the d2i = 1 subsample.Similarly, Π2 and Π2 are both consistent for Π2, but Π2 only uses the d1i = 1subsample, whereas Π2 also uses the information from the d2i = 1 observa-tions when available. Thus, contrary to expanded GMM, cross-sample GMMimposes the cross-subsample restrictions on first-stage coefficients implied bythe model.

Pooled GMM can be regarded as imposing the restriction

Π22 = Π∗2

in its specification of the instruments. That is, it imposes the constraint thatthe simple regression coefficient of xi2 on zi2 (in the d2i = 1 sample) coincideswith the zi2 coefficient in the multiple regression of xi2 on zi1 and zi2 (in thed1i = 1 sample). Since this restriction will only hold in special cases (ifΠ21 = 0 or if E (zi1z

′i2) = 0), in general pooled GMM will be asymptotically

less efficient than expanded GMM or cross-sample GMM.

5 Monte Carlo Experiments

In this section, we present some experimental evidence on the finite sampleperformance of our proposed estimator, the Cross-Sample GMM, and thetwo other competing alternatives, the Pooled and the Expanded GMM.

5.1 Minimum Distance Estimation

If the moment conditions are linear, the estimation problem can be formu-lated as one of enforcing restrictions on a covariance matrix. Suppose thatwe have

E [zs (yt − x′tβ)] = 0 s ≤ t.

14

Page 15: GMM Estimation from Incomplete and Rotating Panels · GMM Estimation from Incomplete and Rotating Panels Pedro Albarran and Manuel Arellanoy September 2008 1 Introduction We consider

Let us define ωst = E (zsyt), Ωst = E (zsx′t), dit is an indicator of whether

period t variables are observed for individual i, and for∑N

i=1 disdit > 0:

ωst =1∑N

i=1 disdit

N∑i=1

disditzisyit

Ωst =1∑N

i=1 disdit

N∑i=1

disditzisx′it.

Next, form

bstN =

(ωst − Ωstβ

vecΩst − vecΩst

)≡(ωst − (I ⊗ β′) vecΩst

vecΩst − vecΩst

)and let bN be a vector containing the bstN for all s, t such that

∑Ni=1 disdit >

0, and let θ contain β and the corresponding vecΩst. A pooled minimumdistance estimator of θ is

θPMD = arg min b′N V−1bN

where V is a consistent estimator of the variance of bN . Moreover, under thetransformation(

I − (I ⊗ β′)0 I

)bstN =

(ωst − Ωstβ

vecΩst − vecΩst

),

the second block is seen to consist of unrestricted moments. Thus, letting b∗Nbe a vector containing all the available ωst − Ωstβ, from standard propertiesof minimum distance estimation it turns out that βPMD (which is part of the

θPMD vector) is asymptotically equivalent to

β = arg min b∗′N V∗−1b∗N

where V ∗ is a consistent estimator of the variance of b∗N . Since

ωst − Ωstβ =1∑N

i=1 disdit

N∑i=1

disditzis (yit − x′itβ) ,

it should be clear that β coincides with the pooled GMM estimator.

15

Page 16: GMM Estimation from Incomplete and Rotating Panels · GMM Estimation from Incomplete and Rotating Panels Pedro Albarran and Manuel Arellanoy September 2008 1 Introduction We consider

Similarly, an extended minimum distance estimator can be constructedas follows. Let (s, t) be an observable pair for the j-th subpanel. Form

bst[j]N =

(ωj

st − Ωstβ

vecΩjst − vecΩst

)where ωj

st and Ωjst are j-th subpanel sample averages. Form a vector b

[j]N

for all (s, t) that are observable for the j-th subpanel. Thus, letting b†N =(b

[1]′N , ..., b

[J ]′N

)′, an extended minimum distance estimator is

θEMD = arg min b†′N

(V †)−1

b†N

where V † is a consistent estimator of the variance of b†N . Using a similar

argument as before, βEMD can be seen to be asymptotically equivalent tothe extended GMM estimator of β.

Suppose an (s, t) pair that is observable in subpanels j and j′. Pooled MD

merges bst[j]N and b

st[j′]N into a single average, whereas extended MD treats them

as separate moments. Now consider another (s′, t′) pair that is observable

in j but not in j′, so that bs′t′[j]N is correlated to b

st[j]N but not to b

st[j′]N . The

efficiency of EMD relative to PMD comes from the fact that extended MDtakes into account these patterns of correlations across subpanels in imposingthe constraints. In contrast, pooled MD cannot allow for these differencesin correlations because subpanel-specific moments have been pooled into asingle aggregate moment.

16

Page 17: GMM Estimation from Incomplete and Rotating Panels · GMM Estimation from Incomplete and Rotating Panels Pedro Albarran and Manuel Arellanoy September 2008 1 Introduction We consider

References

[1] Arellano, M. and S. R. Bond (1991): “Some Tests of Specification forPanel Data: Monte Carlo Evidence and an Application to EmploymentEquations, Review of Economic Studies, 58, 277-297.

[2] Arellano, M. and O. Bover (1995): “Another Look at the Instrumental-Variable Estimation of Error-Components Models”, Journal of Econo-metrics, 68, 29-51.

[3] Hansen, L. P. (1982): “Large Sample Properties of Generalized Methodof Moments Estimators”, Econometrica, 50, 1029-1054.

[4] Sargan, J. D. (1958): “The Estimation of Economic Relationships UsingInstrumental Variables”, Econometrica, 26, 393-415.

17

Page 18: GMM Estimation from Incomplete and Rotating Panels · GMM Estimation from Incomplete and Rotating Panels Pedro Albarran and Manuel Arellanoy September 2008 1 Introduction We consider

Appendix

18

Page 19: GMM Estimation from Incomplete and Rotating Panels · GMM Estimation from Incomplete and Rotating Panels Pedro Albarran and Manuel Arellanoy September 2008 1 Introduction We consider

Tab

le1:

Mon

teC

arlo

Sim

ula

tion

Res

ult

s.P

aram

eter

valu

=0.

2

T=

6T

=8

T=

10n=

100

n=

250

n=

100

n=

250

n=

100

n=

250

Pool

.Expd.

CSm

p.

Pool

.Expd.

CSm

p.

Pool

.Expd.

CSm

p.

Pool

.Expd.

CSm

p.

Pool

.Expd.

CSm

p.

Pool

.Expd.

CSm

p.

J=0(∗)

Med

ian

0.17

890.

1789

0.17

890.

1911

0.19

110.

1911

0.18

340.

1834

0.18

350.

1929

0.19

290.

1929

0.18

300.

1830

0.18

300.

1916

0.19

160.

1916

IQR

0.10

770.

1077

0.10

770.

0715

0.07

150.

0715

0.08

280.

0828

0.08

250.

0531

0.05

330.

0531

0.07

000.

0700

0.07

000.

0430

0.04

300.

0430

MA

E0.

0548

0.05

480.

0548

0.03

690.

0369

0.03

690.

0423

0.04

230.

0422

0.02

630.

0264

0.02

640.

0374

0.03

740.

0374

0.02

200.

0220

0.02

20J=

2M

edia

n0.

1632

0.15

490.

1706

0.18

610.

1814

0.18

950.

1752

0.16

540.

1814

0.18

970.

1854

0.19

250.

1723

0.16

450.

1783

0.19

040.

1865

0.19

25IQ

R0.

1334

0.12

920.

1395

0.08

750.

0848

0.08

940.

0915

0.08

920.

0977

0.06

290.

0615

0.06

140.

0769

0.07

360.

0778

0.04

740.

0456

0.04

64M

AE

0.06

950.

0725

0.07

070.

0446

0.04

410.

0465

0.04

900.

0503

0.04

900.

0332

0.03

320.

0336

0.04

150.

0446

0.04

140.

0235

0.02

410.

0239

J=4

Med

ian

0.15

770.

1199

0.17

480.

1792

0.16

190.

1843

0.16

570.

1308

0.17

470.

1865

0.16

930.

1901

0.16

860.

1326

0.17

930.

1870

0.17

050.

1914

IQR

0.15

130.

1517

0.15

460.

1041

0.09

580.

1037

0.10

050.

0962

0.10

460.

0682

0.06

700.

0697

0.07

530.

0705

0.07

790.

0479

0.04

570.

0490

MA

E0.

0836

0.09

470.

0801

0.05

250.

0570

0.05

300.

0565

0.07

320.

0538

0.03

620.

0392

0.03

590.

0445

0.06

770.

0414

0.02

580.

0327

0.02

55J=

6M

edia

n0.

1430

0.08

790.

1669

0.16

790.

1404

0.17

510.

1587

0.09

730.

1718

0.18

260.

1540

0.18

780.

1633

0.10

750.

1773

0.18

480.

1573

0.18

90IQ

R0.

1729

0.16

580.

1755

0.11

400.

1072

0.11

590.

1086

0.09

800.

1108

0.07

500.

0702

0.07

170.

0821

0.07

040.

0815

0.05

230.

0486

0.05

27M

AE

0.09

550.

1177

0.09

080.

0652

0.07

250.

0612

0.06

290.

1030

0.05

920.

0394

0.04

920.

0381

0.04

760.

0925

0.04

640.

0272

0.04

380.

0273

J=8

Med

ian

0.15

130.

0737

0.17

160.

1785

0.13

860.

1893

0.15

880.

0794

0.17

260.

1811

0.14

400.

1870

IQR

0.11

700.

1030

0.11

720.

0828

0.07

210.

0837

0.08

670.

0729

0.09

250.

0560

0.05

340.

0557

MA

E0.

0695

0.12

630.

0612

0.04

470.

0627

0.04

270.

0522

0.12

060.

0479

0.03

160.

0560

0.03

04J=

10M

edia

n0.

1375

0.04

730.

1689

0.17

560.

1275

0.18

590.

1480

0.05

690.

1720

0.17

790.

1294

0.18

58IQ

R0.

1356

0.11

410.

1374

0.08

350.

0776

0.08

780.

0903

0.07

450.

0964

0.05

990.

0537

0.05

97M

AE

0.07

960.

1527

0.07

090.

0457

0.07

370.

0451

0.05

970.

1431

0.05

250.

0354

0.07

060.

0324

J=12

Med

ian

0.13

970.

0357

0.16

580.

1716

0.11

660.

1835

IQR

0.09

230.

0766

0.10

020.

0611

0.05

450.

0626

MA

E0.

0672

0.16

430.

0560

0.03

780.

0834

0.03

46J=

14M

edia

n0.

1342

0.01

800.

1639

0.16

710.

1060

0.18

01IQ

R0.

1038

0.08

600.

1065

0.06

590.

0588

0.06

75M

AE

0.07

170.

1820

0.05

940.

0418

0.09

400.

0383

Note

:W

eru

n1,0

00

rep

lica

tion

for

each

sam

ple

size

,u

nb

ala

nce

dn

ess

patt

ern

an

des

tim

ato

r.(∗

)T

he

thre

ees

tim

ato

rsare

equ

ivale

nt

for

bala

nce

dp

an

els

(save

for

min

or

nu

mer

ical

ap

pro

xim

ati

on

an

dro

un

din

gis

sues

).

19

Page 20: GMM Estimation from Incomplete and Rotating Panels · GMM Estimation from Incomplete and Rotating Panels Pedro Albarran and Manuel Arellanoy September 2008 1 Introduction We consider

Tab

le2:

Mon

teC

arlo

Sim

ula

tion

Res

ult

s.P

aram

eter

valu

=0.

5

T=

6T

=8

T=

10n=

100

n=

250

n=

100

n=

250

n=

100

n=

250

Pool

.Expd.

CSm

p.

Pool

.Expd.

CSm

p.

Pool

.Expd.

CSm

p.

Pool

.Expd.

CSm

p.

Pool

.Expd.

CSm

p.

Pool

.Expd.

CSm

p.

J=0

Med

ian

0.46

990.

4699

0.46

990.

4871

0.48

710.

4871

0.47

210.

4721

0.47

220.

4890

0.48

900.

4890

0.47

310.

4731

0.47

310.

4885

0.48

850.

4885

IQR

0.12

970.

1297

0.12

970.

0851

0.08

510.

0851

0.09

500.

0950

0.09

490.

0575

0.05

750.

0575

0.07

400.

0740

0.07

400.

0462

0.04

620.

0462

MA

E0.

0678

0.06

780.

0678

0.04

340.

0434

0.04

340.

0502

0.05

020.

0501

0.03

030.

0303

0.03

030.

0398

0.03

980.

0398

0.02

440.

0244

0.02

44J=

2M

edia

n0.

4466

0.42

990.

4489

0.47

810.

4735

0.48

170.

4642

0.44

750.

4703

0.48

250.

4760

0.48

600.

4607

0.44

490.

4709

0.48

430.

4773

0.48

67IQ

R0.

1693

0.16

480.

1764

0.10

800.

1059

0.11

350.

1093

0.10

230.

1092

0.07

210.

0698

0.07

450.

0804

0.07

770.

0831

0.05

070.

0492

0.05

24M

AE

0.09

000.

0964

0.09

010.

0560

0.05

620.

0565

0.05

640.

0630

0.05

840.

0388

0.03

960.

0383

0.04

800.

0570

0.04

590.

0277

0.02

920.

0265

J=4

Med

ian

0.43

490.

3823

0.45

820.

4663

0.43

670.

4749

0.44

840.

3903

0.46

430.

4768

0.45

150.

4833

0.45

490.

4037

0.46

720.

4809

0.45

540.

4854

IQR

0.19

380.

1841

0.20

160.

1248

0.11

950.

1265

0.12

030.

1071

0.12

440.

0776

0.07

220.

0797

0.08

310.

0769

0.08

540.

0539

0.04

890.

0542

MA

E0.

1052

0.12

690.

0989

0.06

550.

0771

0.06

480.

0695

0.11

050.

0662

0.04

340.

0547

0.04

110.

0533

0.09

630.

0474

0.02

990.

0452

0.02

92J=

6M

edia

n0.

4019

0.32

400.

4392

0.44

680.

4064

0.46

560.

4329

0.34

730.

4544

0.47

050.

4282

0.48

080.

4454

0.36

500.

4611

0.47

710.

4347

0.48

20IQ

R0.

2057

0.19

460.

2213

0.14

460.

1310

0.14

330.

1303

0.10

630.

1261

0.08

400.

0789

0.08

260.

0894

0.07

670.

0940

0.05

970.

0528

0.05

89M

AE

0.12

640.

1769

0.12

050.

0822

0.10

580.

0786

0.08

050.

1527

0.07

300.

0465

0.07

270.

0446

0.06

090.

1350

0.05

340.

0342

0.06

540.

0330

J=8

Med

ian

0.41

940.

3137

0.45

340.

4656

0.40

650.

4785

0.43

580.

3300

0.45

700.

4711

0.41

530.

4807

IQR

0.13

910.

1256

0.13

530.

0933

0.08

050.

0929

0.09

830.

0808

0.10

480.

0625

0.05

630.

0634

MA

E0.

0912

0.18

630.

0785

0.05

280.

0941

0.04

720.

0691

0.17

000.

0568

0.03

760.

0847

0.03

48J=

10M

edia

n0.

3994

0.27

790.

4453

0.45

680.

3847

0.47

540.

4226

0.29

680.

4531

0.46

570.

3956

0.47

59IQ

R0.

1572

0.12

920.

1617

0.09

840.

0858

0.09

540.

1056

0.08

720.

1047

0.06

760.

0573

0.06

96M

AE

0.10

900.

2221

0.08

680.

0596

0.11

550.

0509

0.08

100.

2032

0.06

260.

0441

0.10

440.

0384

J=12

Med

ian

0.41

090.

2745

0.44

490.

4577

0.37

760.

4722

IQR

0.11

390.

0880

0.10

830.

0704

0.05

890.

0710

MA

E0.

0926

0.22

550.

0697

0.04

840.

1224

0.04

24J=

14M

edia

n0.

3985

0.25

490.

4378

0.45

040.

3627

0.46

98IQ

R0.

1177

0.09

450.

1220

0.07

800.

0658

0.07

60M

AE

0.10

440.

2451

0.07

570.

0551

0.13

730.

0469

Note

:W

eru

n1,0

00

rep

lica

tion

for

each

sam

ple

size

,u

nb

ala

nce

dn

ess

patt

ern

an

des

tim

ato

r.(∗

)T

he

thre

ees

tim

ato

rsare

equ

ivale

nt

for

bala

nce

dp

an

els

(save

for

min

or

nu

mer

ical

ap

pro

xim

ati

on

an

dro

un

din

gis

sues

).

20

Page 21: GMM Estimation from Incomplete and Rotating Panels · GMM Estimation from Incomplete and Rotating Panels Pedro Albarran and Manuel Arellanoy September 2008 1 Introduction We consider

Tab

le3:

Mon

teC

arlo

Sim

ula

tion

Res

ult

s.P

aram

eter

valu

=0.

8

T=

6T

=8

T=

10n=

100

n=

250

n=

100

n=

250

n=

100

n=

250

Pool

.Expd.

CSm

p.

Pool

.Expd.

CSm

p.

Pool

.Expd.

CSm

p.

Pool

.Expd.

CSm

p.

Pool

.Expd.

CSm

p.

Pool

.Expd.

CSm

p.

J=0(∗)

Med

ian

0.74

210.

7421

0.74

210.

7721

0.77

210.

7721

0.74

710.

7471

0.74

710.

7780

0.77

800.

7780

0.75

380.

7538

0.75

380.

7803

0.78

020.

7802

IQR

0.16

380.

1638

0.16

380.

1042

0.10

420.

1042

0.11

520.

1152

0.11

480.

0758

0.07

580.

0758

0.07

580.

0758

0.07

580.

0509

0.05

100.

0509

MA

E0.

0908

0.09

080.

0908

0.05

580.

0558

0.05

580.

0660

0.06

600.

0660

0.03

810.

0381

0.03

810.

0528

0.05

280.

0528

0.03

010.

0303

0.03

03J=

2M

edia

n0.

6900

0.66

840.

7170

0.75

190.

7415

0.75

840.

7275

0.69

850.

7396

0.76

840.

7549

0.77

720.

7369

0.70

710.

7478

0.77

300.

7609

0.77

60IQ

R0.

2150

0.20

420.

2284

0.15

170.

1449

0.15

580.

1233

0.11

470.

1304

0.08

810.

0820

0.08

680.

0888

0.08

800.

0931

0.05

800.

0580

0.06

10M

AE

0.13

370.

1461

0.12

970.

0835

0.08

550.

0846

0.08

340.

1046

0.07

660.

0488

0.05

380.

0477

0.06

700.

0930

0.06

090.

0358

0.04

190.

0354

J=4

Med

ian

0.66

870.

5779

0.70

370.

7331

0.68

230.

7522

0.70

670.

6204

0.73

330.

7607

0.70

990.

7725

0.72

430.

6416

0.74

480.

7661

0.72

220.

7737

IQR

0.24

760.

2182

0.25

700.

1640

0.15

840.

1746

0.14

030.

1262

0.14

630.

0905

0.08

650.

0958

0.09

490.

0858

0.10

090.

0633

0.05

840.

0677

MA

E0.

1592

0.22

350.

1488

0.09

970.

1226

0.09

630.

1011

0.17

960.

0904

0.05

430.

0904

0.05

290.

0775

0.15

840.

0655

0.04

220.

0778

0.03

92J=

6M

edia

n0.

6121

0.49

140.

6729

0.70

400.

6308

0.73

800.

6806

0.55

690.

7147

0.74

420.

6692

0.76

010.

7082

0.59

340.

7383

0.75

860.

6917

0.77

09IQ

R0.

2817

0.24

690.

3105

0.20

060.

1685

0.19

820.

1559

0.13

300.

1596

0.10

410.

0882

0.10

200.

1066

0.08

890.

1122

0.06

760.

0597

0.06

99M

AE

0.20

370.

3086

0.18

340.

1244

0.17

110.

1149

0.12

490.

2431

0.10

750.

0672

0.13

080.

0597

0.09

270.

2066

0.07

330.

0485

0.10

830.

0421

J=8

Med

ian

0.65

600.

5048

0.71

290.

7346

0.63

560.

7577

0.68

910.

5500

0.72

920.

7521

0.66

050.

7675

IQR

0.16

780.

1477

0.17

670.

1172

0.09

530.

1094

0.11

470.

0946

0.12

240.

0708

0.06

250.

0752

MA

E0.

1468

0.29

520.

1098

0.07

710.

1644

0.06

410.

1113

0.25

000.

0826

0.05

240.

1395

0.04

39J=

10M

edia

n0.

6209

0.47

060.

7042

0.71

540.

6006

0.74

940.

6687

0.51

140.

7150

0.73

990.

6352

0.76

21IQ

R0.

1808

0.15

600.

2039

0.12

980.

1056

0.12

750.

1251

0.09

790.

1324

0.08

370.

0691

0.08

37M

AE

0.18

000.

3294

0.12

050.

0938

0.19

940.

0743

0.13

130.

2886

0.09

310.

0642

0.16

480.

0485

J=12

Med

ian

0.65

170.

4853

0.70

240.

7284

0.60

720.

7571

IQR

0.14

320.

1000

0.14

040.

0854

0.07

630.

0853

MA

E0.

1490

0.31

470.

1047

0.07

400.

1928

0.05

49J=

14M

edia

n0.

6325

0.46

510.

6945

0.71

640.

5864

0.74

74IQ

R0.

1426

0.10

180.

1550

0.09

350.

0796

0.09

45M

AE

0.16

750.

3349

0.11

340.

0845

0.21

360.

0622

Note

:W

eru

n1,0

00

rep

lica

tion

for

each

sam

ple

size

,u

nb

ala

nce

dn

ess

patt

ern

an

des

tim

ato

r.(∗

)T

he

thre

ees

tim

ato

rsare

equ

ivale

nt

for

bala

nce

dp

an

els

(save

for

min

or

nu

mer

ical

ap

pro

xim

ati

on

an

dro

un

din

gis

sues

).

21