Cluster Robust Se

download Cluster Robust Se

of 3

Transcript of Cluster Robust Se

  • 7/28/2019 Cluster Robust Se

    1/3

    Cluster-robust standard errorsClass Notes

    Manuel Arellano

    October 7, 2011

    Let be a parameter vector identified from the moment condition

    E (w, ) = 0

    such that dim() = dim (). Consider a sample W= {w1,...,wN} and a consistent estimator b thatsolves the sample moment conditions (up to a small order term).

    The function (w, ) may denote orthogonality conditions in a just-identified GMM estimation

    problem or the first-order conditions in an m-estimation problem. Alternatively (w, ) may be re-

    garded as a linear combination of a larger moment vector (w, ) in an overidentified GMM estimation

    problem, so that (w, ) = B0 (w, ).

    Assume that conditions for the following asymptotically linear representation of the scaled estima-

    tion error and asymptotic normality hold:

    Nb = D1

    0

    1N

    NXi=1

    (wi, ) + op (1)d N

    0, D1

    0V0D

    10

    0

    where D0 = E (w, ) /c0 is the Jacobian of the population moments and V0 is the limiting variance:

    V0 = limN

    V ar

    1N

    NXi=1

    (wi, )

    != lim

    N

    1

    N

    NXi=1

    NXj=1

    E (wi, ) (wj, )

    0

    .

    To get standard errors ofb we need estimates ofD0 and V0. A natural estimate ofD0 is bD = 1NPNi=1bDiwhere bdki = 12N h wi,b + Nk wi,b Nki is the kth column of bDi, for some small N > 0,and k is the kth unit vector. As for V0 the situation is different under independent or cluster sampling.

    Independent sampling Denote i = (wi, ) and bi = wi,b. If the observations areindependent E

    i

    0

    j

    = 0 for i 6= j so that

    V0 = limN

    1

    N

    NXi=1

    Ei

    0

    i

    .

    If they are also identically distributed V0 = Ei0i. Regardless of the latter a consistent estimateof V0 under independence is

    bV = 1N

    NXi=1

    bib0i.A consistent estimate of the asymptotic variance of b under independent sampling is:dV ar b = 1

    NbD1 1

    N

    NXi=1

    bib0i! bD1. (1)

    1

  • 7/28/2019 Cluster Robust Se

    2/3

    Cluster sampling We now obtain a consistent estimate of the asymptotic variance ofb whenthe sample consists ofH groups (or clusters) of Mh observations each (N = M1 + ... + MH) such that

    observations are independent across groups but dependent within groups, H and Mh is fixedfor all h. For convenience we order observations by groups and use the double-index notation whm so

    that W= {w11,...,w1M1 | ... | wH1,...,wHMH}.Under cluster sampling, letting h =

    PMhm=1 hm and

    eh = PMhm=1bhm we haveV0 = lim

    NV ar

    1N

    HXh=1

    h

    != lim

    N

    1

    N

    HXh=1

    Eh

    0

    h

    ,

    so that Ei

    0

    j

    = 0 only if i and j belong to different clusters, or E

    hm

    0

    h0m0

    = 0 for h 6= h0.

    Thus, a consistent estimate of V0 is

    eV =

    1

    N

    H

    Xh=1 ehe

    0

    h. (2)

    The estimated asymptotic variance ofb allowing unrestricted within-cluster correlation is thereforegV ar b = 1

    NbD1 1

    N

    HXh=1

    ehe0h! bD1. (3)

    Note that (3) is of order 1/H whereas (1) is of order 1/N.

    Examples A panel data example is

    (whm, ) = xhm

    yhm x0hm

    where h denotes units and m time periods. If the variables are in deviations from means b is thewithin-group estimator. In this case bDhm = xhmx0hm and eh = PMhm=1 xhmbuhm where buhm arewithin-group residuals. The result is the (large H, fixed Mh) formula for within-group standard errors

    that are robust to heteroskedasticity and serial correlation of arbitrary form in Arellano (1987):

    gV ar b = HXh=1

    MhXm=1

    xhmx0

    hm

    !1HXh=1

    MhXm=1

    MhXs=1

    buhmbuhsxhmx0hs

    HXh=1

    MhXm=1

    xhmx0

    hm

    !1. (4)

    Another example is a -quantile regression with moments

    (whm, ) = xhm 1 yhm x0hm where bDhm = bhmxhmx0hm, bhm = 1yhm x0hmb N / (2N),1 eh = PMhm=1 xhmbuhm and buhm =1

    yhm x0hmb . Cluster-robust standard errors for QR coefficients are obtained from:gV ar b = HX

    h=1

    MhXm=1

    bhmxhmx0hm!1

    HXh=1

    MhXm=1

    MhXs=1

    buhmbuhsxhmx0hs

    HXh=1

    MhXm=1

    bhmxhmx0hm!1

    . (5)

    1 This choice of eD corresponds to selecting an (i, k)-specific scaled N given by N/xik.

    2

  • 7/28/2019 Cluster Robust Se

    3/3

    Equicorrelated moments within-cluster The variance formula (3) is valid for arbitrary cor-

    relation patterns among cluster members. Here we explore the consequences for variance estimation

    of assuming that the dependence between any pair of members of a cluster is the same. Under the

    error-component structure

    Ehm0hs = ( +v if m = s if m 6= s

    we have Eh

    0

    h

    = Mhv + M

    2

    h. Letting M =1

    N

    PHh=1 M

    h, the limiting variance V0 becomes

    V0 = limH

    v +

    M2

    M1

    .

    The natural estimator ofv is the within-group variance bv. When all clusters are of the same size(Mh = M for all h), the natural estimator of is the between-group variance minus the within

    variance times 1/M:2

    b = 1H

    HXh=1

    1

    M2ehe0h 1Mbv.

    Noting that M2/M1 = M, it turns out that the plug-in estimate of V0 is identical to eV in (2):bv + Mb = 1

    HM

    HXh=1

    ehe0h.When cluster sizes differ, a method-of-moments estimator of is a weighted between-group variance

    minus the within variance times a weighted average of 1/Mh:

    b = 1H

    HXh=1

    wh 1M2h

    ehe0h 1Mhbvfor some weights wh such that

    PHh=1 wh = H.

    3 Once again, using wh = M2

    h/M2 as weights, the

    plug-in estimate of V0 is identical to eV:bv + M2

    M1b = 1

    HM1

    HXh=1

    ehe0hThe conclusion is that imposing within-cluster equicorrelation is essentially innocuous for the

    purpose of calculating cluster-robust standard errors.

    References

    Arellano, M. (1987): Computing Robust Standard Errors for Within-Group Estimators, Oxford

    Bulletin of Economics and Statistics, 49, 431-434.

    Arellano, M. (2003): Panel Data Econometrics, Oxford University Press.

    2 The within-group sample variance is ev = 1H(M11)

    SH

    h=1

    SMh

    m=1

    hm

    hMh

    hm

    hMh

    0

    . As for the between-

    group variance note that overall means are equal to zero ate (Arellano, 2003, section 3.1 on error-component estimation).3 For example, wh = 1, wh = Mh/M1 or wh = M

    2

    h/M2.

    3