Implementing robust estimation instruments for panel data … · 2019. 7. 31. · 1 Cahier de...

1

Cahier de recherche 2019-02

Implementing robust estimation instruments for panel data regression models with errors-in-variables

Francois-Eric Racicot Telfer School of Management, University of Ottawa,Ottawa, ON K1N 6N5, Canada

Affliate Research Fellow, IPAG Business School, Paris, France; Corporate Reporting Chair, ESG UQAM

E-mail: [email protected]

mailto:[email protected]

2

Abstract1

Econometricians have long recognized the need to account in some way for measurement

errors, specification errors and endogeneity to ensure that the ordinary least squares (OLS)

estimator is consistent. This paper introduces a new Generalized Method of Moments

(GMM) estimator that relies on robust instruments to estimate panel data regression models

containing errors in variables. We show how this GMM approach can be generalized for

the panel data framework using higher moments and cumulants as instruments. The new

instruments, engineered for greater robustness, are proposed to tackle the pervasive

problem of weak instruments.

Keywords: EViews code; fixed effects; GMM; panel data; random effects; robust

instruments

1This article is a revised version of Racicot (2015).

3

1. Introduction

A large proportion of the data used in empirical economics and finance contains errors in

variables. This phenomenon is not new and has been recognized for at least the last 40

years. The literature maps out areas in economics where these errors are most likely to

appear. Researchers argue that the problem of measurement errors is relatively more

important in macroeconomic data, but also present in micro-economic studies

(Morgenstern, 1972; Langanskens and Van Rickeghem, 1974). As noted by Morgenstern

(1972), empirical economics and finance has not tackled this phenomenon systematically.

While the seminal Generalized Method of Moments (GMM) that was developed by Hansen

(1982) provided a potent solution to the problem of errors in variables, the problem of weak

instruments has more recently put in doubt the applicability of the method. A line of

research going back to the early 1990s has shown that when instrumental variables are

weak, the two-stage least squares (2SLS) estimator is inconsistent (Nelson and Starz,

1990a,b; Bound, Jaeger, and Baker, 1995; Hahn and Hausman, 2003). In response, Racicot

(1993, 2014)2, Racicot and Theoret (2014)3 have developed a method that can generate

more robust instruments. They show how robust instruments can be engineered4 using a

Bayesian averaging process (Theil and Goldberger, 1961) of two generalized versions of

Durbin (1954) and Pal (1980) higher moment estimators. Furthermore, the robustnest of

these instruments holds up in the context of a time-series application and is confirmed in a

test analogous to the Staiger and Stock (1997)5 test.

The purpose of this article is to propose an improved GMM estimator for estimating

panel data regression models. To engineer this estimator, we extend our new robust

instruments relying on existing empirical and theoretical results about their performance in

the presence of measurement errors (Racicot, 1993, 2013; Racicot and Théoret, 2012,

2014). We begin with an explanation of the general framework of panel data estimation

2See also Dagenais and Dagenais (1994). 3See also Racicot and Rentz (2015) and Racicot et al. (2018) for applications of this method to the Pástor

and Stambaugh model. 4This terminology comes from the field of financial engineering, where the term “technology” is also used

in an analogous context (e.g., Neftci, 2008). We define econometric engineering as the process of blending

existing methods (i.e., technology) guided by theoretical results (i.e., theorems and lemmas). Podivinsky

(1990) uses this terminology in another context. 5See also Stock and Watson (2011).

4

with a focus on the two most commonly used models, namely, the fixed effects model and

the random effects model. We show that in this framework, which involves the problem of

unobserved heterogeneity, measurement errors may be exacerbated when not tackled

properly. The solution of first-differencing to remove unobserved heterogeneity may be

one cause. However, it should be noted that the fact that measurement errors could be

attenuated by using the panel data framework can happen only by chance (Arrenallo, 2003).

Subsequently, we show how the problem of errors in variables can be tackled using a two-

step approach based on the two-stage least squares (2SLS) to implement the standard GMM

estimator in the panel data regression model. Finally, we show how to implement our new

robust instruments, initially applied in a simple regression framework and then extended

to the GMM estimator for panel data framework. We refer to this new estimator that uses

the distance instruments (i.e., the d instruments) as the GMMd.

The remainder of the paper is organized as follows. Section 2 provides a review of the

basic panel data framework in the context of errors in variables. Section 3 discusses the

standard GMM approach for the panel data framework and how to implement our new

distance instruments in that context. Section 4 concludes.

2. Errors in variables and the panel data framework

Two main approaches are used in a panel data setting: the fixed effects model and the

random effects model. These models are used to tackle the problem of heterogeneity,

reflecting differences in behavior across individuals. This “individual effect” refers to the

specific characteristics of firms over time that cannot be modeled by the same explanatory

variables. This effect can be captured by the constant term.

The basic framework is a regression that can be expressed as6:

𝑦𝑖𝑡 = 𝒙𝑖𝑡′𝜷 + 𝒛𝑖′𝜶 + 𝜀𝑖𝑡

= 𝒙𝑖𝑡′𝜷 + 𝛼𝑖 + 𝜀𝑖𝑡 (1)

where 𝛼𝑖 = 𝒛𝑖′𝜶, including all the observable effects and, 𝒙𝑖𝑡

′ = (𝑥𝑖𝑡1, 𝑥𝑖𝑡2, … 𝑥𝑖𝑡𝑘), 𝜷′ =

(𝛽1, 𝛽2, … 𝛽𝑘). In model (1), the constant term 𝛼𝑖 corresponds to the fixed effects

approach. Note that if zi is unobserved and correlated with xit, the OLS estimator is biased

6Here we follow Greene (2014, 2018). For an introduction to the panel data regression models with

applications see Gujarati and Porter (2009), chap. 16.

5

and inconsistent. This is the omitted-variable bias. One basic feature of the fixed effects

model is that only the constant term varies across equations while 𝜷 is considered to be

fixed7. This model (eq. 1) can be written into the dummy variable representation as follows:

𝑦𝑖𝑡 = 𝒙𝑖𝑡′𝜷 + ∑ 𝛼𝑗𝑑𝑖𝑗𝑡𝑁𝑗=1 + 𝜀𝑖𝑡 (2)

where dijt = 1 for i=j, is a dummy variable. This representation can be further reduced in

its matrix form. Assume that yi and Xi are T observations for the ith unit (or the ith firm)

and 𝜺𝑖 is a vector for disturbances of dimension T x 1:

𝒚𝑖 = 𝑿𝑖𝜷 + 𝒅𝑖𝛼𝑖 + 𝜺𝑖

= 𝑿𝑖𝜷 + 𝒊𝛼𝑖 + 𝜺𝑖 (3)

where i is a Tx1 vector of ones and 𝑿𝑖 = (𝒙𝑖1, 𝒙𝑖2, … , 𝒙𝑖𝑁). If we stack the vectors (matrix)

of variables as 𝒚′ = (𝒚1, 𝒚2, … , 𝒚𝑁), 𝑿′ = (𝑿1, 𝑿2, … , 𝑿𝑁), 𝒅′ = (𝒅1, 𝒅2, … , 𝒅𝑁), 𝜶′ =

(𝛼1, 𝛼2, … , 𝛼𝑁), and 𝜺′ = (𝜺1, 𝜺2, … , 𝜺𝑁) then the basic fixed effects model (eq. 3) can be

written in its stacked form :

𝒚 = 𝑿𝜷 + 𝑫𝜶 + 𝜺 (4)

where 𝑫 = (𝒅1, 𝒅2, … , 𝒅𝑁). More precisely, D may be represented as

D = (

𝒅𝟏 𝟎 𝟎 𝟎𝟎 𝒅𝟐 ⋯ 𝟎⋮ ⋮ ⋱ ⋮𝟎 𝟎 ⋯ 𝒅𝑁

).

Equation (4) refers to the least squares dummy variable (LSDV) model. This a standard

regression model, so the basic estimation approach should apply. By regrouping the error

term and the matrix of dummy variables, (4) can be rewritten as

𝒚 = 𝑿𝜷 + 𝜺∗ (5)

In this form, equation (5) has some degree of endogeneity because 𝑝𝑙𝑖𝑚1

𝑁𝑇𝑿′𝜺∗ ≠ 0. This

implies that X is endogenous, given that it is is correlated with 𝜺∗, due to D. Applying OLS

to (5) will make it possible to estimate the parameters consistently and without bias. The

usual solution to this problem is to transform (5) in deviation from the mean so that

𝑝𝑙𝑖𝑚1

𝑁𝑇𝑿∗′𝜺∗ = 𝑝𝑙𝑖𝑚

1

𝑁𝑇𝑿′𝑴𝐷𝜺 = 0 where 𝑴𝐷 = 𝑰 − 𝑫(𝑫

′𝑫)−𝟏𝑫′ which can be

represented as :

7A general framework that permits 𝜷 to vary across equation is the SUR model. This approach is left for

further research.

6

MD = (

𝑴0 𝟎 𝟎 𝟎𝟎 𝑴0 ⋯ 𝟎⋮ ⋮ ⋱ ⋮𝟎 𝟎 ⋯ 𝑴0

)

where 𝑴0 = 𝑰𝑇 −1

𝑇𝒊𝒊′. So, transforming (4) in deviation from the mean using MD

removes the constant term, and applying OLS on the transformed model yield (Greene,

2018)

�̂� = (𝑿′𝑴𝐷𝑿)−𝟏(𝑿′𝑴𝐷𝒚) = 𝜷

𝒘𝒊𝒕𝒉𝒊𝒏 (6)

This result is referred to as the within estimator. It is an unbiased and consistent estimator.

We can thus conclude that using X* = MDX as an instrument is a valid solution to this

particular endogeneity problem. Another approach to remove the latent heterogeneity is to

transform the model into first-difference form : ∆𝑦𝑖𝑡 = (∆𝒙𝑖𝑡)′𝜷 + ∆𝛼𝑖 + ∆𝜀𝑖𝑡 =

(∆𝒙𝑖𝑡)′𝜷 + 𝜀𝑖𝑡 − 𝜀𝑖𝑡−1 = (∆𝒙𝑖𝑡)

′𝜷+ 𝑢𝑖𝑡. But when the X contains measurement errors,

these solutions may be no longer be valid, as one basic assumption of this setting is:

𝐸(𝜀𝑖𝑡|𝒙𝑖𝑡, 𝛼𝑖) = 0 (𝑖 = 1, … ,𝑁; 𝑡 = 1,… , 𝑇)

In fact, it can be shown that the measurement error8 bias may be exacerbated when first-

differencing is used to remove the unobserved heterogeneity. In such a case, for a two-

period panel9 (T = 2), the OLS bias is (Arellano, 2003)

𝐶𝑜𝑣(∆𝑦𝑖2,∆𝑥𝑖2)

𝑉(∆𝑥𝑖2)=

𝛽

1+𝜆Δ (7)

where xi2 is assumed to be an explanatory variable from a single regressor model, to

simplify the discussion and 𝜆Δ = 𝑉(∆𝜀𝑖2)/𝑉(∆�̃�𝑖2) where �̃�𝑖2 is the unobservable

explanatory (i.e. from 𝑥𝑖𝑡 = �̃�𝑖𝑡 + 𝑣𝑖𝑡, the classical errors in variables problem in a panel

data setting10.

To see how first-differencing may exacerbate the measurement error bias, consider the

following. If we assume that 𝑣𝑖𝑡 is an iid error, then 𝑉(∆𝜀𝑖2) = 2𝜎𝜀2. Now assume also that

�̃�𝑖𝑡 is iid; then 𝜆Δ = 𝜆, implying that the measurement error bias in difference and in level

8See Baltagi (2001) and Arenallo (2003) for a discussion of the impact of errors in variables on the

estimation process of panel data regression models. 9Two period panels involve, for instance, a before and after treatment. The treatment effect — the change

in an outcome variable —may be studied in that context (Greene, 2018). 10See Racicot (2014) or Racicot and Theoret (2014) for the case of a single time-series estimation.

7

will be equivalent. But when �̃�𝑖𝑡 is a stationary time series with positive autocorrelation, it

can be shown that (Arellano, 2003)

𝑉(∆�̃�𝑖2) = 2 × (�̃�2 − 𝐶𝑜𝑣(�̃�1𝑡, �̃�2𝑡)) < 2�̃�

2 (8)

imlying that 𝜆Δ> 𝜆.

Now, for the case of the standard random effects model, a simple reformulation of

(1) yields

𝑦𝑖𝑡 = 𝒙𝑖𝑡′ 𝜷 + (𝛼 + 𝑢𝑖) + 𝜀𝑖𝑡 (9)

where 𝑢𝑖 = 𝒛𝑖′𝜶 − 𝑬(𝒛𝑖

′𝜶) is the random heterogeneity of the ith firm added to 𝛼, the

constant term. 𝑢𝑖 can viewed as a set of factors for the ith firm, 𝒛𝑖′𝜶, that are not in the

regression and are specific to that firm. In its basic form, model (9) assumes strict

exogeneity. Note that one way to remove this heterogeneity is by transforming the model

in the deviation (Baltagi, 2001; Arenallo, 2003; Greene, 2018) from the group mean, that

is

𝑦𝑖𝑡 − �̅�𝑖 = (𝛼 + 𝑢𝑖) − (𝛼 + 𝑢𝑖) + (𝒙𝑖𝑡′ − �̅�𝒊)

′𝜷 + 𝜀𝑖𝑡 − 𝜀�̅�

= (𝒙𝑖𝑡′ − �̅�𝒊)

′𝜷 + 𝜀𝑖𝑡 − 𝜀�̅� (10)

where �̅�𝑖 = (𝛼 + 𝑢𝑖) + �̅�𝒊′𝜷 + 𝜀�̅�, i=1,…, N. In this setting, the LSDV estimator is a

consistent estimator of 𝜷11. This estimator has the virtue of being robust to specification

errors12. However, the approach, while instructive, is like the OLS estimator: not efficient.

An efficient GLS exists and this is the preferred method to estimate (9). The GLS estimator

is as follows (Greene, 2018)

�̂� = (𝑿′𝛀−𝟏𝑿)−𝟏𝑿′𝛀−𝟏𝒚 = (∑ 𝑿𝒊′𝒏𝒊=𝟏 𝚺−𝟏𝑿𝒊)

−𝟏(∑ 𝑿𝒊′𝒏𝒊=𝟏 𝚺−𝟏𝒚𝒊) (11)

where 𝛀 = (𝑰𝑵⨂𝚺), 𝚺−𝟏/𝟐 =

1

𝜎𝜀(𝑰 −

𝜽

𝑻𝐢𝑻𝐢𝑻′) and, 𝜃 = 1 −

𝜎𝜀

√𝜎𝜀2+𝑇𝜎𝑢

2. The transformation of

the dependent explanatory variables used for the GLS is obtained by multiplying these

variables by 𝚺−𝟏/𝟐. The GLS estimator can be shown to be, like the pooled OLS, a weighted

average (matrix) of the within and between-units estimators (Greene, 2018):

�̂� = �̂�𝑤𝑖𝑡ℎ𝑖𝑛𝒃𝑤𝑖𝑡ℎ𝑖𝑛 + �̂�𝑏𝑒𝑡𝑤𝑒𝑒𝑛𝒃𝑏𝑒𝑡𝑤𝑒𝑒𝑛 (12)

11The group mean deviations constitutes the LSDV approach. 12That is, if we wrongly choose the random or the fixed effects model, LSDV estimator remains consistent.

8

where �̂�𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = 𝑰 − �̂�𝑤𝑖𝑡ℎ𝑖𝑛, �̂�𝑤𝑖𝑡ℎ𝑖𝑛 = (𝑺𝑥𝑥𝑤𝑖𝑡ℎ𝑖𝑛 + 𝜆𝑺𝑥𝑥

𝑏𝑒𝑡𝑤𝑒𝑒𝑛)−1𝑺𝑥𝑥𝑤𝑖𝑡ℎ𝑖𝑛, 𝜆 = (1 − 𝜃)2.

From this, it can be seen that the inefficiency of the ordinary least squares, that is when

𝜆=1, come from the fact that it puts too much weight on the between-unit variation

compared to the GLS estimator. In practice, with 𝚺 rarely known, one remedy is to rely on

an estimated version (10) to remove heterogeneity to obtain an estimator of 𝜎𝜀2 (Greene,

2018)

�̂��̂�2 = �̂�𝐿𝑆𝐷𝑉

2 =∑ ∑ (�̂�𝑖𝑡−�̂�𝑖)

2𝑇𝑡=1

𝑁𝑖=1

𝑁𝑇−𝑁−𝐾 (13)

where 𝜀�̂�𝑡 and 𝜀�̂� are, respectively, the estimated residuals and the mean over the time period

of the estimated residuals, both obtained from (10). An estimation of 𝜎𝑢2 can be obtained

by applying OLS on (9)—this is the pooled model—and then computing the estimated

residuals13. The estimator in equation (11) is unbiased and consistent, provided matrix X is

uncorrelated to the errors. Otherwise, as in the case of the fixed effects, there is no

guarantee that the errors in the explanatory variables and the unobserved heterogeneity

would cancel out. For that matter, Arellano (2003) uses a simple cross-sectional setting to

show how this attenuation of the bias14 may happen by chance.

Our purpose here is to propose a parsimonious approach to tackle measurement

errors or the endogeneity of the explanatory X based on the generalized method of

moments. This method has the virtue of freeing the analyst from having to choose between

one instrument and another. As the literature has well established (e.g., Greene, 2018),

weak instruments present a perverse problem. Choosing the wrong instruments may result

in increasing the problem one hoped to confront in the first place. That is, it may transform

the estimator into a biased and inconsistent one. For example, it may bias the two-stage

least squares estimator toward the OLS15. Also, it will render the basic framework for

statistical inference inappropriate (Nelson and Startz, 1990a,b; Hahn and Hausman, 2003).

3. The GMMd approach in the panel data framework

Assume that we have the following equation to estimate using panel data

13A alternative approach is provided by computing �̂�𝑢

2 = �̂�𝑃𝑜𝑜𝑙𝑒𝑑2 − �̂�𝐿𝑆𝐷𝑉

2 . 14Not to be confused with the well-known bias due to errors in variables called attenuation which results in

an estimator that tends to 0. 15The OLS estimator is biased and inconsistent in that context.

9

𝑦𝑖𝑡 = 𝒙𝑖𝑡′𝜷 + 𝜀𝑖𝑡 (14)

where no fixed effects is shown explicitly, as in equation (5) where no transformation has

been applied. We also assume there are errors in the explanatory variables which take the

following form:

𝑥𝑖𝑡 = �̃�𝑖𝑡 + 𝑣𝑖𝑡 (15)

where �̃�𝑖𝑡 is the unobserved explanatory variables measured with errors 𝑣𝑖𝑡. Taking the

first difference of equation (14) obtains16:

𝑦𝑖𝑡 − 𝑦𝑖𝑡−1 = (𝒙𝑖𝑡 − 𝒙𝑖𝑡−1)′𝜷 + (𝜀𝑖𝑡 − 𝜀𝑖𝑡−1)

= (𝒙𝑖𝑡 − 𝒙𝑖𝑡−1)′𝜷 + 𝜉𝑖𝑡 (16)

The first step in that framework is to apply the 2SLS on (16) assuming a matrix of

instrumental variables Z, resulting in (Greene, 2018)17:

�̂�2𝑆𝐿𝑆 =

[(∑ 𝑿𝒊′𝒁𝒊𝑁𝑖=1 )(∑ 𝒁𝒊′𝒁𝒊

𝑁𝑖=1 )−𝟏(∑ 𝒁𝒊′𝑿𝒊

𝑁𝑖=1 )]−1[(∑ 𝑿𝒊′𝒁𝒊

𝑁𝑖=1 )(∑ 𝒁𝒊′𝒁𝒊

𝑁𝑖=1 )−1(∑ 𝒁𝒊′𝒚𝒊

𝑁𝑖=1 )]

(17)

The second step consists of forming the weighting matrix (�̂�) for the GMM estimator based

on the estimated residuals of (16):

�̂� =1

𝑁2∑ 𝒁𝒊′�̂�𝒊�̂�𝒊′𝒁𝒊𝑁𝑖=1 (18)

Substituting (18) in the criterion or GMM estimation gives

𝑞 = (1

𝑁∑ �̂�𝒊′𝒁𝒊𝑁𝑖=1 ) �̂�−𝟏 (

1

𝑁∑ 𝒁𝒊′�̂�𝒊𝑁𝑖=1 ) (19)

Finally, minimizing (19) for parameters 𝜷 yields

�̂�𝐺𝑀𝑀 = [(∑ 𝑿𝒊′𝒁𝒊𝑁𝑖=1 )�̂�−𝟏(∑ 𝒁𝒊′𝑿𝒊

𝑁𝑖=1 )]−1[(∑ 𝑿𝒊′𝒁𝒊

𝑁𝑖=1 )�̂�−1(∑ 𝒁𝒊′𝒚𝒊

𝑁𝑖=1 )] (20)

where the asymptotic variance-covariance matrix of estimator �̂�𝐺𝑀𝑀 is given by :

𝐸𝑠𝑡. 𝐴𝑠𝑦. 𝑉(�̂�𝐺𝑀𝑀) = [(∑ 𝑿𝒊′𝒁𝒊𝑁𝑖=1 )�̂�−𝟏(∑ 𝒁𝒊′𝑿𝒊

𝑁𝑖=1 )]−1 (21)

In this context, our estimator (Racicot, 1993; Racicot, 2013; Racicot and Théoret (2012),

Racicot and Théoret, 2014) is obtained by replacing the Z instruments by our new robust

instruments which can be qualified as strong instruments. These new instruments – the d

“distance” instruments – can be computed using a sort of matrix-weighted average by

16See appendix for an EViews implemtation of our robust intruments for the fixed effects model. 17See also Arenallo (2003) for the a presentation of the GMM in a panel data context. Racicot and Théoret

(2001, 2008) present the basics of the GMM approach with applications to finance.

10

applying the Generalized Least-Squares (GLS) to a combination of two robust estimators,

namely the Durbin (1954) and Pal (1980) estimators. These estimators are respectively

defined, in there multivariate representation, by (Racicot, 1993, 2014):

𝜷𝐷 = (𝒛1′𝒙)−1(𝒛1

′𝑦) (Durbin) (22)

𝜷𝑃 = (𝒛2′ 𝒙)−1(𝒛2

′ 𝑦) (Pal) (23)

where 𝒛1 = [𝑥𝑖𝑗2 ], 𝒛2 = 𝒛3 − 3𝐷(𝒙

′𝒙/𝑁)𝒙′, 𝒛3 = [𝑥𝑖𝑗3 ], and D(x’x/N) = x’x/N Ik. In

short, the instruments are obtained by taking the matrix of explanatory variables (X) in

deviation from its mean (x). The second and third power of the de-meaned variables (x) is

then computed. This is akin to computing the second and third moments of the explanatory

variables. Next, we obtain the weighted estimator (𝜷𝐻) by an application of the GLS to the

following combination (Racicot, 1993, 2014):

𝜷𝐻 = 𝑾(𝜷𝐷𝜷𝑃) (24)

where 𝑾 = (𝑪′𝑺−1𝑪)−1𝑪′𝑺−1 is the GLS weighting matrix, S is the covariance matrix of

(𝜷𝐷𝜷𝑃) under the null hypothesis (i.e., no measurement errors), and 𝑪 = (

𝑰𝑘𝑰𝑘) is a matrix of

two staked identity matrices of dimension k. Note that this weighting approach, which

relies on GLS as the weighting matrix, is optimal in the Aitken (1935) sense18. However,

we opt for the GMM method to weight the Durbin and Pal’s estimators. We consider this

to be a more efficient procedure than the one used by Dagenais and Dagenais (1997)19 in

that we rely on the asymptotic properties of the GMM estimator with respect to the

correction of heteroskedasticity and autocorrelation to weight the instruments obtained

with GLS. Note that when using GMM, we give up some efficiency gain in order to avoid

completely specifying the nature of the autocorrelation or heteroskedasticity of the

innovation and the DGP of the measurement errors (Hansen 1982). Again, we consider this

a great advantage over the GLS estimator.

18Note that we use W as a weighting matrix in the GLS estimator (equation 24). As well-known, this matrix

can be replaced by the White (1980) or the Newey-West (1987) HAC asymptotically consistent variance-

covariance matrix. Racicot (1993, 2014) discusses the properties of the estimator in the context of the White

(1980) matrix. This estimator is named βE. In this article, we rely on the HAC matrix. For the problem of

cross-sectional correlation (or spatial correlation) see Driscoll and Kraay (1998). 19For an application of Dagenais and Dagenais’ higher moments estimator to the Fama and French model,

see Racicot and Coën (2007).

11

Now, the computation of the d instruments is undertaken with the following

equation (Racicot and Théoret, 2012, 2014):

𝑑𝑖𝑡 = 𝑥𝑖𝑡 − �̂�𝑖𝑡 (25)

(25) may be considered as filtered version of the endogenous variables. It removes some

of the nonlinearities embedded in the 𝑥𝑖𝑡. (25) is thus a smoothed version of the 𝑥𝑖𝑡 which

might be regarded as a proxy for its long-term expected value, the relevant variables in the

asset pricing models being theoretically defined on the explanatory variables’ expected

values. To compute the ˆitx in (25), we perform the following regression using the z

(cumulant) instruments:

𝑥𝑖𝑡 = 𝛾0 + 𝒛�̂� + 𝜉𝑡 (26)

which amounts to running a polynomial adjustment on each explanatory variable. The z

instruments are another version of the Durbin and Pal estimators which are given by

(Dagenais and Dagenais, 1994):

x x (27)

and

x x x - 3x [D Ik] (28)

where x is the matrix of the explanatory variables expressed in deviation from their mean,

D Ik = diag(x’x/N), is a diagonal matrix, and where the symbol stands for the Hadamard

product, an element-by-element matrix multiplication operator, and Ik is an identity matrix

of dimension k.

The GMM estimator for estimating panel data regression models is obtained by

replacing Z in (19) by our robust instruments d :

�̂�𝐺𝑀𝑀𝑑 = [(∑ 𝑿𝒊′𝒅𝒊𝑁𝑖=1 )�̂�−𝟏(∑ 𝒅𝒊′𝑿𝒊

𝑁𝑖=1 )]−1[(∑ 𝑿𝒊′𝒅𝒊

𝑁𝑖=1 )�̂�−1(∑ 𝒅𝒊′𝒚𝒊

𝑁𝑖=1 )] (29)

(29) represents a generalization of Racicot and Théoret’s (2012, 2014) approach. It should

be noted, however, that in some applications, simplifications may be obtained. For

instance, consider the case of the SUR20 models with repeated X, where the GLS estimator

reduces to applying OLS on each equation. Note also that our approach generates the same

number of instruments (d) as there are regressors. Further simplifications of (29) might

20 SUR stands for seemingly unrelated regression (Zellner, 1962).

12

thus be obtained analogously to the case of the estimation of an equation that is exactly

identified. In that case, it can be shown that the 2SLS estimator reduces to the indirect least

squares estimator21.

Since this impetus for this article is theoretical, we are interested in the applicability

of equation (29) in other contexts. We considered, for instance, the field of research of

corporate governance studies, a field of research that relies frequently on the panel data

framework. Ammann et al. (2013) report several studies where the Tobin’s Q — a measure

of firm value — is used as the dependent variable as a function of explanatory variables

like, for instance, the log of total assets, the ratios PPE/sales and EBIT/sales, a measure of

leverage, etc. Most of these studies report using the fixed effects model (e.g., Ammann et

al., 2011, Cramers and Ferell, 2011). Moreover, Chhaochharia and Laeven (2009) use the

GMM estimation method with the same framework, while Aggarwal et al. (2009) rely on

the OLS and IV estimation methods. So it would seem that our approach given by equation

(29) could be quite useful not only for particular fields of finance, but also to any field of

research using the panel data regression models.

4. Conclusion

We have demonstrated how our new approach based on higher moments and cumulants

(Racicot, 1993, 2014; Racicot and Théoret, 2012, 2014) can not only be used to deal with

the problem of errors in variables, but also generalized to tackle the estimation of panel

data regression models. We discussed the problem of weak instruments in the context of

the panel data framework and showed how our new robust instruments can be used in

conjunction with the GMM method to obtain a more reliable estimation method. We also

provided an example of where this method can be useful. We believe our estimation

approach has the potential to benefit the field of corporate finance, where commonly used

variables may contain significant measurement errors. We also believe that the field of

accounting could benefit from our method since some of the models used in that field

21 See Fomby et al. (1984), Theorem 21.2.1.

13

involve the use of the 2SLS method (e.g., Dobler, 2014). For instance, accrual models are

based on explanatory variables that contain significant measurement errors (e.g., Calmès,

Cormier, Racicot and Théoret, 2013a,b). More broadly, any field that employs econometric

methods like the panel data regression models could benefit from the bias reduction of the

approach presented in this article.

Further research is needed to investigate how our proposed approach behaves in

different panels or SUR frameworks22. For instance, we have deliberately omitted the case

of spatial correlation to simplify the generalization our approach. There are also some

simplifications that arise when using repeated explanatory variables. For example, in the

case of the SUR model with repeated explanatory variables, the SUR model reduces to the

application of OLS on each equation separately.

Acknowledgements

I thank William F. Rentz for useful comments. Financial support from the IPAG Business

School and the Social Sciences and Humanities Research Council (SSHRC) of Canada is

gratefully acknowledged.

References

Aggarwal, R., Erel, I., Stulz, R., Williamson, R. (2009) Differences in governance

practice between U.S. and foreign firms: measurement, causes, and consequences,

Review of Financial Studies, 22(8), 3131-3169.

Aitken, A.C. (1935) On least squares and linear combinations of observations,

Proceedings of the Royal Statistical Society, 55, 42-48.

Ammann, M., Oesch, D., Schmid, M. (2013) The construction and valuation effect of

corporate governance indices, In Bell, A.R., Brooks, C., Prokopczuk, M., Eds., Handbook

of Research Methods and Applications in Empirical Finance, Edward Elgar, 314-340.

22In a simultaneous equations context, da Graça and Masson (2014) use a three-stages least squares

estimator for investigating a structural event study of M&As effects. Our instruments may be useful in that

context or a similar one to improve to the robustness of the estimation process.

14

Ammann, M., Oesch, D., Schmid, M. (2011) Corporate governance and firm value:

international evidence, Journal of Empirical Finance, 18(1), 36-55.

Arellano, M. (2003) Panel Data Econometrics. Advanced Text in Econometrics, Oxford

University Press.

Baltagi, B. H. (2001) Econometric Analysis of panel Data, 2nd ed., Wiley.

Bound, J., Jaeger, D., Baker, R. (1995) Problems with instrumental variables estimation

when the correlation between the instruments and the endogenous explanatory variables

is weak, Journal of the American Statistical Association, 90, 443-450.

Calmès, C., Cormier, D., Racicot, F.E., Théoret, R. (2013a) Accruals, errors-in-variables,

and Tobin's q, Atlantic Economic Journal, 41(2), 193-195.

Calmès, C., Cormier, D., Racicot, F.E., Théoret, R. (2013b) Firms' Accruals and Tobin's

q. Aestimatio, the IEB International Journal of Finance , 6, 20-49.

Carhart, M.M. (1997) On persistence in mutual fund performance, Journal of Finance, 52

(1), 57-82.

Chhaochharia, V., Laeven, L. (2009) Corporate governance norms and practices, Journal

of Financial Intermediation, 18(3), 405-431.

Cremers, M., Ferrell, A. (2011) Thirty years of governance: firm valuation and stock

returns, Working Paper, Yale School of Management.

Dagenais, M.G., Dagenais, D.L. (1994) GMM estimators for linear regression models

with errors in the variables, Working Paper No. 94-04, CRDE, University of Montreal.

Dagenais, M.G., Dagenais, D.L. (1997) Higher moments estimators for linear regression

models with errors in the variables, Journal of Econometrics, 76(1-2), 193-221.

da Graça, T., Masson, R. (2014) A structural event study for M&As: an application in

corporate governance, Working Paper, Department of business administration, University

of Quebec – Outaouais (UQO).

Dobler, M. (2014) Auditor-provided non-audit services in listed and private family firms,

Managerial Auditing Journal, 29(5), 427-454.

Driscoll, J.C., Kraay, A.C. (1998) Consistent covariance matrix estimation with spatially

dependent panel data, Review of Economics and Statistics, 80(4), 549-560.

Durbin, J. (1954) Errors in variables, International Statistical Review, 22, 23-32.

15

Fomby, T. B., Hill, R.C., Johnson, S.R. (1984) Advanced Econometric Methods,

Springer-Verlag.

Fama, E.F., French, K.R. (1993) Common risk factors in the returns on stocks and bonds,

Journal of Financial Economics, 33, 3-56.

Greene, W.H. (2014) Class notes on the econometric analysis of panel data, Class 9,

Stern School of Business, New York University, Winter 2014.

Available at http://people.stern.nyu.edu/wgreene/Econometrics/PanelDataNotes.htm

Greene, W.H. (2018) Econometric Analysis, 8th ed., Pearson.

Gujarati, D., Porter, D. (2009) Basic Econometrics, 5th ed., McGraw-Hill.

Hahn, J., Hausman, J. (2003) Weak instruments: diagnosis and cures in empirical

economics, American Economic Review, 93, 118-125.

Hansen, L. (1982) Large sample properties of the generalized method of moments

estimators, Econometrica, 50, 1029-1054.

Langasken, Y., Van Rickeghem, M. (1974) A new method to estimate measurement errors

in national income account statistics: the Belgian case, International Statistical Review, 42

(3), 283-290.

Morgenstern, O. (1972) L'illusion statistique: précision et incertitude des données

économiques, Paris : Dunod.

Neftci, S. (2008) Principles of Financial Engineering, 2th ed., Academic Press.

Nelson, C., Startz, R. (1990a) Some further results on the exact small sample properties of

the instrumental variables estimator, Econometrica, 58(4), 967-976.

Nelson, C., Startz, R. (1990b) The distribution of the instrumental variables estimator and

its t-ratio with the instrument is a poor one, Journal of Business, 63(1), S125-S140.

Newey, W., West, K. (1987) A simple positive semi-definite, heteroskedasticity and

autocorrelation consistent covariance matrix, Econometrica, 55, 703-708.

Pal, M. (1980) Consistent moment estimators of regression coefficients in the presence of

errors in variables, Journal of Econometrics, 14, 349-364.

Podivinsky, J.M. (1990) Econometric engineering : an update for PC Give, Journal of

Economic Surveys, 4(1), 109-113.

Racicot, F.E., Rentz, W.F., Théoret, R. (2018) Testing the new Fama and French factors

with illiquidity: A panel data investigation, Finance, 39(3), 45-102.

http://people.stern.nyu.edu/wgreene/Econometrics/PanelDataNotes.htm

16

Racicot, F.E. (2015) Engineering robust instruments for panel data regression models

with errors in variables : a note, Applied Economics, 47(10), 981-989.

Racicot, F.E., Rentz, W.F. (2015) The Pástor-Stambaugh empirical model revisited:

Evidence from robust instruments, Journal of Asset Management, 16(5), 329-341.

Racicot, F.E., Théoret, R. (2014) Cumulant instrument estimators for hedge fund return

models with errors in variables, Applied Economics, 46(10), 1134-1149.

Racicot, F.E. (2014) Erreurs de mesure sur les variables économiques et financières, La

Revue des Sciences de Gestion, 267-268 (3/4), 79-103.

Racicot, F.E., Théoret, R. (2012) Optimally weighting higher-moment instruments to deal

with measurement errors in financial return models, Applied Financial Economics,

22(14), 1135-1146.

Racicot, F.E., Théoret, R. (2008) The Econometric Analysis of Hedge Fund Returns: An

Errors-in-Variables Perspective, Santa Cristina (Spain): Netbiblo.

Racicot, F.E., Coën, A. (2007) Capital asset pricing models revisited: evidence from

errors in variables, Economics Letters, 95(3), 443-450.

Racicot, F.E., Théoret, R. (2001) Traité d’économétrie financière : modélisation

financière, Québec : Presses de l’Université du Québec (PUQ).

Racicot, F.E. (1993) Techniques alternatives d’estimations et tests en présence d’erreurs

de mesure sur les variables explicatives, M.Sc. thesis, Department of Economics,

University of Montreal.

Available at https://papyrus.bib.umontreal.ca/xmlui/handle/1866/1076;jsessionid=0A6E7013F78B1BBCF5B06D538D1BD0E5

Stock, J., Watson, M. (2011) Introduction to Econometrics, 2nd ed., Pearson.

Theil, H., Goldbeger, A.S. (1961) On pure and mixed statistical estimation in economics,

International Economic Review, 2, 65-78.

White, H. (1980) A heteroscedasticity-consistent covariance matrix estimator and a direct

test for heteroscedasticitty, Econometrica, 48, 817-838.

Zellner, A. (1962) An efficient method of estimating seemingly unrelated regressions

and tests of aggregation bias, Journal of the American Statistical Association, 57, 500-

509.

Appendix

http://www.ruor.uottawa.ca/handle/10393/31745

http://www.ruor.uottawa.ca/handle/10393/31745

https://papyrus.bib.umontreal.ca/xmlui/handle/1866/1076;jsessionid=0A6E7013F78B1BBCF5B06D538D1BD0E5

17

This appendix shows the implementation process of our robust instrumental

variables (IV) approach for estimating panel data regression model with errors in

variables, relying on the robust instruments described in Racicot (2015). To illustrate our

methodology, we use the Fama-French 12 industry portfolios and the the four-factor

model of Fama-French-Carhart (1993, 1997).

The implementation of our robust IV method for the fixed effects model can be

explained as follows.

The LSDV model calls for the fixed effects dummy variables to be generated using the

EViews code displayed in Table 1 as the sector dummies (i.e., sector1, sector 2, …,

sector12).

18

Table 1

EViews code for generating the dummy variables for the LSDV model _______________________________________________________________________

'Generate dummy variables for the LSDV model smpl 1 7260

genr sector1=0 genr sector2=0 genr sector3=0 genr sector4=0 genr sector5=0 genr sector6=0 genr sector7=0 genr sector8=0 genr sector9=0 genr sector10=0 genr sector11=0 genr sector12=0

'Debut de la boucle for for !i=0 to 7259

if indicator(!i+1)=1 then genr sector1(!i)=1

else genr sector1(!i)=0

endif if indicator(!i+1)=2 then

genr sector2(!i)=1 else

genr sector2(!i)=0 endif







else



















endif next

________________________________________________________________________

Once the dummy variables are computed, we use the EViews code displayed in Table 2 to compute the

robust instruments (this code is an extension of the code proposed in Racicot, 2014).

19

Table 2

EViews code for generating the robust instruments for the fixed effects model

_________________________________________________________________ 'Erreurs sur les 12 secteurs de Fama-French. Une nouvelle methode pour données en panel 'Développée par F.E.Racicot (2005/05) et modifié parle (2013/05/30) et le (2015/02/25) (voir

'Racicot 1993, 2014, 2015). 'Instruments extraient de Beta H (Beta E), méthode améliorée GMM ‘pour les erreurs sur les variables d'un modèle multifactoriel (modèle de Pastor et Stambaugh (2003), JPE) ‘dans le cadre d'une regression en panel avec effets fixes.

smpl @all

' find size of workfile series _temp = 1

!length = @obs(_temp) delete _temp

' set fixed sample size !ssize = 7260 ' # observations par secteurs

!ssize1= 1 '# de secteurs (12 secteurs) !size = 4

' Matrice des résultats matrix(!length-!ssize+1,20) testsstats

matrix(!length-!ssize+1,20) testsstats1 matrix(!length-!ssize+1,20) testsstats2 matrix(!length-!ssize+1,20) testsstats3 matrix(!length-!ssize+1,25) testsstats4 matrix(!length-!ssize+1,25) testsstats5

genr r=returns genr x1 =rm_rf genr x2 = smb genr x3 = hml genr x4 =umd

for !i=1 to !ssize1 smpl @first+(!i-1)*!ssize @first+!i*!ssize-1

genr x1barre=@mean(x1) genr x2barre=@mean(x2) genr x3barre=@mean(x3) genr x4barre=@mean(x4)

genr px1=x1-x1barre genr px2=x2-x2barre genr px3=x3-x3barre genr px4=x4-x4barre

vector pxi1=px1 vector pxi2=px2 vector pxi3=px3 vector pxi4=px4 genr z11=px1^2 genr z21=px2^2 genr z31=px3^2 genr z41=px4^2 vector zi1=z11 vector zi2=z21 vector zi3=z31 vector zi4=z41

matrix(!ssize,!size) z1 ' Création de la matrice z1 ou zD colplace(z1,zi1,1) colplace(z1,zi2,2) colplace(z1,zi3,3) colplace(z1,zi4,4)

genr ybarre=@mean(r) genr py1=r-ybarre

vector py=py1 ' Cacul de z2 (ou zP de Pal)

' vector w=px^3 genr w1=px1^3 genr w2=px2^3 genr w3=px3^3 genr w4=px4^3 vector wi1=w1 vector wi2=w2 vector wi3=w3

vector wi4=w4 ' vector w=w1

matrix(!ssize,!size) w colplace(w,wi1,1) colplace(w,wi2,2) colplace(w,wi3,3) colplace(w,wi4,4)

' vector wt=@transpose(w) Matrix wt=@transpose(w)

matrix(!ssize,!size) px colplace(px,pxi1,1) colplace(px,pxi2,2) colplace(px,pxi3,3) colplace(px,pxi4,4)

matrix pxt=@transpose(px) matrix pxtt=pxt*px/@obs(px1)

vector diagon1=@getmaindiagonal(pxtt) matrix diagon=@makediagonal(diagon1)

matrix z21i=wt-(3*diagon*pxt) matrix z2=@transpose(z21i)

matrix(!ssize,8) zz matplace(zz,z1,1,1) matplace(zz,z2,1,5)

' Régression artificielle de px sur zz afin d'obtenir xchap ou ' hchap (wchap)

' ls px zz 'Avec des variables binaires, il faut utiliser l'inverse generalisée de ‘Moore-Penrose sera probablement nécessaire (i.e., si le nombre ‘de variables explicatives binaires est supérieure au nombre de

variables continues) matrix theta =

(@inverse(@transpose(zz)*zz))*@transpose(zz)*px ' vector xchap=zz*theta

' vector wchap= px-xchap matrix xchap=zz*theta

matrix wchap=px-xchap matrix(!ssize,8) ww

matplace(ww,px,1,1) matplace(ww,wchap,1,5)

mtos(wchap,wchap1) mtos(py,py0)

equation temp2.ls x1 c ser01 ser02 ser03 ser04 genr residser01=resid genr x1hat=x1-resid




' Nouvel estimateur gmm-d par Racicot(2015) basé sur les cumulants de Durbin-Pal-Dagenais pour model de regression en

panel avec effet fixe. 'equation temp1.ls r c x1 x2 x3 x4

' b=a: Andrews; b=nw, Newey West; q=quadratic: Bartlet default; i=iterated (or s=sequentially)

20

equation temp1.gmm(b=nw,i=500) r sector1 sector2 sector3 sector4 sector5 sector6 sector7 sector8 sector9 sector10

sector11 sector12 x1 x2 x3 x4 @ sector1 sector2 sector3 sector4 sector5 sector6 sector7 sector8 sector9 sector10 sector11

sector12 residser01 residser02 residser03 residser04 ' Coef.

testsstats2(!i,1)=temp1.@coef(1) testsstats2(!i,2)=temp1.@coef(2) testsstats2(!i,3)=temp1.@coef(3) testsstats2(!i,4)=temp1.@coef(4) testsstats2(!i,5)=temp1.@coef(5) testsstats2(!i,6)=temp1.@coef(6) testsstats2(!i,7)=temp1.@coef(7) testsstats2(!i,8)=temp1.@coef(8) testsstats2(!i,9)=temp1.@coef(9)

testsstats2(!i,10)=temp1.@coef(10) testsstats2(!i,11)=temp1.@coef(11) testsstats2(!i,12)=temp1.@coef(12) testsstats2(!i,13)=temp1.@coef(13) testsstats2(!i,14)=temp1.@coef(14) testsstats2(!i,15)=temp1.@coef(15) testsstats2(!i,16)=temp1.@coef(16)

testsstats2(!i,17)=temp1.@rbar2 testsstats2(!i,18)=temp1.@dw testsstats2(!i,19)=temp1.@jstat

' Test t testsstats3(!i,1)=temp1.@tstat(1) testsstats3(!i,2)=temp1.@tstat(2) testsstats3(!i,3)=temp1.@tstat(3) testsstats3(!i,4)=temp1.@tstat(4) testsstats3(!i,5)=temp1.@tstat(5) testsstats3(!i,6)=temp1.@tstat(6) testsstats3(!i,7)=temp1.@tstat(7) testsstats3(!i,8)=temp1.@tstat(8) testsstats3(!i,9)=temp1.@tstat(9)

testsstats3(!i,10)=temp1.@tstat(10) testsstats3(!i,11)=temp1.@tstat(11) testsstats3(!i,12)=temp1.@tstat(12) testsstats3(!i,13)=temp1.@tstat(13) testsstats3(!i,14)=temp1.@tstat(14) testsstats3(!i,15)=temp1.@tstat(15) testsstats3(!i,16)=temp1.@tstat(16)

'Resultat de la regression lineaire simple avec effets fixes equation temp2.ls r sector1 sector2 sector3 sector4

sector5 sector6 sector7 sector8 sector9 sector10 sector11 sector12 x1 x2 x3 x4 x1hat x2hat x3hat x4hat

' Coef

testsstats4(!i,1)=temp2.@coef(1) testsstats4(!i,2)=temp2.@coef(2) testsstats4(!i,3)=temp2.@coef(3) testsstats4(!i,4)=temp2.@coef(4) testsstats4(!i,5)=temp2.@coef(5) testsstats4(!i,6)=temp2.@coef(6) testsstats4(!i,7)=temp2.@coef(7) testsstats4(!i,8)=temp2.@coef(8) testsstats4(!i,9)=temp2.@coef(9)

testsstats4(!i,10)=temp2.@coef(10) testsstats4(!i,11)=temp2.@coef(11) testsstats4(!i,12)=temp2.@coef(12) testsstats4(!i,13)=temp2.@coef(13) testsstats4(!i,14)=temp2.@coef(14) testsstats4(!i,15)=temp2.@coef(15) testsstats4(!i,16)=temp2.@coef(16) testsstats4(!i,17)=temp2.@coef(17) testsstats4(!i,18)=temp2.@coef(18) testsstats4(!i,19)=temp2.@coef(19) testsstats4(!i,20)=temp2.@coef(20)

testsstats4(!i,21)=temp2.@rbar2 testsstats4(!i,22)=temp2.@dw 'testsstats(!i,12)=temp.@jstat

' Test t testsstats5(!i,1)=temp2.@tstat(1) testsstats5(!i,2)=temp2.@tstat(2) testsstats5(!i,3)=temp2.@tstat(3) testsstats5(!i,4)=temp2.@tstat(4) testsstats5(!i,5)=temp2.@tstat(5) testsstats5(!i,6)=temp2.@tstat(6) testsstats5(!i,7)=temp2.@tstat(7) testsstats5(!i,8)=temp2.@tstat(8) testsstats5(!i,9)=temp2.@tstat(9)

testsstats5(!i,10)=temp2.@tstat(10) testsstats5(!i,11)=temp2.@tstat(11) testsstats5(!i,12)=temp2.@tstat(12) testsstats5(!i,13)=temp2.@tstat(13) testsstats5(!i,14)=temp2.@tstat(14) testsstats5(!i,15)=temp2.@tstat(15) testsstats5(!i,16)=temp2.@tstat(16) testsstats5(!i,17)=temp2.@tstat(17) testsstats5(!i,18)=temp2.@tstat(18) testsstats5(!i,19)=temp2.@tstat(19) testsstats5(!i,20)=temp2.@tstat(20)

next

___________________________________________________________________________________________

Note that we have used in Table 2 the sector dummies ‒i.e., the Fama-French 12 industry portfolios‒ as

instruments in this example. This is appropriate because they are not stochastic. Furthermore, the

constant term c is usually used as an instrument. In this case, c is decomposed into dummies (e.g., sector

dummies) where the sum is equal to 1or c. More precisely, the EViews code applied for estimating a

simple linear regression, either in a cross-sectional or time-series setting, can be written as follows

equation temp1.gmm(b=nw,i=500) r c x1 x2 x3 x4 @ c residser01 residser02 residser03 residser04

21

where there are four explanatory variables. In this case, which is the model of Fama-French-Carhart

(1993, 1997), x1 = Rm-rf, x2 = SMB, x3 = HML and x3 = UMD. Note that c is used as an instrument

because it is non-stochastic. For a data regression model with fixed effects, c can be decomposed into

the game-French sectors for our application and used as instruments because the components are non-

stochastic. The EViews code can thus be written as follows

equation temp1.gmm(b=nw,i=500) r sector1 sector2 sector3 sector4 sector5 sector6 sector7 sector8

sector9 sector10 sector11 sector12 x1 x2 x3 x4 @ sector1 sector2 sector3 sector4 sector5 sector6

sector7 sector8 sector9 sector10 sector11 sector12 residser01 residser02 residser03 residser04

Note that c is a vector of ones and can be represented as follows

𝑐 = (1⋮1)

For the fixed effects model, for example, c can be decomposed as

𝑐 =

(

111111)

=

(

110000)

+

(

001100)

+

(

000011)

where each column of the decomposition represents a sector’s dummy variables. In this simple example,

they are three sectors where each sector is represented by a vector of dummy variables with five

observations. The sum of the sector’s dummy variables being equal to the constant term c justifies the

use of the sector’s dummy variables as instruments.

It should be noted that when there are several random dummy variables in a particular model, the

matrix may not be easily invertible. One solution is to use the Moore-Penrose (in EViews: @pinverse ())

generalized inverse (see Racicot, 1993, 2014 or Theil 1971). When this algorithm is used, the parameter

estimates will be similar to the OLS ones, but the t-statistics will be different since we are using the

HAC matrix in the GMM estimator.

Finally, a concluding remark on the method proposed in this paper is in order here. As mentioned

previously, it should be noted that in our approach the number of instruments used in the GMM

algorithm is equal to the number of parameters to be estimated. This implies that the 2SLS (GMM)

22

reduces to the indirect least squares (ILS), which makes the proposed method a parsimonious one.

Simply put, the difference between the GMM estimator and the 2SLS is that GMM uses the HAC matrix

while 2SLS is not robust to autocorrelation or heteroskedasticity. Thus, using our approach has the

advantage of tackling these problems simultaneously while reducing to a simple ILS method, one that

closely resembles the betaE estimator shown in (Racicot, 1993, 2014).

Implementing robust estimation instruments for panel data … · 2019. 7. 31. · 1 Cahier de...

Documents

Transcript of Implementing robust estimation instruments for panel data … · 2019. 7. 31. · 1 Cahier de...