Introduction to Panel Data Analysis -...

75
Introduction to Panel Data Analysis Francesca Di Iorio Dipartimento di Scienze Politiche Università di Napoli Federico II Summer School "Advanced Econometric Tools for Economic Analysis" (AETEA) 9-10 Settembre 2013 F. Di Iorio (UniNa FII) Panel data AETEA 2013 1 / 75

Transcript of Introduction to Panel Data Analysis -...

Page 1: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Introduction to Panel Data Analysis

Francesca Di Iorio

Dipartimento di Scienze PoliticheUniversità di Napoli Federico II

Summer School "Advanced Econometric Tools for EconomicAnalysis" (AETEA)

9-10 Settembre 2013

F. Di Iorio (UniNa FII) Panel data AETEA 2013 1 / 75

Page 2: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Summary

1 Panel Data: definitions

2 Advantages and Problems

3 Models and estimation methods

4 Introduction to dynamic panels

5 Limited dependent variables

F. Di Iorio (UniNa FII) Panel data AETEA 2013 2 / 75

Page 3: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

References

References

Arellano M (2003) Panel data econometrics. Oxford University Press, Oxford

Baltagi, B. (2001) Econometric Analysis of Panel Data, 2nd edn, Wiley, New York

Hsiao, C. (2003) Analysis of Panel data, 2nd edn, Cambridge University Press,

Cambridge.

Hsiao, C. (2007) Panel data analysis - advantages and challenges, DOI

10.1007/s11749-007-0046-x

Wooldridge, J. (2010) Econometric Analysis of Cross Section and Panel Data,

2nd edn, Mit Press

Bontempi, Golinelli (2008) Econometria dei dati panel: teroia e pratica,

http://www2.dse.unibo.it/golinelli/teaching/corsi/corsi e 2/materiali/

Lezioni panel EII

http://web.pdx.edu/∼crkl/ec510/examples.htm

F. Di Iorio (UniNa FII) Panel data AETEA 2013 3 / 75

Page 4: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Definitions

Definitions

Panel data or longitudinal data are data containing time seriesobservations of a number of individuals.

each observation has two subscript i, xit with i = 1 . . . n andt = 1 . . .T , i.e. observed value for unit i at time t

Examples:

a random sample of household or firms observed for severalmonths or years

a sample of air pollution stations in a number of cities for severalcountry for given time period

a given number of variables observed in different country for giventime span

F. Di Iorio (UniNa FII) Panel data AETEA 2013 4 / 75

Page 5: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Definitions

Definitions

the data can be collected by sample survey (e.g. Consumptionsurvey by the National Statistical Office)

or considering a set of homogeneous defined time series fordifferent units (e.g. The GDP of EU Member States from IQ 2000to IVQ 2012)

Panel can be Balanced (all individuals observed in the same timeperiods) or Unbalanced

F. Di Iorio (UniNa FII) Panel data AETEA 2013 5 / 75

Page 6: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Definitions

Definitions

Aim: analyze a phenomenon as a function of explanatoryvariables, taking into account individual effects

First relevant studies dating back to the ’60s (the USA NationalLongitudinal Surveys of Labor Market Experience (NLS)

First analysis and methodology are based on large n and small T

3 factors contributing to the geometric growth of panel dataanalysis:

data availabilitychallenging methodologythe IT development made feasible complex estimation methods

F. Di Iorio (UniNa FII) Panel data AETEA 2013 6 / 75

Page 7: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Advantages of panel data

Advantages of panel data

More accurate inference of model parameters (Greater number ofdata⇒ Greater number of degrees of freedom⇒ improvesefficiency of the estimation)

greater capacity for modeling the phenomenon complexity than asingle cross-section or time series data

estimation of dynamic relationships even with a small number oftime periods

Simplifying computation and statistical inference under givenconditions.

F. Di Iorio (UniNa FII) Panel data AETEA 2013 7 / 75

Page 8: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Advantages of panel data

Phenomenon complexity

observing a sample of units over time allows to study how thephenomenon evolves with respect to explanatory vars dynamics(e.g. treatment effects )

Controlling the impact of omitted variables

Uncovering dynamic relationship

Generating more accurate predictions for individual outcomes

Providing micro foundations for aggregate data analysis,especially when the micro-data are heterogeneous and therepresentative agent assumption is difficult to sustain.

F. Di Iorio (UniNa FII) Panel data AETEA 2013 8 / 75

Page 9: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Advantages of panel data

Simplifying statistical inference.

For panels with non-stationary time series, the infer. procedurescan be simplified compared to usual time series analysis becauseunder assumption of independence between the units we can usethe Central Limit Theorem

Measurement errors can lead to under-identification of aneconometric model. More information for each individual allowtransformation in the data that help identification

Useful with limited dependent variables models (probit, tobit),especially with dynamic specifications

F. Di Iorio (UniNa FII) Panel data AETEA 2013 9 / 75

Page 10: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Problems

Problems

Attrition : when individuals leave the panel for many different reasons(e.g. death, bankruptcy, merger, other reasons for non-response,..)

If attrition is random and does not depend on individual behaviorresults will be unbiased results

If attrition is not random there will be selectivity problems and theresults will be affected in turn (possible solution rotated panel )

F. Di Iorio (UniNa FII) Panel data AETEA 2013 10 / 75

Page 11: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Models and methods Models

The Model

The main model for panel data analysis is derived from themultivariate regression model

yi = β0 + β1x1i + β2x2i + . . . βkxki + εi = β0 + β′xi + εi

panel data focuses on individual outcomes

Factors affecting individual outcomes are numerous (explanatoryvariables as well as unobservable individual effects) summarizedthem into a model is very complex

Individual effects take different names : unobservablecomponents, latent variables, unobservable heterogeneity, andare generally supposed time-invariant

have to make choices.

F. Di Iorio (UniNa FII) Panel data AETEA 2013 11 / 75

Page 12: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Models and methods Models

Pooled Model

The Pooled Model assume homogeneity for all coefficients, i.e.absence of individual effects.

yit = α + β1x1it + β2x2it + . . . βkxkit + εit i = 1 . . . n t = 1 . . .T

con εit ∼ iid ∀i ,∀t , E(εit ) = 0 e E(εε′) = σ2In

error variance is the same for all individuals in each period, noautocorrelation, no correlation of the error terms betweenindividuals in a specific time period (no contemporary correlation)

Pooled estimator is OLS without further adjustments, i.e.estimation of k + 2 coeffs with n ∗ T observ.

F. Di Iorio (UniNa FII) Panel data AETEA 2013 12 / 75

Page 13: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Models and methods Traditional view

A classic Model: Fixed Effects Model

One of the easiest ways to take account of individual factors is toconsider the model:

yit = αi + β1x1it + β2x2it + . . . βkxkit + εit = αi + β′xit + εit

with t = 1 . . .T and i = 1 . . .N, where

βj are the same for all units for each variable j = 1 . . . k that isvariables affect in the same way all the individuals, ( homogeneity)

unobservable individual factors are represented by the individualconstant heterogeneity αi (unobservable individual heterogeneity).

In fig.1-3 several example and comparison with the Pooled model

F. Di Iorio (UniNa FII) Panel data AETEA 2013 13 / 75

Page 14: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Models and methods Traditional view

Fig.1

F. Di Iorio (UniNa FII) Panel data AETEA 2013 14 / 75

Page 15: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Models and methods Traditional view

Fig.2

F. Di Iorio (UniNa FII) Panel data AETEA 2013 15 / 75

Page 16: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Models and methods Traditional view

Fig.3

F. Di Iorio (UniNa FII) Panel data AETEA 2013 16 / 75

Page 17: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Models and methods Traditional view

Fixed Effects Model

In the traditional view of Panel Data this model is known as FixedEffects ModelModel for a single observation i is yi1

...

yiT

=

1...

1

αi +

x1i1 . . . xk

i1... ... ...

x1iT . . . xk

iT

β1

...

βk

+

εi1...

εiT

then in matrix form

yi = ιTαi + Xiβ + εi

How to estimate this model?

F. Di Iorio (UniNa FII) Panel data AETEA 2013 17 / 75

Page 18: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Models and methods Traditional view

Fixed Effects Model

The model for all observations is given by:

y1

...

yn

=

ιT 0 . . . 00 ιT ... 00 0 . . . ιT

α1

α2

...

αn

+

X1

X2

...

Xk

β +

ε1ε2...

εn

then in matrix form

y = Dnα + Xβ + ε

where1 Dn is a nT × n matrix of individual dummies2 y and ε are nT × 1 vectors3 α is n × 1 vector and β is k × 1 vector4 X is nT × k matrix

F. Di Iorio (UniNa FII) Panel data AETEA 2013 18 / 75

Page 19: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Models and methods Traditional view

LSDV

The estimator is the LSDV (Least Squares Dummy Variables) i.e. OLSfor the model for all observations

1 β is BLUE

2 computationally burdensome if n is large

3 for k large possible imprecise estimation

4 great loss of degrees of freedom

5 often we are not interested in each parameter αi when n is large

6 Solution: Within transformation.

F. Di Iorio (UniNa FII) Panel data AETEA 2013 19 / 75

Page 20: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Models and methods Traditional view

Deviance decomposition

Deviance can be split in two part: temporal (within individuals) andbetween individuals. lets the temporal means as:

yi. = (1/T )T∑

t=1

yit xi. = (1/T )T∑

t=1

xit etc.

and the overall means:

y.. = (1/nT )n∑

i=1

T∑t=1

yit x.. = (1/nT )n∑

i=1

T∑t=1

xit etc.

the decomposition is given by:

(yit − y..) = (yit − yi.)︸ ︷︷ ︸within

+ (yi. − y..)︸ ︷︷ ︸between

etc.F. Di Iorio (UniNa FII) Panel data AETEA 2013 20 / 75

Page 21: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Models and methods Traditional view

Deviance decomposition

Considering the summations:

Wyy =n∑

i=1

T∑t=1

(yit − yi.)2

Byy = Tn∑

i=1

(yi. − y..)2

Tyy =n∑

i=1

T∑t=1

(yit − yi.)2

we have:

Tyy = Wyy + Byy

and similarly for x and the codeviance between x and yF. Di Iorio (UniNa FII) Panel data AETEA 2013 21 / 75

Page 22: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Models and methods Traditional view

Within Transformation

Then the Within Transformation for the model is:

(yit − yi.) = (xit − xi.)β + (εit − εi)

and thenyit = xitβ + εit

the estimator is given by

βW = (X ′X )−1X ′y

i.e. OLS on the Within-transformed model

F. Di Iorio (UniNa FII) Panel data AETEA 2013 22 / 75

Page 23: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Models and methods Traditional view

Within Transformation

So αi dropped from the equation and viewed as nuisanceparameters

βW = LSDV for β

βW are BLUE, asy. norm.

βW consistent when n large and T fixed, for n fixed and T large, nand T large

αi,W = yi. − xi.βW but identification with∑n

i=1 αi = 0

αi,W not consistent for n large (num param grows with n:incidental parameters problem)

F. Di Iorio (UniNa FII) Panel data AETEA 2013 23 / 75

Page 24: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Models and methods Traditional view

First Difference Transformation

First Difference Transformation eliminates the individual effects asthe Within Trans. It is obtained by subtracting to yit = αi + xitβ + εit thesame at time t − 1. Lets:

∆yit = yit − yit−1 ∆xit = xit − xit−1 ∆εit = εit − εit−1

the model is∆yit = ∆xitβ + ∆εit

estimated by OLS

β = LSDV for β

β are unbiased and consistent

if β, βW and LSDV give different result for β then someassumptions of the FE model are non sustained by the data.

F. Di Iorio (UniNa FII) Panel data AETEA 2013 24 / 75

Page 25: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Models and methods Traditional view

Random Effects Model

Another way of dealing with the unobservable effects is theRandom Effects Model: the individual heterogeneity αi isconsidered as random sample form a given distribution.

the Random Effects Model in matrix form is:

yi = ιTαi + Xiβ + εi

Assumption: E(αi/Xi) = 0 i = 1 . . . n, individual effectsuncorrelated with the explanatory variables

Assumption: Var(αi) = σ2α i = 1 . . . n

given the shape of the error cov. matrix the better estimator is aGLS

F. Di Iorio (UniNa FII) Panel data AETEA 2013 25 / 75

Page 26: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Models and methods Modern view

Modern view

in practice the interest is concentrated only in β (struct. param. )neglecting others coeffs. ( incidental param.)

then it makes more sense to ask whether or not individual factorsare uncorrelated with the explanatory variables, especially withlarge n

In the Modern view of panel data αi are always random, andrandom effects refer to unobservable individual factors areuncorrelated with the explanatory variablesCov(αi , xit ) = 0 t = 1 . . .T .

Fixed effects refers to Cov(αi , xit ) 6= 0 t = 1 . . .T .

Assuming xit are random variable imply discuss the effects of yit

on xis when s > t (exogeneity)

F. Di Iorio (UniNa FII) Panel data AETEA 2013 26 / 75

Page 27: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Models and methods Modern view

Modern view

Consider yit = αi + β′xit + εit

Strict exogeneity xit , t = 1 . . .T w.r.t. unobserved effect

E(yit |xi1 . . . xiT , αi) = E(yit |xit , αi) = xitβ + αi i = 1 . . .T

first equation say conditionally to xit e αi , then xis have no partialeffects on yit when s 6= t .

this condition ca be rewritten in for error terms

E(εit |xi1 . . . xiT , αi) = 0 t = 1 . . .T

then the uncorrelation for each period between errors andvariables

E(x′itεit ) = 0 s, t = 1 . . .T

F. Di Iorio (UniNa FII) Panel data AETEA 2013 27 / 75

Page 28: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Models and methods Modern view

Modern view

The model can be rewritten as yit = µ+ β′xit + uit whereuit = εit + αi

for this model, under E(x ′ituit ) = 0, or if E(x ′itεit ) = 0 e E(x ′itαi) = 0,Pooled OLS consistent for β

condition E(x ′itαi) = 0 is quite restrictive and αi produce serialcorrelation in uit ; more correlation between uit and uis does notdecrease when |t − s| grows up (uit no weakly dependent)

error component model

F. Di Iorio (UniNa FII) Panel data AETEA 2013 28 / 75

Page 29: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Models and methods Modern view

Modern view: random effects analysis

the model is once again

yit = µ+ β′xit + uit

where uit = εit + αi

with xi = (xi1 . . . xiT ). Assumptions are;1 strict exogeneity E(uit |xi, αi ) = 0 t = 1 . . .T ,

2 effects and esplic var. are orthogonal E(αi |xi) = E(αi ) = 0

3 E(εit ) = 0; E(εitεjs) = σ2ε for t = s, j = i ; E(εitεjs) = 0 for t 6= s, j 6= i

4 E(αi ) = 0; E(αiαj ) = σ2α for t = s, j = i ; E(αiαj ) = 0 for t 6= s, j 6= i

5 E(αiεit ) = 0

F. Di Iorio (UniNa FII) Panel data AETEA 2013 29 / 75

Page 30: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Models and methods Modern view

Modern view: random effects analysis

Under the previous assumptions E(u2it ) = σ2

α + σ2ε , then

Ω = E(uiui′) = σ2

εIT + σ2αιT ι

′T

that can be rewritten as:

Ω = σ2ε

[Q +

(It −Q)

]where ψ = σ2

ε

σ2ε+Tσ2

αe Q = IT − 1

T ιT ι′T ( Random effect structure)

Then we haveΩ−1 =

1σ2ε

[Q + ψ (It −Q)]

for theGLS estimator

F. Di Iorio (UniNa FII) Panel data AETEA 2013 30 / 75

Page 31: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Models and methods Modern view

Modern view: random effects analysis

βGLS = (X ′Ω−1X )−1(X ′Ω−1Y )

in general σ2ε e σ2

α are unknown. We need estimates⇒ FGLS

preliminary estimate σ2ε from fixed effect

Between est.: OLS on yi = µ+ αi + xi.β + εi

preliminary estimate σ2α form Between estim.

Between estim. consistent ma non effic. if αi uncorr xit

GLS can be obtain by OLS onyit − θyi = (1− θ)µ+ (xit − θxi)β + (uit − θui) where θ = 1− ψ1/2

F. Di Iorio (UniNa FII) Panel data AETEA 2013 31 / 75

Page 32: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Models and methods Modern view

Modern view: fixed effects analysis

from the practical point of view for the estimator β is the same ofthe Traditional view

fixed effects estimator is given by OLS on Within transf. model

then βw measure how much yit differ from its average on time, yi

instead of reporting the estimates of the individual effects weconsider their variance

F. Di Iorio (UniNa FII) Panel data AETEA 2013 32 / 75

Page 33: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Models and methods Modern view

Test for individual effects

Consider fixed effect in the traditional view.

H0 : NO individual effects (Poolability test)

equals to

H0 : α1 = α2 = · · · = αn = 0 H1 : at least αi 6= 0

the usual F-test f = (RRSS−URSS)/(n−1)URSS/(nT−n−k)

F. Di Iorio (UniNa FII) Panel data AETEA 2013 33 / 75

Page 34: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Models and methods Modern view

Test for random effects

Consider random effects

H0 : NO random effects

equals to

H0 : σ2α = 0 H1 : σ2

α > 0

Bresuch-Pagan test

LM =nT

2 ∗ (T − 1)

(1− u′(I ⊗ JT )u

u′u

)∼ χ2(1)

where u are the Pooled model residuals

F. Di Iorio (UniNa FII) Panel data AETEA 2013 34 / 75

Page 35: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Models and methods Modern view

Random or fixed effects model?

The choice between a FE or RE model depends on several factors

If for individual effects there are so many non observable causes,use RE

if n large and T small, then few dof for FE, use RE

if we are interested only in β use RE

Depend on sample Es. country, o sectors, FE can be interesting; ifr.s. from population, use RE

F. Di Iorio (UniNa FII) Panel data AETEA 2013 35 / 75

Page 36: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Models and methods Modern view

Random or fixed effects model?

More

FE: easy to estimate

LSDV robust to the omission of time-invariant variables and stillconsistent even when RE is valid

RE: save many degrees of freedom (especially with n large)

RE: takes into account both the between and within variance

RE: allows you to use Explanatory time-invariant variables (eg,gender, sector, etc.)

Obviously there are tests ...

F. Di Iorio (UniNa FII) Panel data AETEA 2013 36 / 75

Page 37: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Models and methods Modern view

Test for random vs fixed effects

Idea: Compare an estimator which is consistent and efficient under theH0and inconsistent under the alternative hypothesis with an estimatorwhich is consistent under both hypothesis. Use for example theHausman test.Let Σ = Cov(βFE )− Cov(βRE ) then

(βFE − βRE )′Σ(βFE − βRE ) ∼ χ2(rank(Σ))

H0 : E(xitαi) = 0 RE consistent⇒ RE and FE similar

H1 : E(xitαi) 6= 0 FE consistent e RE inconsistent⇒ RE e FEshould be different

The RE model is preferred by the test even when the non-significantdifference between FE and RE estimates depends on the estimate lowsignificance⇒ look also at the point estimate values

F. Di Iorio (UniNa FII) Panel data AETEA 2013 37 / 75

Page 38: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Models and methods Modern view

Time effects

In addition to individual some time effects can also be considered, i.e.Individual-constant time-varying variablesCaptures the impact of unobserved variables which affect all units in agiven time period but which can vary over time

yit = µ+ β′xit + δzt + uit

Even in this case δ can be considered as nuisance parameters anddeleted using the transformation

yit − yt = β′(xit − xt ) + (uit − ut )

where yt = (1/N)∑n

i=1 yit etc.Often specific temporal dummies are used to take account of someparticular event

F. Di Iorio (UniNa FII) Panel data AETEA 2013 38 / 75

Page 39: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Example

Example 1

Model of investment demand (Greene (2008), pp.250-252,Grunfeld and Griliches (1960))

Iit = αi + β1Fit + β2Cit + uit i = 1,10 t = 1935, ..,1954

where

10 US firms GM, CH, GE, WE, US, AF, DM, GY, UN, IBM.

Iit = Gross Investments

Fit = Market value

Cit = Capital

F. Di Iorio (UniNa FII) Panel data AETEA 2013 39 / 75

Page 40: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Example

Example 1: economic theory

Investment spending (I) is affected by many factors; the study of itsdynamics helps to understand the fluctuations of aggregate demand,the future value of the stock of capital and thus the future productivitywork. Several theories argue that the expected profits and the desiredstock of capital affect the demand for investment. The expected profitsare measured by the market value of the shares (F), the stock of capital(C) depends on expectations. The model is supposed linear and static.

F. Di Iorio (UniNa FII) Panel data AETEA 2013 40 / 75

Page 41: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Example

Example 1: the data

imprese anno I F C1 1935 317.6 3078.5 2.81 1936 391.8 4661.7 52.6. . . . . . . . . . . . . . .1 1953 1304.4 6241.7 1777.31 1954 1486.7 5593.6 2226.32 1935 40.29 417.5 10.52 1936 72.76 837.8 10.2. . . . . . . . . . . . . . .2 1953 174.93 1001.5 346.12 1954 172.49 703.2 414.9. . . . . . . . . . . . . . .10 1935 20.36 197 6.510 1936 25.98 210.3 15.8. . . . . . . . . . . . . . .10 1953 127.52 793.5 211.510 1954 135.72 927.3 238.7

F. Di Iorio (UniNa FII) Panel data AETEA 2013 41 / 75

Page 42: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Example

Example 1: look the data

F. Di Iorio (UniNa FII) Panel data AETEA 2013 42 / 75

Page 43: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Example

Example 1: look the data

F. Di Iorio (UniNa FII) Panel data AETEA 2013 43 / 75

Page 44: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Example

Example 1: look the data with STATA

declare unit and time with xtset imprese anno

descriptive statistics xtsum

Total Variance= Within var + Between Var.F. Di Iorio (UniNa FII) Panel data AETEA 2013 44 / 75

Page 45: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Example

Example 1: look at the heterogeneity

F. Di Iorio (UniNa FII) Panel data AETEA 2013 45 / 75

Page 46: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Example

Example 1: look at the heterogeneity

F. Di Iorio (UniNa FII) Panel data AETEA 2013 46 / 75

Page 47: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Example

Example 1: Pooled OLS by Stata

regress I F C orregress I F C, vce(robust)

F. Di Iorio (UniNa FII) Panel data AETEA 2013 47 / 75

Page 48: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Example

Example 1: LSDV by Stata

xi: regress I F C i.imprese

αi estimates are the dummy coeffs.

F. Di Iorio (UniNa FII) Panel data AETEA 2013 48 / 75

Page 49: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Example

Example 1: FE by Stata

xtreg I F C, fe

in this case we are not interested the estimates of the individual αiF. Di Iorio (UniNa FII) Panel data AETEA 2013 49 / 75

Page 50: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Example

Example 1: Within Estim. by Stata

bysort imprese: egen igm=mean(I) * means generationbysort imprese: egen fgm=mean(F)

bysort imprese: egen cgm=mean(C)

gen idgm=I-igm

gen fdgm=F-fgm

gen cdgm=C-cgm

regress idgm fdgm cdgm

F. Di Iorio (UniNa FII) Panel data AETEA 2013 50 / 75

Page 51: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Example

Example 1: Between Estim. by Stata

xtreg I F C, be

Also in this case we are not interested the estimates of the individual αi

F. Di Iorio (UniNa FII) Panel data AETEA 2013 51 / 75

Page 52: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Example

Example 1: RE Estim by Stata

xtreg I F C, re theta

F. Di Iorio (UniNa FII) Panel data AETEA 2013 52 / 75

Page 53: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Example

Example 1: R2 in STATA

The variability due to the individual effects must be considered in thevariability explained by the model? If it is the case, the traditional R2 isnot an suitable indicator.There is no agreement on that R2 to use the panel. STATA followsWooldridge (2002) and offers three versions of R 2. For FE we have:

Within: R2w = corr(xit − xi)βw , (yit − yi) Ordinary R2 from Within

model

Between: R2b = corrxi βw , yi) that is R2 for the between

transformation but using βw coefficients

Overall: R20 = corrxit − βw , yit that is R2 for the original model

but using βw coeff.There are similar measures for BE and RE with βbe e βre in place of βw

How to interpret these values? E.g. if Within and Overall are close, then theindividual effects are not very important.

F. Di Iorio (UniNa FII) Panel data AETEA 2013 53 / 75

Page 54: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Example

Example 1: specification test

xttest0 Breusch-Pagan LM test H0 : var(α) = 0 after RE estim.hausman sFE sRE con sRE e sFE names of saved estimates afterFE and RE

F. Di Iorio (UniNa FII) Panel data AETEA 2013 54 / 75

Page 55: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Example

Example 1: specification test

The serial correlation affects the stand. err. and produces inefficiency.Test of Wooldridge (2002) for the presence of serial correlation of order1 in errors terms The test use εit residuals of the first differenceregression

(yit − yit−1) = (xit − xit−1)β + (εit − εit−1)

∆yit = ∆xitβ + ∆εit

If εit show not serial correlation, Corr(∆εit ,∆εit−1) = −0.5 thenH0 : ρ = −0.5 for εit = ρεit−1 + ηit

H0: no first-order autocorrelation

F( 1, 9) = 722.885, Prob > F = 0.0000

xtserial I F CF. Di Iorio (UniNa FII) Panel data AETEA 2013 55 / 75

Page 56: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

A brief introduction to dynamic panel

Instrumental variables

Correlation between the error term and explanatory variables in theregression model leads to the inconsistency of the OLS or GLS.Causes:

explanatory variables measured with error

endogenous var. explanatory

dynamic model and correlated errors

Possible solutions for consistency: instrumental variables, such as2SLS, IV ...

βIV = (Z ′X )−1Z ′Y

where Z are the instruments, i.e. variables asymptotically correlatedwith the endogenous regressors, and asymptotically uncorrelated withthe regression error.

F. Di Iorio (UniNa FII) Panel data AETEA 2013 56 / 75

Page 57: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

A brief introduction to dynamic panel

A brief introduction to dynamic panel

The dynamic panel are characterized by the presence of the laggeddependent variable between the regressors. There are two differenttypes of correlation:

autocorrelation of the dependent variable,

correlation due to unobserved heterogeneity

yit = φyit−1 + xitβ + uit , dove uit = αi + εit

In these models by definition Corr(yit−1,uit ) 6= 0, because of theindividual effects αi . Within transformation does not solve the problembecause it creates a relationship between yit−1 and εit that disappearsonly asymptotically. So OLS and GLS are inconsistent for both FE andRE.

F. Di Iorio (UniNa FII) Panel data AETEA 2013 57 / 75

Page 58: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

A brief introduction to dynamic panel

Dynamic panel: Anderson-Hsiao

Starting from a simple case

yit = φyit−1 + xitβ + αi + εit

Consider the transformation in first differences which eliminates αi

∆yit = φ∆yit−1 + ∆xitβ + ∆εit

Explanatory vars and errors are still correlated through yit−1 e εit−1 thatare in ∆yit−1 and ∆εitAnderson-Hsiao:∆yit−2 as instrum. var. for ∆yit−1 with 2SLS forconsistent estimates (but not efficent). Suitable instrument is also yit−2.

F. Di Iorio (UniNa FII) Panel data AETEA 2013 58 / 75

Page 59: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

A brief introduction to dynamic panel

Dynamic panel: Arellano-Bond

The GMM estimator is based on moment conditions from which toobtain the orthogonality conditions. In general if E(f (xt , β0)) = 0 then

βgmm = argminβ

(∑t

f (xt , β0)

)W

(∑t

f (xt , β0)

)

It takes many conditions as parameters to be estimated, but you canconsider more conditions that parameters ( the choice of W becomesimportant). Example: the OLS the condition E(X ′ε) = 0 is also amoment conditions, then OLS is a GMM estimatorThe idea is to find a set of instrumental variables, exploiting theorthogonality conditions between the lagged dependent and εit .

F. Di Iorio (UniNa FII) Panel data AETEA 2013 59 / 75

Page 60: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

A brief introduction to dynamic panel

Dynamic panel: Arellano-Bond

Arellano and Bond (1991) starting from the model in first diff. build aGMM estimator using conditions on lagged yit and ∆εit for γ = (φ, β).Then:

E(yit−s∆εit ) = 0 E(xit−s∆εit ) = 0 s > 2, t = 3 . . .T

Lets gi(γ) = yit−s∆εit , the moments are obtained asg(γ) = (1/n)

∑ni=1 gi(γ) = (1/n)

∑ni=1(yit−s∆εit ) The GMM estimator

is obtained solving g(γ) = 0 then:

γGMM = argminγ g(γ)′Wng(γ)

where the optimal choice for Wn is the inverse of moment covariancematrix.

F. Di Iorio (UniNa FII) Panel data AETEA 2013 60 / 75

Page 61: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

A brief introduction to dynamic panel

Dynamic panel: Arellano-Bond

So who are the instruments? The x are typically taken as exogenousand then each provide a instrument, then ...

at time 3 yi1 instrument for ∆yi3

at time 4 yi1 e yi2 instruments for ∆yi4

at time 5 yi1, yi2 e yi3 instruments for ∆yi5

and so on

So in a model with one lag for y , k exog. vars, and p = T − 2 periodsfrom which to from moment equations, there are k + p ∗ (p + 1)/2conditions.

F. Di Iorio (UniNa FII) Panel data AETEA 2013 61 / 75

Page 62: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

A brief introduction to dynamic panel

Dynamic panel: Arellano-Bond

Of course, this formulation needs a previous estimate to evaluate ∆εit .

Alternative one step estimator (asy. equiv. if εit ∼ iid(0, σ2ε )

γ =(X ′ZWZ ′X

)X ′ZWZ ′y

where X = (y−1,X ) Z = (Z ′1, . . . ,Z′n)′

Xi =

yi2 − yi1 xi3 − xi2

. . . . . .

yiT−1 − yiT − 2 xiT − xiT−1

Zi =

[yi1x ′i1x ′i2] 0 . . . 0. . . . . . . . . . . .

0 0 . . . [yi1 . . . yiT−2x ′i1 . . . x′iT−1]

F. Di Iorio (UniNa FII) Panel data AETEA 2013 62 / 75

Page 63: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

A brief introduction to dynamic panel

Example 1: dynamic specification

The rejection of the Wooldridge test suggests a dynamic specification

xtabond I F C orxtabond I F C, vce(robust)

F. Di Iorio (UniNa FII) Panel data AETEA 2013 63 / 75

Page 64: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

A brief introduction to dynamic panel

Example 1: dynamic specification

Sargan test of the null hypothesis that model and overidentifyingconditions are correct specified

Arellano-Bond test that there is no serial correlation in thefirst-differenced disturbances

estat sargan

estat abond

Attention: with some options STATA does not perform these tests: e.g.with vce(robust) Stata does not perform Sargan test, withvce(gmm) the Arellano-Bond test

F. Di Iorio (UniNa FII) Panel data AETEA 2013 64 / 75

Page 65: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

A brief introduction to dynamic panel

Example 1: two-step GMM

xtabond I F C, twospep vce(robust)

F. Di Iorio (UniNa FII) Panel data AETEA 2013 65 / 75

Page 66: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

A brief introduction to dynamic panel

Predetermined variables

xit strictly exogenous imply E(xitεis) = 0 for all s e t .

if E(xitεis) 6= 0 for t > s, x variables are Predetermined

When the variables are predetermined, it means that we cannotinclude the whole vector of ∆xit into the instrument matrix

We just include the levels of xit for those time periods that areassumed to be unrelated to ∆εis)

there are obviously other extensions of the Arellano-Bondestimator (eg to take account strong autocorr residues etc.).

Read the STATA manual

F. Di Iorio (UniNa FII) Panel data AETEA 2013 66 / 75

Page 67: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

A brief introduction to dynamic panel

Example 1: Predetermined variables

Suppose the variable C predetermined

xtabond I F , twospep vce(robust) pre(C)

F. Di Iorio (UniNa FII) Panel data AETEA 2013 67 / 75

Page 68: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Qualitative Dependent

Qualitative Dependent Variables

Very often in the data panel the dependent variable is qualitative,more often dichotomous

es. labor force participation,

let π(xit ) = Pr(yit = 1|xit ) then a Probit or Logit model can beapplied

F. Di Iorio (UniNa FII) Panel data AETEA 2013 68 / 75

Page 69: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Qualitative Dependent

Qualitative Dependent Variables

Homogeneous models:The indiviual heterogeneity is only explained by observablecovariates and then unobserved effects are ruled out.Independence between the response variables given thecovariates.Probit: π(xit ) = Φ(x′itβ) Logit: π(xit ) =

x′itβ

1+x′itβ

MLE estimate

LogL(β) =∑

i

∑t

[yit log(π(xit ) + (1− yit )log(1− π(xit )]

do not panic: STATA writes it for you!!xtprobit variables, options

you have to know what you are doing....study the Wooldridge textbook

F. Di Iorio (UniNa FII) Panel data AETEA 2013 69 / 75

Page 70: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Qualitative Dependent

Qualitative Dependent Variables

Heterogeneous static logit and probit models:π(αi , xit ) = Pr(yit = 1|αi , xit )

Probit : π(αi , xit ) = Φ(αi + x′itβ) Logit : π(αi + xit ) =αi + x′itβ

1 + αi + x′itβ

αi fixed or random:fixed: yit are still assumed independent across individualsrandom: yit are assumed conditionally independent given αi

The estimation is complicated. Some options are: joint MLE,conditional MLE (only for the logit model), marginal MLEThere are extensions for dynamic models

STATA works for you but ....

you have to know what you are doing....study the books, not the hand notes!!

F. Di Iorio (UniNa FII) Panel data AETEA 2013 70 / 75

Page 71: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Example

Example 2

Example 10.4, cap. 10 in J. M. Wooldridge (2010) Econometric Analysis ofCross Section and Panel Data’, MIT press.The aim is to measure the effect of training courses grant on the rate ofincorrect assembly of parts ( scrap rate) in 54 companies in Michigan thatprovided data for this rate for the years 1987, 1988 and 1989. The grantswere only in the years 1988 and 1989. The companies could receive fundingonly once. The study is a treatment effect type and allows you to see if theeffects of the training course last more than a year. If funding takes effect thescrap rate should be lower to those who have received funding. The data setis downloadable on the STATA website using the command:use http://www.stata.com/data/

jwooldridge/eacsap/jtrain1.dta

F. Di Iorio (UniNa FII) Panel data AETEA 2013 71 / 75

Page 72: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Example

Example 2

The grant is not assigned randomly among firms, so other unobservedfactors may have affected lower scrap rate.The dataset has 12 variables (year and e unitid excluded):

1 number of employees2 annual salary in dollars3 average wage per employee in dollars4 scrap rate in %

5 % of pieces reworked6 dummy if any unions7 dummy if has received grant for the year t8 total hours of training9 2 dummies, one for 1988 and one for 1989

10 no. employees who have attended the trainingF. Di Iorio (UniNa FII) Panel data AETEA 2013 72 / 75

Page 73: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Example

Example 2

POOLED estim. for

log(scrap)it = β0+β1D88it +β2D89it +β3Dsindacatiit +β4DFitβ5DFit−1+uit

covariates are only dummies

F. Di Iorio (UniNa FII) Panel data AETEA 2013 73 / 75

Page 74: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Example

Example 2

RE Estim. for

log(scrap)it = β0+β1D88it +β2D89it +β3Dsindacatiit +β4DFitβ5DFit−1+uit

covariates are only dummies

F. Di Iorio (UniNa FII) Panel data AETEA 2013 74 / 75

Page 75: Introduction to Panel Data Analysis - unina.itwpage.unina.it/fdiiorio/portici/procida2013_en.pdfFactors affecting individual outcomes are numerous (explanatory variables as well as

Example

Example 2

FE Estim. for

log(scrap)it = β0 + β1D88it + β2D89it + +β3DFitβ4DFit−1 + uit

union dummy is time invariant, so dropped

F. Di Iorio (UniNa FII) Panel data AETEA 2013 75 / 75