Panel Data Notes

download Panel Data Notes

of 9

Transcript of Panel Data Notes

  • 8/10/2019 Panel Data Notes

    1/9

    PANEL DATA

    FIXED EFFECTS MODEL

    RHO = COEFFICIENT OF VARIATION

    WE CAN USE LEAST SQUARES WITH TIME DATA BECAUSE Ui F test is rejected. Since there is no Ui, we no

    longer have to do transformation.

    1% increase in enrollment results to a 0.26% increase in rent

    delta: 1 unit

    time variable: year, 80 to 90, but with gaps

    panel variable: city (strongly balanced)

    . xtset city year

    F test that all u_i=0: F(63, 60) = 6.95 Prob > F = 0.0000

    rho .84025883 (fraction of variance due to u_i)

    sigma_e .06418028

    sigma_u .14719727

    _cons 2.037717 1.139795 1.79 0.079 -.2422116 4.317646

    y90 .3773375 .0378882 9.96 0.000 .3015498 .4531252

    lavginc .3196525 .0672163 4.76 0.000 .1851999 .4541051

    lenroll .2621794 .1035767 2.53 0.014 .0549952 .4693636

    lpop -.1937295 .1256064 -1.54 0.128 -.4449796 .0575206

    lrent Coef. Std. Err. t P>|t| [95 Conf. Interval]

    corr(u_i, Xb) = -0.0452 Prob > F = 0.0000

    F(4,60) = 615.18

    overall = 0.7886 max = 2

    between = 0.2692 avg = 2.0

    R-sq: within = 0.9762 Obs per group: min = 2

    Group variable: city Number of groups = 64

    Fixed-effects (within) regression Number of obs = 128

    . xtreg lrent lpop lenroll lavginc y90,fe

  • 8/10/2019 Panel Data Notes

    2/9

    LS estimate is underestimated because of absence of Ui in LS model

    If there is actually a correlation in covariance of x and e, RE estimate will be biased.

    _cons -.1170694 .5202727 -0.23 0.822 -1.146917 .9127785

    y90 .2679622 .0355981 7.53 0.000 .197498 .3384264

    lavginc .554786 .0540609 10.26 0.000 .4477758 .6617962

    lenroll .1316076 .0316488 4.16 0.000 .0689607 .1942545

    lpop -.0880099 .0256218 -3.43 0.001 -.1387267 -.0372932

    lrent Coef. Std. Err. t P>|t| [95 Conf. Interval]

    Total 14.0581346 127 .110693974 Root MSE = .1291

    Adj R-squared = 0.8494

    Residual 2.05016337 123 .016667995 R-squared = 0.8542

    Model 12.0079713 4 3.00199282 Prob > F = 0.0000

    F( 4, 123) = 180.11

    Source SS df MS Number of obs = 128

    . regress lrent lpop lenroll lavginc y90

    rho .75021227 (fraction of variance due to u_i)

    sigma_e .06418028

    sigma_u .11122648

    _cons .8181798 .542852 1.51 0.132 -.2457905 1.88215

    y90 .3247298 .0290265 11.19 0.000 .267839 .3816206

    lavginc .4377074 .0521273 8.40 0.000 .3355398 .5398749

    lenroll .1404553 .039585 3.55 0.000 .0628702 .2180405

    lpop -.0793277 .0327066 -2.43 0.015 -.1434314 -.015224

    lrent Coef. Std. Err. z P>|z| [95 Conf. Interval]

    corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000

    Random effects u_i ~ Gaussian Wald chi2(4) = 2389.99

    overall = 0.8486 max = 2

    between = 0.5134 avg = 2.0

    R-sq: within = 0.9743 Obs per group: min = 2

    Group variable: city Number of groups = 64

    Random-effects GLS regression Number of obs = 128

    . xtreg lrent lpop lenroll lavginc y90,re

  • 8/10/2019 Panel Data Notes

    3/9

    . estimates store RE

    rho .75021227 (fraction of variance due to u_i)

    sigma_e .06418028

    sigma_u .11122648

    _cons .8181798 .542852 1.51 0.132 -.2457905 1.88215

    y90 .3247298 .0290265 11.19 0.000 .267839 .3816206

    lavginc .4377074 .0521273 8.40 0.000 .3355398 .5398749

    lenroll .1404553 .039585 3.55 0.000 .0628702 .2180405

    lpop -.0793277 .0327066 -2.43 0.015 -.1434314 -.015224

    lrent Coef. Std. Err. z P>|z| [95 Conf. Interval]

    corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000

    Random effects u_i ~ Gaussian Wald chi2(4) = 2389.99

    overall = 0.8486 max = 2

    between = 0.5134 avg = 2.0

    R-sq: within = 0.9743 Obs per group: min = 2

    Group variable: city Number of groups = 64

    Random-effects GLS regression Number of obs = 128

    . xtreg lrent lpop lenroll lavginc y90,re

    . estimates store FE

    F test that all u_i=0: F(63, 60) = 6.95 Prob > F = 0.0000

    rho .84025883 (fraction of variance due to u_i)

    sigma_e .06418028

    sigma_u .14719727

    _cons 2.037717 1.139795 1.79 0.079 -.2422116 4.317646

    y90 .3773375 .0378882 9.96 0.000 .3015498 .4531252

    lavginc .3196525 .0672163 4.76 0.000 .1851999 .4541051

    lenroll .2621794 .1035767 2.53 0.014 .0549952 .4693636

    lpop -.1937295 .1256064 -1.54 0.128 -.4449796 .0575206

    lrent Coef. Std. Err. t P>|t| [95 Conf. Interval]

    corr(u_i, Xb) = -0.0452 Prob > F = 0.0000

    F(4,60) = 615.18

    overall = 0.7886 max = 2

    between = 0.2692 avg = 2.0

    R-sq: within = 0.9762 Obs per group: min = 2

    Group variable: city Number of groups = 64

    Fixed-effects (within) regression Number of obs = 128

    . xtreg lrent lpop lenroll lavginc y90,fe

    (V_b-V_B is not positive definite)

    Prob>chi2 = 0.0368

    = 10.23

    chi2(4) = (b-B)'[(V_b-V_B)^(-1)](b-B)

    Test: Ho: difference in coefficients not systematic

    B = inconsistent under Ha, efficient under Ho; obtained from xtreg

    b = consistent under Ho and Ha; obtained from xtreg

    y90 .3773375 .3247298 .0526077 .0243512

    lavginc .3196525 .4377074 -.1180548 .0424356

    lenroll .2621794 .1404553 .1217241 .0957139

    lpop -.1937295 -.0793277 -.1144018 .1212734

    FE RE Difference S.E.

    (b) (B) (b-B) sqrt(diag(V_b-V_B))

    Coefficients

    . hausman FE RE

  • 8/10/2019 Panel Data Notes

    4/9

    WAGE_PANEL 2 DATA

    WE CAN DETECT TIME INVARIANT VARIABLES THROUGH STANDARD DEVIATION UNDER T.

    Log linear model. Average wage of union members is 17% higher than non-union members. This can be

    too high. It is possible that there are other factors that can also affect wage that are also correlated with

    union membership. These are some possible U.

    delta: 1 unit

    time variable: year, 1980 to 1987

    panel variable: crossid (strongly balanced)

    . xtset crossid year

    within .3236137 -.4360092 1.313991 T = 8

    between .3766116 0 1 n = 545

    married overall .4389908 .4963208 0 1 N = 4360

    within 31.1431 -44.07523 158.9248 T = 8

    between 26.35134 17.5 215.5 n = 545

    expersq overall 50.42477 40.78199 0 324 N = 4360

    within 2.291551 3.014679 10.01468 T = 8

    between 1.654918 3.5 14.5 n = 545

    exper overall 6.514679 2.825873 0 18 N = 4360

    within 0 11.76697 11.76697 T = 8

    between 1.747585 3 16 n = 545

    educ overall 11.76697 1.746181 3 16 N = 4360

    within .3622636 -2.467201 3.204687 T = 8

    between .3907468 .3333435 3.174173 n = 545

    lwage overall 1.649147 .5326094 -3.579079 4.05186 N = 4360

    Variable Mean Std. Dev. Min Max Observations

    . xtsum lwage educ exper expersq married

    _cons -.0343057 .0632559 -0.54 0.588 -.1583195 .0897081

    union .1685243 .0170652 9.88 0.000 .1350679 .2019808

    married .1230112 .0155714 7.90 0.000 .0924833 .1535392

    expersq -.0027349 .0007099 -3.85 0.000 -.0041267 -.0013432

    exper .0861696 .0101415 8.50 0.000 .0662871 .1060521

    educ .0989945 .0046227 21.41 0.000 .0899316 .1080574

    lwage Coef. Std. Err. t P>|t| [95 Conf. Interval]

    Total 1236.52965 4359 .283672779 Root MSE = .48287

    Adj R-squared = 0.1781

    Residual 1015.19113 4354 .233162869 R-squared = 0.1790

    Model 221.338512 5 44.2677025 Prob > F = 0.0000

    F( 5, 4354) = 189.86

    Source SS df MS Number of obs = 4360

    . regress lwage educ exper expersq married union

  • 8/10/2019 Panel Data Notes

    5/9

    Note that educ is omitted, because it is already previously shown that it is time invariant. Notice that

    union estimate now is smaller in FE model. Ui is also non-zero.

    Union FE=8%, Union LS= 17%

    Educ is now included. Check whether the gap in union is significant between RE and FE model. Note that

    EDUC is significant in RE model. Since educ is no longer part of Ui, Ui is now smaller.

    F test that all u_i=0: F(544, 3811) = 8.12 Prob > F = 0.0000

    rho .56467849 (fraction of variance due to u_i)

    sigma_e .35125535

    sigma_u .4000539

    _cons 1.06488 .0266607 39.94 0.000 1.012609 1.11715

    union .0820871 .0192907 4.26 0.000 .044266 .1199083

    married .0453033 .0183097 2.47 0.013 .0094056 .081201

    expersq -.0043009 .0006053 -7.11 0.000 -.0054876 -.0031142

    exper .1168467 .0084197 13.88 0.000 .1003392 .1333542

    educ (omitted)

    lwage Coef. Std. Err. t P>|t| [95 Conf. Interval]

    corr(u_i, Xb) = -0.1139 Prob > F = 0.0000

    F(4,3811) = 206.38

    overall = 0.0638 max = 8

    between = 0.0005 avg = 8.0

    R-sq: within = 0.1780 Obs per group: min = 8

    Group variable: crossid Number of groups = 545

    Fixed-effects (within) regression Number of obs = 4360

    note: educ omitted because of collinearity

    . xtreg lwage educ exper expersq married union, fe

    rho .46395167 (fraction of variance due to u_i)

    sigma_e .35125535

    sigma_u .32678141

    _cons -.1186803 .1071673 -1.11 0.268 -.3287243 .0913637

    union .1041501 .0178144 5.85 0.000 .0692346 .1390657

    married .0668301 .0167367 3.99 0.000 .0340268 .0996335

    expersq -.0040453 .000592 -6.83 0.000 -.0052056 -.002885

    exper .1114758 .0082611 13.49 0.000 .0952843 .1276672

    educ .101201 .0087763 11.53 0.000 .0839998 .1184022

    lwage Coef. Std. Err. z P>|z| [95 Conf. Interval]

    corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000

    Random effects u_i ~ Gaussian Wald chi2(5) = 932.13

    overall = 0.1729 max = 8

    between = 0.1690 avg = 8.0

    R-sq: within = 0.1774 Obs per group: min = 8

    Group variable: crossid Number of groups = 545

    Random-effects GLS regression Number of obs = 4360

    . xtreg lwage educ exper expersq married union, re

  • 8/10/2019 Panel Data Notes

    6/9

    Educ is now removed in both model.

    Reject null hypothesis, therefore there is significant difference between FE and RE estimate. Therefore

    we select the FE estimate.

    What if we have reverse causality?? An option is to use lagged variables. Here we use union lag 1.

    [L.UNION]. note that union lag 1 is not significant

    Another option is to treat it as a endogenous variable and use panel data-Instrumental Variable

    estimation. Commonly we use lagged variables as instruments. We can easily argue that lagged variables

    are exogenous, but we still have to meet the relevance condition.

    Prob>chi2 = 0.0000

    = 33.71

    chi2(4) = (b-B)'[(V_b-V_B)^(-1)](b-B)

    Test: Ho: difference in coefficients not systematic

    B = inconsistent under Ha, efficient under Ho; obtained from xtreg

    b = consistent under Ho and Ha; obtained from xtreg

    union .0820871 .1041501 -.022063 .0074013

    married .0453033 .0668301 -.0215268 .0074247

    expersq -.0043009 -.0040453 -.0002556 .0001261

    exper .1168467 .1114758 .0053709 .0016265

    FE RE Difference S.E.

    (b) (B) (b-B) sqrt(diag(V_b-V_B))

    Coefficients

    . hausman FE RE

    .

    F test that all u_i=0: F(544, 3266) = 8.96 Prob > F = 0.0000

    rho .61946546 (fraction of variance due to u_i)

    sigma_e .32665813

    sigma_u .41677819

    _cons 1.126349 .0360629 31.23 0.000 1.055641 1.197057

    L1. .0227485 .0198253 1.15 0.251 -.0161228 .0616198

    union

    married .0547899 .0190767 2.87 0.004 .0173864 .0921933

    expersq -.0032345 .0007133 -4.53 0.000 -.0046331 -.0018359

    exper .1010353 .0105081 9.61 0.000 .0804321 .1216384

    educ (omitted)

    lwage Coef. Std. Err. t P>|t| [95 Conf. Interval]

    corr(u_i, Xb) = -0.1428 Prob > F = 0.0000

    F(4,3266) = 130.65

    overall = 0.0295 max = 7

    between = 0.0016 avg = 7.0

    R-sq: within = 0.1379 Obs per group: min = 7

    Group variable: crossid Number of groups = 545

    Fixed-effects (within) regression Number of obs = 3815

    note: educ omitted because of collinearity

    . xtreg lwage educ exper expersq married l.union, fe

  • 8/10/2019 Panel Data Notes

    7/9

    In the model above, we instrumented the endogenous variable union, with union lag 1 as instrument.

    The model below adds married lag 1 as an instrument. Although married lag 1 is relevant, notice thatunion standard error increased. Notice too that union is too large, it is most likely very biased.

    Instruments: educ exper expersq married L.union

    Instrumented: union

    F test that all u_i=0: F(544,3266) = 8.61 Prob > F = 0.0000

    rho .60828168 (fraction of variance due to u_i)

    sigma_e .33147153

    sigma_u .41305859

    _cons 1.055652 .0763501 13.83 0.000 .9060084 1.205296

    married .0484844 .0200249 2.42 0.015 .0092363 .0877324

    expersq -.0033162 .0007247 -4.58 0.000 -.0047367 -.0018958

    exper .1031398 .0107769 9.57 0.000 .0820174 .1242622

    educ (omitted)

    union .2839369 .2510969 1.13 0.258 -.208204 .7760777

    lwage Coef. Std. Err. z P>|z| [95 Conf. Interval]

    corr(u_i, Xb) = -0.1561 Prob > chi2 = 0.0000

    Wald chi2(4) = 99169.08

    overall = 0.0447 max = 7

    between = 0.0100 avg = 7.0

    R-sq: within = 0.1123 Obs per group: min = 7

    Group variable: crossid Number of groups = 545

    Fixed-effects (within) IV regression Number of obs = 3815

    F test that all u_i=0: F(544, 3266) = 3.71 Prob > F = 0.0000

    rho .55719951 (fraction of variance due to u_i)

    sigma_e .28060659

    sigma_u .31477424

    _cons .248989 .0309789 8.04 0.000 .1882491 .309729

    L1. .0801182 .0170304 4.70 0.000 .0467269 .1135094

    union

    married .0222075 .0163873 1.36 0.175 -.0099229 .0543379

    expersq .0002877 .0006127 0.47 0.639 -.0009137 .0014891

    exper -.0074118 .0090267 -0.82 0.412 -.0251104 .0102867

    educ (omitted)

    union Coef. Std. Err. t P>|t| [95 Conf. Interval]

    corr(u_i, Xb) = 0.7123 Prob > F = 0.0000

    F(4,3266) = 6.28

    overall = 0.3687 max = 7

    between = 0.8580 avg = 7.0

    R-sq: within = 0.0076 Obs per group: min = 7

    Group variable: crossid Number of groups = 545

    Fixed-effects (within) regression Number of obs = 3815

    First-stage within regression

    . xtivreg lwage educ exper expersq married (union = l.union), fe first

  • 8/10/2019 Panel Data Notes

    8/9

    BE Option

    What if the dates of your values do not coincide? You interpolate the data for time gaps. E.g. cross

    country data, you project/interpolate. But there will be measurement error. In this case, use the BE

    (between) estimator.

    Using BE (Between Estimator). This is an option used when there is a lot of interpolations.

    Instruments: educ exper expersq married L.union L.married

    Instrumented: union

    F test that all u_i=0: F(544,3266) = 8.57 Prob > F = 0.0000

    rho .60766203 (fraction of variance due to u_i)

    sigma_e .33216294

    sigma_u .41338247

    _cons 1.052101 .0683068 15.40 0.000 .9182222 1.18598

    married .0482124 .0198923 2.42 0.015 .0092242 .0872007

    expersq -.0033186 .0007259 -4.57 0.000 -.0047413 -.0018959

    exper .1032237 .0107687 9.59 0.000 .0821175 .1243298

    educ (omitted)

    union .2972112 .2161324 1.38 0.169 -.1264006 .720823

    lwage Coef. Std. Err. z P>|z| [95 Conf. Interval]

    corr(u_i, Xb) = -0.1605 Prob > chi2 = 0.0000

    Wald chi2(4) = 98757.27

    overall = 0.0447 max = 7

    between = 0.0108 avg = 7.0

    R-sq: within = 0.1086 Obs per group: min = 7

    Group variable: crossid Number of groups = 545

    Fixed-effects (within) IV regression Number of obs = 3815

    F test that all u_i=0: F(544, 3265) = 3.73 Prob > F = 0.0000

    rho .55799696 (fraction of variance due to u_i)

    sigma_e .28031145

    sigma_u .31495182

    _cons .2575573 .0310964 8.28 0.000 .1965869 .3185278

    L1. .0501667 .0178699 2.81 0.005 .0151295 .085204

    married

    L1. .078896 .017018 4.64 0.000 .0455289 .1122631

    union

    married .0018189 .0179088 0.10 0.919 -.0332947 .0369324

    expersq .0003844 .0006131 0.63 0.531 -.0008176 .0015865

    exper -.0109412 .0091044 -1.20 0.230 -.0287921 .0069097

    educ (omitted)

    union Coef. Std. Err. t P>|t| [95 Conf. Interval]

    corr(u_i, Xb) = 0.5909 Prob > F = 0.0000

    F(5,3265) = 6.61

    overall = 0.2841 max = 7

    between = 0.6227 avg = 7.0

    R-sq: within = 0.0100 Obs per group: min = 7

    Group variable: crossid Number of groups = 545

    Fixed-effects (within) regression Number of obs = 3815

    First-stage within regression

    . xtivreg lwage educ exper expersq married (union = l.union l.married), fe first

  • 8/10/2019 Panel Data Notes

    9/9

    HAUSMAN TAYLOR ESTIMATOR

    What it essentially does is that it separates the time invariant to time varying variables, but you do not

    want to go into full RE. Hausman Taylor is a go-between model. We cannot use RE because union iscorrelated with u. we cannot use FE because educ is time-invariant (that is, if you really want to show

    educ in the first place). So we use the Hausman Taylor as a go-between.

    _cons .5248524 .2190544 2.40 0.017 .0945474 .9551574

    union .252298 .0462559 5.45 0.000 .1614341 .3431619

    married .1674925 .0405987 4.13 0.000 .0877415 .2472436

    expersq .0055953 .0032176 1.74 0.083 -.0007253 .0119159

    exper -.0608253 .0503829 -1.21 0.228 -.1597962 .0381456

    educ .0937637 .0108543 8.64 0.000 .0724418 .1150855

    lwage Coef. Std. Err. t P>|t| [95 Conf. Interval]

    sd(u_i + avg(e_i.))= .3495835 Prob > F = 0.0000

    F(5,539) = 28.13

    overall = 0.1259 max = 8

    between = 0.2069 avg = 8.0

    R-sq: within = 0.0391 Obs per group: min = 8

    Group variable: crossid Number of groups = 545

    Between regression (regression on group means) Number of obs = 4360

    . xtreg lwage educ exper expersq married union, be

    .

    Note: TV refers to time varying; TI refers to time invariant.

    rho .47760032 (fraction of variance due to u_i)

    sigma_e .35107115

    sigma_u .33568041

    _cons -.1122956 .1092657 -1.03 0.304 -.3264525 .1018613

    educ .1011122 .0089567 11.29 0.000 .0835575 .118667

    TIexogenous

    union .0786858 .019242 4.09 0.000 .0409722 .1163994

    TVendogenous

    married .0666511 .0167574 3.98 0.000 .0338072 .099495

    expersq -.0040726 .000591 -6.89 0.000 -.005231 -.0029142

    exper .1118329 .0082459 13.56 0.000 .0956713 .1279946

    TVexogenous

    lwage Coef. Std. Err. z P>|z| [95 Conf. Interval]

    Prob > chi2 = 0.0000

    Random effects u_i ~ i.i.d. Wald chi2(5) = 915.18

    max = 8

    avg = 8

    Obs per group: min = 8

    Group variable: crossid Number of groups = 545

    Hausman-Taylor estimation Number of obs = 4360

    . xthtaylor lwage exper expersq married union educ, endog(union) constant(educ)