Panel Data Assignment

32
Assignment on Panel Data Econometrics II ACT 672 Submitted by: Madiha Khan B05102016 Mirza Muhammad Ali B05102017 S. Fatima Zehra B05102040 Submitted to: Sir Mudassir Uddin

Transcript of Panel Data Assignment

Page 1: Panel Data Assignment

Assignment

on Panel

Data

Econometrics II – ACT 672

Submitted by: Madiha Khan B05102016 Mirza Muhammad Ali B05102017 S. Fatima Zehra B05102040

Submitted to: Sir Mudassir Uddin

Page 2: Panel Data Assignment

Assignment on Panel Data ACT - 672

Page i

TABLE OF CONTENTS

OBJECT 1

1. Introduction to Panel Data 3 1.1 What is Panel Data 4 1.2 Reason for Using Panel Data 4

2. Importance of Fixed and Random Effects 6 2.1 The Fixed Effects Model 7 2.2 The Random Effects Model 7 2.3 Assessing the appropriateness of fixed effects and random effects estimation 7 2.4 When to Use fixed effects, random effects or Pooled OLS Model 8

3. Panel Data Analysis Using STATA 9.1 9 3.1 Panel Data 10 3.2 Preference of STATA over Other Packages 10 3.3 Panel Data in STATA 9.1 11

a. Input the Data in STATA 11 b. Declare the Cross-Section and Time Variables 11 c. Carry out the Descriptive Statistic 12 d. Outline the Main Variables in the Basic Sample Statistics 12 e. Examine the Relationship between the Dependant and Independent Variables 13 f. Conclusion 13

4. Panel Data Models 14 4.1 Different Panel Data Models 15 4.2 Fixed Effect and Random Effect Model in STATA 15 4.3 Fixed Effect Models 16

a. One-way Fixed Effect Models: Group Effects 16 i. The Pooled OLS Regression Model 16 ii. Least Squares Dummy Variable Models 17 iii. Testing Fixed Group Effects (F-test) 18

b. One-way Fixed Effect Models: Fixed Time Effect 19 i. The Pooled OLS Regression Model 19 ii. Least Squares Dummy Variable Models 19 iii. Testing Fixed Time Effects (F-test) 20

4.4 Random Effect Models 20 a. One-way Random Group Effect Model 20 b. One-way Random Time Effect Model 21 c. Testing Random Effect Models 22

i. Testing Random Group Effects 22 ii. Testing Random Time Effect 23

Page 3: Panel Data Assignment

Assignment on Panel Data ACT - 672

Page ii

5. Hausman Test 24 5.1 Fixed Effects VS. Random Effects 25 5.2 Hausman Test 25

a. Estimate the fixed-effect model 25 b. Estimate the random effects model 25 c. Apply the Hausman test 25

6. Conclusion 27 6.1 Statistical Package 28 6.2 Data 28 6.3 Panel Data Models 28

a. Pooled OLS Models 28 b. Fixed Effect Models 28 c. Random Effect Models 28 d. Tests for Fixed Effect and Random Effect Models 29

i. Testing Fixed Group Effects 29 ii. Testing Fixed Time Effects 29 iii. Testing Random Group Effects 29 iv. Testing Random Time Effects 29 v. Hausman Test 29

Page 4: Panel Data Assignment

Assignment on Panel Data ACT - 672

Page 1

OBJECT

Page 5: Panel Data Assignment

Assignment on Panel Data ACT - 672

Page 2

1. Describe what panel data is and the reasons for using it in this format

2. Assess the importance of fixed and random effects

3. Examine the Hausman test, which determines if fixed or random

effects should be used.

4. Evaluate some panel data models

Page 6: Panel Data Assignment

Assignment on Panel Data ACT - 672

Page 3

1. INTRODUCTION TO PANEL DATA

Page 7: Panel Data Assignment

Assignment on Panel Data ACT - 672

Page 4

1.1 WHAT IS PANEL DATA

A data set containing observations on multiple phenomena observed over multiple time periods is called panel data. Panel data aggregates all the individuals, and analyzes them in a period of time. Alternatively, the second dimension of data may be some entity other than time. Whereas time series and cross-sectional data are both one-dimensional, panel data sets are two-dimensional. A longitudinal, or panel, data set follows a given sample of individuals over time, and thus provides multiple observations on each individual in the sample. Panel data have become widely available in both the developed and developing countries. In developing countries, there may not have a long tradition of statistical collection. It is of special importance to obtain original survey data to answer many significant and important questions.

1.2 REASON FOR USING PANEL DATA

Panel data sets for economic research possess several major advantages over conventional cross-sectional or time-series data sets. Some of the many reasons for using panel data include:

More accurate inference of model parameters: Panel data usually contain more sample variability than cross-sectional data which may be viewed as a panel with T = 1, or time series data which is a panel with N = 1, hence improving the efficiency of econometric estimates

Reduced collinearity: Panel data usually give the researcher a large number of data points, increasing the degrees of freedom and reducing the collinearity among explanatory variables

Heterogeneity: Since panel data relate to individuals, firms, states, countries, etc.,over time, there is bound to be heterogeneity in these units. The techniques of panel data estimation can take such heterogeneity explicitly into account by allowing for individual-specific variables

More receptive: Panel data has greater capacity for capturing the complexity of human behavior than a single cross-section or time series data. This includes constructing and testing more complicated behavioral hypotheses

Minimized bias: By making data available for several thousand units, panel data can minimize the bias that might result if we aggregate individuals or firms into broad aggregates

Page 8: Panel Data Assignment

Assignment on Panel Data ACT - 672

Page 5

Controlling the impact of omitted variables: Panel data contain information on both the intertemporal dynamics and the individuality of the entities may allow one to control the effects of missing or unobserved variables.

Uncovering dynamic relationships: With panel data, we can rely on the inter-individual differences to reduce the collinearity between current and lag variables to estimate unrestricted time-adjustment patterns

Complicated behavioral models: Panel data enables us to study more complicated

behavioral models. For example, phenomena such as economies of scale and technological change can be better handled by panel data than by pure cross-section or pure time series data

Generating more accurate predictions: Panel data can help to produce more accurate predictions for individual outcomes by pooling the data rather than generating predictions of individual outcomes using the data on the individual in question. If individual behaviors are similar conditional on certain variables, panel data provide the possibility of learning an individuals behavior by observing the behavior of others.

Simplifying computation and statistical inference: Panel data involve at least two dimensions, a cross-sectional dimension and a time series dimension. Under normal circumstances one would expect that the computation of panel data estimator or inference would be more complicated than cross-sectional or time series data. However, in certain cases, the availability of panel data actually simplifies computation and inference.

Analysis of nonstationary time series: If panel data are available, and observations among cross-sectional units are independent, then one can invoke the central limit theorem across cross-sectional units to show that the limiting distributions of many estimators remain asymptotically normal

Measurement error: Measurement errors can lead to under-identification of an econometric model. The availability of multiple observations for a given individual or at a given time may allow a researcher to make different transformations to induce different and deducible changes in the estimators, hence to identify an otherwise unidentified model.

In short, panel data can enrich empirical analysis in ways that may not be possible if we use only cross-section or time series data.

Page 9: Panel Data Assignment

Assignment on Panel Data ACT - 672

Page 6

2. IMPORTANCE OF FIXED AND

RANDOM EFFECTS

Page 10: Panel Data Assignment

Assignment on Panel Data ACT - 672

Page 7

2.1 THE FIXED EFFECTS MODEL

Another type of panel model would have constant slopes but intercepts that differ according to the cross-sectional (group) unit—for example, the country. Although there are no significant temporal effects, there are significant differences among countries in this type of model. While the intercept is cross-section (group) specific and in this case differs from country to country, it may or may not differ over time. These models are called fixed effects models. Because i-1 dummy variables are used to designate the particular country, this same model is sometimes called the Least Squares Dummy Variable model

2.2 THE RANDOM EFFECTS MODEL

The random effects model is a regression with a random constant term. One way to handle the ignorance or error is to assume that the intercept is a random outcome variable. The random outcome is a function of a mean value plus a random error. But this cross-sectional specific error term vi, which indicates the deviation from the constant of the cross-sectional unit must be uncorrelated with the errors of the variables if this is to be modeled. The time series cross-sectional regression model is one with an intercept that is a random effect.

Under these circumstances, the random error vi is heterogeneity specific to a crosssectional unit. This random error vi is constant over time. Therefore, the random error eit is specific to a particular observation. For vi to be properly specified, it must be orthogonal to the individual effects. Because of the separate cross-sectional error term, these models are sometimes called one-way random effects models. Owing to this intrapanel variation, the random effects model has the distinct advantage of allowing for time-invariant variables to be included among the regressors.

2.3 ASSESSING THE APPROPRIATENESS OF FIXED EFFECTS AND

RANDOM EFFECTS ESTIMATION In principle, random effects is more attractive due to the following reasons:

Page 11: Panel Data Assignment

Assignment on Panel Data ACT - 672

Page 8

Observed characteristics that remain constant for each individual are retained in the regression model. In fixed effects estimation, they have to be dropped.

With random effects estimation we do not lose n degrees of freedom, as is the case with fixed effects.

However, if either of the preconditions for using random effects is violated, we should use fixed effects instead. These preconditions are:

The observations can be described as being drawn randomly from a given population.

The unobserved effect be distributed independently of the Xj variables.

2.4 WHEN TO USE FIXED EFFECTS, RANDOM EFFECTS OR

POOLED OLS MODEL The following flowchart efficiently summarizes when to use fixed or random effects:

Page 12: Panel Data Assignment

Assignment on Panel Data ACT - 672

Page 9

3. PANEL DATA ANALYSIS USING

STATA 9.1

Page 13: Panel Data Assignment

Assignment on Panel Data ACT - 672

Page 10

3.1 PANEL DATA Panel (or longitudinal) data are cross-sectional and time-series. There are multiple entities, each of which has repeated measurements at different time periods. Panel data may have group effects, time effects, or the both, which are analyzed by fixed effect and random effect models. A panel data set contains n entities or subjects (e.g., firms and states), each of which includes T observations measured at 1 through t time period. Thus, the total number of observations is nT. Ideally, panel data are measured at regular time intervals (e.g., year, quarter, and month). Otherwise, panel data should be analyzed with caution. A short panel data set has many entities but few time periods (small T), while a long panel has many time periods (large T) but few entities.

3.2 PREFERENCE OF STATA OVER OTHER PACKAGES We will use the statistical package STATA which has a rich variety of panel analytic procedures. It has fixed and random effects models, can handle balanced or unbalanced panels, and have one- or two-way random and fixed effects models. It also has both Hausman and Sargan tests for specification. STATA has Arellano, Bond and Bover's estimator for dynamic panel models, and can also handle groupwise heteroskedasticity in the random effects model. STATA mainly provides the following estimation methods:

xtreg Fixed-, between- and random-effects, and population-averaged linear models

xtregar Fixed- and random-effects linear models with an AR(1) disturbance

xtgls Panel-data models using GLS

xtpcse OLS or Prais-Winsten models with panel-corrected standard errors

xtrchh Hildreth-Houck random coefficients models

xtivreg Instrumental variables and two-stage least squares for panel-data models

xtabond Arellano-Bond linear, dynamic panel data estimator

xtabond2 Arellano-Bond system dynamic panel data estimator

xttobit Random-effects tobit models

xtintreg Random-effects interval data regression models

xtlogit Fixed-effects, random-effects, & population-averaged logit models

xtprobit Random-effects and population-averaged probit models

xtcloglog Random-effects and population-averaged cloglog models

xtpoisson Fixed-effects, random-effects, & population-averaged Poisson models

xtnbreg Fixed-effects, random-effects, & population-averaged negative binomial models

Page 14: Panel Data Assignment

Assignment on Panel Data ACT - 672

Page 11

3.3 PANEL DATA IN STATA 9.1 We are interested in testing the effects of physicians' emigration rate to the UK (dependant variable Y) in relation with physicians' emigration rate to Canada (independent variable X) for the Middle Eastern countries of Egypt, Iran and Turkey from the years 1991 to 2004. These countries have approximately the same population per 1000 of individuals, and thus make a suitable data set for comparison. Consider the following dataset:

YEAR EGYPT_UK EGYPT_CAN IRAN_UK IRAN_CAN TUR_UK TUR_CAN

1991 0.0035 0.0043 0.0004 0.0044 0.0003 0.0011

1992 0.0030 0.0035 0.0005 0.0044 0.0002 0.0011

1993 0.0024 0.0030 0.0004 0.0043 0.0002 0.0011

1994 0.0014 0.0022 0.0001 0.0028 0.0001 0.0009

1995 0.0019 0.0021 0.0002 0.0020 0.0001 0.0009

1996 0.0009 0.0019 0.0001 0.0018 0.0002 0.0013

1997 0.0009 0.0023 0.0001 0.0016 0.0001 0.0007

1998 0.0005 0.0019 0.0001 0.0013 0.0000 0.0007

1999 0.0007 0.0016 0.0002 0.0014 0.0001 0.0005

2000 0.0008 0.0015 0.0004 0.0014 0.0000 0.0005

2001 0.0008 0.0013 0.0004 0.0014 0.0001 0.0004

2002 0.0007 0.0013 0.0006 0.0013 0.0002 0.0004

2003 0.0007 0.0013 0.0006 0.0013 0.0002 0.0004

2004 0.0008 0.0012 0.0011 0.0014 0.0002 0.0004

We will follow these steps for panel data analysis in STATA

a. Input the Data in STATA

Command: . insheet using"C:\Documents and Settings\Micro\Desktop\Assignment\panel data.txt"

b. Declare the Cross-Section and Time Variables Command: Statistics > Time Series > Setup and Utilities > Declare Data to be Time Series Time Variable: year Panel ID variable: country Display format for the time variable: Yearly

Page 15: Panel Data Assignment

Assignment on Panel Data ACT - 672

Page 12

Output: tsset country year, yearly panel variable: country, 1 to 3 time variable: year, 1991 to 2004

c. Carry out the Descriptive Statistic Command: xtdes Output: xtdes country: 1, 2, ..., 3 n = 3 year: 1991, 1992, ..., 2004 T = 14 Delta(year) = 1; (2004-1991)+1 = 14 (country*year uniquely identifies each observation) Distribution of T_i: min 5% 25% 50% 75% 95% max 14 14 14 14 14 14 14 Freq. Percent Cum. | Pattern ---------------------------------+--------------------------------- 3 100.00 100.00 | 11111111111111 ---------------------------------+--------------------------------- 3 100.00 | XXXXXXXXXXXXXX

d. Outline the Main Variables in the Basic Sample Statistics

Command: xtsum uk canada Output:

Variable Mean Std. Dev. Min Max Observations

uk overall 0.000624 0.000778 0 0.0035 N = 42

between

0.000645 0.000143 0.001357 n = 3

within

0.000567 -0.00023 0.002767 T = 14

canada overall 0.001681 0.001115 0.0004 0.0044 N = 42

between

0.000814 0.000743 0.0022 n = 3

within

0.000889 0.000781 0.003881 T = 14

Page 16: Panel Data Assignment

Assignment on Panel Data ACT - 672

Page 13

e. Examine the Relationship between the Dependant and Independent Variables Command: xtline uk canada Output:

f. Conclusion It is observed that a relationship may exist between the two variables (physician’s emigration rate to the UK and physician’s emigration rate to Canada) therefore, we will continue with an in-depth panel data analysis testing for fixed and random effects using the Hausman test, as well as testing for autocorrelation and heteroskedasticity

0

.001.

002.

003.

004

0

.001.

002.

003.

004

1990 1995 2000 2005

1990 1995 2000 2005

1 2

3

UK CANADA

YEAR

Graphs by Country

Page 17: Panel Data Assignment

Assignment on Panel Data ACT - 672

Page 14

4. PANEL DATA MODELS

Page 18: Panel Data Assignment

Assignment on Panel Data ACT - 672

Page 15

4.1 DIFFERENT PANEL DATA MODELS

Panel data models examine fixed and/or random effects of entity (individual or subject) or time. The core difference between fixed and random effect models lies in the role of dummy variables. If dummies are considered as a part of the intercept, this is a fixed effect model. In a random effect model, the dummies act as an error term. Fixed effects are tested by the (incremental) F test, while random effects are examined by the Lagrange Multiplier (LM) test (Breusch and Pagan 1980). If the null hypothesis is not rejected, the pooled OLS regression is favored. If one cross-sectional or time-series variable is considered (e.g., country, firm, and race), this is called a one-way fixed or random effect model. Two-way effect models have two sets of dummy variables for group and/or time variables (e.g., state and year).

4.2 FIXED EFFECT AND RANDOM EFFECT MODEL IN STATA We use the xtreg command for estimating the following four basic linear panel data models:

Fixed effect model (Fixed-effect)

Random effects model (Random-effect)

Between groups effects model (Between-effect)

Sample average model (Population -average). Command: xtreg depvar [varlist] [if exp], model_type [level (#)] Where the level (#) option is used to set the level of significance, (the default value is

95%) The model_type option corresponds with the following four kinds of models:

model_type Models

be between-effects estimator fe fixed-effects estimator re GLS random-effects estimator pa GEE population-averaged estimator mle maximum-likelihood random-effects estimator

Page 19: Panel Data Assignment

Assignment on Panel Data ACT - 672

Page 16

Stata also provides the procedures and commands that estimate panel data models in the convenient way

Procedure STATA Command

Regression (OLS) .regress

One-way fixed effect (within) .xtreg, fe

.areg, abs

Two-way fixed effect (within) N/A

Between effect .xtreg, be

One-way random effect

.xtreg, re

.xtgls

.xtmixed

Two-way random effect .xtmixed

Random coefficient model .xtmixed

.xtrc

4.3 FIXED EFFECT MODELS There are several strategies for estimating fixed effect models. The least squares dummy variable model (LSDV) uses dummy variables, whereas the within effect model does not. These strategies, of course, produce the identical parameter estimates of non-dummy independent variables. The between effect model fits the model using group and/or time means of dependent and independent variables without dummies.

a. One-way Fixed Effect Models: Group Effects

A one-way fixed group model examines group differences in intercepts. The LSDV for this fixed model needs to create as many dummy variables as the number of entities or subjects. When many dummies are needed, the within effect model is useful since it transforms variables using group means to avoid dummies. The between effect model uses group means of variables.

i. The Pooled OLS Regression Model

First, fit the pooled regression model without any dummy variable Command: regress mgrt_uk mgrt_can

Page 20: Panel Data Assignment

Assignment on Panel Data ACT - 672

Page 17

Output: Source SS df MS Number of obs = 42 F( 1, 40) = 17.42 Model 7.5293e-06 1 7.5293e-06 Prob > F = 0.0002 Residual .000017287 40 4.3217e-07 R-squared = 0.3034 Adj R-squared = 0.2860 Total .000024816 41 6.0527e-07 Root MSE = .00066 uk Coef. Std. Err. t P>t [95% Conf. Interval] canada .3843646 .0920859 4.17 0.000 .1982521 .570477 _cons -.0000223 .0001851 -0.12 0.905 -.0003963 .0003517 Comments:

The regression equation is mgrt_uk = -.0000223 + .3843646* mgrt_can

This model does not fit the data well (F=17.42, p = 0.0002and R2=0.3034)

We may, however, suspect if there is a fixed group effect producing different

intercepts across groups.

ii. Least Squares Dummy Variable Models Least Squares Dummy Variable Regression Model (LSDV1) drops a dummy variable to get the model identified. LSDV1 produces correct ANOVA information, goodness of fit, parameter estimates, and standard errors. As a consequence, this approach is commonly used in practice. Command: regress mgrt_uk g1-g2 mgrt_can Output: Source SS df MS Number of obs = 42 F( 3, 38) = 20.23 Model .00001526 3 5.0868e-06 Prob > F = 0.0000 Residual 9.5559e-06 38 2.5147e-07 R-squared = 0.6149 Adj R-squared = 0.5845 Total .000024816 41 6.0527e-07 Root MSE = .0005

Page 21: Panel Data Assignment

Assignment on Panel Data ACT - 672

Page 18

mgrt_uk Coef. Std. Err. t P>t [95% Conf. Interval]

g1 .0007619 .0002241 3.40 0.002 .0003083 .0012156

g2 -.0002571 .0002289 -1.12 0.268 -.0007205 .0002063

mgrt_can .3333187 .0880796 3.78 0.001 .1550109 .5116264

_cons -.0001048 .0001491 -0.70 0.487 -.0004067 .0001972

Comments:

LSDV1 fits the data better than does the pooled OLS.

SSE decreases from .000017287 to 9.5559e-06

But R2 increases significantly from 0.3034 to 0.6149

Due to the dummies included, this model loses two degrees of freedom (from 40 to 38).

iii. Testing Fixed Group Effects (F-test)

In a regression of , the null hypothesis is that all dummy parameters except for one for the dropped are zero. This hypothesis is tested by the F-test, which is based on loss of goodness-of-fit. The robust model in the following formula is LSDV (or within effect model) and the efficient model is the pooled regression.

The F statistic is computed as Fcal = (0.6149 - 0.3034)/(3 – 1) = 14.55985

(1 - 0.6149)/ (40 – 3 – 1)

Ftab = F[.95,2,36] = 0.051366

Since Fcal > Ftab, therefore we reject the null hypothesis and conclude that the fixed group effect model is better than the pooled OLS model.

Page 22: Panel Data Assignment

Assignment on Panel Data ACT - 672

Page 19

b. One-way Fixed Effect Models: Fixed Time Effect

A fixed time effect model investigates how time affects the intercept using time dummy variables. The logic and method are the same as those of the fixed group effect model.

i. The Pooled OLS Regression Model

Similar to our previous section, we will fit a pooled regression model without any dummy variable and obtain the same results.

ii. Least Squares Dummy Variable Models

In this model we will drop one time dummy variable, Command: regress mgrt_uk t1-t13 mgrt_can Output: Source SS df MS Number of obs = 42 F( 14, 27) = 1.06 Model 8.8007e-06 14 6.2862e-07 Prob > F = 0.4313 Residual .000016015 27 5.9316e-07 R-squared = 0.3546 Adj R-squared = 0.0200 Total .000024816 41 6.0527e-07 Root MSE = .00077 mgrt_uk Coef. Std. Err. t P>t [95% Conf. Interval]

t1 -.0001666 .0007142 -0.23 0.817 -.0016321 .0012989 t2 -.0002313 .0006962 -0.33 0.742 -.0016599 .0011972 t3 -.0003882 .0006839 -0.57 0.575 -.0017915 .0010151 t4 -.0005363 .0006452 -0.83 0.413 -.0018601 .0007876 t5 -.0002216 .0006367 -0.35 0.731 -.0015279 .0010848 t6 -.0005549 .0006367 -0.87 0.391 -.0018613 .0007515 t7 -.0005372 .0006339 -0.85 0.404 -.0018378 .0007633 t8 -.0006147 .0006304 -0.98 0.338 -.0019083 .0006789 t9 -.0004304 .0006293 -0.68 0.500 -.0017217 .0008609 t10 -.000351 .0006292 -0.56 0.582 -.0016419 .0009399 t11 -.0002794 .0006289 -0.44 0.660 -.0015697 .0010109 t12 -.0002 .0006288 -0.32 0.753 -.0014903 .0010903 t13 -.0002 .0006288 -0.32 0.753 -.0014903 .0010903 mgrt_can .3823382 .1494048 2.56 0.016 .0757848 .6888916 _cons .0003177 .0004691 0.68 0.504 -.0006448 .0012802

Page 23: Panel Data Assignment

Assignment on Panel Data ACT - 672

Page 20

iii. Testing Fixed Time Effects (F-test)

Again we use the null hypothesis that all dummy parameters except for one for the dropped are zero. The F statistic is computed as Fcal = (0.3546- 0.3034)/(14 – 1) = 0.208201

(1 - 0.3546)/ (40 – 15– 1)

Ftab = F[.95,13,24] = 0.41319

Since Fcal < Ftab, therefore we fail to reject the null hypothesis and conclude that there is no fixed time effect.

4.4 RANDOM EFFECT MODELS

A random effect model examines how group and/or time affect error variances. This model is appropriate for n individuals who were drawn randomly from a large population. Here we will focus on the feasible generalized least squares (FGLS) with variance component estimation methods.

a. One-way Random Group Effect Model

In STATA, the .xtreg command has the re option to fit the one-way random effect model. This produces the FGLS estimates.

Command: iis g xtreg mgrt_uk mgrt_can, re theta The theta option reports an estimated theta Output: Random-effects GLS regression Number of obs = 42 Group variable (i): g Number of groups = 3 R-sq: within = 0.2737 Obs per group: min = 14

Between = 0.3568 avg = 14.0 overall = 0.3034 max = 14

Random effects u_i ~ Gaussian Wald chi2(1) = 15.19

Page 24: Panel Data Assignment

Assignment on Panel Data ACT - 672

Page 21

corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0001 theta = .81687672

mgrt_uk Coef. Std. Err. z P>z [95% Conf. Interval] mgrt_can .3359594 .0862016 3.90 0.000 .1670074 .5049114 _cons .0000591 .0004419 0.13 0.894 -.0008071 .0009252

sigma_u .0007195 sigma_e .00050147 rho .67305272 (fraction of variance due to u_i)

b. One-way Random Time Effect Model

In Stata, we have to switch group and time variables using the .tsset command. Command: tsset year g Output: panel variable: year, 1991 to 2004 time variable: g, 1 to 3 Command: xtreg mgrt_uk mgrt_can, re i(year) theta Output: Random-effects GLS regression Number of obs = 42 Group variable (i): year Number of groups = 14 R-sq: within = 0.1952 Obs per group: min = 3

between = 0.7414 avg = 3.0 overall = 0.3034 max = 3

Random effects u_i ~ Gaussian Wald chi2(1) = 17.42 corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000 theta = 0

Page 25: Panel Data Assignment

Assignment on Panel Data ACT - 672

Page 22

mgrt_uk Coef. Std. Err. z P>z [95% Conf. Interval] mgrt_can .3843646 .0920859 4.17 0.000 .2038796 .5648495 _cons -.0000223 .0001851 -0.12 0.904 -.000385 .0003404 sigma_u 0 sigma_e .00077017 rho 0 (fraction of variance due to u_i)

c. Testing Random Effect Models

The Breusch-Pagan Lagrange multiplier (LM) test is designed to test random effects. The null hypothesis of the one-way random group effect model is that

individual-specific or time-series error variances are zero . If the null hypothesis is not rejected, the pooled regression model is appropriate.

i. Testing Random Group Effects

In Stata, run the .xttest0 command right after estimating the one-way random group effect model. Command: quietly xtreg mgrt_uk mgrt_can, re i(g)

xttest0

Output:

Breusch and Pagan Lagrangian multiplier test for random effects:

mgrt_uk[g,t] = Xb + u[g] + e[g,t]

Estimated results:

Var sd = sqrt(Var)

-----------------+-----------------------------

mgrt_uk 6.05e-07 .000778

e 2.51e-07 .0005015

u 5.18e-07 .0007195

Test: Var(u) = 0

chi2(1) = 43.56

Prob > chi2 = 0.0000

Comments:

Since the null hypothesis is rejected, therefore it is concluded that the pooled regression is inappropriate. Clearly, the test results indicate the existence of random effects.

Page 26: Panel Data Assignment

Assignment on Panel Data ACT - 672

Page 23

ii. Testing Random Time Effect

The null hypothesis of the one-way random time effect is that variance

components for time are . In Stata, run the .xttest0 command right after estimating the one-way random time effect model. Command: tsset year g Output: panel variable: year, 1991 to 2004 time variable: g, 1 to 3

Command: quietly xtreg mgrt_uk mgrt_can, re i(year)

xttest0

Output: Breusch and Pagan Lagrangian multiplier test for random effects:

mgrt_uk[year,t] = Xb + u[year] + e[year,t]

Estimated results:

| Var sd = sqrt(Var)

---------+-----------------------------

mgrt_uk | 6.05e-07 .000778

e | 5.93e-07 .0007702

u | 0 0

Test: Var(u) = 0

chi2(1) = 6.38

Prob > chi2 = 0.0116

Comments:

Since the null hypothesis is accepted at .01 level of significance, therefore it is concluded that the variance components for time are zero.

Page 27: Panel Data Assignment

Assignment on Panel Data ACT - 672

Page 24

5. HAUSMAN TEST

Page 28: Panel Data Assignment

Assignment on Panel Data ACT - 672

Page 25

5.1 FIXED EFFECTS VS. RANDOM EFFECTS How do we compare a fixed effect model and its counterpart random effect model? From the purely practical point of view, fixed-effects model often restricts a lot of freedom, especially for large number of cross-section panel data. In such a case a random effects model would seem more appropriate. On the other hand, fixed-effects model has a unique advantage, since we do not have the assumption that the individual effects and other explanatory variables related, while in the random effects model, this assumption is necessary.

5.2 HAUSMAN TEST We can test the fixed effects u_i with other explanatory variables is related to serve as a fixed effect and random effects of model basis. Hausman test is such a test statistic. The Hausman specification test examines if the individual effects are uncorrelated with the other regressors in the model. Following are the steps involved in the Hausman Test:

a. Estimate the fixed-effect model

Command: tsset g year

Output:

panel variable: g, 1 to 3

time variable: year, 1991 to 2004

Command: quietly xtreg mgrt_uk mgrt_can, fe

estimates store fixed_group

b. Estimate the random effects model

Command: quietly xtreg mgrt_uk mgrt_can, re

c. Apply the Hausman test

Command: hausman fixed_group

Page 29: Panel Data Assignment

Assignment on Panel Data ACT - 672

Page 26

Output: ---- Coefficients ----

(b) (B) (b-B) sqrt(diag(V_b-V_B))

fixed_group . Difference S.E.

mgrt_can .3333187 .3359594 -.0026407 .0180914

b = consistent under Ho and Ha; obtained from xtreg

B = inconsistent under Ha, efficient under Ho; obtained from xtreg

Test: Ho: difference in coefficients not systematic

chi2(1) = (b-B)'[(V_b-V_B)^(-1)](b-B)

= 0.02

Prob>chi2 = 0.8839

Comments:

The test does not reject the null hypothesis, in favor of the random effect model. Therefore we conclude that a random effect model is more suitable.

Page 30: Panel Data Assignment

Assignment on Panel Data ACT - 672

Page 27

6. CONCLUSION

Page 31: Panel Data Assignment

Assignment on Panel Data ACT - 672

Page 28

Panel data are analyzed to investigate group and time effects using fixed effect and random effect models. The fixed effect model asks how group and/or time affect the intercept, while the random effect model analyzes error variance structures affected by group and/or time. Slopes are assumed unchanged in both fixed effect and random effect models.

6.1 STATISTICAL PACKAGE

Four statistical packages can be used for panel data analysis; namely SAS, LIMDEP, SPSS and Stata. We have preferred to use Stata as the software of choice in this report, as it is very handy to manipulate panel data as well as being user-friendly.

6.2 DATA

A panel data set needs to be arranged in the long format in order to be used in STATA. If the number of groups (subjects) or time periods is extremely large, panel data models may be less useful because the null hypothesis of F test is too strong. Then, we may consider categorizing subjects to reduce the number of groups. If data are severely unbalanced, read output with caution and consider dropping subjects with many missing data points. This document assumes that data are balanced without missing values.

6.3 PANEL DATA MODELS

a. Pooled OLS Models

The Ordinary Least Squares Regression model for the pooled data did not prove to be a good fit with a very low R2 value. This implies that there may be

a fixed group effect producing different intercepts across groups.

b. Fixed Effect Models

Fixed effect models are estimated by the least squares dummy variable (LSDV) regression and within effect model. LSDV has three approaches to avoid perfect multicollinearity. LSDV1 drops a dummy, LSDV2 suppresses the intercept, and LSDV3 includes all dummies and imposes restrictions instead. LSDV1 is commonly used since it produces correct statistics. That is why we only applied LSDV1 on this data. LSDV1 fits the data better than does the pooled OLS with a much higher R2 value.

c. Random Effect Models

Page 32: Panel Data Assignment

Assignment on Panel Data ACT - 672

Page 29

Random effect models are estimated by the generalized least squares (GLS) and the feasible generalization least squares (FGLS). When the variance structure is known, GLS is used. If unknown, FGLS estimates theta. Parameter estimates vary depending on estimation methods. Since we are unaware of the variance structure, therefore we used FGLS model

d. Tests for Fixed Effect and Random Effect Models

Fixed effects are tested by the F-test and random effects by the Breusch-Pagan Lagrange multiplier test.

i. Testing Fixed Group Effects

The result is significant; therefore we reject the null hypothesis and conclude that the fixed group effect model is better than the pooled OLS model.

ii. Testing Fixed Time Effects

The result is insignificant; therefore we fail to reject the null hypothesis and conclude that there is no fixed time effect.

iii. Testing Random Group Effects

The result is significant; therefore it is concluded that the pooled regression is inappropriate and there is existence of random effects.

iv. Testing Random Time Effects

The result is insignificant; therefore we fail to reject the null hypothesis and conclude that there is no random time effect.

v. Hausman Test

The Hausman specification test compares a fixed effect model and a random effect model. If the null hypothesis of uncorrelation is rejected, the fixed effect model is preferred. In this case the null hypothesis is accepted, thus confirming the adequacy of the random effects model for this data as proved above.