1st. STATA Group Meeting Mexico Discussion of user-written Stata programs Predicting counterfactual...

16
1st. STATA Group Meeting Mexico Discussion of user-written Stata programs Predicting counterfactual densities with the DFL Ado-file: A pertinent constructive critique. Luis Huesca Reynoso Centro de Investigación en Alimentación y Desarrollo, A.C. Department of Economics. Email: [email protected] April 23, 2009, Universidad Iberoamericana Campus Mexico.

Transcript of 1st. STATA Group Meeting Mexico Discussion of user-written Stata programs Predicting counterfactual...

Page 1: 1st. STATA Group Meeting Mexico Discussion of user-written Stata programs Predicting counterfactual densities with the DFL Ado-file: A pertinent constructive.

1st. STATA Group Meeting Mexico

Discussion of user-written Stata programs

Predicting counterfactual densities with the DFL Ado-file:A pertinent constructive critique.

Luis Huesca ReynosoCentro de Investigación en Alimentación y Desarrollo, A.C.Department of Economics. Email: [email protected]

April 23, 2009, Universidad Iberoamericana Campus Mexico.

Page 2: 1st. STATA Group Meeting Mexico Discussion of user-written Stata programs Predicting counterfactual densities with the DFL Ado-file: A pertinent constructive.

It is not an easy task dealing with distributions (and so with densities!)

Problems to face:A. Scale: log or numeric.B. Comparisson: Unit of measurement (in economics and social sciencies: constant

prices, others.C. Selection of the right window width (eye-ball sight or the optimal) –check out for

instance bandw by Salgado-Ugarte, Shimizu and Taniuchi-D. Joint: Compute them toghether (see for instance nbins or # of grid points in

akdensity).

STATA makes it easier!

Goal.-The estimation of kernel density functions and counterfactuals well dimensioned with a semiparametric technique:

Estimate densities that stands for obtaining the real shape not only for the total distribution but also for a number of subgroups belonging to the former.

Page 3: 1st. STATA Group Meeting Mexico Discussion of user-written Stata programs Predicting counterfactual densities with the DFL Ado-file: A pertinent constructive.

Probability density function (PDF)

Any function, f(y) can serve as a density function as long as:

and

By definition, the sum of the PDF must add to one as so for the Gaussian or any other nice kernel functions (Duclos, 2001 & Silverman, 1986) –Epanechnikov, biweight , triangular, cosine kernels for instance-.

A general kernel function K(u) to weight the density must then be,

1)(

dxyf

,0)( yf y

1)(

duuK 1)(ˆ

dyyfSince then

Page 4: 1st. STATA Group Meeting Mexico Discussion of user-written Stata programs Predicting counterfactual densities with the DFL Ado-file: A pertinent constructive.

Kernel density estimation: Letting the data speak by themselves as follows:

hYy

Kh

yf in

i 1

1)(ˆ

With as a vector of earnings, h the optimal window width and K a Gaussian kernel function.

),,( 1 nyyy

Following Jenkins and Van Kerm, (2005) for decompositions:

)()(1

yfyf kK

k

k

kkf

as a weigthed sum of the FDPs for each sub-group k, where stands for the population share of the group k, and as the PDF of the group k.

- In the empirical example an adaptive kernel estimator is used (Van Kerm, 2003).

Page 5: 1st. STATA Group Meeting Mexico Discussion of user-written Stata programs Predicting counterfactual densities with the DFL Ado-file: A pertinent constructive.

Dinardo, Fortin, Lemiux (1996)

Counterfactual estimation compares the objective variable (depvar) distribution to the depvar distribution that would have prevailed if they had been paid like the comparison group (the counterpart).

dxAsxhxyfdyyf AA )|()|()( dxBsxhxyfdyyf BB )|()|()(

Actual dxBsxhxyfyf BBB ||

dxAsxhxyfyf BBA ||Counterfactual

dxxhxh

AsxhxyfAB

AB ||

Which can be computed using Bayes’ theorem:

The conditional treatment probability – propensity score – is estimated by the program under a especification using a logistic regression (DFL command shifts to probit as well). For comparisson I use the pscore ado file written by Becker & Ichino (2002) which follows the neirest neighbour technique.

Actual wage distributions for A and B

dxBsxhxywf B ||

)(1

)|(1

|

AP

APxAP

xAP

w

DFL (1996) rewrite and reweigh the density for B as follows:

In Stata:w = 1-Prob(Depvar=1)/Prob(Depvar=0)

Page 6: 1st. STATA Group Meeting Mexico Discussion of user-written Stata programs Predicting counterfactual densities with the DFL Ado-file: A pertinent constructive.

Empirical case: (A semi-parametric-approach)

Estimation of the mexican earnings distribution and decompositions by sub-population of workers in the formal and informal sectors (compliance with social security coverage).

(Let’s assume that self-selection bias does not affect individual decisions of worker’s location). Models are estimated separately for each category.

Logit has a practical advantage over probit when the sum of predicted values equal to the sum of empirically observed values (Butcher and Dinardo, 1998.)

ENEU: Encuesta Nacional de Empleo Urbano (National Survey of Urban Employment).

Males aging from 16 to 65Occupations = (1 ,…, 4)

1: Formal self-employed2: Informal self-employed3: Formal wage-earners

4: Informal wage-earners

Model 1 pooled

Model 2 pooled)exp(1

)exp()(

ss

ss

x

xfSP

Page 7: 1st. STATA Group Meeting Mexico Discussion of user-written Stata programs Predicting counterfactual densities with the DFL Ado-file: A pertinent constructive.

1. Compute the earnings distribution using DFL command.

dfl depvar indepvars [if exp] [in range] , outcome(varname) [nbins(integer) w(bandwidth) adaptive gauss quietly probit [logit default] graph(cfactual) graph_combine axis_selection_options axis_scale_options title_options

dfl informal esc eda eda2 jefe dmiembros dwmenor drama1 drama3 /// drama4 dregion1 dregion2 dregion3 dregion4 dregion6 ///if sex==1 & logitp>=1 & logitp<=2, outcome(logwm) nbins(50) /// adaptive gauss graph(cfactual)

2. Compute the earnings distribution using do-file.

pscore informalb esc eda eda2 jefe dmiembros dwmenor drama1 drama3 drama4 ///dregion1 dregion2 dregion3 dregion4 dregion6 if sex==1 & logitp>=1 & logitp<=2, /// pscore(mypscore) logit level(0.001)akdensity logwm if sex==1 & logitp==4 [aw = mypscore], gau s(i) ///gen(hai92c dhai92c) lab var dhai92c “Informal wage-earner"replace dhai92c = dhai92c*.24

Example with my do-file

Syntax

Page 8: 1st. STATA Group Meeting Mexico Discussion of user-written Stata programs Predicting counterfactual densities with the DFL Ado-file: A pertinent constructive.

0.2

.4.6

.81

De

nsity

4 6 8 10 12Log of monthly earnings (pesos 2000=100)

Total Self-employed

Wage-earners

Decomposition of density functions for self-employedand wage earners, Mexico 1992

Figure 1.

Page 9: 1st. STATA Group Meeting Mexico Discussion of user-written Stata programs Predicting counterfactual densities with the DFL Ado-file: A pertinent constructive.

DFL commandDo file reescaled

0.5

11.

5D

ens

ity

4 6 8 10 12Log of monthly earnings (pesos 2000=100)

Factual Counterfactual

Males

-1-.

50

.5D

iffer

ence

in D

ensi

ties

4 6 8 10 12Log of monthly earnings (pesos 2000=100)

Differences

-1-.

50

.5D

ifere

nci

a e

n d

en

sid

ad

4 6 8 10 12Log of monthly earnings (pesos 2000=100)

Differences

0.5

11

.52

De

nsi

ty

4 6 8 10 12Log of monthly earnings (pesos 2000=100)

Factual Counterfactual

Males

Figure 2. Wage-earners in Mexico working in a formal world, 1992.

Page 10: 1st. STATA Group Meeting Mexico Discussion of user-written Stata programs Predicting counterfactual densities with the DFL Ado-file: A pertinent constructive.

-.1

-.0

50

.05

.1D

ifere

nci

a e

n d

en

sid

ad

4 6 8 10 12Log of monthly earnings (pesos 2000=100)

Differences

0.1

.2.3

.4D

en

sity

4 6 8 10 12Log of monthly earnings (pesos 2000=100)

Factual Counterfactual

Males

Do file reescaled adjusting ranges

Figure 2a. Wage-earners in Mexico working in a formal world, 1992.

Page 11: 1st. STATA Group Meeting Mexico Discussion of user-written Stata programs Predicting counterfactual densities with the DFL Ado-file: A pertinent constructive.

0.2

.4.6

.81

De

nsi

ty

4 6 8 10 12Log of monthly earnings (pesos 2000=100)

Factual Counterfactual

Males

-.1

-.0

50

.05

.1D

iffe

ren

ce in

De

nsi

ties

4 6 8 10 12Log of monthly earnings (pesos 2000=100)

Differences

0.2

.4.6

.81

De

nsi

ty

4 6 8 10 12Log of monthly earnings (pesos 2000=100)

Factual Counterfactual

Males-.

1-.

05

0.0

5.1

Dife

ren

cia

en

de

nsi

da

d

4 6 8 10 12Log of monthly earnings (pesos 2000=100)

Differences

DFL commandDo file reescaled

Figure 3. Self-employed in Mexico working in a formal world, 1992.

Page 12: 1st. STATA Group Meeting Mexico Discussion of user-written Stata programs Predicting counterfactual densities with the DFL Ado-file: A pertinent constructive.

0.2

.4.6

De

nsid

ad

4 6 8 10 12Log of monthly earnings (pesos 2000=100)

Factual Counterfactual

Males

-.1

-.0

50

.05

.1D

ifere

ncia

en

dens

ida

d

4 6 8 10 12Log of monthly earnings (pesos 2000=100)

Differences

Do file reescaled adjusting ranges

Figure 3a. Self-employed in Mexico working in a formal world, 1992.

Page 13: 1st. STATA Group Meeting Mexico Discussion of user-written Stata programs Predicting counterfactual densities with the DFL Ado-file: A pertinent constructive.

0.2

.4.6

.8D

ensi

ty

2 4 6 8 10 12Log of earnings (pesos 2000=100)

Factual Counterfactual

DFL command

0.2

.4.6

.81

Den

sity

2 4 6 8 10 12Log of monthly earnings (pesos 2000=100)

Factual Counterfactual

Do file rescaled-.

10

.1.2

Diff

eren

ce in

Den

sitie

s

2 4 6 8 10 12Log of earnings (pesos 2000=100)

DFL command

-.1

-.05

0.0

5.1

Dife

renc

ia e

n de

nsid

ad

2 4 6 8 10 12Log of monthly earnings (pesos 2000=100)

Do file rescaled

Figure 4. Informal self-employed males in a formal world 2002

Page 14: 1st. STATA Group Meeting Mexico Discussion of user-written Stata programs Predicting counterfactual densities with the DFL Ado-file: A pertinent constructive.

0.2

.4.6

.81

Den

sity

2 4 6 8 10 12Log of earnings (pesos 2000=100)

Factual Counterfactual

DFL command

0.2

.4.6

.81

Den

sity

4 6 8 10 12Log of monthly earnings (pesos 2000=100)

Factual Counterfactual

Do file rescaled-.

2-.

10

.1.2

Diff

eren

ce in

Den

sitie

s

2 4 6 8 10 12Log of earnings (pesos 2000=100)

DFL command

-.2

-.1

0.1

.2D

ifere

ncia

en

dens

idad

4 6 8 10 12Log of monthly earnings (pesos 2000=100)

Do file rescaled

Figura 5. Informal wage-earner males in a formal world 2002

Page 15: 1st. STATA Group Meeting Mexico Discussion of user-written Stata programs Predicting counterfactual densities with the DFL Ado-file: A pertinent constructive.

DFL user written command is useful just watch out when using sub-groups or log scales.

DFL (1996) use the subgroup decomposability property of the aggregate PDF.

A suggestion when computing densities, consider population shares (if necessary) to weight them.

The problem of obtaining over-dimensioned densities struggles the most when dealing with logarithmic scales for data.

For kernel densities the estimation with the adaptive technique is more time-consuming but seems to be more accurate as well (it works better without smoothing more than needed).

Adaptive kernel estimation depicts better bimodal or multimodal distributions

Conclusions :

Page 16: 1st. STATA Group Meeting Mexico Discussion of user-written Stata programs Predicting counterfactual densities with the DFL Ado-file: A pertinent constructive.

Huesca, Luis and Mario Camberos (2009), "El mercado laboral mexicano 1992 y 2002: Un análisis contrafactual de los cambios en la informalidad", Economía Mexicana, Vol. XVIII, Núm. 1, primer semestre, pp. 5-43.

Dinardo, John, Nicole Fortin, and Thomas Lemieux (1996), “Labor Market Institutions and the Distribution of Wages, 1973-1992: A semi-parametric approach”, Econometrica, 64(5), 1001-44.

Azevedo, Joao Pedro (2005). DiNardo, Fortin and Lemieux Counterfacual Kernel Density –DFL user written command-”.

Inegi (2006), Encuesta Nacional de Empleo Urbano, 1992 and 2002, ENEU, INEGI, Ags., México, Bases de datos.

Jenkins, Stephen and Phillipe Van Kerm (2005), “Accounting for income distribution trends: A density function decomposition approach”, Journal of Economic Inequality, 3, pp. 43-62.

Silverman, B. W. (1986). Density estimation for statistics and data analysis. Chapman and Hall. London.

Van-Kerm, Phillipe (2003), “Adaptive kernel density estimation”, -akdensity- The Stata Journal, 3(2), 148-56.

References

Duclos, Jean-Yves (2001), “Non-parametric estimation for distributive analysis”, Poverty and Equity: theory and estimation, Departament d’Economia Aplicada, Universitat Autònoma de Barcelona, mimeo, March, 37-44.

Heckman, James, Ichimura, H. and Todd, P. E. (1998), "Matching as an Econometric Evaluation Estimator", Review of Economic Studies, 65, 261-294.

Becker, Sascha O., and Andrea Ichino (2002), “Estimation of average treatment effects based on propensity scores”, The Stata Journal, 2(4), 358-377.

Butcher, K. F. and John Dinardo (1998), “The immigrant and native-born wage distributions: Evidence from united states census”, NBER Working paper No. 6630.