Detailed Wage Decompositions: Revisiting the Identification...

21
Detailed Wage Decompositions: Revisiting the Identification Problem * ChangHwan Kim August 1, 2012 Department of Sociology University of Kansas 1415 Jayhawk Blvd., Room 716 Lawrence, KS 66045 Tel: (785) 864-9426 Fax: (785) 864-5280 [email protected] * Comments from the editor of Sociological Methodology, five anonymous reviewers, Arthur Sakamoto, Daniel Powers, Yu Xie, and Myeong-Su Yun have greatly improved previous drafts. All remaining errors are the author’s sole responsibility. i

Transcript of Detailed Wage Decompositions: Revisiting the Identification...

Page 1: Detailed Wage Decompositions: Revisiting the Identification ...people.ku.edu/~chkim/paper/NewDetailDcp_17_whole.pdfDetailed Wage Decompositions: Revisiting the Identification Problem

Detailed Wage Decompositions: Revisiting theIdentification Problem∗

ChangHwan Kim

August 1, 2012

Department of SociologyUniversity of Kansas

1415 Jayhawk Blvd., Room 716Lawrence, KS 66045Tel: (785) 864-9426Fax: (785) 864-5280

[email protected]

∗Comments from the editor of Sociological Methodology, five anonymous reviewers, Arthur Sakamoto, DanielPowers, Yu Xie, and Myeong-Su Yun have greatly improved previous drafts. All remaining errors are the author’ssole responsibility.

i

Page 2: Detailed Wage Decompositions: Revisiting the Identification ...people.ku.edu/~chkim/paper/NewDetailDcp_17_whole.pdfDetailed Wage Decompositions: Revisiting the Identification Problem

Detailed Wage Decompositions: Revisiting theIdentification Problem

August 1, 2012

Abstract

Yun (2005) suggested an averaging method to resolve the identification problem

in detailed decompositions. Since then, detailed decomposition techniques have been

widely discussed. This paper shows that the averaging method may not answer the

identification problem. The method is built on unrealistic distributional constraints on

a set of dummy variables and is sensitive to both the number of groups and the method

of grouping. As an alternative, a weighted averaging method is suggested which is in

line with Haisken-DeNew and Schmidt’s (1997) two step re-normalization. This method

gives a distinct meaning to the intercept term and makes detailed decomposition fea-

sible using a reasonable assumption. However, this paper underscores that there are

multiple solutions to the identification problem, and that other detailed decomposition

methods may be acceptable depending on theoretical or practical considerations.

Keywords: Blinder-Oaxaca Decomposition, Detailed Decomposition, Averaging Method,

Grand-Mean Weighting Method

1

Page 3: Detailed Wage Decompositions: Revisiting the Identification ...people.ku.edu/~chkim/paper/NewDetailDcp_17_whole.pdfDetailed Wage Decompositions: Revisiting the Identification Problem

1 Introduction

Blinder-Oaxaca (BO) wage decomposition methods have been extensively applied in soci-

ology and economics (e.g., Dodoo 1991; Farkas and Vicknair 1996; Sakamoto et al. 2000;

DeLeire 2001; Sayer 2004; Van Hook et al. 2004; Phillips and Sweeney 2006; Stearns et al.

2007; Berends and Penaloza 2008). Although several variants of wage decompositions

have been developed since the initial work of Blinder (1973) and Oaxaca (1973),1 all these

techniques share the basic idea that the mean wage gap between two groups at a given

time can be broken down into two components (Jones and Kelley 1984; Cotton 1985). The

first component is the coefficient effect and the second component is the endowment ef-

fect. On top of the two components decomposition, researchers have often reported the

contributions of individual variables or of combined sets of dummy variables after imple-

menting a BO decomposition (e.g., Fields and Wolff 1995; DeLeire 2001; Bobbitt-Zeher

2007; citealtChangEngland:2010). However, these detailed decompositions can be er-

roneous, because the contribution of each individual variable or set of dummy variables

changes with the choice of the reference group. This is the identification problem which

has been a much discussed (Jones 1983; Jones and Kelley 1984; Clogg and Eliason 1986;

Oaxaca and Ransom 1999; Horrace and Oaxaca 2001; Gardeazabal and Ugidos 2004; Yun

2005) yet often ignored pitfall of detailed decomposition using BO techniques.

To resolve the identification problem, Gardeazabal and Ugidos (2004) suggested nor-

malizing the estimated coefficients, and Yun (2005) proposed an averaging method as a

simple alternative to the unwieldy normalization. Since then, detailed BO-type decompo-

sition techniques have been widely discussed in sociology and economics (e.g., Bhaumik

et al. 2006; Powers and Yun 2009; Kim 2010; Fortin et al. 2010; Powers et al. forthcom-

ing). The averaging method is, however, based on unrealistic distributional constraints on

a set of dummy variables. In this paper, I suggest the application of an alternative method

and consider its limitations.1Decomposition techniques were originally developed by sociologists and demographers (e.g., Kitagawa

1955). See Powers and Yun (2009:234-235) for a summary history.

2

Page 4: Detailed Wage Decompositions: Revisiting the Identification ...people.ku.edu/~chkim/paper/NewDetailDcp_17_whole.pdfDetailed Wage Decompositions: Revisiting the Identification Problem

2 Identification Problem and Averaging Method

Using ordinary least square (OLS) regression, we can estimate the log wage, y, as follows:

y = a +

J∑j=1

Kj∑k=1

bjkxjk + e (1)

where j = 1, 2, 3, . . . , J and k = 1, 2, 3, . . . ,Kj; j refer to the jth factor, and k repre-

sents the kth level of each of these J factors. In equation 1, each factor includes a com-

plete set of dummies, including the typically left out reference group. The reference group

E[bjkxjk|xj1 = 1] takes a value of zero, which implies the identification constraint of bj1 = 0.

For simplicity, I restrict J = 1 in this paper, so that equation 1 becomes y = a+∑K

k=1 bkxk+e.

Consider a comparison of white workers (group W) and black workers (group B). Esti-

mating equation 1 separately for two groups, the mean wage gap between groups W and B

(yW − yB) can be partitioned into several components as follows:

yW − yB =(aW − aB

)︸ ︷︷ ︸D1A

+

K∑k=1

(bWk − bBk )xBk︸ ︷︷ ︸D1B︸ ︷︷ ︸

D1

+

K∑k=1

(xWk − xBk )bWk︸ ︷︷ ︸D2

(2)

where D1 is the sum of D1A, which denotes the intercept component, and D1B, which

denotes the coefficient component. D1 represents the total coefficient effect. D2 refers to

the total endowment effect. By definition in OLS, the mean residual difference between eW

and eB is zero.

In a regression model, an estimated intercept varies with the choice of reference group,

and that is the case with the intercept component (D1A) in equation 2. As the reference

group changes in a regression model, the estimated coefficient bk changes accordingly.

Therefore, estimates of the extent to which individual factors and factor levels contribute to

the mean wage gap between groups W and B also vary with the choice of reference group.

This illustrates the identification problem in detailed decompositions.

3

Page 5: Detailed Wage Decompositions: Revisiting the Identification ...people.ku.edu/~chkim/paper/NewDetailDcp_17_whole.pdfDetailed Wage Decompositions: Revisiting the Identification Problem

To resolve the identification problem, Gardeazabal and Ugidos (2004) suggest normaliz-

ing the coefficients of the dummy variables by imposing a restriction of∑

bk = 0. As a sim-

ple way to impose such restriction, Yun (2005) proposes the following averaging method:

y =(a + b

)+

K∑k=1

(bk − b)xk + e

= a′ +K∑k=1

b′kxk + e

(3)

where a′ refers to a + b and b′ refers to (bk − b). A b is computed as the sum of bk divided

by the number of factor levels; that is,∑K

k=1 bkK . For the reference group of equation 1, b1 is

zero by definition, thus the transformed coefficient b1− b is equal to −b. This transformation

of bk causes∑K

k=1 b′k to become zero. In ANOVA, the restriction of

∑b′k = 0 is referred to

as a sigma constraint. It is one of a number of possible identifying normalizations (Fox

2008:145). Under this constraint, the constant or intercept term is a generalized grand

mean (i.e., the mean of the means), and the effects b′k are deviations from this.

As a result of this normalization, both the new coefficients for the independent variables,

(bk − b), and for the new intercept, a + b, do not depend on the choice of reference group.

As the coefficient b1 for the reference group becomes −b, no group is omitted. Because

equation 3 has exactly the same structure as equation 1, the BO decomposition technique

for equation 2 can also be applied to equation 3.

On the surface, the averaging method appears to solve the identification problem. In

fact, the averaging method does not resolve it but merely conceals it. In equation 3, the

intercept a′ is the expected wage given xk = 1/K for k = 1, . . . ,K. That is, E[y|(xk =

1/K)] = a′. The difference between the intercepts of the two groups, a′W − a′B, is the

expected wage difference between hypothetical groups W and B, assuming the means of

all x’s equal 1/K. The normalized coefficients b′k represent the expected differences of y

for a group whose xk = 1 for a specific k compared to the hypothetical reference point (or

group), of which the mean of xk is equal to 1/K for all k’s. That is, b′k = E[y|(xk = 1)]− a′.

4

Page 6: Detailed Wage Decompositions: Revisiting the Identification ...people.ku.edu/~chkim/paper/NewDetailDcp_17_whole.pdfDetailed Wage Decompositions: Revisiting the Identification Problem

The identification problem occurs because the estimates of the intercept and those of

all other coefficients change with the arbitrary change of reference group. If the intercept

of one possible reference group (e.g, E[y|x1 = 1 & x2,3,4,...,K = 0] when x1 is the reference

group) and that of the other possible reference group (e.g, E[y|x2 = 1 & x1,3,4,...,K = 0]

when x2 is the reference group) are arbitrary, then the intercept of the averaging method

(i.e., E[y|x1,2,3,...,K = 1/K] when a hypothetical point of which the means of all xs equal

1/K is the reference) is also arbitrary in essence.

Some will argue that the averaging method nonetheless offers a normalized, standard-

ized, and detailed decomposition. They will say that it is better to use 1/K than to cherry-

pick the reference group. However, the averaging method is sensitive to the number of

k. This sensitivity leads to another kind of identification problem. Suppose a researcher

uses the four factor levels for education. One possible four-factor grouping would be LTHS

(less than high school), HSG (high school graduate), SC (some college), and BA+ (bache-

lor degree or higher), in which case the intercept calculated using the averaging method is

E(y|xLTHS = xHSG = xSC = xBA+ = .25). If the researcher changes her education cat-

egories into five, dividing BA+ into BA and Grad (graduate degrees), the intercept of the

averaging method is changed to E(y|xLTHS = xHSG = xSC = xBA = xGrad = .20). In the

former case, BA+ would include a 25% share of workers, whereas in the second grouping

it would include a 40% share. As the size of K changes, so does the intercept estimate, as

well as all the other coefficients.

The averaging method is sensitive not only to the number of groups, but also to the

method of grouping. Suppose a research uses another way of four-factor grouping, which

combines LTHS and HSG into <HSG and divides BA+ into BA and Grad, so that four groups

are <HSG, SC, BA, and Grad. In the previous four-factor grouping, BA+ would include a

25% share of workers, whereas in the new grouping it would include a 50% share. As the

hypothetical distribution of x for the intercept differs as a function of grouping method,

almost all estimated effects including the intercept effect (D1A), the sum of coefficient

effects (sum of D1B), coefficient effects (D1B) of factor level and endowment effects (D2) of

5

Page 7: Detailed Wage Decompositions: Revisiting the Identification ...people.ku.edu/~chkim/paper/NewDetailDcp_17_whole.pdfDetailed Wage Decompositions: Revisiting the Identification Problem

factor level differ as well. Only the sum of D1 and the sum of D2 are consistently estimated

free from the variation of model specifications. Note that the original BO decomposition

also yields the same results regarding the sums of D1 and D2.

There is also an identification problem with dichotomous variables (Yun 2005). Which

values should be coded as 1 and 0 is an arbitrary decision made by the researcher. With the

averaging method, the intercept for the dichotomous variables is the value expected when

the proportions of x = 1 and x = 0 are equal (i.e., .50). This simply is not the case for many

variables of interest, such as union membership.

In sum, despite contentions to the contrary, the averaging method suffers from the same

identification problem as the original BO decomposition. Different numbers of groups and

different methods of grouping can lead to substantially divergent results.

3 A Suggested Alternative Approach: The Grand-Mean Weight-

ing Method

If neither the current BO decomposition nor the averaging method can solve the identifi-

cation problem, are detailed decompositions feasible at all? They are said to be infeasible

because the intercept term in regression models depends on the choice of reference group.

In fact, any normalization with the linear restriction of the parameters∑K

k=1 b†kwk = 0 will

yield model-free decomposition estimates. There are an infinite number of possible restric-

tions. The averaging method (where the weighting factors wk are equal to 1/K) is a special

cases of these restrictions. Even the original BO decomposition can be considered a special

case of the restriction, with w1 = 1 for k = 1 (factor level 1 is the reference group) and

wk = 0 for all other k. Given the restriction of∑K

k=1 b†kwk = 0, all detailed decompositions

are not mathematically wrong. The feasibility of detailed decomposition does not hinge on

mathematical tweaks, but rather on acceptable restrictions involving the weighting factors

wk.

As an alternative approach, I suggest to apply a weighted averaging method in which the

6

Page 8: Detailed Wage Decompositions: Revisiting the Identification ...people.ku.edu/~chkim/paper/NewDetailDcp_17_whole.pdfDetailed Wage Decompositions: Revisiting the Identification Problem

weighting factors are grand-means. The weighted normalization is in line with the two-step

re-normalization proposed by Haisken-DeNew and Schmidt (1997) and the restricted least

squares method discussed by Greene and Seaks (1991). I will call this the grand-mean (GM)

weighting method. In the following section, I first discuss the mathematical advantage of

the GM weighting method and then turn to its theoretical implications. To carry out a

detailed decomposition using the GM weighting method, it is necessary to transform the

estimated regression coefficients of equation 1, bk, as follows:

y =(a + b∗

)+

K∑k=1

(bk − b∗)xk + e

= a∗ +K∑k=1

b∗kxk + e

where b∗ =K∑k=1

bk ¯xk.

(4)

Yun (2005) used a simple arithmetic mean to compute b in equation 3. The normaliza-

tion of the averaging method by∑K

k=1 b′k = 0 is equivalent to the normalization based on∑K

k=1 b′k(1/K) = 0. The GM weighting method, instead, normalizes the estimated coeffi-

cients by imposing∑K

k=1 b∗k¯xk = 0, where ¯x refers to the grand mean for both group W and

group B. That is, b∗ is treated as the grand-mean-weighted sum of bk in equation 4. After

these transformations, the usual BO decomposition techniques can be applied.

The insensitivity of a∗ to the choice of reference group can easily be proved. For sim-

plicity, let’s assume there is only one dummy variable on the right side of equation. When

x0 is a reference group, the regression model looks like equation 5a, and as we change the

reference group to x1, the estimated regression model becomes equation 5b.

y = a + bx1 + e (5a)

y = (a + b) + (−b)x0 + e (5b)

7

Page 9: Detailed Wage Decompositions: Revisiting the Identification ...people.ku.edu/~chkim/paper/NewDetailDcp_17_whole.pdfDetailed Wage Decompositions: Revisiting the Identification Problem

Applying equation 4, equations 5a and 5b are transformed to equations 6a and 6b respec-

tively:

y = (a + b¯x1) + (0− b¯x1)x0 + (b− b¯x1)x1 + e (6a)

y = (a + b + [−b¯x0]) + (−b− [−b¯x0])x0 + (0− [−b¯x0])x1 + e (6b)

The intercept of equation 6a is (a+ b¯x1) and that of equation 6b is (a+ b+[−b¯x0]). Because

x1 is a dichotomous variable, x0 = 1 − x1 and ¯x0 = 1 − ¯x1. If we replace ¯x0 with 1 − ¯x1

for the intercept of 6b, it is reduced to (a + b¯x1) which is identical with the intercept of 6a.

This proves that the intercept of the GM weighting method is insensitive to the choice of

reference group.

The intercept effect (D1A) of the GM weighting method quantifies the extent to which

members of the disadvantaged group are, on average, treated differently than members of

the advantaged group in a society. Thus, this value can also be interpreted as the average

extent of discrimination (assuming no unobserved heterogeneity).

The estimated contribution of the individual factor level (D1B) of the GM weighting

method indicates a deviation from the mean discrimination level. The sum of D1B of the GM

weighting method is close to zero.2 When there is group-based discrimination, all members

of the disadvantaged group suffer to a similar degree. If the extent of discrimination varies

greatly within a given minority group, it would be hard to label the discrimination as group-

based. Therefore, the small effects of D1B, along with the large intercept effect are what

one would expect if there is group-based discrimination.

The sums of D2 are identical between averaging method and the GM weighting method.

Unlike the averaging method, however, the GM weighting method yields consistent esti-

mates of the D2 for individual factor levels regardless of model specifications.

2The sum of D1B of the GM weighting method will be meaningfully different from zero only if the value ofx for each group differs from the grand mean. A similar limitation is inevitable to all detailed decompositionmethods. For the averaging method, the sum of D1B can be substantively large only if the distribution of xdiffers from 1/K, and the sum of D1B becomes larger as a researcher arbitrarily applies a model specificationthat increases

∑|xk − 1/K|. For the original BO method, the sum of D1B will be near zero if the proportion

of the reference group approaches to 1, and conversely it becomes larger as a researcher arbitrarily chooses areference group of which the proportion (i.e., xk) is smaller than other groups.

8

Page 10: Detailed Wage Decompositions: Revisiting the Identification ...people.ku.edu/~chkim/paper/NewDetailDcp_17_whole.pdfDetailed Wage Decompositions: Revisiting the Identification Problem

Table 1: Descriptive Statistics

Total White Black Gap¯x xW xB xW − xB

Log Wage, y 3.0572 3.0811 2.7968 .2843

Less Than High School (LTHS) .0428 .0406 .0667 -.0261High School Graduaate (HSG) .2958 .2895 .3643 -.0748

Some College (SC) .2836 .2810 .3118 -.0308Bachelor Degree (BA) .2564 .2633 .1822 .0811

Graduate Degree (Grad) .1214 .1256 .0750 .0506

25-34 .3273 .3277 .3231 .004635-44 .3388 .3380 .3481 -.010145-54 .3339 .3343 .3289 .0054

Never Married .2298 .2192 .3451 -.1259Currently Married .6433 .6581 .4827 .1754

Widow/Divorce/Seperated .1268 .1227 .1722 -.0495

4 An Illustrative Example

Using the 2009 Current Population Survey–Monthly Outgoing Rotation Group (CPS-MORG),

I decompose the log wage gap (.284 log dollars) between white male workers and black

male workers. To examine the sensitivity of decomposition results to model specifications, I

estimate three models, applying both the averaging method and the GM weighting method.

Table 1 shows the grand means and two group means. Table 2 presents the decompo-

sition results.3 Model 1 decomposes the racial gap into five educational levels, three age

groups, and three marital status. In Model 2, LTHS and HSG are collapsed to <HSG and

marital status is divided into two categories. Model 3 has the same model specification with

Model 2 except educational categories. BA and Grad are collapsed to BA+ and LTHS and

HSG are separately identified.

3The variance-covariance matrix of the averaging method is discussed in detail by Yun (2008). The sametechnique can be applied for the modified GMC method with slight modification. The variance-covariancematrix of the normalized regression coefficients of the averaging method is computed as Σb′ = WΣB0W ′

where W is a weight matrix and ΣB0 is a reformatted variance-covariance matrix of the original regressioncoefficients (ΣB). For the GM weighting method, everything except the weighting matrix W ∗ is the same asthe averaging method. The weighting matrix needs to be rebuilt by replacing a set of matrix of 1/Kj for eachfactor with a set of matrix of weighting values using grand means. The new coefficient for the GM weightingmethod is obtained by taking diagonal of W ∗B. The variance-covariance matrix of the new coefficients iscomputed as Σ∗b = W ∗ΣB∗W

∗′.

9

Page 11: Detailed Wage Decompositions: Revisiting the Identification ...people.ku.edu/~chkim/paper/NewDetailDcp_17_whole.pdfDetailed Wage Decompositions: Revisiting the Identification Problem

In all three models, the sum of D1 (D1A + D1B) and the sum of D2 are identical between

two decomposition methods. The sum of D2 for each factor (e.g.,∑

D2 of Edu Effect) is

also identical across methods. All other estimated decomposition components, however,

differ by decomposition methods.

Importantly, the decomposition results of the GM weighting method are consistent

across model specifications, while those of the averaging method are substantially altered

by models. The intercept component (D1A) of the averaging method is .175 in Model 1, but

it becomes .167 in Model 2 and .200 in Model 3. In contrast to the averaging method, the

intercept components using the GM weighting method barely change across the models.

The coefficient effects (D1) of individual factors and factor levels are not consistently

estimated with the averaging method. For example, in Model 1 which uses five educational

categories, the sum of D1 of education is .005. When the education factor levels are reduced

to four categories in Model 2, the sum of D1 of education becomes .023. When I modify the

classification of education factors again in Model 3, the effect now turns out to be -.012.

These results imply that we cannot determine whether the coefficient effect of education

contributes to the reduction of racial gap or to the increase of racial gap with the averaging

method. For another example, the sum of D1 for two educational factor levels, LTHS and

HSG, of the averaging method is .014 in Model 1, but it is .024 in Model 2 and .007 in Model

3. Unlike the averaging method, the GM weighting method yields consistent estimates of

the effects of D1 for individual factors and factor levels under the different classifications of

factor levels.

The averaging method does not produce consistent estimates of the endowment ef-

fects (D2) for individual factor levels either. The estimated effect of D2 for BA and Grad

combined in the averaging method is either .028, .039, or .049 depending on the model

specifications, while that in the GM weighting method is almost identical across models.

When a dichotomous variable is used, the averaging method reports even effects for two

dichotomous categories by design. As a result, D2’s of married and not-married are equally

.016 in Model 2. However, when three marital status factor levels are used in Model 1, D2

10

Page 12: Detailed Wage Decompositions: Revisiting the Identification ...people.ku.edu/~chkim/paper/NewDetailDcp_17_whole.pdfDetailed Wage Decompositions: Revisiting the Identification Problem

Table 2: Decomposition of the Mean Log Wage Gap between White and Black Men UsingAveraging and GM weighting Methods

Averaging GM weighting(Equation 3) (Equation 4)D1 D2 D1 D2

A. Model ILTHS .005 .010 .005 .010HSG .009 .015** .010* .016**SC -.004 .002 -.003 .002BA .002 .019** .003 .018**Grad -.007 .020** -.007 .019*

[Σ Edu Effect] [.005] [.066]** [.007]** [.066]**25-34 -.007 -.001 -.007 -.00135-44 .004 .000 .004 .00045-54 .003 .001 .003 .001

[Σ Age Effect] [.000] [-.001]** [.001]* [-.001]**Never Married -.011* .012* -.015** .019**Currently Married .016** .020** .009** .012**Wid/Div/Sep .000 .001 -.003 .003

[Σ Marriage Effect] [.005]* [.034]** [-.009]** [.034]**Intercept .175** .187**

[Total] [.185]** [.099]** [.185]** [.099]**

B. Model 2<HSG .024** .031* .017** .024SC .001 .005 -.004 .002BA .005 .012 .002 .018Grad -.006 .016 -.008 .019

[Σ Edu Effect] [.023]** [.063]** [.007]** [.063]**25-34 -.007 -.001 -.007 -.00135-44 .004 .000 .004 .00045-54 .004 .001 .004 .001

[Σ Age Effect] [.000] [-.001]** [.001] [-.001]**Currently Not-married -.013** .016 -.017** .021Currently Married .012** .016 .009** .012

[Σ Marriage Effect] [-.001]** [.032]** [-.008]** [.032]**Intercept .167** .190**

[Total] [.189]** [.095]** [.189]** [.095]**

C. Model 3LTHS .004 .008 .005 .010HSG .003 .008 .009* .016SC -.009 -.001 -.004 .002BA+ -.009 .049** -.005 .035**

[Σ Edu Effect] [-.012] [.064]** [.005] [.064]**25-34 -.007 -.001 -.007 -.00135-44 .005 .000 .005 .00045-54 .002 .001 .002 .001

[Σ Age Effect] [.000] [-.001]** [.000] [-.001]**Currently Not-married -.014** .016 -.018** .021Currently Married .013** .016 .009** .012

[Σ Marriage Effect] [-.001]** [.033]** [-.009]** [.033]**Intercept .200** .191**

[Total] [.188]** [.096]** [.188]** [.096]**

11

Page 13: Detailed Wage Decompositions: Revisiting the Identification ...people.ku.edu/~chkim/paper/NewDetailDcp_17_whole.pdfDetailed Wage Decompositions: Revisiting the Identification Problem

of not married (i.e., the combination of never-married and widowed/divorced/separated)

is smaller than that of married. Unlike the averaging method, the GM weighting method

again produces consistent amounts of D2 for married regardless of models.

Another noteworthy point is about the effects of age. Unlike other variables, the es-

timated effects D1 and D2 are near zeros in both methods. This is simply because the

proportion of each age group happens to equal 1/K (i.e., 1/3) for both racial groups. In

short, this illustration clearly shows that the detailed decomposition of the GM weighting

methods are model-free and consistent, while the averaging method is sensitive to model

specifications.

5 The Best Practice

Even though I suggested to apply the GM weighted method for detailed decomposition, I

hasten to add that there are no methods, including even the GM weighting method, that can

ultimately solve the identification problem and thus be universally applied to all situations.

Recall that given the restriction that∑K

k=1 b†kwk = 0, then all detailed decompositions are

mathematically correct. Different restrictions lead to different interpretations of the esti-

mates. Thus, the most essential question is what method is best and what principles should

be applied when choosing a specific decomposition method.

If the task is to compare groups within a nation on economic performance, I argue that

the GM weighting method is generally preferable. According to this method, the currently

observed distribution of x per se should be accepted as given. The intercept terms in equa-

tions 4 measure the expected wage when x is distributed as ¯x. The reason why wk should

be the grand-mean rather than a group-specific mean (or other weighting factor), is that

the wage is determined by supply and demand in the whole labor force of a society, not a

specific group. For example, the supply of highly educated workers can be measured best by

¯xGrad, not by xWGrad or xBGrad. If the currently observed labor market reflects an equilibrium

in employment, which in turn affects wages, the most reasonable and practical assumption

12

Page 14: Detailed Wage Decompositions: Revisiting the Identification ...people.ku.edu/~chkim/paper/NewDetailDcp_17_whole.pdfDetailed Wage Decompositions: Revisiting the Identification Problem

on the current status of the labor market is ¯x.4 The grand-means need not be of the two

groups (W and B); their estimation can include other groups not of interest in the current

study. Which distribution of x yields the most realistic estimate of actual wages depends

upon the researcher’s judgment.

When the sample in a given dataset is representative, the grand means are unbiased

and consistent estimates of E(x) for a given population. As far as E(x) is considered a

reflection of the current social conditions in a given society, the GM weighting method

accurately estimates the extent to which each factor and factor level contributes to the

group differences under these social conditions.

If there are theoretical or practical reasons to define a specific group as the reference

group, the original BO decomposition method can be applied. For example, suppose a re-

searcher conducts a detailed decomposition of a wage difference, with a particular focus on

college premiums. College premiums are defined as the net difference in the results of HSG

and BA. Therefore, HSG may be the natural choice for the reference group. Even in cases

such as this, however, I recommend combining the original BO method with the GM weight-

ing method. The dummy coding used in the original BO method need be applied only for

the education factor, with the GM weighting method applied to all the others. This is be-

cause there are no strong theoretical or practical reasons to pick a certain factor level (e.g.,

the Pacific region) as the reference for computing college premiums. One caveat researchers

should bear in mind in interpreting the college premium effect of the BO decomposition is

that the college premiums are computed relative to the inner group counterparts. By setting

HSG as a reference point, we implicitly assume that HSG does not contribute to the wage

gap between two populations we are interested in. A positive college premium effect in

account for the wage gap between group B (black workers) and group W (white workers)

does not necessarily indicate that the wage of the college educated workers of group B ex-

ceeds that of group W. The higher college premium of group B can be a reflection of the

excessive discrimination against the low educated workers of group B.

4A similar logic was applied in a study of inter-industry wage differentials by Krueger and Summers (1988).

13

Page 15: Detailed Wage Decompositions: Revisiting the Identification ...people.ku.edu/~chkim/paper/NewDetailDcp_17_whole.pdfDetailed Wage Decompositions: Revisiting the Identification Problem

The example in the previous section illustrates how more generically analytic and sub-

stantive grounds may be invoked when choosing a specific decomposition method. In this

spirit, I would recommend the averaging method only when it is reasonable to assume a

uniform distribution across factor levels (although in the case of studies of labor market

outcomes, such an assumption seems dubious). Other constraints of∑K

k=1 b†kwk = 0 might

be possible, but there needs to be compelling reasons to bypass the grand means.

The choice of weighting values becomes more subtle when detailed decomposition tech-

niques are applied beyond the labor market. For group comparisons within a nation, such

as racial differences in voting rates or gender differences in subjective well-being, the GM

weighting method is still preferable in most cases. However, the GM weighting method is

not always the best choice. In particular, grand means would not be appropriate weighting

factors for international comparisons. Suppose a researcher is interested in the difference

in mortality rates between the US and China. As the Chinese economy develops, the dis-

tributions of age, education-level, and other covariates in China would approach those in

the US. If a researcher wants to perform a decomposition under the assumption that the

distributions of the xs in China are equal to those in the US, the weighting factors used to

normalize the coefficients should be xUS , not the (weighted) means of groups means, xUS

and xCN .

There are other studies for which the averaging method is most appropriate. An ex-

ample is the total fertility rate (TFT), which is a hypothetical fertility rate when a woman

experiences the current, age-specific population fertility rates throughout her life. A uni-

form distribution of age groups should be assumed for computing the TFT.5 If researchers

want to yield the intercept component (D1A) representing the TFT after controlling for the

other covariates, they can apply the averaging method to the age variable and the modified

GM weighting method to the covariates.6

5Yu Xie pointed out this in his comments on the previous draft of this paper at the quantitative methodologysession of the 2011 American Sociological Association annual meeting.

6Note that the identification problem and its solutions as discussed in this paper are relevant to the methodsused for rate standardization and the decomposition of rate differences, matters that have long been discussedin the demography literature ( see, for example, Kitagawa (1955); Clogg and Eliason (1988); Liao (1989);

14

Page 16: Detailed Wage Decompositions: Revisiting the Identification ...people.ku.edu/~chkim/paper/NewDetailDcp_17_whole.pdfDetailed Wage Decompositions: Revisiting the Identification Problem

In short, “there is no single ‘best’ method of obtaining components” for decomposition

(Clogg et al. 1990:191). In principal, any detailed decomposition method is acceptable as

long as there are theoretical or practical reasons to believe that the researcher’s choice of

reference group (or weighting factors) produces a meaningful decomposition result. This is

why I refer to the GM weighting method “A Suggested Alternative,” not “The Solution. ”

6 Conclusion

Since the development of the BO decomposition techniques, detailed decompositions have

often been reported despite the caveats about the identification problem that has been

raised many scholars. To address this concern, Yun (2005) suggested the averaging method.

Although that method is a notable advance and is undoubtedly applicable in some cases, it

does not resolve the identification problem entirely. It is based on unrealistic distributional

constraints on a set of dummy variables, and it is sensitive to the number of groups and the

method of grouping.

The legitimacy of any detailed decomposition depends on the acceptability of the as-

sumption of how the independent variables are distributed for the purpose of computing

the intercept. There are multiple solutions to the identification problem. Different model

specifications for detailed decompositions are appropriate, depending on various theoretical

and/or practical considerations.

Unless such considerations are compelling, however, the GM weighting method is likely

to be more generally preferred for studies of labor market outcomes. This conclusion is

based on the following reasons: (1) a state of equilibrium (or current social conditions) is

the most reasonable assumption to make for labor market phenomena; (2) estimates of the

contributions of individual factors and factor levels from the GM weighting method are the

least sensitive to model specifications, such as the choice of coding scheme; and (3) as a

result, the GM weighting method provides clear substantive interpretations of the intercept

Clogg et al. (1990)).

15

Page 17: Detailed Wage Decompositions: Revisiting the Identification ...people.ku.edu/~chkim/paper/NewDetailDcp_17_whole.pdfDetailed Wage Decompositions: Revisiting the Identification Problem

and coefficients components for individual factor levels.

Detailed decomposition can be applied to various sociological issues. Given the rise in

the number of highly educated workers, estimating the detail contributions of the differ-

ences in levels of education, fields of study, occupation, and other covariates in accounting

for gender/race earnings gaps is especially promising. Wealth inequality is another area of

interest, where a key issue is how to compute the extent to which each factor contributes

to wealth accumulation (Spilerman 2000); detailed decomposition is an essential tool for

such calculations (Scholz and Levine 2004).

Application of the GM weighting method to non-linear models is warranted. Although

there is no general agreement on how to decompose the results of quantile regressions, the

GM weighting method can be easily applied to quantile regressions as long as the use of the

Blinder-Oaxaca type decomposition implemented by Garcıa et al. (2001) is acceptable.7

7See Gardeazabal and Ugidos (2005) for further discussion.

16

Page 18: Detailed Wage Decompositions: Revisiting the Identification ...people.ku.edu/~chkim/paper/NewDetailDcp_17_whole.pdfDetailed Wage Decompositions: Revisiting the Identification Problem

References

Berends, Mark, amd Samuel R. Lucas and Roberto V. Penaloza. 2008. “How Changes in

Families and Schools Are Related to Trends in Black-White Test Scores.” Sociology of

Education 81:313–44.

Bhaumik, Sumon Kumar, Ira N. Gang, and Myeong-Su Yun. 2006. “Ethnic conflict and

economic disparity: Serbians and Albanians in Kosovo.” Journal of Comparative Economics

34:754–773.

Blinder, Alan S. 1973. “Wage Discrimination: Reduced Form and Structural Estimates.”

Journal of Human Resources 8:436–55.

Bobbitt-Zeher, Donna. 2007. “The Gender Income Gap and the Role of Education.” Sociology

of Education 80:1–22.

Clogg, Clifford C. and Scott R. Eliason. 1986. “On Regression Standardization for Mo-

ments.” Sociological Methods and Research 14:423–46.

Clogg, Clifford C. and Scott R. Eliason. 1988. “A Flexible Procedure for Adjusting Rates and

Proportions, Including Statistical Methods for Group Comparisons.” American Sociological

Review 53:267–83.

Clogg, Clifford C., James W. Shockey, and Scott R. Eliason. 1990. “A General Statistical

Framework for Adjustment of Rates.” Sociological Methods and Research 19:156–95.

Cotton, Jeremiah. 1985. “Decomposing Income, Earnings, and Wage Differentials.” Socio-

logical Methods and Research 14:201–16.

DeLeire, Thomas. 2001. “Changes in Wage Discrimination against People with Disabilities:

1984-93.” Journal of Human Resources 36:144–58.

Dodoo, F. Nil-Amoo. 1991. “Earnings differences among Blacks in America.” Social Science

Research 20:93–108.

17

Page 19: Detailed Wage Decompositions: Revisiting the Identification ...people.ku.edu/~chkim/paper/NewDetailDcp_17_whole.pdfDetailed Wage Decompositions: Revisiting the Identification Problem

Farkas, George and Keven Vicknair. 1996. “Appropriate Tests aof Racial Wage Discrimination

Require Controls for Cognitive Skill: Comment on Cancio, Evans, and Maume.” American

Sociological Review 61:557–60.

Fields, Judith and Edward N. Wolff. 1995. “Interindustry Wage Differentials and the Gender

Wage Gap.” Industrial and Labor Relations Review 49:105–20.

Fortin, Nicole, Thomas Lemieux, and Sergio Firpo. 2010. “Decomposition Methods in Eco-

nomics.” NBER Working Papers 16045.

Fox, John. 2008. Applied Regression Analysis and Generalized Linear Models. Thousand Oaks,

CA: Sage Publications, Inc.

Garcıa, Jaume, Pedro J. Hernandez, and Angel Lopez-Nicolas. 2001. “How Wide is the

Gap? An Investigation of Gender Wage Differences Using Quantile Regression.” Empirical

Economics 26:149–67.

Gardeazabal, Javier and Arantza Ugidos. 2004. “More on Identification in Detailed Wage

Decompositions.” The Review of Economics and Statistics 86:1034–1036.

Gardeazabal, Javier and Arantza Ugidos. 2005. “Gender Wage Discrimination at Quantiles.”

Journal of Population Economics 18:165–79.

Greene, William H. and Terry G. Seaks. 1991. “The Restricted Least Square Estimator: A

Pedagogical Note.” The Review of Economics and Statistics 73:563–67.

Haisken-DeNew, J.P. and C. M. Schmidt. 1997. “Inter-Industry and Inter-Regional Differen-

tials: Mechanics and Interpretation.” The Review of Economics and Statistics 79:516–21.

Horrace, William C. and Ronald L. Oaxaca. 2001. “Inter-Industry Wage Differentials and

the Gender Wage Gap: An Identification Problem.” Industrial and Labor Relations Review

54:611–18.

Jones, F. L. 1983. “On Decomposing the Wage Gap: A Critical Comment on Blinder’s

Method.” Journal of Human Resources 18:126–30.

18

Page 20: Detailed Wage Decompositions: Revisiting the Identification ...people.ku.edu/~chkim/paper/NewDetailDcp_17_whole.pdfDetailed Wage Decompositions: Revisiting the Identification Problem

Jones, F. L. and Jonathan Kelley. 1984. “Decomposing Differences Between Groups: A Cau-

tionary Note on Measuring Discrimination.” Sociological Methods and Research 12:323–

43.

Kim, ChangHwan. 2010. “Decomposing the Change in the Wage Gap Between White and

Black Men Over Time, 1980-2005: An Extension of the Blinder-Oaxaca Decomposition

Method.” Sociological Methods and Research 38:619–51.

Kitagawa, E.M. 1955. “Components of a Difference between Two Rates.” Journal of the

American Statistical Association 50:1168–94.

Krueger, Alan B. and Lawrence H. Summers. 1988. “Efficiency Wages and the Inter-Industry

Wage Structure.” Econometrica 57:259–293.

Liao, Tim Futing. 1989. “A Flexible Approach for the Decomposition of Rate Differences.”

Demography 26:717–26.

Oaxaca, Ronald L. 1973. “Male-female Wage Differentials in Urban Labor Markets.” Inter-

national Economic Review 14:693–709.

Oaxaca, Ronald L. and Michael R. Ransom. 1999. “Identification in Detailed Wage Decom-

positions.” The Review of Economics and Statistics 81:154–57.

Phillips, Julie A. and Megan M. Sweeney. 2006. “Can Differential Exposure to Risk Factors

Explain Recent Racial and Ethnic Variation in Marital Disruption?” Social Science Research

35:409–34.

Powers, Daniel A., Hirotoshi Yoshioka, and Myeong-Su Yun. forthcoming. “mdvcmp: Multi-

variate Decomposition for Nonlinear Response Models.” The Stata Journal .

Powers, Daniel A. and Myeong-Su Yun. 2009. “Multivariate Decomposition for Hazard Rate

Models.” Sociological Methodology 39:233–63.

19

Page 21: Detailed Wage Decompositions: Revisiting the Identification ...people.ku.edu/~chkim/paper/NewDetailDcp_17_whole.pdfDetailed Wage Decompositions: Revisiting the Identification Problem

Sakamoto, Arthur, Huei-Hsia Wu, and Jessie M. Tzeng. 2000. “The Declining Significance of

Race among American Men During the Latter Half of the Twentieth Century.” Demography

37:41–51.

Sayer, Liana C. 2004. “Are Parents Investing Less in Children? Trends in Mothers and

Fathers Time with Children.” American Journal of Sociology 110:1–43.

Scholz, John Karl and Kara Levine. 2004. “U.S. Black-White Wealth Inequality.” In So-

cial Inequality, edited by Kathryn M. Neckerman, pp. 895–930, New York. Russell Sage

Foundation.

Spilerman, Seymour. 2000. “Wealth and Stratification Processes.” Annual Review of Sociol-

ogy 26:497–524.

Stearns, Elizabeth, Stephanie Moller, Judith Blau, and Stephanie Potochnick. 2007. “Stay-

ing Back and Dropping Out: The Relationship Between Grade Retention and School

Dropout.” Sociology of Education 80:210–40.

Van Hook, Jennifer, Susan L. Brown, and Maxwell Ndigume Kwenda. 2004. “A Decomposi-

tion of Trends in Poverty among Children of Immigrants.” Demography 41:649–70.

Yun, Myeong-Su. 2005. “A Simple Solution to the Identification Problem in Detailed Wage

Decompositions.” Economic Inquiry 43:766–72.

Yun, Myeong-Su. 2008. “Identification Problem and Detailed Oaxaca Decomposition: A

General Solution and Inference.” Journal of Economic and Social Measurement 33:27–38.

20