Detailed Wage Decompositions: Revisiting the Identification...
Transcript of Detailed Wage Decompositions: Revisiting the Identification...
Detailed Wage Decompositions: Revisiting theIdentification Problem∗
ChangHwan Kim
August 1, 2012
Department of SociologyUniversity of Kansas
1415 Jayhawk Blvd., Room 716Lawrence, KS 66045Tel: (785) 864-9426Fax: (785) 864-5280
∗Comments from the editor of Sociological Methodology, five anonymous reviewers, Arthur Sakamoto, DanielPowers, Yu Xie, and Myeong-Su Yun have greatly improved previous drafts. All remaining errors are the author’ssole responsibility.
i
Detailed Wage Decompositions: Revisiting theIdentification Problem
August 1, 2012
Abstract
Yun (2005) suggested an averaging method to resolve the identification problem
in detailed decompositions. Since then, detailed decomposition techniques have been
widely discussed. This paper shows that the averaging method may not answer the
identification problem. The method is built on unrealistic distributional constraints on
a set of dummy variables and is sensitive to both the number of groups and the method
of grouping. As an alternative, a weighted averaging method is suggested which is in
line with Haisken-DeNew and Schmidt’s (1997) two step re-normalization. This method
gives a distinct meaning to the intercept term and makes detailed decomposition fea-
sible using a reasonable assumption. However, this paper underscores that there are
multiple solutions to the identification problem, and that other detailed decomposition
methods may be acceptable depending on theoretical or practical considerations.
Keywords: Blinder-Oaxaca Decomposition, Detailed Decomposition, Averaging Method,
Grand-Mean Weighting Method
1
1 Introduction
Blinder-Oaxaca (BO) wage decomposition methods have been extensively applied in soci-
ology and economics (e.g., Dodoo 1991; Farkas and Vicknair 1996; Sakamoto et al. 2000;
DeLeire 2001; Sayer 2004; Van Hook et al. 2004; Phillips and Sweeney 2006; Stearns et al.
2007; Berends and Penaloza 2008). Although several variants of wage decompositions
have been developed since the initial work of Blinder (1973) and Oaxaca (1973),1 all these
techniques share the basic idea that the mean wage gap between two groups at a given
time can be broken down into two components (Jones and Kelley 1984; Cotton 1985). The
first component is the coefficient effect and the second component is the endowment ef-
fect. On top of the two components decomposition, researchers have often reported the
contributions of individual variables or of combined sets of dummy variables after imple-
menting a BO decomposition (e.g., Fields and Wolff 1995; DeLeire 2001; Bobbitt-Zeher
2007; citealtChangEngland:2010). However, these detailed decompositions can be er-
roneous, because the contribution of each individual variable or set of dummy variables
changes with the choice of the reference group. This is the identification problem which
has been a much discussed (Jones 1983; Jones and Kelley 1984; Clogg and Eliason 1986;
Oaxaca and Ransom 1999; Horrace and Oaxaca 2001; Gardeazabal and Ugidos 2004; Yun
2005) yet often ignored pitfall of detailed decomposition using BO techniques.
To resolve the identification problem, Gardeazabal and Ugidos (2004) suggested nor-
malizing the estimated coefficients, and Yun (2005) proposed an averaging method as a
simple alternative to the unwieldy normalization. Since then, detailed BO-type decompo-
sition techniques have been widely discussed in sociology and economics (e.g., Bhaumik
et al. 2006; Powers and Yun 2009; Kim 2010; Fortin et al. 2010; Powers et al. forthcom-
ing). The averaging method is, however, based on unrealistic distributional constraints on
a set of dummy variables. In this paper, I suggest the application of an alternative method
and consider its limitations.1Decomposition techniques were originally developed by sociologists and demographers (e.g., Kitagawa
1955). See Powers and Yun (2009:234-235) for a summary history.
2
2 Identification Problem and Averaging Method
Using ordinary least square (OLS) regression, we can estimate the log wage, y, as follows:
y = a +
J∑j=1
Kj∑k=1
bjkxjk + e (1)
where j = 1, 2, 3, . . . , J and k = 1, 2, 3, . . . ,Kj; j refer to the jth factor, and k repre-
sents the kth level of each of these J factors. In equation 1, each factor includes a com-
plete set of dummies, including the typically left out reference group. The reference group
E[bjkxjk|xj1 = 1] takes a value of zero, which implies the identification constraint of bj1 = 0.
For simplicity, I restrict J = 1 in this paper, so that equation 1 becomes y = a+∑K
k=1 bkxk+e.
Consider a comparison of white workers (group W) and black workers (group B). Esti-
mating equation 1 separately for two groups, the mean wage gap between groups W and B
(yW − yB) can be partitioned into several components as follows:
yW − yB =(aW − aB
)︸ ︷︷ ︸D1A
+
K∑k=1
(bWk − bBk )xBk︸ ︷︷ ︸D1B︸ ︷︷ ︸
D1
+
K∑k=1
(xWk − xBk )bWk︸ ︷︷ ︸D2
(2)
where D1 is the sum of D1A, which denotes the intercept component, and D1B, which
denotes the coefficient component. D1 represents the total coefficient effect. D2 refers to
the total endowment effect. By definition in OLS, the mean residual difference between eW
and eB is zero.
In a regression model, an estimated intercept varies with the choice of reference group,
and that is the case with the intercept component (D1A) in equation 2. As the reference
group changes in a regression model, the estimated coefficient bk changes accordingly.
Therefore, estimates of the extent to which individual factors and factor levels contribute to
the mean wage gap between groups W and B also vary with the choice of reference group.
This illustrates the identification problem in detailed decompositions.
3
To resolve the identification problem, Gardeazabal and Ugidos (2004) suggest normaliz-
ing the coefficients of the dummy variables by imposing a restriction of∑
bk = 0. As a sim-
ple way to impose such restriction, Yun (2005) proposes the following averaging method:
y =(a + b
)+
K∑k=1
(bk − b)xk + e
= a′ +K∑k=1
b′kxk + e
(3)
where a′ refers to a + b and b′ refers to (bk − b). A b is computed as the sum of bk divided
by the number of factor levels; that is,∑K
k=1 bkK . For the reference group of equation 1, b1 is
zero by definition, thus the transformed coefficient b1− b is equal to −b. This transformation
of bk causes∑K
k=1 b′k to become zero. In ANOVA, the restriction of
∑b′k = 0 is referred to
as a sigma constraint. It is one of a number of possible identifying normalizations (Fox
2008:145). Under this constraint, the constant or intercept term is a generalized grand
mean (i.e., the mean of the means), and the effects b′k are deviations from this.
As a result of this normalization, both the new coefficients for the independent variables,
(bk − b), and for the new intercept, a + b, do not depend on the choice of reference group.
As the coefficient b1 for the reference group becomes −b, no group is omitted. Because
equation 3 has exactly the same structure as equation 1, the BO decomposition technique
for equation 2 can also be applied to equation 3.
On the surface, the averaging method appears to solve the identification problem. In
fact, the averaging method does not resolve it but merely conceals it. In equation 3, the
intercept a′ is the expected wage given xk = 1/K for k = 1, . . . ,K. That is, E[y|(xk =
1/K)] = a′. The difference between the intercepts of the two groups, a′W − a′B, is the
expected wage difference between hypothetical groups W and B, assuming the means of
all x’s equal 1/K. The normalized coefficients b′k represent the expected differences of y
for a group whose xk = 1 for a specific k compared to the hypothetical reference point (or
group), of which the mean of xk is equal to 1/K for all k’s. That is, b′k = E[y|(xk = 1)]− a′.
4
The identification problem occurs because the estimates of the intercept and those of
all other coefficients change with the arbitrary change of reference group. If the intercept
of one possible reference group (e.g, E[y|x1 = 1 & x2,3,4,...,K = 0] when x1 is the reference
group) and that of the other possible reference group (e.g, E[y|x2 = 1 & x1,3,4,...,K = 0]
when x2 is the reference group) are arbitrary, then the intercept of the averaging method
(i.e., E[y|x1,2,3,...,K = 1/K] when a hypothetical point of which the means of all xs equal
1/K is the reference) is also arbitrary in essence.
Some will argue that the averaging method nonetheless offers a normalized, standard-
ized, and detailed decomposition. They will say that it is better to use 1/K than to cherry-
pick the reference group. However, the averaging method is sensitive to the number of
k. This sensitivity leads to another kind of identification problem. Suppose a researcher
uses the four factor levels for education. One possible four-factor grouping would be LTHS
(less than high school), HSG (high school graduate), SC (some college), and BA+ (bache-
lor degree or higher), in which case the intercept calculated using the averaging method is
E(y|xLTHS = xHSG = xSC = xBA+ = .25). If the researcher changes her education cat-
egories into five, dividing BA+ into BA and Grad (graduate degrees), the intercept of the
averaging method is changed to E(y|xLTHS = xHSG = xSC = xBA = xGrad = .20). In the
former case, BA+ would include a 25% share of workers, whereas in the second grouping
it would include a 40% share. As the size of K changes, so does the intercept estimate, as
well as all the other coefficients.
The averaging method is sensitive not only to the number of groups, but also to the
method of grouping. Suppose a research uses another way of four-factor grouping, which
combines LTHS and HSG into <HSG and divides BA+ into BA and Grad, so that four groups
are <HSG, SC, BA, and Grad. In the previous four-factor grouping, BA+ would include a
25% share of workers, whereas in the new grouping it would include a 50% share. As the
hypothetical distribution of x for the intercept differs as a function of grouping method,
almost all estimated effects including the intercept effect (D1A), the sum of coefficient
effects (sum of D1B), coefficient effects (D1B) of factor level and endowment effects (D2) of
5
factor level differ as well. Only the sum of D1 and the sum of D2 are consistently estimated
free from the variation of model specifications. Note that the original BO decomposition
also yields the same results regarding the sums of D1 and D2.
There is also an identification problem with dichotomous variables (Yun 2005). Which
values should be coded as 1 and 0 is an arbitrary decision made by the researcher. With the
averaging method, the intercept for the dichotomous variables is the value expected when
the proportions of x = 1 and x = 0 are equal (i.e., .50). This simply is not the case for many
variables of interest, such as union membership.
In sum, despite contentions to the contrary, the averaging method suffers from the same
identification problem as the original BO decomposition. Different numbers of groups and
different methods of grouping can lead to substantially divergent results.
3 A Suggested Alternative Approach: The Grand-Mean Weight-
ing Method
If neither the current BO decomposition nor the averaging method can solve the identifi-
cation problem, are detailed decompositions feasible at all? They are said to be infeasible
because the intercept term in regression models depends on the choice of reference group.
In fact, any normalization with the linear restriction of the parameters∑K
k=1 b†kwk = 0 will
yield model-free decomposition estimates. There are an infinite number of possible restric-
tions. The averaging method (where the weighting factors wk are equal to 1/K) is a special
cases of these restrictions. Even the original BO decomposition can be considered a special
case of the restriction, with w1 = 1 for k = 1 (factor level 1 is the reference group) and
wk = 0 for all other k. Given the restriction of∑K
k=1 b†kwk = 0, all detailed decompositions
are not mathematically wrong. The feasibility of detailed decomposition does not hinge on
mathematical tweaks, but rather on acceptable restrictions involving the weighting factors
wk.
As an alternative approach, I suggest to apply a weighted averaging method in which the
6
weighting factors are grand-means. The weighted normalization is in line with the two-step
re-normalization proposed by Haisken-DeNew and Schmidt (1997) and the restricted least
squares method discussed by Greene and Seaks (1991). I will call this the grand-mean (GM)
weighting method. In the following section, I first discuss the mathematical advantage of
the GM weighting method and then turn to its theoretical implications. To carry out a
detailed decomposition using the GM weighting method, it is necessary to transform the
estimated regression coefficients of equation 1, bk, as follows:
y =(a + b∗
)+
K∑k=1
(bk − b∗)xk + e
= a∗ +K∑k=1
b∗kxk + e
where b∗ =K∑k=1
bk ¯xk.
(4)
Yun (2005) used a simple arithmetic mean to compute b in equation 3. The normaliza-
tion of the averaging method by∑K
k=1 b′k = 0 is equivalent to the normalization based on∑K
k=1 b′k(1/K) = 0. The GM weighting method, instead, normalizes the estimated coeffi-
cients by imposing∑K
k=1 b∗k¯xk = 0, where ¯x refers to the grand mean for both group W and
group B. That is, b∗ is treated as the grand-mean-weighted sum of bk in equation 4. After
these transformations, the usual BO decomposition techniques can be applied.
The insensitivity of a∗ to the choice of reference group can easily be proved. For sim-
plicity, let’s assume there is only one dummy variable on the right side of equation. When
x0 is a reference group, the regression model looks like equation 5a, and as we change the
reference group to x1, the estimated regression model becomes equation 5b.
y = a + bx1 + e (5a)
y = (a + b) + (−b)x0 + e (5b)
7
Applying equation 4, equations 5a and 5b are transformed to equations 6a and 6b respec-
tively:
y = (a + b¯x1) + (0− b¯x1)x0 + (b− b¯x1)x1 + e (6a)
y = (a + b + [−b¯x0]) + (−b− [−b¯x0])x0 + (0− [−b¯x0])x1 + e (6b)
The intercept of equation 6a is (a+ b¯x1) and that of equation 6b is (a+ b+[−b¯x0]). Because
x1 is a dichotomous variable, x0 = 1 − x1 and ¯x0 = 1 − ¯x1. If we replace ¯x0 with 1 − ¯x1
for the intercept of 6b, it is reduced to (a + b¯x1) which is identical with the intercept of 6a.
This proves that the intercept of the GM weighting method is insensitive to the choice of
reference group.
The intercept effect (D1A) of the GM weighting method quantifies the extent to which
members of the disadvantaged group are, on average, treated differently than members of
the advantaged group in a society. Thus, this value can also be interpreted as the average
extent of discrimination (assuming no unobserved heterogeneity).
The estimated contribution of the individual factor level (D1B) of the GM weighting
method indicates a deviation from the mean discrimination level. The sum of D1B of the GM
weighting method is close to zero.2 When there is group-based discrimination, all members
of the disadvantaged group suffer to a similar degree. If the extent of discrimination varies
greatly within a given minority group, it would be hard to label the discrimination as group-
based. Therefore, the small effects of D1B, along with the large intercept effect are what
one would expect if there is group-based discrimination.
The sums of D2 are identical between averaging method and the GM weighting method.
Unlike the averaging method, however, the GM weighting method yields consistent esti-
mates of the D2 for individual factor levels regardless of model specifications.
2The sum of D1B of the GM weighting method will be meaningfully different from zero only if the value ofx for each group differs from the grand mean. A similar limitation is inevitable to all detailed decompositionmethods. For the averaging method, the sum of D1B can be substantively large only if the distribution of xdiffers from 1/K, and the sum of D1B becomes larger as a researcher arbitrarily applies a model specificationthat increases
∑|xk − 1/K|. For the original BO method, the sum of D1B will be near zero if the proportion
of the reference group approaches to 1, and conversely it becomes larger as a researcher arbitrarily chooses areference group of which the proportion (i.e., xk) is smaller than other groups.
8
Table 1: Descriptive Statistics
Total White Black Gap¯x xW xB xW − xB
Log Wage, y 3.0572 3.0811 2.7968 .2843
Less Than High School (LTHS) .0428 .0406 .0667 -.0261High School Graduaate (HSG) .2958 .2895 .3643 -.0748
Some College (SC) .2836 .2810 .3118 -.0308Bachelor Degree (BA) .2564 .2633 .1822 .0811
Graduate Degree (Grad) .1214 .1256 .0750 .0506
25-34 .3273 .3277 .3231 .004635-44 .3388 .3380 .3481 -.010145-54 .3339 .3343 .3289 .0054
Never Married .2298 .2192 .3451 -.1259Currently Married .6433 .6581 .4827 .1754
Widow/Divorce/Seperated .1268 .1227 .1722 -.0495
4 An Illustrative Example
Using the 2009 Current Population Survey–Monthly Outgoing Rotation Group (CPS-MORG),
I decompose the log wage gap (.284 log dollars) between white male workers and black
male workers. To examine the sensitivity of decomposition results to model specifications, I
estimate three models, applying both the averaging method and the GM weighting method.
Table 1 shows the grand means and two group means. Table 2 presents the decompo-
sition results.3 Model 1 decomposes the racial gap into five educational levels, three age
groups, and three marital status. In Model 2, LTHS and HSG are collapsed to <HSG and
marital status is divided into two categories. Model 3 has the same model specification with
Model 2 except educational categories. BA and Grad are collapsed to BA+ and LTHS and
HSG are separately identified.
3The variance-covariance matrix of the averaging method is discussed in detail by Yun (2008). The sametechnique can be applied for the modified GMC method with slight modification. The variance-covariancematrix of the normalized regression coefficients of the averaging method is computed as Σb′ = WΣB0W ′
where W is a weight matrix and ΣB0 is a reformatted variance-covariance matrix of the original regressioncoefficients (ΣB). For the GM weighting method, everything except the weighting matrix W ∗ is the same asthe averaging method. The weighting matrix needs to be rebuilt by replacing a set of matrix of 1/Kj for eachfactor with a set of matrix of weighting values using grand means. The new coefficient for the GM weightingmethod is obtained by taking diagonal of W ∗B. The variance-covariance matrix of the new coefficients iscomputed as Σ∗b = W ∗ΣB∗W
∗′.
9
In all three models, the sum of D1 (D1A + D1B) and the sum of D2 are identical between
two decomposition methods. The sum of D2 for each factor (e.g.,∑
D2 of Edu Effect) is
also identical across methods. All other estimated decomposition components, however,
differ by decomposition methods.
Importantly, the decomposition results of the GM weighting method are consistent
across model specifications, while those of the averaging method are substantially altered
by models. The intercept component (D1A) of the averaging method is .175 in Model 1, but
it becomes .167 in Model 2 and .200 in Model 3. In contrast to the averaging method, the
intercept components using the GM weighting method barely change across the models.
The coefficient effects (D1) of individual factors and factor levels are not consistently
estimated with the averaging method. For example, in Model 1 which uses five educational
categories, the sum of D1 of education is .005. When the education factor levels are reduced
to four categories in Model 2, the sum of D1 of education becomes .023. When I modify the
classification of education factors again in Model 3, the effect now turns out to be -.012.
These results imply that we cannot determine whether the coefficient effect of education
contributes to the reduction of racial gap or to the increase of racial gap with the averaging
method. For another example, the sum of D1 for two educational factor levels, LTHS and
HSG, of the averaging method is .014 in Model 1, but it is .024 in Model 2 and .007 in Model
3. Unlike the averaging method, the GM weighting method yields consistent estimates of
the effects of D1 for individual factors and factor levels under the different classifications of
factor levels.
The averaging method does not produce consistent estimates of the endowment ef-
fects (D2) for individual factor levels either. The estimated effect of D2 for BA and Grad
combined in the averaging method is either .028, .039, or .049 depending on the model
specifications, while that in the GM weighting method is almost identical across models.
When a dichotomous variable is used, the averaging method reports even effects for two
dichotomous categories by design. As a result, D2’s of married and not-married are equally
.016 in Model 2. However, when three marital status factor levels are used in Model 1, D2
10
Table 2: Decomposition of the Mean Log Wage Gap between White and Black Men UsingAveraging and GM weighting Methods
Averaging GM weighting(Equation 3) (Equation 4)D1 D2 D1 D2
A. Model ILTHS .005 .010 .005 .010HSG .009 .015** .010* .016**SC -.004 .002 -.003 .002BA .002 .019** .003 .018**Grad -.007 .020** -.007 .019*
[Σ Edu Effect] [.005] [.066]** [.007]** [.066]**25-34 -.007 -.001 -.007 -.00135-44 .004 .000 .004 .00045-54 .003 .001 .003 .001
[Σ Age Effect] [.000] [-.001]** [.001]* [-.001]**Never Married -.011* .012* -.015** .019**Currently Married .016** .020** .009** .012**Wid/Div/Sep .000 .001 -.003 .003
[Σ Marriage Effect] [.005]* [.034]** [-.009]** [.034]**Intercept .175** .187**
[Total] [.185]** [.099]** [.185]** [.099]**
B. Model 2<HSG .024** .031* .017** .024SC .001 .005 -.004 .002BA .005 .012 .002 .018Grad -.006 .016 -.008 .019
[Σ Edu Effect] [.023]** [.063]** [.007]** [.063]**25-34 -.007 -.001 -.007 -.00135-44 .004 .000 .004 .00045-54 .004 .001 .004 .001
[Σ Age Effect] [.000] [-.001]** [.001] [-.001]**Currently Not-married -.013** .016 -.017** .021Currently Married .012** .016 .009** .012
[Σ Marriage Effect] [-.001]** [.032]** [-.008]** [.032]**Intercept .167** .190**
[Total] [.189]** [.095]** [.189]** [.095]**
C. Model 3LTHS .004 .008 .005 .010HSG .003 .008 .009* .016SC -.009 -.001 -.004 .002BA+ -.009 .049** -.005 .035**
[Σ Edu Effect] [-.012] [.064]** [.005] [.064]**25-34 -.007 -.001 -.007 -.00135-44 .005 .000 .005 .00045-54 .002 .001 .002 .001
[Σ Age Effect] [.000] [-.001]** [.000] [-.001]**Currently Not-married -.014** .016 -.018** .021Currently Married .013** .016 .009** .012
[Σ Marriage Effect] [-.001]** [.033]** [-.009]** [.033]**Intercept .200** .191**
[Total] [.188]** [.096]** [.188]** [.096]**
11
of not married (i.e., the combination of never-married and widowed/divorced/separated)
is smaller than that of married. Unlike the averaging method, the GM weighting method
again produces consistent amounts of D2 for married regardless of models.
Another noteworthy point is about the effects of age. Unlike other variables, the es-
timated effects D1 and D2 are near zeros in both methods. This is simply because the
proportion of each age group happens to equal 1/K (i.e., 1/3) for both racial groups. In
short, this illustration clearly shows that the detailed decomposition of the GM weighting
methods are model-free and consistent, while the averaging method is sensitive to model
specifications.
5 The Best Practice
Even though I suggested to apply the GM weighted method for detailed decomposition, I
hasten to add that there are no methods, including even the GM weighting method, that can
ultimately solve the identification problem and thus be universally applied to all situations.
Recall that given the restriction that∑K
k=1 b†kwk = 0, then all detailed decompositions are
mathematically correct. Different restrictions lead to different interpretations of the esti-
mates. Thus, the most essential question is what method is best and what principles should
be applied when choosing a specific decomposition method.
If the task is to compare groups within a nation on economic performance, I argue that
the GM weighting method is generally preferable. According to this method, the currently
observed distribution of x per se should be accepted as given. The intercept terms in equa-
tions 4 measure the expected wage when x is distributed as ¯x. The reason why wk should
be the grand-mean rather than a group-specific mean (or other weighting factor), is that
the wage is determined by supply and demand in the whole labor force of a society, not a
specific group. For example, the supply of highly educated workers can be measured best by
¯xGrad, not by xWGrad or xBGrad. If the currently observed labor market reflects an equilibrium
in employment, which in turn affects wages, the most reasonable and practical assumption
12
on the current status of the labor market is ¯x.4 The grand-means need not be of the two
groups (W and B); their estimation can include other groups not of interest in the current
study. Which distribution of x yields the most realistic estimate of actual wages depends
upon the researcher’s judgment.
When the sample in a given dataset is representative, the grand means are unbiased
and consistent estimates of E(x) for a given population. As far as E(x) is considered a
reflection of the current social conditions in a given society, the GM weighting method
accurately estimates the extent to which each factor and factor level contributes to the
group differences under these social conditions.
If there are theoretical or practical reasons to define a specific group as the reference
group, the original BO decomposition method can be applied. For example, suppose a re-
searcher conducts a detailed decomposition of a wage difference, with a particular focus on
college premiums. College premiums are defined as the net difference in the results of HSG
and BA. Therefore, HSG may be the natural choice for the reference group. Even in cases
such as this, however, I recommend combining the original BO method with the GM weight-
ing method. The dummy coding used in the original BO method need be applied only for
the education factor, with the GM weighting method applied to all the others. This is be-
cause there are no strong theoretical or practical reasons to pick a certain factor level (e.g.,
the Pacific region) as the reference for computing college premiums. One caveat researchers
should bear in mind in interpreting the college premium effect of the BO decomposition is
that the college premiums are computed relative to the inner group counterparts. By setting
HSG as a reference point, we implicitly assume that HSG does not contribute to the wage
gap between two populations we are interested in. A positive college premium effect in
account for the wage gap between group B (black workers) and group W (white workers)
does not necessarily indicate that the wage of the college educated workers of group B ex-
ceeds that of group W. The higher college premium of group B can be a reflection of the
excessive discrimination against the low educated workers of group B.
4A similar logic was applied in a study of inter-industry wage differentials by Krueger and Summers (1988).
13
The example in the previous section illustrates how more generically analytic and sub-
stantive grounds may be invoked when choosing a specific decomposition method. In this
spirit, I would recommend the averaging method only when it is reasonable to assume a
uniform distribution across factor levels (although in the case of studies of labor market
outcomes, such an assumption seems dubious). Other constraints of∑K
k=1 b†kwk = 0 might
be possible, but there needs to be compelling reasons to bypass the grand means.
The choice of weighting values becomes more subtle when detailed decomposition tech-
niques are applied beyond the labor market. For group comparisons within a nation, such
as racial differences in voting rates or gender differences in subjective well-being, the GM
weighting method is still preferable in most cases. However, the GM weighting method is
not always the best choice. In particular, grand means would not be appropriate weighting
factors for international comparisons. Suppose a researcher is interested in the difference
in mortality rates between the US and China. As the Chinese economy develops, the dis-
tributions of age, education-level, and other covariates in China would approach those in
the US. If a researcher wants to perform a decomposition under the assumption that the
distributions of the xs in China are equal to those in the US, the weighting factors used to
normalize the coefficients should be xUS , not the (weighted) means of groups means, xUS
and xCN .
There are other studies for which the averaging method is most appropriate. An ex-
ample is the total fertility rate (TFT), which is a hypothetical fertility rate when a woman
experiences the current, age-specific population fertility rates throughout her life. A uni-
form distribution of age groups should be assumed for computing the TFT.5 If researchers
want to yield the intercept component (D1A) representing the TFT after controlling for the
other covariates, they can apply the averaging method to the age variable and the modified
GM weighting method to the covariates.6
5Yu Xie pointed out this in his comments on the previous draft of this paper at the quantitative methodologysession of the 2011 American Sociological Association annual meeting.
6Note that the identification problem and its solutions as discussed in this paper are relevant to the methodsused for rate standardization and the decomposition of rate differences, matters that have long been discussedin the demography literature ( see, for example, Kitagawa (1955); Clogg and Eliason (1988); Liao (1989);
14
In short, “there is no single ‘best’ method of obtaining components” for decomposition
(Clogg et al. 1990:191). In principal, any detailed decomposition method is acceptable as
long as there are theoretical or practical reasons to believe that the researcher’s choice of
reference group (or weighting factors) produces a meaningful decomposition result. This is
why I refer to the GM weighting method “A Suggested Alternative,” not “The Solution. ”
6 Conclusion
Since the development of the BO decomposition techniques, detailed decompositions have
often been reported despite the caveats about the identification problem that has been
raised many scholars. To address this concern, Yun (2005) suggested the averaging method.
Although that method is a notable advance and is undoubtedly applicable in some cases, it
does not resolve the identification problem entirely. It is based on unrealistic distributional
constraints on a set of dummy variables, and it is sensitive to the number of groups and the
method of grouping.
The legitimacy of any detailed decomposition depends on the acceptability of the as-
sumption of how the independent variables are distributed for the purpose of computing
the intercept. There are multiple solutions to the identification problem. Different model
specifications for detailed decompositions are appropriate, depending on various theoretical
and/or practical considerations.
Unless such considerations are compelling, however, the GM weighting method is likely
to be more generally preferred for studies of labor market outcomes. This conclusion is
based on the following reasons: (1) a state of equilibrium (or current social conditions) is
the most reasonable assumption to make for labor market phenomena; (2) estimates of the
contributions of individual factors and factor levels from the GM weighting method are the
least sensitive to model specifications, such as the choice of coding scheme; and (3) as a
result, the GM weighting method provides clear substantive interpretations of the intercept
Clogg et al. (1990)).
15
and coefficients components for individual factor levels.
Detailed decomposition can be applied to various sociological issues. Given the rise in
the number of highly educated workers, estimating the detail contributions of the differ-
ences in levels of education, fields of study, occupation, and other covariates in accounting
for gender/race earnings gaps is especially promising. Wealth inequality is another area of
interest, where a key issue is how to compute the extent to which each factor contributes
to wealth accumulation (Spilerman 2000); detailed decomposition is an essential tool for
such calculations (Scholz and Levine 2004).
Application of the GM weighting method to non-linear models is warranted. Although
there is no general agreement on how to decompose the results of quantile regressions, the
GM weighting method can be easily applied to quantile regressions as long as the use of the
Blinder-Oaxaca type decomposition implemented by Garcıa et al. (2001) is acceptable.7
7See Gardeazabal and Ugidos (2005) for further discussion.
16
References
Berends, Mark, amd Samuel R. Lucas and Roberto V. Penaloza. 2008. “How Changes in
Families and Schools Are Related to Trends in Black-White Test Scores.” Sociology of
Education 81:313–44.
Bhaumik, Sumon Kumar, Ira N. Gang, and Myeong-Su Yun. 2006. “Ethnic conflict and
economic disparity: Serbians and Albanians in Kosovo.” Journal of Comparative Economics
34:754–773.
Blinder, Alan S. 1973. “Wage Discrimination: Reduced Form and Structural Estimates.”
Journal of Human Resources 8:436–55.
Bobbitt-Zeher, Donna. 2007. “The Gender Income Gap and the Role of Education.” Sociology
of Education 80:1–22.
Clogg, Clifford C. and Scott R. Eliason. 1986. “On Regression Standardization for Mo-
ments.” Sociological Methods and Research 14:423–46.
Clogg, Clifford C. and Scott R. Eliason. 1988. “A Flexible Procedure for Adjusting Rates and
Proportions, Including Statistical Methods for Group Comparisons.” American Sociological
Review 53:267–83.
Clogg, Clifford C., James W. Shockey, and Scott R. Eliason. 1990. “A General Statistical
Framework for Adjustment of Rates.” Sociological Methods and Research 19:156–95.
Cotton, Jeremiah. 1985. “Decomposing Income, Earnings, and Wage Differentials.” Socio-
logical Methods and Research 14:201–16.
DeLeire, Thomas. 2001. “Changes in Wage Discrimination against People with Disabilities:
1984-93.” Journal of Human Resources 36:144–58.
Dodoo, F. Nil-Amoo. 1991. “Earnings differences among Blacks in America.” Social Science
Research 20:93–108.
17
Farkas, George and Keven Vicknair. 1996. “Appropriate Tests aof Racial Wage Discrimination
Require Controls for Cognitive Skill: Comment on Cancio, Evans, and Maume.” American
Sociological Review 61:557–60.
Fields, Judith and Edward N. Wolff. 1995. “Interindustry Wage Differentials and the Gender
Wage Gap.” Industrial and Labor Relations Review 49:105–20.
Fortin, Nicole, Thomas Lemieux, and Sergio Firpo. 2010. “Decomposition Methods in Eco-
nomics.” NBER Working Papers 16045.
Fox, John. 2008. Applied Regression Analysis and Generalized Linear Models. Thousand Oaks,
CA: Sage Publications, Inc.
Garcıa, Jaume, Pedro J. Hernandez, and Angel Lopez-Nicolas. 2001. “How Wide is the
Gap? An Investigation of Gender Wage Differences Using Quantile Regression.” Empirical
Economics 26:149–67.
Gardeazabal, Javier and Arantza Ugidos. 2004. “More on Identification in Detailed Wage
Decompositions.” The Review of Economics and Statistics 86:1034–1036.
Gardeazabal, Javier and Arantza Ugidos. 2005. “Gender Wage Discrimination at Quantiles.”
Journal of Population Economics 18:165–79.
Greene, William H. and Terry G. Seaks. 1991. “The Restricted Least Square Estimator: A
Pedagogical Note.” The Review of Economics and Statistics 73:563–67.
Haisken-DeNew, J.P. and C. M. Schmidt. 1997. “Inter-Industry and Inter-Regional Differen-
tials: Mechanics and Interpretation.” The Review of Economics and Statistics 79:516–21.
Horrace, William C. and Ronald L. Oaxaca. 2001. “Inter-Industry Wage Differentials and
the Gender Wage Gap: An Identification Problem.” Industrial and Labor Relations Review
54:611–18.
Jones, F. L. 1983. “On Decomposing the Wage Gap: A Critical Comment on Blinder’s
Method.” Journal of Human Resources 18:126–30.
18
Jones, F. L. and Jonathan Kelley. 1984. “Decomposing Differences Between Groups: A Cau-
tionary Note on Measuring Discrimination.” Sociological Methods and Research 12:323–
43.
Kim, ChangHwan. 2010. “Decomposing the Change in the Wage Gap Between White and
Black Men Over Time, 1980-2005: An Extension of the Blinder-Oaxaca Decomposition
Method.” Sociological Methods and Research 38:619–51.
Kitagawa, E.M. 1955. “Components of a Difference between Two Rates.” Journal of the
American Statistical Association 50:1168–94.
Krueger, Alan B. and Lawrence H. Summers. 1988. “Efficiency Wages and the Inter-Industry
Wage Structure.” Econometrica 57:259–293.
Liao, Tim Futing. 1989. “A Flexible Approach for the Decomposition of Rate Differences.”
Demography 26:717–26.
Oaxaca, Ronald L. 1973. “Male-female Wage Differentials in Urban Labor Markets.” Inter-
national Economic Review 14:693–709.
Oaxaca, Ronald L. and Michael R. Ransom. 1999. “Identification in Detailed Wage Decom-
positions.” The Review of Economics and Statistics 81:154–57.
Phillips, Julie A. and Megan M. Sweeney. 2006. “Can Differential Exposure to Risk Factors
Explain Recent Racial and Ethnic Variation in Marital Disruption?” Social Science Research
35:409–34.
Powers, Daniel A., Hirotoshi Yoshioka, and Myeong-Su Yun. forthcoming. “mdvcmp: Multi-
variate Decomposition for Nonlinear Response Models.” The Stata Journal .
Powers, Daniel A. and Myeong-Su Yun. 2009. “Multivariate Decomposition for Hazard Rate
Models.” Sociological Methodology 39:233–63.
19
Sakamoto, Arthur, Huei-Hsia Wu, and Jessie M. Tzeng. 2000. “The Declining Significance of
Race among American Men During the Latter Half of the Twentieth Century.” Demography
37:41–51.
Sayer, Liana C. 2004. “Are Parents Investing Less in Children? Trends in Mothers and
Fathers Time with Children.” American Journal of Sociology 110:1–43.
Scholz, John Karl and Kara Levine. 2004. “U.S. Black-White Wealth Inequality.” In So-
cial Inequality, edited by Kathryn M. Neckerman, pp. 895–930, New York. Russell Sage
Foundation.
Spilerman, Seymour. 2000. “Wealth and Stratification Processes.” Annual Review of Sociol-
ogy 26:497–524.
Stearns, Elizabeth, Stephanie Moller, Judith Blau, and Stephanie Potochnick. 2007. “Stay-
ing Back and Dropping Out: The Relationship Between Grade Retention and School
Dropout.” Sociology of Education 80:210–40.
Van Hook, Jennifer, Susan L. Brown, and Maxwell Ndigume Kwenda. 2004. “A Decomposi-
tion of Trends in Poverty among Children of Immigrants.” Demography 41:649–70.
Yun, Myeong-Su. 2005. “A Simple Solution to the Identification Problem in Detailed Wage
Decompositions.” Economic Inquiry 43:766–72.
Yun, Myeong-Su. 2008. “Identification Problem and Detailed Oaxaca Decomposition: A
General Solution and Inference.” Journal of Economic and Social Measurement 33:27–38.
20