On the Identification of Production Functions: How ...production function cannot generally be used...

94
On the Identification of Production Functions: How Heterogeneous is Productivity? Amit Gandhi, Salvador Navarro, David Rivers * May 30, 2016 Abstract We show that the “proxy variable” approaches for estimating production functions generi- cally suffer from a fundamental identification problem in the presence of a flexible input that satisfies the “proxy variable” assumptions for invertibility, such as intermediate inputs. We provide a formal proof of nonparametric non-identification, and illustrate that the source of under-identification is the flexible input elasticity. Using a transformation of the firm’s first order condition, we develop a new nonparametric identification strategy that addresses this problem, as well as a simple corresponding estimator for the production function. We show that the alternative of approximating the effects of intermediate inputs using a value-added production function cannot generally be used to identify features of interest from the gross output production function. Applying our approach to plant-level data from Colombia and Chile, we find that a gross output production function implies fundamentally different patterns of productivity heterogeneity than a value-added specification. * We would like to thank Dan Ackerberg, Richard Blundell, Juan Esteban Carranza, Allan Collard-Wexler, Ulrich Doraszelski, Steven Durlauf, Jeremy Fox, Silvia Goncalves, Phil Haile, Joel Horowitz, Jean-Francois Houde, Au- reo de Paula, Amil Petrin, Mark Roberts, Nicolas Roys, Chad Syverson, Chris Taber, Quang Vuong, and especially Tim Conley for helpful discussions. This paper has also benefited from detailed comments by the editor and three anonymous referees. We would also like to thank Amil Petrin and David Greenstreet for helping us to obtain the Colombian and Chilean data respectively. Navarro and Rivers acknowledge support from the Social Sciences and Humanities Research Council of Canada. This paper previously circulated under the name “Identification of Produc- tion Functions using Restrictions from Economic Theory.” First draft: May 2006. Amit Gandhi is at the University of Wisconsin-Madison, E-mail: [email protected]. Salvador Navarro is at the University of Western Ontario, E-mail: [email protected]. DavidRivers is at the University of Western Ontario, E-mail: [email protected]. 1

Transcript of On the Identification of Production Functions: How ...production function cannot generally be used...

Page 1: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

On the Identification of Production Functions: How

Heterogeneous is Productivity?

Amit Gandhi, Salvador Navarro, David Rivers∗

May 30, 2016

Abstract

We show that the “proxy variable” approaches for estimating production functions generi-

cally suffer from a fundamental identification problem in the presence of a flexible input that

satisfies the “proxy variable” assumptions for invertibility, such as intermediate inputs. We

provide a formal proof of nonparametric non-identification, and illustrate that the source of

under-identification is the flexible input elasticity. Using a transformation of the firm’s first

order condition, we develop a new nonparametric identification strategy that addresses this

problem, as well as a simple corresponding estimator for the production function. We show

that the alternative of approximating the effects of intermediate inputs using a value-added

production function cannot generally be used to identify features of interest from the gross

output production function. Applying our approach to plant-level data from Colombia and

Chile, we find that a gross output production function implies fundamentally different patterns

of productivity heterogeneity than a value-added specification.

∗We would like to thank Dan Ackerberg, Richard Blundell, Juan Esteban Carranza, Allan Collard-Wexler, UlrichDoraszelski, Steven Durlauf, Jeremy Fox, Silvia Goncalves, Phil Haile, Joel Horowitz, Jean-Francois Houde, Au-reo de Paula, Amil Petrin, Mark Roberts, Nicolas Roys, Chad Syverson, Chris Taber, Quang Vuong, and especiallyTim Conley for helpful discussions. This paper has also benefited from detailed comments by the editor and threeanonymous referees. We would also like to thank Amil Petrin and David Greenstreet for helping us to obtain theColombian and Chilean data respectively. Navarro and Rivers acknowledge support from the Social Sciences andHumanities Research Council of Canada. This paper previously circulated under the name “Identification of Produc-tion Functions using Restrictions from Economic Theory.” First draft: May 2006. Amit Gandhi is at the Universityof Wisconsin-Madison, E-mail: [email protected]. Salvador Navarro is at the University of Western Ontario,E-mail: [email protected]. David Rivers is at the University of Western Ontario, E-mail: [email protected].

1

Page 2: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

1 Introduction

The identification and estimation of production functions using data on firm inputs and output

is among the oldest empirical problems in economics. A key challenge for identification arises

because firms optimally choose their inputs as a function of their productivity, but productivity is

unobserved by the econometrician in the data. This gives rise to a classic simultaneity problem that

was first articulated by Marschak and Andrews (1944) and has come to be known in the production

function literature as “transmission bias”. In their influential review of the state of the literature

nearly 50 years later, Griliches and Mairesse (1998) (henceforth GM) concluded that the “search

for identification” for the production function remained a fundamentally open problem.1 Resolving

this identification problem is critical to measuring productivity with plant level production data,

which has become increasingly available for many countries, and which motivates a variety of

industry equilibrium models based on patterns of productivity heterogeneity found in this data.2

In this paper we examine the identification foundations of the “proxy variable” approach for

estimating production functions pioneered by Olley and Pakes (1996) and further developed by

Levinsohn and Petrin (2003)/Ackerberg, Caves, and Frazer (2015)/Wooldridge (2009) (henceforth

OP, LP, ACF and Wooldridge, respectively). Despite the immense popularity of the method in

the applied productivity literature, a fundamental question has remained unexplored: does the

proxy variable technique solve the identification problem of transmission bias? That is, is the

production function in fact identified from panel data on inputs and output under the structural

assumptions that the proxy variable technique places on the data? To date there has been no formal

demonstration of identification in this literature. On the other hand, there have been various red

flags concerning the identification foundations for this class of estimators (see Bond and Söderbom,

2005 and ACF). However, no clear conclusion regarding non-identification of the model as a whole

1In particular, the standard econometric solutions to correct the transmission bias, i.e., using firm fixed effects orinstrumental variables, have proven to be both theoretically problematic and unsatisfactory in practice (see e.g., GMand Ackerberg et al., 2007 for a review and Section 7 of this paper for a discussion).

2Among these patterns are the general understanding that even narrowly defined industries exhibit “massive” unex-plained productivity dispersion (Dhrymes, 1991; Bartelsman and Doms, 2000; Syverson, 2004; Collard-Wexler, 2010;Fox and Smeets, 2011), and that productivity is closely related to other dimensions of firm-level heterogeneity, such asimporting (Kasahara and Rodrigue, 2008), exporting (Bernard and Jensen, 1995, Bernard and Jensen, 1999, Bernardet al., 2003), wages (Baily, Hulten, and Campbell, 1992), etc. See Syverson (2011) for a review of this literature.

2

Page 3: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

has been reached either. Without an answer to this basic question, it is unclear how to interpret the

vast body of empirical results regarding patterns of productivity which rely on the proxy approach

for the measurement of productivity.

Our first main result is a negative one. We show that the model structure underlying the proxy

variable technique does not identify the production function (and hence productivity) in the pres-

ence of a flexible input that satisfies the conditions for invertibility invoked in this literature. We

establish this result under the LP/Wooldridge approach that uses intermediate inputs as a flexi-

ble input that satisfies the proxy variable assumption of being monotone (and hence invertible) in

productivity. Under this structure we show how to construct a continuum of observationally equiv-

alent production functions that cannot be distinguished from the true production function in the

data. We also show that similar non-identification problems arise under the original OP strategy of

using investment as the proxy variable.3

Our second contribution is to show that the exact source of this non-identification is the flexible

input elasticity for inputs satisfying the proxy variable assumption. The flexible input elasticity

defines a partial differential equation on the production function. As we show, if this elasticity

were known, then it could be integrated up to nonparametrically identify the part of the production

function that depends on the flexible input. We then show that the proxy variable structure is

sufficient to nonparametrically identify the remainder of the production function. These results

taken together formalize the empirical content of the proxy variable structure that is widely used

in applied work - the model is identified up to the flexible input elasticity, but fails to identify the

flexible input elasticity itself.

Our third contribution is that we present a new empirical strategy that nonparametrically iden-

tifies the flexible input elasticity, and hence solves for the missing source of identification for the

production function within the proxy variable structure. The key to our approach is that we build

on a key economic assumption of the proxy variable structure - that the firm optimally chooses

intermediate inputs in response to its realized productivity. Whereas the proxy variable literature

3We do not emphasize OP as the leading case because the applied literature has largely adopted the use of interme-diate inputs as the proxy variable due to the prevalence of zeroes in investment, which was the original motivation forLP. Nevertheless we show that resorting to OP does not solve the non-identification problem we pose for LP.

3

Page 4: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

has used this assumption to “invert” and replace for productivity using intermediate inputs, we go

further and exploit the nonparametric structural link between the production function and the firm’s

first order condition for intermediate inputs. This link allows for a nonparametric regression of the

intermediate input’s revenue share on all inputs (labor, capital, and intermediate inputs) to identify

the flexible input elasticity. This is a nonparametric analogue of the familiar parametric insight that

revenue shares directly identify the intermediate input coefficient in a Cobb-Douglas setting (e.g.,

Klein, 1953 and Solow, 1957). Our key innovation is that we show that the information in the first

order condition can be used in a completely nonparametric way, i.e., without making functional

form assumptions on the production function. Our results thus show how this “share regression”

can be combined with the remaining content of the proxy variable structure to nonparametrically

identify the production function as a whole.

This identification strategy - regressing revenue shares on inputs to identify the flexible input

elasticity, solving the partial differential equation, and integrating this into the proxy variable struc-

ture to identify the remainder of the production function - gives rise to a natural two-step estimator

in which a flexible parametric approximation to different components of the production function is

estimated in each stage. We present a computationally straightforward implementation of this es-

timator and show that the properties of the estimator are equivalent to a standard GMM estimator,

which gives us a straightforward approach to inference with the computed parameter estimates.

We further relate our solution to the common empirical practice in the literature of estimating

value-added production functions that subtract out a flexible input from the model (typically inter-

mediate inputs). While value-added specifications may be of direct interest themselves, a common

justification for value-added is that they derive directly from an underlying gross output technol-

ogy. To the extent that one is interested in features of the gross output production function, value

added production functions may appear immune from the non-identification problem we raise,

as they explicitly exclude intermediate inputs. We show that, unless the production function is a

very specific version of Leontief in value added and intermediate inputs, then value added cannot

be used to identify features of interest (including productivity) from the underlying gross output

production.

4

Page 5: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

Finally, we apply our identification strategy to plant-level data from Colombia and Chile to

study the underlying patterns of productivity under gross output compared to value-added specifi-

cations. We find that productivity differences become orders of magnitude smaller and sometimes

even change sign when we analyze the data via gross output rather than value added. For example,

the standard 90/10 productivity ratio taken among all manufacturing firms in Chile is roughly 9

under value added (meaning that the 90th percentile firm is 9 times more productive than the 10th

percentile firm), whereas under our gross output estimates this ratio falls to 2. Moreover, these dis-

persion ratios exhibit a remarkable degree of stability across industries and across the two countries

when measured via gross output, but exhibit much larger cross-industry and cross-country variance

when measured via value added. We further show that, as compared to gross output, value added

estimates generate economically significant differences in the productivity premium of firms that

export, firms that import, firms that advertise, and higher wage firms.

In contrast to the view expressed in Syverson (2011), that empirical findings related to produc-

tivity are quite robust to measurement choices, our findings illustrate the empirical importance of

the distinction between gross output and value-added estimates of productivity. Our results high-

light the empirical relevance of our identification strategy for gross output production functions.

The results suggest that the distinction between gross output and value added is at least as im-

portant, if not more so, than the transmission bias that has been the main focus of the production

function estimation literature to date.

The rest of the paper is organized as follows. In Section 2 we describe the model and provide

a nonparametric characterization of transmission bias. In Section 3 we formally prove that the

production function is nonparametrically non-identified under the proxy variable approach. Sec-

tion 4 shows that the source of the non-identification is the flexible input elasticity. In Section 5

we present our nonparametric identification strategy. Section 6 describes our estimation strategy.

Section 7 compares our approach to the related literature. Section 8 discusses the use of value

added. In Section 9 we describe the Colombian and Chilean data and show the results compar-

ing gross output to value added for productivity measurement. In particular, we show evidence

of large differences in unobserved productivity heterogeneity suggested by value added relative to

5

Page 6: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

gross output. Section 10 concludes with an example of the policy relevance of our results.

2 The Model and Identification Problem

We first describe the economic model of production underlying the “proxy variable” approach

to estimating production functions (OP/LP/ACF/Wooldridge), which has become a widely-used

approach to estimating production functions and productivity in applied work. We then define

the identification problem associated with the model and data. Our first main result in this pa-

per demonstrates that, despite the ubiquity of these approaches in empirical work, the production

function and productivity are not identified by the restrictions imposed by these methods.4,5

2.1 Data and Definitions

We observe a panel consisting of firms j = 1, . . . , J over periods t = 1, . . . , T .6 A generic firm’s

output, labor, capital, and intermediate inputs will be denoted by (Yt, Kt, Lt,Mt) respectively,

and their log values will be denoted in lowercase by (yt, kt, lt,mt). Firms are sampled from an

underlying population and the asymptotic dimension of the data is to let the number of firms

J → ∞ for a fixed T , i.e., the data takes a short panel form. The data directly identifies the

joint distribution of the history of inputs and output of a firm, i.e., the data identifies the joint the

distribution of the collection of random variables: {(yt, kt, lt,mt)}Tt=1 .

We let It denote the information set of the firm in period t. The information set It consists of all

information the firm can use to solve its period t decision problem, which potentially includes its

input choices. By definition (and irrespective of the economics details underlying input decisions)

we have that the capital, labor, and intermediate input choices in each period can be expressed as

4As we discuss in Sections 3 and 8, Ackerberg, Caves, and Frazer (2015) avoid the issues we discuss below bycarefully considering data-generating-processes under which their procedure can be employed for restricted profit /Leontief specifications of the production function.

5The identification problem we isolate also applies to the dynamic panel approach to production function estimationfollowing Arellano and Bond (1991); Blundell and Bond (1998, 2000). We draw a comparison to the dynamic panelliterature in Section 7.3.

6Throughout this section we assume a balanced panel for notational simplicity. We also omit the firm subscript j,except when the context requires it.

6

Page 7: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

functions of the information set It, i.e.,

kt = K (It) ; lt = L (It) ; mt = M (It) (1)

Let xt ∈ {kt, lt,mt} denote a generic input. If an input xt is such that xt ∈ It, i.e., the period

t amount of the input employed in the period is in the firm’s information set for that period, then

we say the input is predetermined in period t. Thus a predetermined input is a function of the

information set of a prior period,

xt = X (It−1) ∈ It

If an input is not predetermined, and thus xt /∈ It, then we say the input is variable in period t.

If an input is variable and∂

∂xt−τX (It) 6= 0

for τ > 0, i.e., lagged values of the input affect the optimal period t choice of the input, then we

say the input is dynamic. Finally, if a variable input is not dynamic, then we say it is flexible.

2.2 The Production Function and Productivity

We assume that the relationship between output and inputs is determined by an underlying produc-

tion function F and a Hicks neutral productivity shock νt.

Assumption 1. The relationship between output and the inputs takes the form

Yt = F (kt, lt,mt) eνt ⇐⇒

yt = f (kt, lt,mt) + νt (2)

The Hick’s neutral productivity shock νt is decomposed as νt = ωt+εt. The distinction between

ωt and εt is that ωt is known to the firm before making its period t decisions, whereas εt is an ex-

post productivity shock realized only after the period decisions are made. The stochastic behavior

of both of these components is explained next.

7

Page 8: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

Assumption 2. ωt ∈ It is known to the firm at the time of making its period t decisions, whereas

εt /∈ It is not. Furthermore ωt is Markovian so that its distribution can be written as Pω (ωt | It−1) =

Pω (ωt | ωt−1). The function h (ωt−1) = E [ωt | ωt−1] is continuous. The shock εt on the other hand

is independent of the within period variation in information sets, Pε (εt | It) = Pε (εt).

Given that ωt ∈ It, but εt is completely unanticipated on the basis of It, we will refer to ωt as

persistent productivity, εt as ex-post productivity, and νt = ωt + εt as total productivity. Observe

that we can express ωt = h(ωt−1) + ηt, where ηt satisfies E [ηt | It−1] = 0. ηt can be interpreted as

the, unanticipated at period t− 1, “innovation” to the firm’s persistent productivity ωt in period t.7

Without loss of generality, we can normalize E [εt | It] = E [εt] = 0, which is in units of log

output. However, the expectation of the ex-post shock, in units of the level of output, becomes a

free parameter which we denote as E ≡ E [eεt | It] = E [eεt ].8 Note that the general form of input

demand (1) implies E [εt | It, kt, lt,mt] = E [εt | It] = 0 and hence E [εt | kt, lt,mt] = 0 by the

law of iterated expectations.

Finally, we have the scalar invertibility assumption that allows an input to be used to proxy for

productivity. Levinsohn and Petrin (2003) proposed using the structure of a flexible input demand

as the basis for such a proxy variable. For simplicity, we focus on the case of a single flexible input

in the model, namely intermediate inputs mt, and treat capital kt and labor lt as predetermined in

the model. The non-identification problem we demonstrate can be easily adapted to the case where

lt is also flexible.9

Assumption 3. The scalar unobservability assumption of LP/Wooldridge places the following as-

7It is straightforward to allow the distribution of Pω (ωt | It−1) to depend upon other elements of It−1, such asfirm export or import status, R&D, etc. In these cases ωt becomes a controlled Markov process from the firm’s pointof view. See Kasahara and Rodrigue (2008) and Doraszelski and Jaumandreu (2013) for examples.

8See Goldberger (1968) for an early discussion of the implicit reinterpretation of results that arises from ignoringE (i.e., setting E≡ E [eεt ] = 1 while simultaneously setting E [εt] = 0) in the context of Cobb-Douglas productionfunctions.

9We focus on the case of a single flexible input because allowing lt to be flexible, in addition to intermediate inputsmt, is associated with a set of distinct problems related to the identification of the labor elasticity that were raised byACF. The key to avoiding the ACF critique is letting labor have sources of variation beyond ωt and yet preserve thescalar unobservability restriction on the intermediate input demand. By treating lt as predetermined we avoid the ACFcritique and can focus attention on the flexible input elasticity problem (which is the focus of our paper).

8

Page 9: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

sumption on the flexible input demand

mt = M (It) = M (kt, lt, ωt) . (3)

The intermediate input demand M is assumed strictly monotone in ωt.10

Observe that a key implication of Assumption 3 is that M can be inverted in ωt, allowing

productivity to be expressed as a deterministic function of the inputs

ωt = M−1 (kt, lt,mt) .

We restrict our attention to the use of intermediate inputs as a proxy versus the original proxy

variable strategy of OP that uses investment. As LP argued, the fact that investment is often zero

in plant level data leads to practical challenges in using the OP approach, and as a result using

intermediate inputs as a proxy has become the preferred strategy in applied work. Investment as a

proxy raises similar identification challenges, which we discuss in Appendix A.

We could have generalized the model to allow the primitives to vary with time t, i.e., ft, Pt,ω,

and Pt,ε to all vary by time t. We do not use this more general form of the model in the analysis to

follow because the added notational burden distracts from the main ideas of the paper. However,

it is straightforward to generalize the analysis that follows to the time-varying case by simply

repeating the steps of our analysis separately for each time period t ∈ {2, . . . , T}.

2.3 Transmission Bias

Given the structure of the production function we can formally state the problem of transmission

bias in the nonparametric setting. Transmission bias classically refers to the bias of the OLS re-

gression of output on inputs as estimates of a Cobb-Douglas production parameter. In the nonpara-

metric setting we can see transmission bias more generally as the empirical problem of regressing

10This approach can be generalized to allow the input demand M to vary by time period t. All of our results can bereadily extended to this more general case at the cost of introducing additional notational complexity, which we avoidhere in order not to distract from the core conceptual issues we raise.

9

Page 10: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

output yt on inputs (kt, lt,mt) which yields

E [yt | kt, lt,mt] = f (kt, lt,mt) + E [ωt | kt, lt,mt]

and hence the elasticity of the regression in the data with respect to an input xt ∈ {kt, lt,mt}

∂xtE [yt | kt, lt,mt] =

∂xtf (kt, lt,mt) +

∂xtE [ωt | kt, lt,mt]

is a biased estimate of the true production elasticity ∂∂xtf (kt, lt,mt).

Under the proxy variable structure, transmission bias takes a very specific form. This can be

seen as follows:

E [yt | kt, lt,mt] = f (kt, lt,mt) + M−1 (kt, lt,mt) ≡ φ (kt, lt,mt) . (4)

Clearly no structural elasticities can be identified from this regression (the “first stage”), in partic-

ular the flexible input elasticity, ∂∂mt

f (kt, lt,mt). Instead, all the information from the first stage is

summarized by the identification of the random variable φt ≡ φ (kt, lt,mt), and as a consequence

the ex-post productivity shock εt = yt − E [yt | kt, lt,mt].

The question then becomes whether the part of φt that is due to f (kt, lt,mt) versus the part

due to ωt can be separately identified using the second stage restrictions of the model. This second

stage is formed by recognizing that

yt = f (kt, lt,mt) + ωt + εt

= f (kt, lt,mt) + h (φt−1 − f (kt−1, lt−1,mt−1)) + ηt + εt. (5)

The challenge in using this equation for identification is the presence of an endogenous variable

mt in the model that is correlated with ηt.

LP/Wooldridge propose to use instrumental variables for this endogeneity problem by exploit-

ing orthogonality restrictions implicit in the model with respect to ηt+εt. In particular Assumption

10

Page 11: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

2 implies that for any transformation Γt = Γ (It−1) of the lagged period information set It−1 we

have the orthogonality E [ηt + εt | Γt] = 0. We focus on transformations that are observable by

the econometrician, in which case Γt will serve as the instrumental variables for the problem. The

full vector of potential instrumental variables given the data (as described in section 2.1) consists

of all lagged output/input values which, by construction, are transformations of It−1, as well as the

current values of the predetermined inputs kt and lt (which by assumption are a transformation of

It−1), i.e., Γt = (kt, lt, yt−1, kt−1, lt−1,mt−1, . . . , y0, k0, l0,m0).

3 Non-Identification

In this section we show that the proxy variable structure of Assumptions 1 - 3 does not suffice to

identify the production function. In Theorem 1, we first show that the application of instrumental

variables (via the orthogonality restriction E [ηt + εt | Γt] = 0) to the structural equation (5) is

insufficient to identify the production function f (and the Markovian process h). However, the

orthogonality restriction underlying the instrumental variables approach does not summarize the

full structure of the model. In Theorem 2, we show that, even if we treat the model as a simultane-

ous system of equations for the determination of the the endogenous variables (yt,mt) in (5), the

production function f cannot be identified.

Identification of the production function f by instrumental variables is based on projecting

output yt onto the exogenous variables Γt (see e.g., Newey and Powell, 2003). This generates a

restriction between (f, h) and the distribution of the data Gyt,mt|Γt that takes the form

E [yt | Γt] = E [f (kt, lt,mt) | Γt] + E [ωt | Γt]

= E [f (kt, lt,mt) | Γt] + h (φt−1 − f (kt−1, lt−1,mt−1)) , (6)

where recall that φt−1 ≡ φ (kt−1, lt−1,mt−1) is known from the first stage equation (4). The

structural primitives underlying equation (6) are given by (f, h). The true (f 0, h0) are identified if

no other(f , h

)among all possible alternatives also satisfy the functional restriction (6) given the

11

Page 12: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

distribution of the observables Gyt,mt|Γt .11

We first establish the following useful Lemma. For notational simplicity, we define the random

variable ft ≡ f (kt, lt,mt).

Lemma 1. If (f, h) solve the functional restriction (6), then it must be the case that

E [φt − ft | Γt] = h (φt−1 − ft−1)

Proof. Observe that

E [yt | Γt] = E [E [yt | kt, lt,mt] | Γt]

= E [φt | Γt]

by construction of φt. From the definition of yt it follows that

E [φt | Γt] = E [ft | Γt] + h (φt−1 − ft−1) .

Re-arranging terms gives us the Lemma.

Theorem 1. Under the model defined by Assumptions 1 - 3, and given φt ≡ φ (kt, lt,mt) identified

from the first stage equation (4), there exists a continuum of alternative(f , h

)defined by

f ≡ (1− a) f 0 + aφt

h (x) ≡ (1− a)h0

(1

(1− a)x

)

for any a ∈ (0, 1), that satisfy the same functional restriction (6) as the true (f 0, h0).

Proof. The proof of the Theorem follows almost immediately from Lemma 1. Given the definition

11Less formally, the intuitive idea is that(f0, h0

)are the unique primitives that explain the reduced form E [yt | Γt]

given the model.

12

Page 13: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

of(f , h

)we have

ft + h(φt−1 − ft−1

)=

f 0t + a

(φt − f 0

t

)+ h

((1− a)

(φt−1 − f 0

t−1

))=

f 0t + a

(φt − f 0

t

)+ (1− a)h0

(φt−1 − f 0

t−1

).

Now, take the conditional expectation of the above (with respect to Γt) and apply the Lemma

E[ft | Γt

]+ h

(φt−1 − ft−1

)=

E[f 0t | Γt

]+ ah0

(φt−1 − f 0

t−1

)+ (1− a)h0

(φt−1 − f 0

t−1

)=

E[f 0t | Γt

]+ h0

(φt−1 − f 0

t−1

).

Thus (f 0, h0) and(f , h

)satisfy the functional restriction and cannot be distinguished via instru-

mental variables.

The intuition for the identification failure established in Theorem 1 can be seen by looking at

equation (5) above. Notice that, by replacing for ωt in the intermediate input demand equation (3)

we get

mt = M(kt, lt, h

(M−1 (kt−1, lt−1,mt−1)

)+ ηt

).

This implies that the only source of variation left in mt after conditioning on

(kt, lt, kt−1, lt−1,mt−1) ∈ Γt (which are used as instruments for themselves) is the unobservable ηt.

Therefore, despite the apparent abundance of instruments in Γt, all of the remaining elements in Γt

are orthogonal to this remaining source of variation, ηt, and hence have no power as instruments.

Theorem 1 calls into question existing applied work on productivity that employs the proxy

variable technique to recover productivity. A standard application in the literature will employ a

“flexible parametric approximation” fβ parametrized by a finite dimensional parameter β to the

production function f . The standard estimator applies the restrictions of the first stage and second

stage in a GMM formulation where the moments are defined by E [εtxt] = 0 for xt ∈ {kt, lt,mt}

13

Page 14: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

and E [(ηt + εt)xt] for xt ∈ Γt. So long as the number of moment restrictions (determined by

the number of elements in Γt the estimator exploits) exceeds the dimensionality of β, then such

an approach would appear identified. However, our Theorem 1 shows that this simple process of

“counting moment equations” and unknown parameters is deceiving.

In so far as such estimators are consistent for β, it is only because of the parametric structure

being employed. Theorem 1 establishes that the production function is nonparametrically non-

identified by the moment restrictions underlying these techniques. However, the researcher will

typically have little basis for imposing parametric restrictions, and, if the parametric restrictions are

not correct, this can generate misleading inferences about the production function and productivity

(see Manski, 2003; Roehrig, 1988; and Matzkin, 2007 for more detail). Furthermore, as we show

in Appendix B, for the case of the commonly-employed Cobb-Douglas parametric form, even

imposing structural parametric assumptions is not necessarily sufficient to solve the identification

problem.

The result in Theorem 1 is a useful benchmark, as it relates directly to the approach used in the

proxy variable literature. However, this instrumental variables approach does not necessarily ex-

haust the sources of identification inherent in the proxy variable structure. First, since instrumental

variables is based only on conditional expectations, it does not employ the entire distribution of the

data (yt,mt,Γt). Second, it does not directly account for the fact that Assumption 3 also imposes

restrictions (scalar unobservability and monotonicity) on the determination of the endogenous vari-

able mt via M (·). Therefore, the proxy variable structure imposes restrictions on a simultaneous

system of equations because, in addition to the model for output, yt, via the production function,

there is a model for the proxy variable, in this case intermediate inputs, mt.

We now extend our non-identification result to show that the complete structure Θ = (f, h,M)

cannot be identified from the full joint distribution Gyt,mt|Γt of the data. For a structure Θ, let

εΘt = yt − f (kt, lt,mt)−M−1 (kt, lt,mt) ,

14

Page 15: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

and

ηΘt = M−1 (kt, lt,mt)− h

(M−1 (kt−1, lt−1,mt−1)

).

In order to relate the structure Θ to the joint distribution of the data Gyt,mt|Γt through the model, a

joint distribution of the GηΘt ,ε

Θt |Γt needs to be specified. Let EG (·) denote the expectation operator

taken with respect to distribution G. We say that a structure Θ rationalizes the data if: (i) there

exists a joint distribution GηΘt ,ε

Θt |Γt = GηΘ

t |Γt × GεΘtthat generates the joint distribution Gyt,mt|Γt;

(ii) satisfies the first stage moment restriction EGεΘt

[εΘt | kt, lt,mt

]= 0; (iii) satisfies the IV or-

thogonality restriction EGηΘt ,ε

Θt |Γt

[ηΘt + εΘ

t | Γt]

= 0; and (iv) satisfies Assumption 3 (i.e., scalar

unobservability and monotonicity of M). Following Matzkin (2007), we say that, if there exists an

alternative structure Θ 6= Θ0 that rationalizes the data, then the structure Θ0 is not identified from

the joint distribution Gyt,mt|Γt of the data.

Theorem 2. Given the true structure Θ0 = (f 0, h0,M0) and Assumptions 1 - 3, there always exists

a continuum of alternative structures Θ 6= Θ0, defined by

f ≡ f 0 + a(M0)−1

h (x) ≡ (1− a)h0

(1

(1− a)x

)M−1 ≡ (1− a)

(M0)−1

for any a ∈ (0, 1), that exactly rationalize the data Gyt,mt|Γt .

Proof. Let x denote a particular value of the random variable x in its support. We first observe that,

for any hypothetical structure Θ = (f, h,M), there always exists a distribution GηΘt ,ε

Θt |Γt defined

by

GηΘt ,ε

Θt |Γt (ηt, εt | Γt) =

Gyt,mt|Γt

εt + f (kt, lt,M (kt, lt, h (M−1 (kt−1, lt−1,mt−1)) + ηt))

+M−1 (kt, lt,M (kt, lt, h (M−1 (kt−1, lt−1,mt−1)) + ηt))

,M (kt, lt, h (M−1 (kt−1, lt−1,mt−1)) + ηt)

∣∣∣∣∣∣∣∣∣Γt ,

15

Page 16: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

that generates the conditional distribution of the data Gyt,mt|Γt through the model, hence (i) is

satisfied.

Second, since the true model rationalizes the data, it follows that EGεΘ

0t

[εΘ0

t | kt, lt,mt

]= 0.

The εΘt implied by our alternative structure is given by

εΘt = yt − f (kt, lt,mt)− M−1 (kt, lt,mt)

= yt − f 0 (kt, lt,mt)− a(M0)−1

(kt, lt,mt)− (1− a)(M0)−1

(kt, lt,mt)

= yt − f 0 (kt, lt,mt)−(M0)−1

(kt, lt,mt)

= εΘ0

t ,

so it trivially satisfies the moment restriction in (ii).

Third, it follows that

ηΘt + εΘ

t = yt − f (kt, lt,mt)− h(M−1 (kt−1, lt−1,mt−1)

)= yt − f 0 (kt, lt,mt)− a

(M0)−1

(kt, lt,mt)− (1− a)h0((

M0)−1

(kt−1, lt−1,mt−1))

= yt − f 0 (kt, lt,mt)− h0((

M0)−1

(kt−1, lt−1,mt−1))

︸ ︷︷ ︸ηΘ0t +εΘ

0t

−a

(M0)−1

(kt, lt,mt)− h0((

M0)−1

(kt−1, lt−1,mt−1))

︸ ︷︷ ︸ηΘ0t

= (1− a) ηΘ0

t + εΘ0

t .

Since εΘt = εΘ0

t , it immediately follows that EGηΘt ,ε

Θt |Γt

(εΘt | Γt

)= 0. It also follows that ηΘ

t =

16

Page 17: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

(1− a) ηΘ0

t . By a simple change of variables we have that

EGηΘt ,ε

Θt |Γt

(ηΘt | Γt

)= EG

ηΘt |Γt

(ηΘt | Γt

)= EG

ηΘ0t |Γt

(ηΘt

(1− a)| Γt

)= EG

ηΘ0t |Γt

(ηΘ0

t | Γt)

= 0.

Hence, our alternative structure satisfies the moment restriction in (iii).

Finally we notice that, since (M0)−1 is invertible given Assumption 3,

(M)−1

≡ (1− a) (M0)−1

is therefore also invertible and hence satisfies Assumption 3 (i.e., (iv)) as well. Since both Θ and

Θ0 satisfy requirements (i)-(iv), i.e., both rationalize the data, we conclude that Θ0 is not identi-

fied.

While we have focused our discussion on gross output production functions, a similar non-

identification result generally arises in a value-added specification in which either capital or labor

is flexible and satisfies Assumption 3. Notice that it is not necessary that this flexible input be

used as the proxy, just that it satisfies scalar unobservability and monotonicity. Hence, in the

value-added case with intermediate inputs used as the proxy, if labor is also flexible and satisfies

Assumption 3, it will be subject to the same identification problems described in this section. A

notable exception is Ackerberg, Caves, and Frazer (2015), which carefully lays out a description

of the data-generating-processes under which their procedure can be employed for value-added

specifications.12 However, based in part on our non-identification results above, ACF suggest not

applying their procedure to gross output production functions that are not Leontief in intermediate

inputs.

12

In particular, in one of the cases they present, they show that when labor is a flexible input that does not satisfyAssumption 3 due to persistent and unobserved wage shocks, its elasticity is identified.

17

Page 18: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

4 The Empirical Content of the Model

Our theorems in the previous section show that the LP/Wooldridge approach is nonparametrically

under-identified. Our next result makes precise the exact source of under-identification. In par-

ticular, if the flexible input elasticity ∂∂mt

f (kt, lt,mt) were nonparametrically known, then the

structure of the second stage of the model is informative enough to identify the remainder of the

production function nonparametrically. We present this result for a single flexible input mt, and

show how it extends to multiple flexible input elasticities in Appendix C.

The idea is that the flexible input elasticity defines a partial differential equation that can be

integrated up to identify the part of the production function f related to the intermediate input

mt.13 By the fundamental theorem of calculus we have

∫∂

∂mt

f (kt, lt,mt) dmt = f (kt, lt,mt) + C (kt, lt) . (7)

Subtracting equation (7) from the production function, and re-arranging terms we have

Yt ≡ yt − εt −∫

∂mt

f (kt, lt,mt) dmt = −C (kt, lt) + ωt. (8)

Notice that Yt is an “observable” random variable as it is a function of data and the flexible input

elasticity which is assumed to be known. It also depends on the ex-post shock εt, which can be

recovered, for example, from the first stages of the OP/LP/ACF procedures.

Applying the Markov structure on productivity that is the basis for the “second stage” moments

of the OP/LP/ACF procedure gives

Yt = −C (kt, lt) + h (Yt−1 + C (kt−1, lt−1)) + ηt. (9)

Since (kt, lt, kt−1, lt−1,mt−1, yt−1 − εt−1) is a transformation of the information set It−1, and

Yt−1 is a function of these variables, we have the orthogonality E [ηt | kt, lt,Yt−1, kt−1, lt−1] = 0

13See Houthakker (1950) for a similar solution to the related problem of how to recover the utility function fromthe demand functions.

18

Page 19: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

which implies

E [Yt | kt, lt,Yt−1, kt−1, lt−1] = −C (kt, lt) + h (Yt−1 + C (kt−1, lt−1)) . (10)

This regression identified in the data will allow us to identify the last component of the production

function C up to an additive constant.14,15

We now establish this result formally based on the observations of the above discussion. We

will use the following regularity condition on the support of the regressors

(kt, lt,Yt−1, kt−1, lt−1) (adapted from Newey, Powell, and Vella, 1999).

Assumption 4. For each point(Yt, kt−1, lt−1

)in the support of (Yt−1, kt−1, lt−1), the boundary of

the support of (kt, lt) conditional on(Yt, kt−1, lt−1

)has a probability measure zero.

Assumption 4 is a condition that states that we can independently vary the predetermined inputs

(kt, lt) conditional on (Yt−1, kt−1, lt−1) within the support. This implicitly assumes the existence

of enough variation in the input demand functions for the predetermined inputs to induce open

set variation in them conditional on the lagged output and input values (Yt−1, kt−1, lt−1). This

condition makes explicit the variation that allows for nonparametric identification of the remainder

of the production function under the second stage moments above. A version of this assumption is

thus implicit in the LP/ACF/Wooldridge procedures.

Theorem 3. Under Assumptions 1 - 4, if ∂∂mt

f (kt, lt,mt) is nonparametrically known, then the

production function f is nonparametrically identified up to an additive constant.

Proof. The theorem assumes that ∂∂mt

f (kt, lt,mt) is known almost everywhere in (kt, lt,mt). As-

sumptions 2, 3, and 4 ensure that with probability 1 for any (kt, lt,mt) in the support of the data14Notice that in the OP/LP/ACF procedures, Yt can only be formed for observations in which the proxy vari-

able is strictly positive. Observations that violate the strict monotonicity of the proxy equation need to bedropped from the first stage, which implies that εt cannot be recovered. This introduces a selection bias sinceE [ηt | kt, lt,Yt−1, kt−1, lt−1, ιt > 0] 6= E [ηt | kt, lt,Yt−1, kt−1, lt−1], where ιt is the proxy variable, and equation(10) does not hold. The reason is that firms that receive lower draws of ηt are more likely to choose non-positive valuesof the proxy, and this probability is a function of the other state variables of the firm. An alternative is to form ηt+εt =yt−

∫∂∂mt

f (kt, lt,mt) dmt + C (kt, lt) + h (Yt−1 + C (kt−1, lt−1)) , which does not require one to recover εt, andhence can be formed for all observations. One can then use moments of the formE [ηt + εt | kt, lt,Yt−1, kt−1, lt−1] =0 to recover the rest of the production function: −C (kt, lt) + h (Yt−1 + C (kt−1, lt−1)).

15As it is well known, it is not possible to separately identify a constant in the production function from meanproductivity, E [ωt].

19

Page 20: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

there is a set

{(k, l,m) | k = kt, l = lt,m ∈ [m (kt, lt) ,mt]}

also contained in the support for some m (kt, lt). Hence with probability 1 the integral

∫ mt

m(kt,lt)

∂mt

f (kt, lt,m) dm = f (kt, lt,mt) + C (kt, lt)

is identified, where the equality follows from the fundamental theorem of calculus. Therefore, if

two production functions f and f give rise to the same input elasticity ∂∂mt

f (kt, lt,mt) over the

support of the data, then they can only differ by an additive function C (kt, lt) . To identify this

additive function, observe that we can identify the joint distribution of (Yt, kt, lt,Yt−1, kt−1, lt−1)

for Yt defined by (8). Thus the regression function

E [Yt | kt, lt,Yt−1, kt−1, lt−1] = r (kt, lt,Yt−1, kt−1, lt−1) (11)

can be identified for almost all xt = (kt, lt,Yt−1, kt−1, lt−1), where recall that r = −C (kt, lt) +

h (Yt−1 + C (kt−1, lt−1)). Let(C , h

)be a candidate alternative structure. The two structures

(C , h) and(C , h

)are observationally equivalent if and only if

−C (kt, lt) + h (Yt−1 + C (kt−1, lt−1)) = −Ct (kt, lt) + h(Yt−1 + C (kt−1, lt−1)

), (12)

for almost all points in the support of xt. Our support assumption (4) on (kt, lt) allows us to take

partial derivatives of both sides of (12) with respect to kt and lt

∂zC (kt, lt) =

∂zC (kt, lt)

for z ∈ {kt, lt} and for all xt in its support, which implies C (kt, lt)− C (kt, lt) = c for a constant

c for almost all xt. Thus we have shown the production function is identified up to an additive

constant.

The key insight this result provides is that it makes precise the true empirical content of the

20

Page 21: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

proxy variable structure places on the economic environment surrounding the production func-

tion. In particular, it shows the empirical content of the structure of the model is such that it can

nonparametrically identify the predetermined input elasticities given the flexible input elasticities.

However, from our results in Section 3, there is not enough variation in the model to identify the

flexible input elasticities themselves. Hence, our Theorems 1 - 3 establish that the precise source

of under-identification of the proxy variable model is the elasticity ∂∂mt

f (kt, lt,mt) of the flexible

input mt that is used as the proxy variable. The conclusion is that the proxy variable structure in

LP/Wooldridge is too weak to identify this flexible input elasticity.

The simple intuition for this fact can be seen by juxtaposing the production function yt =

f (kt, lt,mt) + ωt + εt against the structure on intermediate input demand mt = M (kt, lt, ωt).

Observe first that mt is an endogenous variable in the model as it is also a function of the same

productivity shock ωt that determines output yt. However this endogenous variable does not admit

any source of variation from outside the production function - the only input demand shifter aside

from the other inputs in the production function is productivity ωt. In particular, the elasticity is

identified with how output varies with mt holding fixed (kt, lt), but the only source of variation in

mt (namely ωt) also simultaneously shifts output yt. Thus it would seem impossible to identify an

elasticity of the production function with respect to intermediate inputs mt.

One way out of this problem is to allow for observed shifters that enter the flexible input

demand M, but are excluded from the production function.16 Flexible input and output prices that

vary by firm are one natural source of variation and it has been considered recently by Doraszelski

and Jaumandreu (2013, 2015). In Section 7 we provide a more detailed discussion of the use of

prices as exclusions restrictions, and the circumstances under which they can be used. In the next

section, however, we present an alternative approach to identify the flexible input elasticity which

is widely applicable and does not rely on exogenously varying observable price differences among

firms within an industry.

16It may be possible to achieve identification in the absence of exclusion restrictions by imposing additional re-strictions. One example is using heteroskedasticity restrictions (see e.g., Rigobon, 2003; Klein and Vella, 2010; andLewbel, 2012), although these approaches require explicit restrictions on the form of the error structure. We thank ananonymous referee for pointing this out. We are not aware of any applications of these ideas in the production functionsetting.

21

Page 22: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

5 Nonparametric Identification of the Flexible Input Elasticity

The source of the identification problem underlying our Theorems 1 and 2 is that the data cannot

disentangle f from M−1 in either the first or second stages of the proxy variable technique. Our

solution is to use the restrictions implied by optimal firm behavior which underlie the input demand

function M. The key idea is to recognize that f and M are not independent functions for an

optimizing firm, but instead the input demand M is implicitly defined by f through the firm’s first

order condition. We show that this functional relationship can be exploited in a fully nonparametric

fashion (i.e., without imposing parametric structure on f ) to solve the non-identification problem.17

The key reason why we are able to use the first order condition with such generality is that the key

assumption of the model - Assumption 3 - already presumes intermediate inputs are a flexible

input, thus making the economics of this input choice especially tractable.

We focus attention in the main body on the classic case of perfect competition in the interme-

diate input and output markets. The perfect competition case makes our proposed solution to the

identification problem caused by intermediate inputs particularly transparent. In Appendix C4, we

show that our approach can be extended to the case of monopolistic competition with unobserved

output prices.

Assumption 5. Firms are price takers in the output and intermediate input market, with ρt denot-

ing the common intermediate input price and Pt denoting the common output price facing all firms

in period t. The production function f is differentiable at all (k, l,m) ∈ R3++. Firms maximize

expected discounted profits.

We can view Assumption 5 as a natural strengthening of Assumption 3. Indeed the monotonic-

ity of M in Assumption 3 is typically justified following from the market structure, Assumption

5, under suitable shape restrictions on the production function f (see Appendix A in Levinsohn

and Petrin, 2003). However, an important distinction is that our approach can be generalized to

allow for additional unobservables in the firm’s problem that scalar unobservability, Assumption

3, cannot accommodate (see Appendix C).

17Please see Appendix C3 for the extension to the case of multiple flexible inputs.

22

Page 23: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

Assumption 5 allows us to establish the link between M and the production function f . The

firm’s profit maximization problem with respect to intermediate inputs is

M (kt, lt, ωt) = arg maxMt

PtE[F (kt, lt,mt) e

ωt+εt | It]− ρtMt, (13)

which follows because Mt does not have any dynamic implications and thus only affects current

period profits. The first order condition of the problem (13) is

Pt∂

∂Mt

F (kt, lt,mt) eωtE = ρt, (14)

where recall E = E [eεt ]. Taking logs of (14) and differencing with the production function (2)

gives

st = ln E + ln

(∂

∂mt

f (kt, lt,mt)

)− εt (15)

≡ lnDE (kt, lt,mt)− εt

where st ≡ ln(ρtMt

PtYt

)is the (log) intermediate input share of output.

Theorem 4. Under Assumptions 1, 2, 4, 5, and that ρt, Pt (or price-deflators) are observed, the

share regression in equation (15) nonparametrically identifies the flexible input elasticity ∂∂mt

f (kt, lt,mt)

of the production function almost everywhere in (kt, lt,mt).

Proof. Because E [εt | kt, lt,mt] = 0 we have that the conditional expectation

E [st | kt, lt,mt] = lnDE (kt, lt,mt) (16)

identifies the function DE . We refer to this regression in the data as the share regression. As we

now show, the share regression is the key to close the non-identification gap left by our Theorems

1 and 2. Observe that εt = lnDE (kt, lt,mt)− st and thus the constant

E = E[exp

(lnDE (kt, lt,mt)− st

)](17)

23

Page 24: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

can be identified.18 This allows us to identify the flexible input elasticity as

D (kt, lt,mt) ≡∂

∂mt

f (kt, lt,mt) =DE (kt, lt,mt)

E. (18)

Theorem 4 shows that, by taking full advantage of the economic content of the model, we can

identify the flexible input elasticity. Combined with Theorem 3 this allows us to identify the whole

production function f nonparametrically.

6 A Computationally Simple Estimator

In this section we show how to obtain a simple nonparametric estimator of the production function

using standard sieve series estimators as analyzed by Chen (2007). Our estimation procedure

consists of two steps. We first show how to estimate the share regression, and then proceed to

estimation of the constant of integration C and the Markov process h.

We propose a finite-dimensional truncated linear series given by a complete polynomial of

degree r for the share regression. In what follows, we add back in the firm subscripts j for clarity.

Given the observations {(yjt, kjt, ljt,mjt)}Tt=1 for the firms j = 1, . . . , J sampled in the data, we

propose to use a complete polynomial of degree r in kjt,ljt,mjt and to use the sum of squared

residuals,∑

jt ε2jt, as our objective function. For example, for a complete polynomial of degree

two, our estimator would solve:

minγ′

∑j,t

sjt − ln

γ′0 + γ′kkjt + γ′lljt + γ′mmjt + γ′kkk2jt + γ′lll

2jt

+γ′mmm2jt + γ′klkjtljt + γ′kmkjtmjt + γ′lmljtmjt

2

.

The solution to this problem is an estimator

DEr (kjt, ljt,mjt) =∑

rk+rl+rm≤r

γ′rk,rl,rmkrkjt l

rljtm

rmjt , with rk, rl, rm ≥ 0, (19)

18Doraszelski and Jaumandreu (2013) use a similar idea to recover this constant.

24

Page 25: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

of the elasticity up to the constant E , as well as the residual εjt corresponding to the ex-post shocks

to production.19 Since we can estimate E = 1JT

∑j,t e

εjt , we can recover γ ≡ γ′

E , and thus estimate

∂∂mjt

f (kjt, ljt,mjt) from equation (19), free of the constant.

Given our estimator for the intermediate input elasticity, we can calculate the integral in (7).

One advantage of the polynomial sieve estimator we have selected is that this integral will have a

closed-form solution:

Dr (kjt, ljt,mjt) ≡∫Dr (kjt, ljt,mjt) dmjt =

∑rk+rl+rm≤r

γrk,rl,rmrm + 1

krkjt lrljtm

rm+1jt .

For a degree two estimator (r = 2) we would have

D2 (kjt, ljt,mjt) ≡

γ0 + γkkjt + γlljt + γm2mjt + γkkk

2jt + γlll

2jt

+γmm3m2jt + γklkjtljt + γkm

2kjtmjt + γlm

2ljtmjt

mjt.

With an estimate of εjt and of Dr (kjt, ljt,mjt) in hand, we can form a sample analogue of Yjt in

equation (8): Yjt ≡ ln

(Yjt

eεjteDr(kjt,ljt,mjt)

).

In the second step, in order to recover the constant of integration C in (9) and the Markovian

process h, we use similar complete polynomial series estimators. Since a constant in the production

function cannot be separately identified from mean productivity, E [ωjt], we normalize C (kjt, ljt)

to contain no constant. That is, we use

Cτ (kjt, ljt) =∑

0<τk+τl≤τ

ατk,τlkτkjt l

τljt, with τk, τl ≥ 0, (20)

and

hA (ωjt−1) =∑

0≤a≤A

δaωajt−1, with a ≤ A (21)

19As with all nonparametric sieve estimators, the number of terms in the series increases with the number of ob-servations. Under mild regularity conditions these estimators will be consistent and asymptotically normal for sieveM-estimators like the one we propose. See Chen (2007).

25

Page 26: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

for some degrees τ and A (that increase with the sample size). Combining these gives us

Yjt = −∑

0<τk+τl≤τ

ατk,τlkτkjt l

τljt +

∑0≤a≤A

δaωajt−1 + ηjt. (22)

Replacing for ωjt−1 we have the estimating equation:

Yjt = −∑

0<τk+τl≤τ

ατk,τlkτkjt l

τljt +

∑0≤a≤A

δa

(Yjt−1 +

∑0<τk+τl≤τ

ατk,τlkτkjt−1l

τljt−1

)a

+ ηjt (23)

We can then use moments of the form E[ηjtk

τkjt l

τljt

]= 0 and E

[ηjtYajt−1

]= 0 to form a

standard sieve GMM criterion function to estimate (α, δ).20 Notice that the estimator described

above is just-identified. One could also use higher-order moments, as well as lags of inputs, to

estimate an over-identified version of the model.

Even though the estimator we introduce is a straightforward application of sieves, to our knowl-

edge, there is no asymptotic distributional theory for multi-step nonparametric sieve estimators.

Hence, for the purpose of inference, we interpret our estimator as a flexible parametric approxima-

tion to the production function. Our estimator then becomes a standard GMM problem that uses

the following moments

E

[εjt∂ lnDr (kjt, ljt,mjt)

∂γ

]= 0,

E[ηjtk

τkjt l

τljt

]= 0,

E[ηjtYajt−1

]= 0,

where the first set of moments are the NLLS moments corresponding to the share equation. We

can then apply standard GMM theory to do inference.21 Since our main empirical focus is not on

20Alternatively, for a guess of α, one can form ωjt−1 (α) = Yjt−1 + C (kjt−1, ljt−1) = Yjt−1 +∑0<τk+τl≤τ ατk,τlk

τkjt−1l

τljt−1, and based on equation (22) use moments of the form E [ηjtωjt−1 (α)] = 0 to esti-

mate δ. Notice that since ωjt (α) = Yjt +∑

0<τk+τl≤τ ατk,τlkτkjt l

τljt, this is equivalent to regressing ωjt on a sieve in

ωjt−1. Then the moments E[ηjtk

τkjt l

τljt

]= 0 can be used to estimate α.

21See e.g., Hansen (1982) for the one-step version and Newey and McFadden (1994) for the two-step version.

26

Page 27: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

particular parameters themselves, but on functions of the parameters (e.g., elasticities, productiv-

ity), we employ a nonparametric block bootstrap to compute standard errors (see e.g., Horowitz,

2001).22

7 Relationship to Literature

7.1 Price Variation as an Instrument

Recall that the identification problem with respect to intermediate inputs stems from insufficient

variation in mjt to identify their influence in the production function independently of the other in-

puts. However, by looking at the intermediate input demand equation,mjt = M(kjt, ljt, ωjt, ρt, Pt),

it can be seen that, if prices (Pt, ρt) were firm specific, they could potentially serve as a source of

variation to address the identification problem.23 In fact, given enough sources of price variation,

prices could potentially be used directly as instruments to estimate the entire production function

while controlling for the endogeneity of input decisions.

There are several challenges to using prices as instruments (see GM and Ackerberg et al.,

2007). First, in many firm-level production datasets, firm-specific prices are not observed. Second,

even if price variation is observed, in order to be useful as an instrument, the variation employed

must not be correlated with the innovation to productivity, ηjt (as is likely to be the case if firms

choose prices optimally due to market power); and it cannot purely reflect differences in the quality

of either inputs or output. To the extent that input and output prices capture quality differences,

22Since we estimate a just-identified model, the multi-step and single-step estimators are equivalent. In principle,one could employ the asymptotic results in Chen and Pouzo (2015) and Tao (2015) to do inference for the nonpara-metric estimator in this case. In practice, researchers may want to use an over-identified version of our estimator, inwhich case these results no longer apply. Hence we focus on the flexible parametric interpretation for the purposes ofinference. That being said, in Online Appendix O1, we present Monte Carlo simulations which show that our bootstrapprocedure has the correct coverage for the nonparametric estimates.

23The intermediate input demand equation also suggests that time-varying prices, even if they were not firm-specific,could serve as potential instruments. While industry-time specific price indices (deflators) are commonly available inmost datasets, this approach is problematic for a few reasons. First, these instruments are likely to have little identifyingvariation given that they do not vary across firm. Second, given that most firm-level datasets are short panels, withasymptotics taken in the number of firms as opposed to the number of periods, time-series variation alone will not leadto consistent estimates. In addition, when the production function is allowed to vary over time, time-series variationin prices is no longer sufficient for identification, as variation within each period is necessary.

27

Page 28: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

prices should be included in the measure of the inputs used in production.24

This is not to say that if one can isolate exogenous price variation, it cannot be used to aid

in identification. The point is that just observing price variation is not enough. The case must

be made that the price variation that is used is indeed exogenous. For example, if prices are

observed and serially correlated, one way to deal with the endogeneity concern, as suggested by

Doraszelski and Jaumandreu (2013), is to use lagged prices as instruments. This diminishes the

endogeneity concerns, since lagged prices only need to be uncorrelated with the innovation to

productivity, ηjt. In an effort to show that wage variation does not reflect differences in worker

quality, Doraszelski and Jaumandreu (2015) demonstrate empirically that the majority of wage

variation in the Spanish manufacturing dataset they use is not due to variation in the skill mix of

workers, and therefore likely due to geographic and temporal differences in labor markets. This

work demonstrates that prices (specifically lagged prices), when carefully employed, can be a

useful source of variation for identification of the production function. However, as also noted

in Doraszelski and Jaumandreu (2013), this information is not available in most datasets. Our

approach offers an alternative identification strategy that can be employed even when external

instruments are not available.

7.2 Exploiting First-Order Conditions

The use of first-order conditions for the estimation of production functions dates back to at least the

work by Klein (1953) and Solow (1957),25 who recognized that, for a Cobb-Douglas production

function, there is an explicit relationship between the parameters representing input elasticities and

input cost or revenue shares. This observation forms the basis for index number methods (see e.g.,

Caves, Christensen, and Diewert, 1982) that are used to nonparametrically recover input elasticities

and productivity.26

24Recent work has suggested that quality differences may be a key driver of price differences (see GM and Fox andSmeets, 2011).

25Other examples of using first-order conditions to obtain identification include Stone (1954) on consumer demand,Heckman (1974) on labor supply, Hansen and Singleton (1982) on Euler equations and consumption, Paarsch (1992)and Laffont and Vuong (1996) on auctions, and Heckman, Matzkin, and Nesheim (2010) on hedonics.

26Index number methods are grounded in three important economic assumptions. First, all inputs are flexible andcompetitively chosen. Second, the production technology exhibits constant returns to scale, which while not strictly

28

Page 29: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

More recently, Doraszelski and Jaumandreu (2013, 2015) and Grieco, Li, and Zhang (2014)

exploit the first-order conditions for labor and intermediate inputs under the assumption that they

are flexibly chosen. Instead of using shares to recover input elasticities, these papers recognize

that given a particular parametric form of the production function, the first-order condition for a

flexible input (the proxy equation in LP/ACF) implies cross-equation parameter restrictions that

can be used to aid in identification. Using a Cobb-Douglas production function, Doraszelski and

Jaumandreu (2013) show that the first-order condition for a flexible input can be re-written to

replace for productivity in the production function. Combined with observed variation in the prices

of labor and intermediate inputs, they are able to estimate the parameters of the production function

and productivity.

Doraszelski and Jaumandreu (2015) extend the methodology developed in Doraszelski and

Jaumandreu (2013) to estimate productivity when it is non-Hicks neutral, for a CES production

function. By exploiting the first order conditions for both labor and intermediate inputs they are

able to estimate a standard Hicks-neutral and a labor-augmenting component to productivity.

Grieco, Li, and Zhang (2014) also use first order conditions for both labor and intermediate

inputs to recover multiple unobservables. In the presence of unobserved heterogeneity in interme-

diate input prices, they show that the parametric cross-equation restrictions between the production

function and the two first-order conditions, combined with observed wages, can be exploited to es-

timate the production function and recover the unobserved intermediate input prices. They also

show that their approach can be extended to account for the composition of intermediate inputs

and the associated (unobserved) component prices.

The paper most closely related to ours is Griliches and Ringstad (1971), which exploits the

relationship between the first order condition for a flexible input and the production function in

a Cobb-Douglas parametric setting. They use the average revenue share of the flexible input to

measure the output elasticity of flexible inputs. This combined with the log-linear form of the

Cobb-Douglas production function, allows them to then subtract out the term involving flexible in-

necessary is typically assumed in order to avoid imputing a rental price of capital. Third, and most importantly forour comparison, there are no ex-post shocks to output. Allowing for ex-post shocks in the index number frameworkcan only be relaxed by assuming that elasticities are constant across firms, i.e., by imposing the parametric structureof Cobb-Douglas.

29

Page 30: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

puts. Finally, under the assumption that the non-flexible inputs are predetermined and uncorrelated

with productivity (not just the innovation), they estimate the coefficients for the predetermined

inputs.

Our identification solution can be seen as a nonparametric generalization of the Griliches and

Ringstad (1971) empirical strategy. Instead of using the Cobb-Douglas restriction, our share equa-

tion (15) uses revenue shares to recover input elasticities in a fully nonparametric setting. In

addition, rather than subtract out the effect of intermediate inputs from the production function,

we instead integrate up the intermediate input elasticity and take advantage of the nonparametric

cross-equation restrictions between the share equation and the production function. Furthermore,

we allow for predetermined inputs to be correlated with productivity, but uncorrelated with just the

innovation to productivity.

7.3 Dynamic Panel

An alternative approach employed in the empirical literature is to use the dynamic panel estima-

tors of Arellano and Bond (1991) and Blundell and Bond (1998, 2000). Under a linear parametric

restriction on the evolution of ωt, these methods take advantage of the conditional moment re-

strictions implied by Assumption (2), which allows for the use of appropriately lagged inputs as

instruments.

If one is willing to step outside the model described in Section 2 and assume that all inputs are

not flexible (i.e., rule out the existence of flexible inputs) or that no flexible input satisfies the proxy

variable assumption,27 and in addition assume a version of Assumption 4 that includes all inputs,

then it may be possible to show that these dynamic panel methods can identify the production

function and productivity. However, the bulk of empirical work based on production function

estimation has focused on environments in which some inputs are non-flexible and some inputs

are flexible. It is this setting that motivates our problem and distinguishes our approach from the

27Note that not satisfying the proxy variable assumption does not guarantee identification in the presence of aflexible input. For example, unobserved serially correlated intermediate input price shocks violate the proxy vari-able assumption. However, this variation generates a measurement problem, since intermediate inputs are typicallymeasured in expenditures.

30

Page 31: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

dynamic panel literature.

8 Value Added

A common empirical approach is to employ a value-added production function by relating a mea-

sure of the output of a firm to a function of capital and labor only. Typically output is measured

empirically as the “value added” by the firm (i.e., the value of gross output minus expenditures

on intermediate inputs).28 One potential advantage of this approach is that, by excluding inter-

mediate inputs from the production function, it avoids the identification problem associated with

intermediate inputs that we describe in Section 3.

The use of value added is typically motivated in one of two ways. First, a researcher may feel

that a value-added function is a better model of the production process. For example, suppose

there is a lot of heterogeneity in the degree of vertical integration within an industry, with firms

outsourcing varying degrees of the production process, as a result of the production function being

heterogeneous in how intermediate inputs enter. In this case, a researcher may feel that focusing

on just the contributions of capital and labor (to the value added by the firm) is preferred to a

gross output specification including intermediate inputs. Under this setup, if either capital or labor

are flexibly chosen, then our non-identification arguments in Section 3 still apply. Similarly, our

identification arguments also apply, and one can use our proposed identification/estimation strategy

for this model as well.

The second motivation is based on the idea that a value-added function can be constructed from

an underlying gross output production function. This value-added function can then be used to

recover objects of interest from the underlying gross output production function, such as firm-level

productivity eωt+εt and certain features of the production technology (e.g., output elasticities of

inputs) with respect to the “primary inputs”, capital and labor. This approach is typically justified

either via the restricted profit function or by using structural production functions. As we discuss in

more detail below, under the model described in Section 2, in general, neither justification allows

28One exception is Ackerberg, Caves, and Frazer (2015) which uses gross output as the output measure.

31

Page 32: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

for a value added production function to be isolated from the gross output production function.

Regardless of the motivation for value added, the objects from a value-added specification,

particularly productivity, will be fundamentally different than those from gross output. Under the

first motivation, this is because productivity from a primitively specified value-added setup mea-

sures differences in value added holding capital and labor fixed, as opposed to differences in gross

output holding all inputs fixed. The results in this section show that under the second motivation,

the value-added objects cannot generally be mapped into their gross output counterparts if all one

has are the value-added objects. A key exception is the linear in intermediate inputs Leontief spec-

ification that we discuss below, a version of which is employed by Ackerberg, Caves, and Frazer

(2015).

8.1 Restricted Profit Value-Added

The first approach to relating gross output to value added is based on the duality results in Bruno

(1978) and Diewert (1978). We first briefly discuss their original results, which were derived under

the assumption that intermediate inputs are flexibly chosen, but excluding the ex-post shocks. In

this case, they show that by replacing intermediate inputs with their optimized value in the profit

function, the empirical measure of value added, V AEt ≡ Yt −Mt can be expressed as:

V AEt = F (kt, lt,M (kt, lt, ωt)) eωt −M (kt, lt, ωt) ≡ V (kt, lt, ωt) , (24)

where we use V (·) to denote value-added in this setup.29 This formulation is sometimes referred

to as the restricted profit function (see Lau, 1976; Bruno, 1978; McFadden, 1978).

In an index number framework, Bruno (1978) shows that elasticities of gross output with re-

spect to capital, labor, and productivity can be locally approximated by multiplying estimates of the

value-added counterparts by the firm-level ratio of value added to gross output, V AEtGOt

= (1− St),

29Technically, V AEt ≡ PtYt

P t− ρtMt

ρt, where P t and ρt are the price deflators for output and intermediate inputs,

respectively. The ratio Pt

P tis equal to the output price in the base year, PBASE , and similarly for the price of interme-

diate inputs. Since PBASE and ρBASE are constants, they get subsumed in the constants in the F and M functions.For ease of notation, we normalize these constants to 1.

32

Page 33: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

where GO stands for gross output.30 For productivity, the result is as follows:

(elasGOteωt

)=(elas

V AEteωt

)(V AEtGOt

)=(elas

V AEteωt

)(1− St) (25)

See Online Appendix O2 for the details of this derivation. Analogous results hold for the elasticities

with respect to capital and labor by replacing eωt with Kt or Lt.

While this derivation suggests that estimates from the restricted-profit value-added function can

be simply multiplied by (1− St) to recover estimates from the underlying gross output production

function, there are several important problems with the relationship in equation (25). First, this

approach is based on a local approximation. While this may work well for small changes in

productivity, for example looking at productivity growth rates (the original context under which

these results were derived), it may not work well for large differences in productivity, such as

analyzing cross-sectional productivity differences.

Second, this approximation does not account for ex-post shocks to output. As we show in

Online Appendix O2, when ex-post shocks are accounted for, the relationship in equation (25)

becomes:

(∂GOt

∂eωteωt

GOt

)︸ ︷︷ ︸

elasGOteωt

=

(∂V AEt∂eωt

eωt

V AEt

)︸ ︷︷ ︸

elasV AEteωt

(1− St) +

[∂Mt

∂eωteωt

GOt

(eεt

E− 1

)](26)

The term in brackets is the bias introduced due to the ex-post shock. Ex-post shocks drive a wedge

between the local equivalence of value added and gross output objects. Analogous results for the

output elasticities of capital and labor can be similarly derived.

As a result of the points discussed above, estimates from the restricted profit value-added func-

tion cannot simply be “transformed” by re-scaling with the firm-specific share of intermediate

inputs to obtain estimates of the underlying production function and productivity. How much of a

difference this makes is ultimately an empirical question, which we address in Section 9. Preview-

30These results were originally derived under a general form of technical change. We have augmented the resultshere to correspond to the standard setup with Hicks-neutral technical change as discussed in Section 2.1.

33

Page 34: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

ing our results, we find that re-scaling using the shares, as suggested by equation (25), performs

poorly.

8.2 “Structural” Value-Added

The second approach to connecting gross output to value added is based on specific parametric as-

sumptions on the production function, such that a value-added production function of only capital,

labor, and productivity can be both isolated and measured (see Sims, 1969 and Arrow, 1972). We

refer to this version of value added as the “structural value-added production function”.

The empirical literature on value-added production functions often appeals to the extreme case

of perfect complements (i.e., Leontief). A standard representation is:

Yt = min [H (Kt, Lt) , C (Mt)] eωt+εt , (27)

where C (·) is a monotone increasing and concave function. The main idea underlying the Leontief

justification is that, under the assumption that

H (Kt, Lt) = C (Mt) , (28)

the right hand side of equation (27) can be written as H (Kt, Lt) eωt+εt , a function that does not

depend on intermediate inputs M . The key problem with this approach is that, given the assump-

tions of the model, the relation in equation (28) will not generally hold. Unless capital or labor

is assumed to be flexible, firms either cannot adjust them in period t or can only do so with some

positive adjustment cost. The key consequence is that firms may optimally choose to not equate

H(Kt, Lt) and C (Mt), i.e., it may be optimal for the firm to hold onto a larger stock of K and L

than can be combined with M , if K and L are both costly (or impossible) to downwardly adjust.31

31For example, suppose C (Mt) = M0.5t . For simplicity, also suppose that capital and labor are fixed one period

ahead, and therefore cannot be adjusted in the short-run. When M0.5t ≤ H (Kt, Lt), marginal revenue with respect

to intermediate inputs equals ∂C(Mt)∂Mt

aPt. When M0.5t > H (Kt, Lt), increasing Mt does not increase output due to

the Leontief structure, so marginal revenue is zero. Marginal cost in both cases equals the price of intermediate inputs

ρt. The firm’s optimal choice of M is therefore given by Mt =(Pt

ρt0.5a

)2, if(Pt

ρt0.5a

)< H (Kt, Lt). But when

34

Page 35: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

An exception to this, as discussed in Ackerberg, Caves, and Frazer (2015), is when C (·) is

linear (i.e., C (Mt) = aMt). In this case the relation in equation (28) will hold, the right hand side

of equation (27) can be written as a function of only capital, labor, and productivity, and we have

that

Yt = H (Kt, Lt) eωt+εt . (29)

This does not imply though that V AE can be used to measure the structural value-added produc-

tion function, as V AEt ≡ Yt −Mt will not be proportional to the value-added production function

H (Kt, Lt) eωt+εt .32 Equation (29), however, suggests that gross output could be used on the left

hand side to measure the structural value-added production function. This is in fact a version of

what Ackerberg, Caves, and Frazer (2015) suggest in estimating a structural value-added produc-

tion function based on an underlying Leontief gross output production function.

9 Data and Application

In the previous section we showed that value-added production functions capture fundamentally

different objects compared to gross output. A natural question that arises is whether these differ-

ences are relevant empirically. A recent survey paper by Syverson (2011) states that many results

in the productivity literature are quite robust to alternative measurement approaches. He attributes

this to the idea that the underlying variation at the firm level is so large that it dominates any dif-

ferences due to measurement. This suggests that whether a researcher uses a value-added or gross

output specification should not change any substantive conclusions related to productivity. In this

section we show that, not only do the two approaches of gross output and value added produce fun-

damentally different patterns of productivity empirically, in many cases the differences are quite

large and lead to very different conclusions regarding the relationship between productivity and

other dimensions of firm heterogeneity.(Pt

ρt0.5a

)> H (Kt, Lt), the firm no longer finds it optimal to set H (Kt, Lt) = C (Mt), and prefers to hold onto

excess capital and labor.32In Online Appendix O2 we also show that moving ωt inside of the min function and/or εt inside of the min

function presents a similar set of issues.

35

Page 36: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

We quantify the effect of using a value-added rather than gross output specification using two

commonly employed plant-level manufacturing datasets. The first dataset comes from the Colom-

bian manufacturing census covering all manufacturing plants with more than 10 employees from

1981-1991. This dataset has been used in several studies, including Roberts and Tybout (1997),

Clerides, Lach, and Tybout (1998), and Das, Roberts, and Tybout (2007). The second dataset

comes from the census of Chilean manufacturing plants conducted by Chile’s Instituto Nacional

de Estadística (INE). It covers all firms from 1979-1996 with more than 10 employees. This dataset

has also been used extensively in previous studies, both in the production function estimation liter-

ature (LP) and in the international trade literature (Pavcnik, 2002 and Alvarez and López, 2005).33

We estimate separate production functions for the five largest 3-digit manufacturing industries

in both Colombia and Chile, which are Food Products (311), Textiles (321), Apparel (322), Wood

Products (331), and Fabricated Metal Products (381). We also estimate an aggregate specification

grouping all manufacturing together. We estimate the production function in two ways.34 First, us-

ing our approach from Section 6, we estimate a gross output production function using a complete

polynomial series of degree 2 for both the elasticity and the integration constant in the production

function.35 That is, adding back in the firm subscripts j for clarity, we use

DE2 (kjt, ljt,mjt) = γ′0 + γ′kkjt + γ′lljt + γ′mmjt + γ′kkk2jt + γ′lll

2jt

+γ′mmm2jt + γ′klkjtljt + γ′kmkjtmjt + γ′lmljtmjt

33We construct the variables adopting the convention used by Greenstreet (2007) with the Chilean dataset, andemploy the same approach with the Colombian dataset. In particular, real gross output is measured as deflated rev-enues. Intermediate inputs are formed as the sum of expenditures on raw materials, energy (fuels plus electricity),and services. Real value added is the difference between real gross output and real intermediate inputs, i.e., doubledeflated value added. Labor input is measured as a weighted sum of blue collar and white collar workers, where bluecollar workers are weighted by the ratio of the average blue collar wage to the average white collar wage. Capital isconstructed using the perpetual inventory method where investment in new capital is combined with deflated capitalfrom period t − 1 to form capital in period t. Deflators for Colombia are obtained from Pombo (1999) and deflatorsfor Chile are obtained from Bergoeing, Hernando, and Repetto (2003).

34For all of the estimates we present, we obtain standard errors by using the nonparametric block bootstrap with200 replications.

35We also experimented with higher-order polynomials, and the results were very similar. In a few industries(specifically those with the smallest number of observations) the results are slightly more heterogeneous, as expected.

36

Page 37: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

to estimate the intermediate input elasticity and

C2 (kjt, ljt) = αkkjt + αlljt + αkkk2jt + αlll

2jt + αklkjtljt

for the constant of integration. Putting all the elements together, the gross output production func-

tion we estimate is given by:

yjt =

γ0 + γkkjt + γlljt + γm2mjt + γkkk

2jt + γlll

2jt

+γmm3m2jt + γklkjtljt + γkm

2kjtmjt + γlm

2ljtmjt

mjt (30)

−αkkjt − αlljt − αkkk2jt − αlll2jt − αklkjtljt + ωjt + εjt,

since yjt =∫ DE(ljt,kjt,mjt)

E dmjt − C (kjt, ljt) + ωjt + εjt.

Second, we estimate a value-added specification using the commonly-applied method devel-

oped by ACF, also using a complete polynomial series of degree 2:

vajt = βkkjt + βlljt + βkkk2jt + βlll

2jt + βklkjtljt + υjt + εjt, (31)

where υjt + εjt represents productivity in the value-added model.

In Table 1 we report estimates of the average output elasticities for each input, as well as

the sum, for both the value-added and gross output models.36 In every case but one, the value-

added model generates a sum of elasticities that is larger relative to gross output, with an average

difference of 2% in Colombia and 6% in Chile.

We also report the ratio of the mean capital and labor elasticities, which measures the capital in-

tensity (relative to labor) of the production technology in each industry. In general, the value-added

estimates of the capital intensity of the technology are larger relative to gross output, although the

differences are small. According to both measures, the Food Products (311) and Textiles (321)

industries are the most capital intensive in Colombia, and in Chile the most capital intensive are

36The distributions of the estimated firm-specific output elasticities are quite reasonable. For each industry, less than2% are outside of the range (0,1) for labor and intermediate inputs. For capital, the elasticities are closer to zero, buteven in the worst case, less than 9.4% have values below zero. Not surprisingly, these percentages are highest amongthe the industries with the smallest number of observations.

37

Page 38: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

Food Products (311), Textiles (321), and Fabricated Metals (381). In both countries, Apparel (322)

and Wood Products (331) are the least capital intensive industries, even compared to the aggregate

specification denoted “All” in the tables.

Value added also recovers dramatically different patterns of productivity as compared to gross

output. Following OP, we define productivity (in levels) as the sum of the persistent and unantici-

pated components: eω+ε.37 In Table 2 we report estimates of several frequently analyzed statistics

of the resulting productivity distributions. In the first three rows of each panel we report ratios

of percentiles of the productivity distribution, a commonly used measure of productivity disper-

sion. There are two important implications of these results. First, value added suggests a much

larger amount of heterogeneity in productivity across plants within an industry, as the various per-

centile ratios are much smaller under gross output. For Colombia, the average 75/25, 90/10, and

95/5 ratios are 1.88, 3.69, and 6.41 under value added, and 1.33, 1.78, and 2.23 under gross out-

put. For Chile, the average 75/25, 90/10, and 95/5 ratios are 2.76, 8.02, and 17.93 under value

added, and 1.48, 2.20, and 2.95 under gross output. The value-added estimates imply that, with the

same amount of inputs, the 95th percentile plant would produce more than 6 times more output in

Colombia, and almost 18 times more output in Chile, than the 5th percentile plant. In stark con-

trast, we find that under gross output, the 95th percentile plant would produce only 2 times more

output in Colombia, and 3 times more output in Chile, than the 5th percentile plant with the same

inputs.

In addition, the ranking of industries according to the degree of productivity dispersion is not

preserved moving from the value added to gross output estimates. For example, in Chile, the

Fabricated Metals industry (381) is found to have the smallest amount of productivity dispersion

under value added, but the largest amount of dispersion under gross output, for all three dispersion

measures.

The second important result is that value added also implies much more heterogeneity across

37Since our interest is in analyzing productivity heterogeneity we conduct our analysis using productivity in levels.An alternative would be to measure productivity in logs. However, the log transformation is only a good approximationfor measuring percentage differences in productivity across groups when these differences are small, which they arenot in our data. We have also computed results based on log productivity. As expected, the magnitude of our resultschanges, however, our qualitative results comparing gross output and value added still hold. We have also computedresults using just the persistent component of productivity, eω . The results are qualitatively similar.

38

Page 39: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

industries, which is captured by the finding that the range of the percentile ratios across industries

is much tighter using the gross output measure of productivity. For example, for the 95/5 ratio, the

value-added estimates indicate a range from 4.36 to 11.01 in Colombia and from 12.52 to 25.08

in Chile, whereas the gross output estimates indicate a range from 2.02 to 2.38 and from 2.48 to

3.31. The surprising aspect of these results is that the dispersion in productivity appears far more

stable both across industries and across countries when measured via gross output as opposed to

value added. In the conclusion we sketch some important policy implications of this finding for

empirical work on the misallocation of resources.

In addition to showing much larger overall productivity dispersion, results based on value

added also suggest a substantially different relationship between productivity and other dimen-

sions of plant-level heterogeneity. We examine several commonly-studied relationships between

productivity and other plant characteristics. In the last four rows of each panel in Table 2 we report

percentage differences in productivity based on whether plants export some of their output, import

intermediate inputs, have positive advertising expenditures, and pay above the median (industry)

level of wages.

Using the value-added estimates, for most industries exporters are found to be more productive

than non-exporters, with exporters appearing to be 83% more productive in Colombia and 14%

more productive in Chile across all industries. Using the gross output specification, these estimates

of productivity differences fall to 9% in Colombia and 3% in Chile, and actually turn negative

(although not statistically different from zero) in some cases.

A similar pattern exists when looking at importers of intermediate inputs. The average pro-

ductivity difference is 14% in Colombia and 41% in Chile using value added. However, under

gross output, these numbers fall to 8% and 13% respectively. The same story holds for differences

in productivity based on advertising expenditures. Moving from value added to gross output, the

estimated difference in productivity drops for most industries in Colombia, and for all industries

in Chile. In several cases it becomes statistically indistinguishable from zero.

Another striking contrast arises when we compare productivity between plants that pay wages

above versus below the industry median. Using the productivity estimates from a value-added

39

Page 40: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

specification, firms that pay wages above the median industry wage are found to be substantially

more productive, with the estimated differences ranging from 34%-63% in Colombia and from

47%-123% in Chile. In every case the estimates are statistically significant. Using the gross output

specification, these estimates fall to 9%-22% in Colombia and 19%-30% in Chile, representing a

fall by a factor of 3, on average, in both countries.

Since intermediate input usage is likely to be positively correlated with productivity, we would

expect that including (excluding) intermediate inputs in the production function will lead to smaller

(larger) differences in productivity heterogeneity. Therefore, we would expect to see the largest

discrepancies between the value-added and gross output productivity heterogeneity estimates in

industries which are intensive in intermediate input usage. By looking at Tables 1 and 2 we can

confirm that, for the most part, this is the case. When comparing the value added and gross output

productivity estimates, the largest differences tend to occur in the most intermediate input inten-

sive industries, which are Food Products (311) in Colombia and Food Products (311) and Wood

Products (331) in Chile. However, this is not always the case. For example, in Chile, the differ-

ence between the gross output and value added estimates of the average productivity comparing

advertisers and non-advertisers is actually the smallest in the Wood Products (331) industry.

In order to isolate the importance of the value-added/gross output distinction separately from

the effect of transmission bias, in Table 3 we repeat the above analysis without correcting for the

endogeneity of inputs. We examine the raw effects in the data by estimating productivity using

simple linear regression (OLS) to estimate both gross output and value-added specifications, using

a complete polynomial of degree 2. As can be seen from Table 3, the general pattern of results,

that value added leads to larger productivity differences across many dimensions, is similar to our

previous results both qualitatively and quantitatively.

While the results in Table 3 may suggest that transmission bias is not empirically important,

in Table 4 we show evidence to the contrary. In particular, we report the average input elasticities

based on estimates for the gross output model using OLS and using our method to correct for

transmission bias. A well-known result is that failing to control for transmission bias leads to

overestimates of the coefficients on more flexible inputs. The intuition is that the more flexible

40

Page 41: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

the input is, the more it responds to productivity shocks and the higher the degree of correlation

between that input and unobserved productivity. The estimates in Table 4 show that the OLS results

substantially overestimate the output elasticity of intermediate inputs in every case. The average

difference is 34%, which illustrates the importance of controlling for the endogeneity generated by

the correlation between input decisions and productivity.

An important implication of our results is that, while controlling for transmission bias certainly

has an effect, the use of value added versus gross output has a much larger effect on the productivity

estimates than transmission bias. This suggests that the use of gross output versus value added may

be more important from a policy perspective than controlling for the transmission bias that has

been the primary focus in the production function literature. Our approach enables the estimation

of gross output production functions by solving the identification problem associated with flexible

inputs while simultaneously correcting for transmission bias.

9.1 Robustness Checks

Adjusting the Value Added Estimates As discussed in Section 8.1, in the absence of ex-post

shocks, the derivation provided in equation (25) suggests that the differences between gross output

and value added can be eliminated by re-scaling the value-added estimates by a factor equal to the

plant-level ratio of value added to gross output, i.e., one minus the share of intermediate inputs

in total output. While this idea has been known in the literature for a while, this re-scaling is

very rarely applied in practice.38 As shown in Section 8.1, there are several reasons why this re-

scaling may not work. In order to investigate how well the re-scaling of value added estimates

performs, we apply the transformation implied by equation (25) using the firm-specific ratio of

value added to gross output(V AEtGOt

), a quantity readily available in the data. We find that this re-

scaling performs quite poorly in recovering the underlying gross output estimates of the production

function and productivity, leading to estimates that are in some cases even further from the gross

output estimates than the value-added estimates themselves.

In Tables 5 and 6 we report the re-scaled estimates as well as the value-added estimates using

38See Petrin and Sivadasan (2013) for an example in which a version of this is implemented.

41

Page 42: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

ACF and the gross output estimates using our method for comparison. At first glance, the re-

scaling appears to be working as many of the re-scaled value-added estimates move towards the

gross output estimates. However, in some cases, the estimates of dispersion and the relationship

between productivity and other dimensions of firm heterogeneity move only slightly towards the

gross output estimates, and remain very close to the original value-added estimates. Moreover,

in many cases the estimates overshoot the gross output estimates. Even worse, in some cases the

re-scaling moves them in the opposite direction and leads to estimates that are even further from

the gross output estimates than the original value-added estimates. Finally, in several cases, the

re-scaled estimates actually lead to a sign-reversal compared to both the value-added and gross

output estimates. Overall, while in some cases the re-scaling applied to the value-added estimates

moves them closer to the gross output estimates, it does a poor job of replicating the gross output

estimates, and in many cases moves them even further away.

Alternative Assumptions on Flexible Inputs Our new identification and estimation strategy

takes advantage of the first-order condition with respect to a flexible input. We have used interme-

diate inputs (the sum of raw materials, energy, and services) as the flexible input, as intermediate

inputs have been commonly assumed to be flexible in the literature. We believe that this is a rea-

sonable assumption because a) the model period is typically a year and b) what is required is that

they can be adjusted flexibly at the margin. To the extent that spot markets for commodities exist,

including energy and certain raw materials, this enables firms to make such adjustments. However,

it may be the case that in some applications, researchers do not want to assume that all intermediate

inputs are flexible, or they may want to test the sensitivity of their estimates to this assumption.

First, in order to investigate the robustness of our estimator, in Online Appendix O1 we present

results from a Monte Carlo experiment in which we introduce adjustment frictions for the flexible

input. We then estimate the production function using our estimator, assuming the first order

condition holds. We design the simulation so that dynamic panel methods should work well in the

presence of adjustment costs, and compare the performance of our method to a version of dynamic

panel.

42

Page 43: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

We find that our method performs remarkably well overall, even for large values of adjustment

costs. As expected, when there are no adjustment costs, our method recovers the true elasticities

of the production function, and dynamic panel breaks down. As we increase adjustment costs, our

method continues to outperform dynamic panel for small values, performs similarly for intermedi-

ate values, and does only marginally worse for large values.

Second, as a robustness check on our results, we estimate two different specifications of our

model in which we allow some of the components of intermediate inputs to be non-flexible. In

particular, the production function we estimate is of the form F (kt, lt, rmt, nst) eωt+εt , where rm

denotes raw materials and ns denotes energy plus services. In one specification we assume rm

to be non-flexible and ns to be flexible, and in the other specification we assume the opposite.

See Appendix D for these results. Overall the results are sensible and qualitatively similar to our

main results. In addition, the relationship between productivity estimates based on gross output

and value added is very similar to the one in the main set of results.

Fixed Effects As we detail in Appendix C, our identification and estimation strategy can be

easily extended to incorporate fixed effects in the production function. The production function

allowing for fixed effects, a, can be written as Yt = F (kt, lt,mt) ea+ωt+εt .39 A common drawback

of models with fixed effects is that the differencing of the data needed to subtract out the fixed

effects can remove a large portion of the identifying information in the data. In the context of

production functions, this often leads to estimates of the capital coefficient and returns to scale that

are unrealistically low, as well as large standard errors (see GM).

In Appendix D, we report estimates corresponding to those in Tables 1 and 2, using our method

to estimate the gross output production function allowing for fixed effects. The elasticity estimates

for intermediate inputs are exactly the same as in the specification without fixed effects, as the first

stage of our approach does not depend on the presence of fixed effects. We do find some evidence

in Colombia of the problems mentioned above as the sample sizes are smaller than those for Chile.

Despite this, the estimates are very similar to those from the main specification for both countries,

39See Kasahara, Schrimpf, and Suzuki (2015) for an important extension of our approach to the general case offirm-specific production functions.

43

Page 44: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

and the larger differences are associated with larger standard errors.

Extra Unobservables As we show in Appendix C, our approach can also be extended to incor-

porate additional unobservables driving the intermediate input demand. Specifically, we allow for

an additional unobservable in the share equation for the flexible input (e.g., optimization error).

This introduces some small changes to the identification and estimation procedure, but the core

ideas are unchanged. In Appendix D we report estimates from this alternative specification. Our

results are remarkably robust. The standard errors increase slightly, which is not surprising given

that we have introduced an additional unobservable into the model. The point estimates, however,

are very similar.

10 Conclusion

In this paper we show that the nonparametric identification of production functions in the presence

of both flexible and non-flexible inputs has remained an unresolved issue. We offer a new iden-

tification strategy that closes this loop. The key to our approach is exploiting the nonparametric

cross-equation restrictions between first order condition for the flexible inputs and the production

function.

Our empirical analysis demonstrates that value added can generate substantially different pat-

terns of productivity heterogeneity as compared to gross output, which suggests that empirical

studies of productivity based on value added may lead to fundamentally different policy impli-

cations compared to those based on gross output. To illustrate this possibility, consider the recent

literature that uses productivity dispersion to explain cross-country differences in output per worker

through resource misallocation. As an example, the recent influential paper by Hsieh and Klenow

(2009) finds substantial heterogeneity in productivity dispersion (defined as the variance of log

productivity) across countries as measured using value added. In particular, when they compare

the United States with China and India, the variance of log productivity ranges from 0.40-0.55

for China and 0.45-0.48 for India, but only from 0.17-0.24 for the United States. They then use

44

Page 45: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

this estimated dispersion to measure the degree of misallocation of resources in the respective

economies. In their main counterfactual they find that, by reducing the degree of misallocation in

China and India to that of the United States, aggregate TFP would increase by 30%-50% in China

and 40%-60% in India. In our datasets for Colombia and Chile the corresponding estimates of

the variance in log productivity using a value-added specification are 0.43 and 0.94, respectively.

Thus their analysis applied to our data would suggest that there is similar room for improvement

in aggregate TFP in Colombia, and much more in Chile.

However, when productivity is measured using our gross output framework, our empirical

findings suggest a much different result. The variance of log productivity using gross output is

0.08 in Colombia and 0.15 in Chile. These significantly smaller dispersion measures could imply

that there is much less room for improvement in aggregate productivity for Colombia and Chile.

Since the 90/10 ratios we obtain for Colombia and Chile using gross output are quantitatively very

similar to the estimates obtained by Syverson (2004) for the United States (who also employed

gross output but in an index number framework), this also suggests that the degree of differences

in misallocation of resources between developed and developing countries may not be as large as

the analysis of Hsieh and Klenow (2009) implies.40

Exploring the role of gross output production functions for policy problems such as the one

above could be a fruitful direction for future research. A key message of this paper is that insights

derived under value added, compared to gross output, could lead to significantly different policy

conclusions. Our identification strategy provides researchers with a stronger foundation for using

gross output production functions in practice.

40Hsieh and Klenow note that their estimate of log productivity dispersion for the United States is larger thanprevious estimates by Foster, Haltiwanger, and Syverson (2008) by a factor of almost 4. They attribute this to thefact that Foster, Haltiwanger, and Syverson use a selected set of homogeneous industries. However, another importantdifference is that Foster, Haltiwanger, and Syverson use gross output measures of productivity rather than value-addedmeasures. Given our results in Section 9, it is likely that a large part of this difference is due to Hsieh and Klenow’suse of value added, rather than their selection of industries.

45

Page 46: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

ReferencesAckerberg, Daniel, C. Lanier Benkard, Steven Berry, and Ariel Pakes. 2007. “Econometric Tools

For Analyzing Market Outcomes.” In Handbook of Econometrics, vol. 6, edited by James J.Heckman and Edward E. Leamer. Amsterdam: Elsevier, 4171–4276.

Ackerberg, Daniel A, Kevin Caves, and Garth Frazer. 2015. “Identification Properties of RecentProduction Function Estimators.” Econometrica 83 (6):2411–2451.

Alvarez, Roberto and Ricardo A. López. 2005. “Exporting and Performance: Evidence fromChilean Plants.” Canadian Journal of Economics 38 (4):1384–1400.

Arellano, Manuel and Stephen Bond. 1991. “Some Tests of Specification for Panel Data: MonteCarlo Evidence and an Application to Employment Equations.” Review of Economic Studies58 (2):277–297.

Arrow, Kenneth J. 1972. The Measurement of Real Value Added. Stanford: Institute for Mathe-matical Studies in the Social Sciences.

Baily, Martin N., Charles Hulten, and David Campbell. 1992. “Productivity Dynamics in Manu-facturing Plants.” Brookings Papers on Economic Activity. Microeconomics :187–267.

Bartelsman, Eric J. and Mark Doms. 2000. “Understanding Productivity: Lessons from Longitu-dinal Microdata.” Finance and Economics Discussion Series 2000-2019, Board of Governors ofthe Federal Reserve System (U.S.).

Bergoeing, Raphael, Andrés Hernando, and Andrea Repetto. 2003. “Idiosyncratic ProductivityShocks and Plant-Level Heterogeneity.” Documentos de Trabajo 173, Centro de EconomíaAplicada, Universidad de Chile.

Bernard, Andrew B., Jonathan Eaton, J. Bradford Jensen, and Samuel Kortum. 2003. “Plants andProductivity in International Trade.” American Economic Review 93 (4):1268–1290.

Bernard, Andrew B. and J. Bradford Jensen. 1995. “Exporters, Jobs, and Wages in U.S. Manufac-turing: 1976-1987.” Brookings Papers on Economic Activity. Microeconomics :67–119.

———. 1999. “Exceptional Exporter Performance: Cause, Effect or Both?” Journal of Interna-tional Economics 47 (1):1–25.

Blundell, Richard and Stephen Bond. 1998. “Initial Conditions and Moment Restrictions in Dy-namic Panel Data Models.” Journal of Econometrics 87 (1):115–143.

———. 2000. “GMM Estimation with Persistent Panel Data: An Application to Production Func-tions.” Econometric Reviews 19 (3):321–340.

Bond, Stephen and Måns Söderbom. 2005. “Adjustment Costs and the Identification of Cobb Dou-glas Production Functions.” Unpublished Manuscript, The Institute for Fiscal Studies, WorkingPaper Series No. 05/4.

46

Page 47: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

Bruno, Michael. 1978. “Duality, Intermediate Inputs, and Value-Added.” In Production Eco-nomics: A Dual Approach to Theory and Applications, vol. 2, edited by M. Fuss and McFaddenD., chap. 1. Amsterdam: North-Holland.

Caves, Douglas W., Laurits R. Christensen, and W. Erwin Diewert. 1982. “The Economic Theoryof Index Numbers and the Measurement of Input, Output, and Productivity.” Econometrica50 (6):1393–1414.

Chen, Xiaohong. 2007. “Large Sample Sieve Estimation of Semi-Nonparametric Models.” InHandbook of Econometrics, vol. 6, edited by James J. Heckman and Edward E. Leamer. Ams-terdam: Elsevier, 5549–5632.

Chen, Xiaohong and Demian Pouzo. 2015. “Sieve Quasi Likelihood Ratio Inference onSemi/Nonparametric Conditional Moment Models.” Econometrica 83 (3):1013–1079.

Clerides, Sofronis K., Saul Lach, and James R. Tybout. 1998. “Is Learning by Exporting Impor-tant? Micro-Dynamic Evidence from Colombia, Mexico, and Morocco.” The Quarterly Journalof Economics 113 (3):903–947.

Collard-Wexler, Allan. 2010. “Productivity Dispersion and Plant Selection in the Ready-Mix Con-crete Industry.” NYU Stern working paper.

Cunha, Flavio, James J. Heckman, and Susanne M. Schennach. 2010. “Estimating the Technologyof Cognitive and Noncognitive Skill Formation.” Econometrica 78 (3):883–931.

Das, Sanghamitra, Mark J. Roberts, and James R. Tybout. 2007. “Market Entry Costs, ProducerHeterogeneity, and Export Dynamics.” Econometrica 75 (3):837–873.

De Loecker, Jan. 2011. “Product Differentiation, Multiproduct Firms, and Estimating the Impactof Trade Liberalization on Productivity.” Econometrica 79 (5):1407–1451.

Dhrymes, Phoebus J. 1991. The Structure of Production Technology Productivity and AggregationEffects. US Department of Commerce, Bureau of the Census.

Diewert, W. Erwin. 1978. “Hick’s Aggregation Theorem and the Existence of a Real Value AddedFunction.” In Production Economics: A Dual Approach to Theory and Practice, vol. 2, editedby Melvyn Fuss and Daniel McFadden, chap. 2. Amsterdam: North-Holland.

Doraszelski, Ulrich and Jordi Jaumandreu. 2013. “R&D and Productivity: Estimating EndogenousProductivity.” Review of Economic Studies 80 (4):1338–1383.

———. 2015. “Measuring the Bias of Technological Change.” Working Paper.

Foster, Lucia, John Haltiwanger, and Chad Syverson. 2008. “Reallocation, Firm Turnover, andEfficiency: Selection on Productivity or Profitability?” American Economic Review 98 (1):394–425.

Fox, Jeremy and Valérie Smeets. 2011. “Does Input Quality Drive Measured Differences in FirmProductivity?” International Economic Review 52 (4):961–989.

47

Page 48: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

Goldberger, Arthur S. 1968. “The Interpretation and Estimation of Cobb-Douglas Functions.”Econometrica 35 (3-4):464–472.

Greenstreet, David. 2007. “Exploiting Sequential Learning to Estimate Establishment-Level Pro-ductivity Dynamics and Decision Rules.” Economics Series Working Papers 345, University ofOxford, Department of Economics.

Grieco, Paul, Shengyu Li, and Hongsong Zhang. 2014. “Production Function Estimation withUnobserved Input Price Dispersion.” Forthcoming, International Economic Review.

Griliches, Zvi and Jacques Mairesse. 1998. “Production Functions: The Search for Identification.”In Econometrics and Economic Theory in the Twentieth Century: The Ragnar Frisch CentennialSymposium. New York: Cambridge University Press, 169–203.

Griliches, Zvi and Vidar Ringstad. 1971. Economies of Scale and the Form of the ProductionFunction: An Econometric Study of Norwegian Manufacturing Establishment Data. North-Holland Pub. Co. (Amsterdam).

Hansen, Lars Peter. 1982. “Large Sample Properties of Generalized Method of Moments Estima-tors.” Econometrica 50 (4):1029–1054.

Hansen, Lars Peter and Kenneth J. Singleton. 1982. “Generalized Instrumental Variables Estima-tion of Nonlinear Rational Expectations Models.” Econometrica 50 (5):1269–1286.

Heckman, James J. 1974. “Shadow Prices, Market Wages, and Labor Supply.” Econometrica42 (4):679–694.

Heckman, James J., Rosa L. Matzkin, and Lars Nesheim. 2010. “Nonparametric Identification andEstimation of Nonadditive Hedonic Models.” Econometrica 78 (5):1569–1591.

Horowitz, Joel L. 2001. “The Bootstrap.” In Handbook of Econometrics, vol. 5, edited by James J.Heckman and Edward E. Leamer. Amsterdam: Elsevier, 3159–3228.

Houthakker, Hendrik S. 1950. “Revealed Preference and the Utility Function.” Economica17 (66):159–174.

Hsieh, Chang-Tai and Peter J. Klenow. 2009. “Misallocation and Manufacturing TFP in China andIndia.” Quarterly Journal of Economics 124 (4):1403–1448.

Hu, Yingyao and Susanne M Schennach. 2008. “Instrumental variable treatment of nonclassicalmeasurement error models.” Econometrica 76 (1):195–216.

Kasahara, Hiroyuki and Joel Rodrigue. 2008. “Does the Use of Imported Intermediates IncreaseProductivity? Plant-level Evidence.” The Journal of Development Economics 87 (1):106–118.

Kasahara, Hiroyuki, Paul Schrimpf, and Michio Suzuki. 2015. “Identification and Estimation ofProduction Function with Unobserved Heterogeneity.” Working Paper.

Klein, Lawrence R. 1953. A Textbook of Econometrics. Evanston: Row, Peterson and Co.

48

Page 49: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

Klein, Roger and Francis Vella. 2010. “Estimating a Class of Triangular Simultaneous EquationsModels Without Exclusion Restrictions.” Journal of Econometrics 154 (2):154–164.

Klette, Tor Jacob and Zvi Griliches. 1996. “The Inconsistency of Common Scale Estimators WhenOutput Prices are Unobserved and Endogenous.” Journal of Applied Econometrics 11 (4):343–361.

Laffont, Jean-Jacques and Quang Vuong. 1996. “Structural Analysis of Auction Data.” AmericanEconomic Review, P&P 86 (2):414–420.

Lau, Lawrence J. 1976. “A Characterization of the Normalized Restricted Profit Function.” Journalof Economic Theory 12 (1):131–163.

Levinsohn, James and Amil Petrin. 2003. “Estimating Production Functions Using Inputs to Con-trol for Unobservables.” Review of Economic Studies 70 (2):317–342.

Lewbel, Arthur. 2012. “Using Heteroscedasticity to Identify and Estimate Mismeasured and En-dogenous Regressor Models.” Journal of Business & Economic Statistics 30 (1):67–80.

Manski, Charles F. 2003. Partial Identification of Probability Distributions. New York: Springer-Verlag.

Marschak, Jacob and William H. Andrews. 1944. “Random Simultaneous Equations and the The-ory of Production.” Econometrica 12 (3-4):143–205.

Matzkin, Rosa L. 2007. “Nonparametric Identification.” In Handbook of Econometrics, vol. 6,edited by James J. Heckman and Edward E. Leamer. Amsterdam: Elsevier, 5307–5368.

McFadden, Daniel. 1978. “Cost, Revenue, and Profit Functions.” In Production Economics: ADual Approach to Theory and Applications, vol. 1, edited by M. Fuss and McFadden D., chap. 1.Amsterdam: North-Holland.

Newey, Whitney K. and Daniel McFadden. 1994. “Large sample estimation and hypothesis test-ing.” In Handbook of Econometrics, vol. 4, edited by Robert F. Engle and Daniel L. McFadden.Amsterdam: North-Holland, 2111–2245.

Newey, Whitney K and James L Powell. 2003. “Instrumental Variable Estimation of Nonparamet-ric Models.” Econometrica 71 (5):1565–1578.

Newey, Whitney K., James L. Powell, and F Vella. 1999. “Nonparametric Estimation of TriangularSimultaneous Equations Models.” Econometrica 67 (3):565–603.

Olley, G. Steven and Ariel Pakes. 1996. “The Dynamics of Productivity in the TelecommunicationsEquipment Industry.” Econometrica 64 (6):1263–1297.

Paarsch, Harry J. 1992. “Deciding Between the Common and Private Value Paradigms in EmpiricalModels of Auctions.” Journal of Econometrics 51 (1-2):191–215.

Pavcnik, Nina. 2002. “Trade Liberalization Exit and Productivity Improvements: Evidence fromChilean Plants.” Review of Economic Studies 69 (1):245–276.

49

Page 50: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

Petrin, Amil and Jagadeesh Sivadasan. 2013. “Estimating Lost Output from Allocative Ineffi-ciency, with an Application to Chile and Firing Costs.” The Review of Economics and Statistics95 (1):286–301.

Pombo, Carlos. 1999. “Productividad Industrial en Colombia: Una Aplicacion de Numeros In-dicey.” Revista de Economia del Rosario 2 (3):107–139.

Rigobon, Roberto. 2003. “Identification Through Heteroskedasticity.” Review of Economics andStatistics 85 (4):777–792.

Roberts, Mark J. and James R. Tybout. 1997. “The Decision to Export in Colombia: An EmpiricalModel of Entry with Sunk Costs.” American Economic Review 87 (4):545–564.

Robinson, Peter M. 1988. “Root-N-Consistent Semiparametric Regression.” Econometrica56 (4):931–954.

Roehrig, Charles S. 1988. “Conditions for Identification in Nonparametric and Parametric Mod-els.” Econometrica 56 (2):433–447.

Sims, Christopher A. 1969. “Theoretical Basis for a Double Deflated Index of Real Value Added.”The Review of Economics and Statistics 51 (4):470–471.

Solow, Robert M. 1957. “Technical Change and the Aggregate Production Function.” Review ofEconomics and Statistics 39 (3):312–320.

Stone, Richard. 1954. “Linear Expenditure Systems and Demand Analysis: An Application to thePattern of British Demand.” The Economic Journal 64 (255):511–527.

Syverson, Chad. 2004. “Product Substitutability and Productivity Dispersion.” The Review ofEconomics and Statistics 86 (2):534–550.

———. 2011. “What Determines Productivity?” Journal of Economic Literature 49 (2):326–365.

Tao, Jing. 2015. “Inference for Point and Partially Identified Semi-Nonparametric ConditionalMoment Models.” Working Paper.

Varian, Hal R. 1992. Microeconomic Analysis. New York: WW Norton.

Wooldridge, Jeffrey M. 2009. “On Estimating Firm-Level Production Functions Using ProxyVariables to Control for Unobservables.” Economics Letters 104 (3):112–114.

50

Page 51: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

Colombia

Value

Added

(ACF)

Gross

Output

(GNR)

Value

Added

(ACF)

Gross

Output

(GNR)

Value

Added

(ACF)

Gross

Output

(GNR)

Value

Added

(ACF)

Gross

Output

(GNR)

Value

Added

(ACF)

Gross

Output

(GNR)

Value

Added

(ACF)

Gross

Output

(GNR)

Labor 0.70 0.22 0.65 0.32 0.83 0.42 0.86 0.44 0.89 0.43 0.78 0.35

(0.04) (0.02) (0.06) (0.03) (0.03) (0.02) (0.06) (0.05) (0.04) (0.02) (0.01) (0.01)

Capital 0.33 0.12 0.36 0.16 0.16 0.05 0.12 0.04 0.25 0.10 0.31 0.14

(0.02) (0.01) (0.04) (0.02) (0.02) (0.01) (0.04) (0.02) (0.03) (0.01) (0.01) (0.01)

Intermediates -- 0.67 -- 0.54 -- 0.52 -- 0.51 -- 0.53 -- 0.54

(0.01) (0.01) (0.01) (0.01) (0.01) (0.00)

Sum 1.03 1.01 1.01 1.01 0.99 0.99 0.98 0.99 1.14 1.06 1.09 1.04

(0.03) (0.01) (0.04) (0.02) (0.02) (0.01) (0.07) (0.04) (0.02) (0.01) (0.01) (0.00)

Mean(Capital) /

Mean(Labor) 0.47 0.55 0.55 0.49 0.19 0.12 0.14 0.08 0.28 0.23 0.39 0.40

(0.06) (0.08) (0.10) (0.09) (0.03) (0.04) (0.05) (0.05) (0.04) (0.04) (0.02) (0.03)

Chile

Labor 0.77 0.28 0.93 0.45 0.95 0.45 0.92 0.40 0.96 0.52 0.77 0.38

(0.02) (0.01) (0.04) (0.03) (0.04) (0.02) (0.04) (0.02) (0.04) (0.03) (0.01) (0.01)

Capital 0.33 0.11 0.24 0.11 0.20 0.06 0.19 0.07 0.25 0.13 0.37 0.16

(0.01) (0.01) (0.02) (0.01) (0.03) (0.01) (0.02) (0.01) (0.02) (0.01) (0.01) (0.00)

Intermediates -- 0.67 -- 0.54 -- 0.56 -- 0.59 -- 0.50 -- 0.55

(0.00) (0.01) (0.01) (0.01) (0.01) (0.00)

Sum 1.10 1.05 1.17 1.10 1.14 1.08 1.11 1.06 1.22 1.15 1.13 1.09

(0.02) (0.01) (0.03) (0.02) (0.03) (0.02) (0.03) (0.01) (0.03) (0.02) (0.01) (0.01)

Mean(Capital) /

Mean(Labor) 0.43 0.39 0.26 0.24 0.21 0.14 0.21 0.18 0.26 0.25 0.48 0.43

(0.03) (0.03) (0.03) (0.04) (0.04) (0.03) (0.03) (0.03) (0.03) (0.03) (0.02) (0.02)

Notes:

Fabricated Metals

(381) All

(Structural Estimates: Value Added vs. Gross Ouput)

d. The row titled "Sum" reports the sum of the average labor, capital, and intermediate input elasticities, and the row titled "Mean(Capital)/Mean(Labor)" reports the ratio of the average capital elasticity to the average labor elasticity.

a. Standard errors are estimated using the bootstrap with 200 replications and are reported in parentheses below the point estimates.

b. For each industry, the numbers in the first column are based on a value-added specification and are estimated using a complete polynomial series of degree 2 with the method from Ackerberg, Caves, and Frazer (2006). The numbers in the second column are based on a gross

output specification and are estimated using a complete polynomial series of degree 2 for each of the two nonparametric functions (G and C ) of our approach.

c. Since the input elasticities are heterogeneous across firms, we report the average input elasticities within each given industry.

Table 1: Average Input Elasticities of Output

Industry (ISIC Code)

Food Products

(311)

Textiles

(321)

Apparel

(322)

Wood Products

(331)

51

Page 52: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

Colombia

Value

Added

(ACF)

Gross

Output

(GNR)

Value

Added

(ACF)

Gross

Output

(GNR)

Value

Added

(ACF)

Gross

Output

(GNR)

Value

Added

(ACF)

Gross

Output

(GNR)

Value

Added

(ACF)

Gross

Output

(GNR)

Value

Added

(ACF)

Gross

Output

(GNR)

75/25 ratio 2.20 1.33 1.97 1.35 1.66 1.29 1.73 1.30 1.78 1.31 1.95 1.37(0.07) (0.02) (0.09) (0.03) (0.03) (0.01) (0.08) (0.04) (0.04) (0.02) (0.17) (0.01)

90/10 ratio 5.17 1.77 3.71 1.83 2.87 1.66 3.08 1.80 3.33 1.74 4.01 1.86(0.27) (0.05) (0.30) (0.07) (0.09) (0.03) (0.38) (0.12) (0.13) (0.03) (0.07) (0.02)

95/5 ratio 11.01 2.24 6.36 2.38 4.36 2.02 4.58 2.24 5.31 2.16 6.86 2.36(1.11) (0.08) (0.76) (0.14) (0.22) (0.05) (1.01) (0.22) (0.34) (0.06) (0.02) (0.03)

Exporter 3.62 0.14 0.20 0.02 0.16 0.05 0.26 0.15 0.20 0.08 0.51 0.06(0.99) (0.05) (0.10) (0.03) (0.07) (0.03) (0.63) (0.14) (0.05) (0.03) (0.12) (0.01)

Importer -0.25 0.04 0.27 0.05 0.29 0.12 0.06 0.05 0.26 0.10 0.20 0.11(0.08) (0.02) (0.10) (0.04) (0.08) (0.03) (0.53) (0.08) (0.06) (0.02) (0.05) (0.01)

Advertiser -0.46 -0.03 0.20 0.08 0.13 0.05 0.02 0.04 0.15 0.05 -0.13 0.03(0.10) (0.02) (0.07) (0.03) (0.04) (0.02) (0.09) (0.04) (0.04) (0.02) (0.06) (0.01)

Wages > Median 0.59 0.09 0.60 0.18 0.41 0.18 0.34 0.15 0.55 0.22 0.63 0.20(0.19) (0.02) (0.09) (0.03) (0.03) (0.02) (0.17) (0.04) (0.06) (0.02) (0.05) (0.01)

Chile

75/25 ratio 2.92 1.37 2.56 1.48 2.58 1.43 3.06 1.50 2.45 1.53 3.00 1.55(0.05) (0.01) (0.07) (0.02) (0.07) (0.02) (0.08) (0.02) (0.06) (0.02) (0.03) (0.01)

90/10 ratio 9.02 1.90 6.77 2.16 6.76 2.11 10.12 2.32 6.27 2.33 9.19 2.39(0.30) (0.02) (0.30) (0.05) (0.33) (0.05) (0.60) (0.05) (0.27) (0.05) (0.15) (0.02)

95/5 ratio 21.29 2.48 13.56 2.91 14.21 2.77 25.08 3.11 12.52 3.13 20.90 3.31(0.99) (0.05) (0.84) (0.09) (0.77) (0.09) (2.05) (0.11) (0.78) (0.10) (0.47) (0.04)

Exporter 0.27 0.02 0.07 0.02 0.18 0.09 0.12 0.00 0.03 -0.01 0.20 0.03(0.10) (0.02) (0.07) (0.03) (0.08) (0.03) (0.12) (0.03) (0.06) (0.03) (0.04) (0.01)

Importer 0.71 0.14 0.22 0.10 0.31 0.14 0.44 0.15 0.30 0.11 0.46 0.15(0.11) (0.02) (0.05) (0.02) (0.05) (0.02) (0.10) (0.03) (0.05) (0.02) (0.03) (0.01)

Advertiser 0.18 0.04 0.09 0.04 0.15 0.06 0.04 0.03 0.07 0.01 0.14 0.06(0.05) (0.01) (0.04) (0.02) (0.04) (0.02) (0.04) (0.01) (0.04) (0.02) (0.02) (0.01)

Wages > Median 1.23 0.21 0.47 0.19 0.62 0.22 0.68 0.21 0.56 0.22 0.99 0.30(0.09) (0.01) (0.06) (0.02) (0.06) (0.02) (0.08) (0.02) (0.06) (0.02) (0.04) (0.01)

Notes:

a. Standard errors are estimated using the bootstrap with 200 replications and are reported in parentheses below the point estimates.

b. For each industry, the numbers in the first column are based on a value-added specification and are estimated using a complete polynomial series of degree 2 with the method from Ackerberg, Caves, and Frazer (2006). The numbers in the second column are based on

a gross output specification and are estimated using a complete polynomial series of degree 2 for each of the nonparametric functions (G and C ) of our approach.

c. In the first three rows we report ratios of productivity for plants at various percentiles of the productivity distribution. In the remaining four rows we report estimates of the productivity differences between plants (as a fraction) based on whether they have exported

some of their output, imported intermediate inputs, spent money on advertising, and paid wages above the industry median. For example, in industry 311 for Chile value added implies that a firm that advertises is, on average, 18% more productive than a firm that does

not advertise.

Wood Products

(331)

Fabricated Metals

(381) All

(Structural Estimates)

Table 2: Heterogeneity in Productivity

Industry (ISIC Code)

Food Products

(311)

Textiles

(321)

Apparel

(322)

52

Page 53: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

Colombia

Value

Added

(OLS)

Gross

Output

(OLS)

Value

Added

(OLS)

Gross

Output

(OLS)

Value

Added

(OLS)

Gross

Output

(OLS)

Value

Added

(OLS)

Gross

Output

(OLS)

Value

Added

(OLS)

Gross

Output

(OLS)

Value

Added

(OLS)

Gross

Output

(OLS)

75/25 ratio 2.17 1.16 1.86 1.21 1.65 1.17 1.72 1.23 1.78 1.23 1.93 1.24(0.06) (0.01) (0.06) (0.01) (0.03) (0.01) (0.06) (0.02) (0.04) (0.01) (0.02) (0.00)

90/10 ratio 5.15 1.42 3.50 1.51 2.81 1.44 3.05 1.57 3.30 1.53 3.96 1.58(0.27) (0.02) (0.18) (0.04) (0.08) (0.02) (0.22) (0.06) (0.12) (0.02) (0.06) (0.01)

95/5 ratio 10.86 1.74 5.77 1.82 4.23 1.74 4.67 2.01 5.22 1.82 6.81 1.94(0.94) (0.05) (0.55) (0.08) (0.20) (0.04) (0.72) (0.15) (0.31) (0.04) (0.15) (0.02)

Exporter 3.42 0.09 -0.03 -0.01 0.10 0.00 0.21 0.10 0.12 0.03 0.45 0.01(0.99) (0.04) (0.04) (0.01) (0.05) (0.01) (0.19) (0.09) (0.04) (0.02) (0.12) (0.01)

Importer -0.23 -0.02 0.09 0.00 0.21 0.02 0.02 -0.03 0.20 0.05 0.14 0.04(0.07) (0.01) (0.06) (0.01) (0.06) (0.01) (0.06) (0.02) (0.05) (0.01) (0.04) (0.01)

Advertiser -0.46 -0.07 0.11 -0.04 0.10 -0.03 0.01 -0.02 0.08 0.00 -0.16 -0.02(0.11) (0.02) (0.05) (0.02) (0.03) (0.01) (0.07) (0.03) (0.04) (0.01) (0.06) (0.01)

Wages > Median 0.51 0.06 0.49 0.10 0.39 0.13 0.33 0.11 0.50 0.13 0.56 0.13(0.15) (0.02) (0.07) (0.02) (0.03) (0.01) (0.08) (0.03) (0.04) (0.01) (0.05) (0.01)

Chile

75/25 ratio 2.91 1.30 2.57 1.40 2.56 1.36 3.07 1.39 2.47 1.46 3.01 1.45(0.05) (0.00) (0.07) (0.01) (0.07) (0.01) (0.08) (0.01) (0.06) (0.01) (0.03) (0.00)

90/10 ratio 9.00 1.72 6.63 1.97 6.64 1.91 10.21 2.03 6.27 2.14 9.13 2.14(0.29) (0.01) (0.31) (0.04) (0.29) (0.03) (0.57) (0.04) (0.26) (0.04) (0.15) (0.01)

95/5 ratio 20.93 2.15 13.49 2.57 14.20 2.45 25.26 2.77 12.18 2.80 20.64 2.86(0.96) (0.02) (0.83) (0.07) (0.80) (0.05) (2.05) (0.07) (0.77) (0.06) (0.47) (0.03)

Exporter 0.17 -0.01 0.04 -0.02 0.12 0.01 0.12 -0.02 0.00 0.00 0.15 -0.01(0.09) (0.02) (0.06) (0.02) (0.08) (0.02) (0.09) (0.02) (0.06) (0.02) (0.04) (0.01)

Importer 0.57 0.03 0.20 0.04 0.26 0.06 0.41 0.07 0.27 0.06 0.41 0.09(0.09) (0.01) (0.04) (0.02) (0.05) (0.01) (0.09) (0.03) (0.05) (0.02) (0.03) (0.01)

Advertiser 0.12 0.00 0.07 0.01 0.11 0.02 0.02 0.01 0.05 0.01 0.10 0.04(0.04) (0.01) (0.04) (0.01) (0.04) (0.01) (0.04) (0.01) (0.04) (0.02) (0.02) (0.01)

Wages > Median 1.11 0.12 0.45 0.15 0.58 0.16 0.66 0.13 0.53 0.16 0.94 0.24(0.07) (0.01) (0.05) (0.02) (0.06) (0.02) (0.07) (0.02) (0.06) (0.02) (0.03) (0.01)

Notes:

Industry (ISIC Code)

(Uncorrected OLS Estimates)

a. Standard errors are estimated using the bootstrap with 200 replications and are reported in parentheses below the point estimates.

b. For each industry, the numbers in the first column are based on a value-added specification and are estimated using a complete polynomial series of degree 2 with OLS. The numbers in the second column are based on a gross output specification estimated using a

complete polynomial series of degree 2 with OLS.

c. In the first three rows we report ratios of productivity for plants at various percentiles of the productivity distribution. In the remaining four rows we report estimates of the productivity differences between plants (as a fraction) based on whether they have exported

some of their output, imported intermediate inputs, spent money on advertising, and paid wages above the industry median. For example, in industry 311 for Chile value added implies that a firm that advertises is, on average, 12% more productive than a firm that does

not advertise.

All

Table 3: Heterogeneity in Productivity

Food Products

(311)

Textiles

(321)

Apparel

(322)

Wood Products

(331)

Fabricated Metals

(381)

53

Page 54: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

Colombia

Gross

Output

(OLS)

Gross

Output

(GNR)

Gross

Output

(OLS)

Gross

Output

(GNR)

Gross

Output

(OLS)

Gross

Output

(GNR)

Gross

Output

(OLS)

Gross

Output

(GNR)

Gross

Output

(OLS)

Gross

Output

(GNR)

Gross

Output

(OLS)

Gross

Output

(GNR)

Labor 0.15 0.22 0.21 0.32 0.32 0.42 0.32 0.44 0.29 0.43 0.26 0.35

(0.01) (0.02) (0.02) (0.03) (0.01) (0.02) (0.03) (0.05) (0.02) (0.02) (0.01) (0.01)

Capital 0.04 0.12 0.06 0.16 0.01 0.05 0.03 0.04 0.03 0.10 0.06 0.14

(0.01) (0.01) (0.01) (0.02) (0.01) (0.01) (0.01) (0.02) (0.01) (0.01) (0.00) (0.01)

Intermediates 0.82 0.67 0.76 0.54 0.68 0.52 0.65 0.51 0.73 0.53 0.72 0.54

(0.01) (0.01) (0.01) (0.01) (0.01) (0.01) (0.02) (0.01) (0.01) (0.01) (0.00) (0.00)

Sum 1.01 1.01 1.03 1.01 1.01 0.99 1.00 0.99 1.05 1.06 1.04 1.04

(0.01) (0.01) (0.01) (0.02) (0.01) (0.01) (0.02) (0.04) (0.01) (0.01) (0.00) (0.00)

Mean(Capital) /

Mean(Labor) 0.27 0.55 0.27 0.49 0.04 0.12 0.08 0.08 0.11 0.23 0.23 0.40

(0.07) (0.08) (0.06) (0.09) (0.02) (0.04) (0.05) (0.05) (0.04) (0.04) (0.01) (0.03)

Chile

Labor 0.17 0.28 0.26 0.45 0.29 0.45 0.20 0.40 0.32 0.52 0.20 0.38

(0.01) (0.01) (0.02) (0.03) (0.02) (0.02) (0.01) (0.02) (0.02) (0.03) (0.01) (0.01)

Capital 0.05 0.11 0.06 0.11 0.03 0.06 0.02 0.07 0.07 0.13 0.09 0.16

(0.00) (0.01) (0.01) (0.01) (0.01) (0.01) (0.01) (0.01) (0.01) (0.01) (0.00) (0.00)

Intermediates 0.83 0.67 0.75 0.54 0.74 0.56 0.81 0.59 0.71 0.50 0.77 0.55

(0.01) (0.00) (0.01) (0.01) (0.01) (0.01) (0.01) (0.01) (0.01) (0.01) (0.00) (0.00)

Sum 1.05 1.05 1.06 1.10 1.06 1.08 1.04 1.06 1.10 1.15 1.06 1.09

(0.00) (0.01) (0.01) (0.02) (0.01) (0.02) (0.01) (0.01) (0.01) (0.02) (0.00) (0.01)

Mean(Capital) /

Mean(Labor) 0.28 0.39 0.22 0.24 0.12 0.14 0.12 0.18 0.21 0.25 0.42 0.43

(0.03) (0.03) (0.04) (0.04) (0.03) (0.03) (0.05) (0.03) (0.04) (0.03) (0.02) (0.02)

Notes:

Fabricated Metals

(381) All

(Gross Output: Structural vs. Uncorrected OLS Estimates)

d. The row titled "Sum" reports the sum of the average labor, capital, and intermediate input elasticities, and the row titled "Mean(Capital)/Mean(Labor)" reports the ratio of the average capital elasticity to the average labor elasticity.

a. Standard errors are estimated using the bootstrap with 200 replications and are reported in parentheses below the point estimates.

c. Since the input elasticities are heterogeneous across firms, we report the average input elasticities within each given industry.

b. For each industry, the numbers in the first column are based on a gross output specification and are estimated using a complete polynomial series of degree 2 with OLS. The numbers in the second column are also based on a gross output specification using a complete

polynomial series of degree 2 for each of the two nonparametric functions (G and C ) of our approach.

Table 4: Average Input Elasticities of Output

Industry (ISIC Code)

Food Products

(311)

Textiles

(321)

Apparel

(322)

Wood Products

(331)

54

Page 55: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

Colo

mb

ia

Valu

e

Added

(AC

F)

Valu

e

Added

(AC

F--

Rescaled

)

Gro

ss

Outp

ut

(GN

R)

Valu

e

Added

(AC

F)

Valu

e

Added

(AC

F--

Rescaled

)

Gro

ss

Outp

ut

(GN

R)

Valu

e

Added

(AC

F)

Valu

e

Added

(AC

F--

Rescaled

)

Gro

ss

Outp

ut

(GN

R)

Valu

e

Added

(AC

F)

Valu

e

Added

(AC

F--

Rescaled

)

Gro

ss

Outp

ut

(GN

R)

Valu

e

Added

(AC

F)

Valu

e

Added

(AC

F--

Rescaled

)

Gro

ss

Outp

ut

(GN

R)

Valu

e

Added

(AC

F)

Valu

e

Added

(AC

F--

Rescaled

)

Gro

ss

Outp

ut

(GN

R)

Lab

or

0.7

00

.20

0.2

20

.65

0.2

80

.32

0.8

30

.38

0.4

20

.86

0.4

00

.44

0.8

90

.40

0.4

30

.78

0.3

30

.35

(0.0

4)

(0.0

1)

(0.0

2)

(0.0

6)

(0.0

3)

(0.0

3)

(0.0

3)

(0.0

2)

(0.0

2)

(0.0

6)

(0.0

3)

(0.0

5)

(0.0

4)

(0.0

2)

(0.0

2)

(0.0

1)

(0.0

1)

(0.0

1)

Cap

ital0

.33

0.0

80

.12

0.3

60

.15

0.1

60

.16

0.0

70

.05

0.1

20

.05

0.0

40

.25

0.1

10

.10

0.3

10

.13

0.1

4

(0.0

2)

(0.0

1)

(0.0

1)

(0.0

4)

(0.0

2)

(0.0

2)

(0.0

2)

(0.0

1)

(0.0

1)

(0.0

4)

(0.0

2)

(0.0

2)

(0.0

3)

(0.0

1)

(0.0

1)

(0.0

1)

(0.0

0)

(0.0

1)

Interm

ediates

----

0.6

7--

--0

.54

----

0.5

2--

--0

.51

----

0.5

3--

--0

.54

(0.0

1)

(0.0

1)

(0.0

1)

(0.0

1)

(0.0

1)

(0.0

0)

Sum

1.0

31

.01

1.0

11

.01

1.0

01

.01

0.9

90

.99

0.9

90

.98

0.9

90

.99

1.1

41

.06

1.0

61

.09

1.0

41

.04

(0.0

3)

(0.0

1)

(0.0

1)

(0.0

4)

(0.0

2)

(0.0

2)

(0.0

2)

(0.0

1)

(0.0

1)

(0.0

7)

(0.0

3)

(0.0

4)

(0.0

2)

(0.0

1)

(0.0

1)

(0.0

1)

(0.0

0)

(0.0

0)

Mean

(Cap

ital) /

Mean

(Lab

or)

0.4

70

.42

0.5

50

.55

0.5

50

.49

0.1

90

.19

0.1

20

.14

0.1

40

.08

0.2

80

.27

0.2

30

.39

0.3

80

.40

(0.0

6)

(0.0

5)

(0.0

8)

(0.1

0)

(0.1

0)

(0.0

9)

(0.0

3)

(0.0

3)

(0.0

4)

(0.0

5)

(0.0

5)

(0.0

5)

(0.0

4)

(0.0

4)

(0.0

4)

(0.0

2)

(0.0

2)

(0.0

3)

Ch

ile

Lab

or

0.7

70

.21

0.2

80

.93

0.3

70

.45

0.9

50

.37

0.4

50

.92

0.3

20

.40

0.9

60

.43

0.5

20

.77

0.2

90

.38

(0.0

2)

(0.0

1)

(0.0

1)

(0.0

4)

(0.0

2)

(0.0

3)

(0.0

4)

(0.0

2)

(0.0

2)

(0.0

4)

(0.0

1)

(0.0

2)

(0.0

4)

(0.0

2)

(0.0

3)

(0.0

1)

(0.0

1)

(0.0

1)

Cap

ital0

.33

0.1

00

.11

0.2

40

.10

0.1

10

.20

0.0

80

.06

0.1

90

.07

0.0

70

.25

0.1

10

.13

0.3

70

.14

0.1

6

(0.0

1)

(0.0

0)

(0.0

1)

(0.0

2)

(0.0

1)

(0.0

1)

(0.0

3)

(0.0

1)

(0.0

1)

(0.0

2)

(0.0

1)

(0.0

1)

(0.0

2)

(0.0

1)

(0.0

1)

(0.0

1)

(0.0

0)

(0.0

0)

Interm

ediates

----

0.6

7--

--0

.54

----

0.5

6--

--0

.59

----

0.5

0--

--0

.55

(0.0

0)

(0.0

1)

(0.0

1)

(0.0

1)

(0.0

1)

(0.0

0)

Sum

1.1

01

.03

1.0

51

.17

1.0

71

.10

1.1

41

.06

1.0

81

.11

1.0

41

.06

1.2

21

.09

1.1

51

.13

1.0

51

.09

(0.0

2)

(0.0

0)

(0.0

1)

(0.0

3)

(0.0

1)

(0.0

2)

(0.0

3)

(0.0

1)

(0.0

2)

(0.0

3)

(0.0

1)

(0.0

1)

(0.0

3)

(0.0

1)

(0.0

2)

(0.0

1)

(0.0

0)

(0.0

1)

Mean

(Cap

ital) /

Mean

(Lab

or)

0.4

30

.46

0.3

90

.26

0.2

60

.24

0.2

10

.21

0.1

40

.21

0.2

20

.18

0.2

60

.27

0.2

50

.48

0.4

90

.43

(0.0

3)

(0.0

3)

(0.0

3)

(0.0

3)

(0.0

3)

(0.0

4)

(0.0

4)

(0.0

4)

(0.0

3)

(0.0

3)

(0.0

3)

(0.0

3)

(0.0

3)

(0.0

3)

(0.0

3)

(0.0

2)

(0.0

2)

(0.0

2)

No

tes:

Wood

Pro

du

cts

(331)

Fab

ricated

Meta

ls

(381)

All

d. T

he ro

w titled

"Sum

" reports th

e sum

of th

e averag

e labor, cap

ital, and interm

ediate inp

ut elasticities, and

the ro

w titled

"Mean(C

apital)/M

ean(Lab

or)" rep

orts th

e ratio o

f the av

erage cap

ital elasticity to

the av

erage lab

or elasticity

.

a. Stand

ard erro

rs are estimated

using

the b

ootstrap

with

200 rep

lications and

are reported

in parenth

eses belo

w th

e point estim

ates.

b. F

or each

industry

, the nu

mbers in th

e first colu

mn are b

ased o

n a valu

e-added

specificatio

n and are estim

ated u

sing a co

mplete p

oly

nom

ial series of d

egree 2

with

the m

ethod fro

m A

ckerb

erg, C

aves, and

Frazer (2

006). T

he nu

mbers in th

e second

colu

mn are o

btained

by raising

the v

alue-ad

ded

estimates to

the p

ow

er of o

ne minu

s the firm

's share o

f intermed

iate inputs in to

tal outp

ut. T

he nu

mbers

in the th

ird co

lum

n are based

on a g

ross o

utp

ut sp

ecification and

are estimated

using

a com

plete p

oly

nom

ial series of d

egree 2

for each

of th

e two no

nparam

etric functio

ns (G and

C) o

f our ap

pro

ach.

c. Since th

e input elasticities are h

eterogeneo

us acro

ss firms, w

e report th

e averag

e input elasticities w

ithin each

giv

en industry

.

Tab

le 5: A

verag

e Input E

lasticities of O

utp

ut--R

escaled V

alue A

dded

(Stru

ctural E

stimates: R

escaled V

alue A

dded

vs. G

ross O

uput)

Industry

(ISIC

Co

de)

Food

Pro

du

cts

(311)

Tex

tiles

(321)

Ap

parel

(322)

55

Page 56: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

Co

lom

bia

Valu

e

Added

(AC

F)

Valu

e

Added

(AC

F--

Rescaled

)

Gro

ss

Outp

ut

(GN

R)

Valu

e

Added

(AC

F)

Valu

e

Added

(AC

F--

Rescaled

)

Gro

ss

Outp

ut

(GN

R)

Valu

e

Added

(AC

F)

Valu

e

Added

(AC

F--

Rescaled

)

Gro

ss

Outp

ut

(GN

R)

Valu

e

Added

(AC

F)

Valu

e

Added

(AC

F--

Rescaled

)

Gro

ss

Outp

ut

(GN

R)

Valu

e

Added

(AC

F)

Valu

e

Added

(AC

F--

Rescaled

)

Gro

ss

Outp

ut

(GN

R)

Valu

e

Added

(AC

F)

Valu

e

Added

(AC

F--

Rescaled

)

Gro

ss

Outp

ut

(GN

R)

75/2

5 ratio

2.2

02.2

91.3

31.9

71.8

51.3

51.6

61.8

81.2

91.7

32.7

41.3

01.7

82.0

01.3

11.9

52.1

51.3

7(0

.07

)(0

.12

)(0

.02

)(0

.09

)(0

.12

)(0

.03

)(0

.03

)(0

.09

)(0

.01

)(0

.08

)(0

.44

)(0

.04

)(0

.04

)(0

.10

)(0

.02

)(0

.17

)(0

.06

)(0

.01

)

90/1

0 ratio

5.1

74.9

91.7

73.7

13.2

81.8

32.8

73.6

01.6

63.0

86.2

71.8

03.3

33.8

11.7

44.0

14.6

81.8

6(0

.27

)(0

.51

)(0

.05

)(0

.30

)(0

.39

)(0

.07

)(0

.09

)(0

.34

)(0

.03

)(0

.38

)(1

.51

)(0

.12

)(0

.13

)(0

.35

)(0

.03

)(0

.07

)(0

.24

)(0

.02

)

95/5

ratio11.0

18.7

92.2

46.3

65.0

02.3

84.3

65.4

12.0

24.5

810.5

32.2

45.3

15.8

52.1

66.8

67.9

12.3

6(1

.11

)(1

.20

)(0

.08

)(0

.76

)(0

.87

)(0

.14

)(0

.22

)(0

.67

)(0

.05

)(1

.01

)(3

.19

)(0

.22

)(0

.34

)(0

.69

)(0

.06

)(0

.02

)(0

.52

)(0

.03

)

Expo

rter3.6

20.8

40.1

40.2

00.0

70.0

20.1

6-0

.04

0.0

50.2

60.5

50.1

50.2

00.0

60.0

80.5

10.0

90.0

6(0

.99

)(0

.30

)(0

.05

)(0

.10

)(0

.08

)(0

.03

)(0

.07

)(0

.07

)(0

.03

)(0

.63

)(0

.60

)(0

.14

)(0

.05

)(0

.07

)(0

.03

)(0

.12

)(0

.06

)(0

.01

)

Impo

rter-0

.25

-0.4

00.0

40.2

70.0

80.0

50.2

9-0

.09

0.1

20.0

6-0

.20

0.0

50.2

60.1

50.1

00.2

00.0

50.1

1(0

.08

)(0

.08

)(0

.02

)(0

.10

)(0

.08

)(0

.04

)(0

.08

)(0

.05

)(0

.03

)(0

.53

)(0

.20

)(0

.08

)(0

.06

)(0

.04

)(0

.02

)(0

.05

)(0

.04

)(0

.01

)

Advertiser

-0.4

6-0

.42

-0.0

30.2

0-0

.16

0.0

80.1

3-0

.28

0.0

50.0

2-0

.24

0.0

40.1

5-0

.05

0.0

5-0

.13

-0.1

80.0

3(0

.10

)(0

.10

)(0

.02

)(0

.07

)(0

.07

)(0

.03

)(0

.04

)(0

.06

)(0

.02

)(0

.09

)(0

.13

)(0

.04

)(0

.04

)(0

.04

)(0

.02

)(0

.06

)(0

.03

)(0

.01

)

Wag

es > M

edian

0.5

90.1

50.0

90.6

00.3

50.1

80.4

10.3

10.1

80.3

40.1

90.1

50.5

50.2

70.2

20.6

30.2

90.2

0(0

.19

)(0

.18

)(0

.02

)(0

.09

)(0

.08

)(0

.03

)(0

.03

)(0

.05

)(0

.02

)(0

.17

)(0

.18

)(0

.04

)(0

.06

)(0

.04

)(0

.02

)(0

.05

)(0

.04

)(0

.01

)

Ch

ile

75/2

5 ratio

2.9

22.3

61.3

72.5

62.0

61.4

82.5

82.9

41.4

33.0

63.3

11.5

02.4

52.4

71.5

33.0

02.9

21.5

5(0

.05

)(0

.11

)(0

.01

)(0

.07

)(0

.17

)(0

.02

)(0

.07

)(0

.27

)(0

.02

)(0

.08

)(0

.36

)(0

.02

)(0

.06

)(0

.19

)(0

.02

)(0

.03

)(0

.10

)(0

.01

)

90/1

0 ratio

9.0

25.2

41.9

06.7

73.8

02.1

66.7

68.0

82.1

110.1

210.4

72.3

26.2

75.7

22.3

39.1

97.2

42.3

9(0

.30

)(0

.44

)(0

.02

)(0

.30

)(0

.57

)(0

.05

)(0

.33

)(1

.44

)(0

.05

)(0

.60

)(2

.23

)(0

.05

)(0

.27

)(0

.85

)(0

.05

)(0

.15

)(0

.44

)(0

.02

)

95/5

ratio21.2

98.9

12.4

813.5

65.5

42.9

114.2

113.8

92.7

725.0

819.8

33.1

112.5

29.4

03.1

320.9

012.2

33.3

1(0

.99

)(0

.96

)(0

.05

)(0

.84

)(1

.05

)(0

.09

)(0

.77

)(2

.94

)(0

.09

)(2

.05

)(5

.41

)(0

.11

)(0

.78

)(1

.84

)(0

.10

)(0

.47

)(0

.91

)(0

.04

)

Expo

rter0.2

70.3

40.0

20.0

7-0

.03

0.0

20.1

80.1

40.0

90.1

20.0

90.0

00.0

30.0

1-0

.01

0.2

00.1

20.0

3(0

.10

)(0

.12

)(0

.02

)(0

.07

)(0

.06

)(0

.03

)(0

.08

)(0

.11

)(0

.03

)(0

.12

)(0

.11

)(0

.03

)(0

.06

)(0

.07

)(0

.03

)(0

.04

)(0

.05

)(0

.01

)

Impo

rter0.7

10.4

40.1

40.2

20.1

00.1

00.3

10.1

80.1

40.4

40.2

10.1

50.3

00.1

90.1

10.4

60.3

70.1

5(0

.11

)(0

.15

)(0

.02

)(0

.05

)(0

.05

)(0

.02

)(0

.05

)(0

.07

)(0

.02

)(0

.10

)(0

.12

)(0

.03

)(0

.05

)(0

.07

)(0

.02

)(0

.03

)(0

.04

)(0

.01

)

Advertiser

0.1

80.0

50.0

40.0

90.0

00.0

40.1

50.0

60.0

60.0

4-0

.04

0.0

30.0

70.0

30.0

10.1

40.1

40.0

6(0

.05

)(0

.05

)(0

.01

)(0

.04

)(0

.04

)(0

.02

)(0

.04

)(0

.06

)(0

.02

)(0

.04

)(0

.06

)(0

.01

)(0

.04

)(0

.06

)(0

.02

)(0

.02

)(0

.02

)(0

.01

)

Wag

es > M

edian

1.2

30.6

60.2

10.4

70.3

40.1

90.6

20.5

70.2

20.6

80.5

30.2

10.5

60.4

70.2

20.9

90.8

90.3

0(0

.09

)(0

.07

)(0

.01

)(0

.06

)(0

.06

)(0

.02

)(0

.06

)(0

.09

)(0

.02

)(0

.08

)(0

.11

)(0

.02

)(0

.06

)(0

.09

)(0

.02

)(0

.04

)(0

.05

)(0

.01

)

Notes:

b. F

or each

indu

stry, th

e nu

mbers in

the first co

lum

n are b

ased o

n a v

alue-ad

ded

specificatio

n an

d are estim

ated u

sing a co

mplete p

oly

nom

ial series of d

egree 2

with

the m

ethod fro

m A

ckerb

erg, C

aves, an

d F

razer (20

06

). Th

e nu

mbers in

the seco

nd co

lum

n are o

btain

ed b

y raisin

g th

e valu

e-added

estimates to

the p

ow

er of o

ne m

inu

s the firm

's share o

f interm

ediate in

pu

ts in to

tal

ou

tpu

t. Th

e nu

mbers in

the th

ird co

lum

n are b

ased o

n a g

ross o

utp

ut sp

ecification

and are estim

ated u

sing a co

mplete p

oly

nom

ial series of d

egree 2

for each

of th

e two n

on

param

etric fun

ction

s (G an

d C

) of o

ur ap

pro

ach.

c. In th

e first three ro

ws w

e report ratio

s of p

rodu

ctivity

for p

lants at v

ariou

s percen

tiles of th

e pro

du

ctivity

distrib

utio

n. In

the rem

ainin

g fo

ur ro

ws w

e report estim

ates of th

e pro

du

ctivity

differen

ces betw

een p

lants (as a fractio

n) b

ased o

n w

heth

er they

hav

e exported

som

e of th

eir ou

tpu

t, imported

interm

ediate in

pu

ts, spen

t mon

ey o

n ad

vertisin

g, an

d p

aid w

ages ab

ove th

e indu

stry

med

ian. F

or ex

ample, in

indu

stry 3

11

for C

hile v

alue ad

ded

implies th

at a firm th

at advertises is, o

n av

erage, 1

8%

more p

rodu

ctive th

an a firm

that d

oes n

ot ad

vertise.

Ap

parel

(322)

Wood

Pro

du

cts

(331)

Fab

ricated

Meta

ls

(381)

All

Tab

le 6: H

eterogen

eity in

Pro

du

ctivity

--Rescaled

Valu

e Ad

ded

(Stru

ctural E

stimates: R

escaled V

alue A

dd

ed v

s. Gro

ss Oup

ut)

Industry

(ISIC

Co

de)

Food

Pro

du

cts

(311)

Tex

tiles

(321)

a. Stan

dard

errors are estim

ated u

sing th

e bootstrap

with

20

0 rep

lication

s and are rep

orted

in p

arenth

eses belo

w th

e poin

t estimates.

56

Page 57: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

Appendix A: The Olley-Pakes Estimator

The original Olley and Pakes (1996) approach differs from LP in that it aims to exploit the structure

of a predetermined input demand as an alternative to the flexible intermediate input as the proxy

variable. In particular they treat kt as the predetermined input in the model whereas the remaining

inputs (lt,mt in this case) are treated as flexible. By using demand for future capital (investment)

as a proxy,38 rather than intermediate inputs, it may appear that the identification problems we

raise above are less severe for the OP empirical strategy, as kt+1 is not a direct input into the

production function at t. However, we now show that capital as a proxy faces the same fundamental

identification problems we raise above.39

The key idea behind the OP approach is to assume that the structure of the predetermined input

demand satisfies the following scalar unobservability assumption

Assumption 6. The predetermined input demand is given by

kt = K (It−1) = K (kt−1, ωt−1)

where K is strictly monotone increasing in the second argument.

This assumption allows them to exploit kt+1 as a proxy variable for productivity ωt. This can

be seen by examining the “first stage” of their procedure in which they regress output yt on the

inputs (kt, lt,mt) as well as the proxy variable kt+1, which yields

E [yt | kt, lt,mt, kt+1] = f (kt,, lt,mt) + E [ωt | kt,, lt,mt, kt+1]

= f (kt,, lt,mt) + K−1 (kt+1, kt) , (32)

where the second equality follows from Assumption 6.

As Ackerberg, Caves, and Frazer (2015) have shown, since mt = M (kt, ωt) = M (kt, kt+1),

38Technically OP used investment it as the proxy, but given the process they assume for capital: Kt+1 =(1− depreciation)Kt + Ii, and conditional on kt, using kt+1 is equivalent to using it.

39This leaves aside altogether the original LP motivation that investment is often zero in the data and hence thoseobservations cannot be used for the estimation.

A-1

Page 58: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

and similarly for labor, in general the OP procedure faces the same identification problems regard-

ing the flexible input elasticities. However, in contrast to LP, ACF show that the economic structure

on the input decisions in OP admits some model-consistent sources of variation that would permit

the elasticity ∂∂xtE [yt | kt,, lt,mt, kt+1] to be identified in the data for xt ∈ {lt,mt}. If such sources

of variation exist, then for xt ∈ {lt,mt} the elasticity

∂xtE [yt | kt,, lt,mt, kt+1] =

∂xtf (kt,, lt,mt) (33)

is identified in the data. Notice also that since K−1 (kt+1, kt) = ωt in equation (32), it follows that

the ex-post productivity shock, εt = yt − E [yt | kt,, lt,mt, kt+1], is also identified.

The capital elasticity clearly cannot be identified from the first stage (33). The empirical strat-

egy proposed by OP for the capital elasticity makes use of parametric structure of the production

function, which they assumed took a Cobb-Douglas form (C-D for short):

f (kt,, lt,mt) = αkkt + αllt + αmmt.

The first stage of the model in this case is

E [yt | kt,, lt,mt, kt+1] = αkkt + αllt + αmmt + K−1 (kt+1, kt)

= αllt + αmmt + φ (kt+1, kt) (34)

which takes the form of a partially linear model (Robinson, 1988). The random variable φt ≡

φ (kt+1, kt) can be recovered in the data from the partially linear regression in (34), as can its

derivatives. The “second stage” of the OP empirical strategy to identify the capital elasticity is

based on regressing the new “dependent variable”

yt ≡ yt − αllt − αmmt

A-2

Page 59: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

on (kt, φt−1, kt−1). This second stage thus yields

E [y | kt, kt−1, φt−1] = αkkt + E [ωt | kt, kt−1, φt−1]

= αkkt + h (φt−1 − αkkt−1) . (35)

As we now show, however, under the model’s restrictions, the joint variation in (kt, kt−1, φt−1)

required to identify the capital elasticity αk is unlikely to exist in the data. The reason is based on

the following observation. Because ωt−1 = φt−1 − αkkt−1, which implies

kt = K (kt−1, ωt−1) = K (kt−1, φt−1) , (36)

there is no variation in kt conditional on (kt−1, φt−1).40 Therefore, the partial derivative

∂ktE [y | kt, kt−1, φt−1] = αk (37)

cannot be recovered directly in the data.

This observation gives rise to two fundamental problems for the OP solution to transmission

bias. The first problem is that any real data application is unlikely to actually exhibit the knife-edge

variation (36) implied by the model, i.e., real data will typically exhibit variation in kt conditional

on (kt−1, φt−1), in which case one could recover the LHS of (37). Applying their estimator to such

data would estimate αk using variation that the model predicts should not exist. The estimates of

“αk” in this case will have no structural meaning (i.e., will not correspond to the capital elasticity)

as the data rejects the model.

This identification problem regarding the OP approach can be seemingly alleviated if we try to

impose a version of Assumption 4 to generate an additional source of variation in kt conditional on

(kt−1, φt−1). If the model allowed for another source of variation χt that enters the predetermined

40This is in essence what Assumption 4 rules out in the nonparametric case.

A-3

Page 60: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

input demand function, we would have

kt = K (It−1) = K (kt−1, ωt−1, χt−1) .

The second problem is that such variation χt would represent additional dimensions of firm hetero-

geneity in addition to productivity ωt, and would thus violate the scalar unobservability assumption

that drives the OP approach.

If χt were observed and augmented to the data, then using it to resolve the identification prob-

lem would require that χt accounts for all the variation in kt conditional on kt−1 and φt−1. Any

unexplained variation would again violate the scalar unobservability assumption on the prede-

termined input. Even if this requirement is met, χt will likely be a higher dimensional random

variable that must now also be included in the nonparametric estimation of φ (kt, kt+1, χt). This

places a major burden on the data to estimate a high-dimensional nonparametric function. This is

especially problematic in applications because, as pointed out by LP, real data sets often contain a

large percentage of observations with zero investment which must be excluded from the empirical

estimation of φt, thus creating a severe small sample problem that would undermine this strategy.

Appendix B: A Parametric Example

In order to illustrate our non-identification result, we consider a parametric example. Suppose

that the true production function is Cobb-Douglas F (kt, lt,mt) = Kαkt Lαlt M

αmt , and productivity

follows an AR(1) process ωt = δ0 + δωt−1 + ηt.41 The parametric version of the instrumental

variables restriction in equation (6) is the following:

E [yt | Γt] = constant+αkkt+αllt+αmE [mt | Γt]+δ0+δ (φt−1 − αkkt−1 − αllt−1 − αmmt−1) ,

41For simplicity we assume that the prices of output and intermediate inputs are non-time-varying. While time-series variation in relative prices would provide a source of identifying variation here, as we discuss in footnote 21,relying on only time-series variation in prices is problematic for several reasons.

A-4

Page 61: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

where φt−1 = f (kt−1, lt−1,mt−1) + M−1 (kt−1, lt−1,mt−1). If we plug in for mt using the first-

order condition and combine constants we have

E [yt | Γt] = ˜constant+ αkkt + αllt + αm

(αkkt + αllt + δ (φt−1 − f (kt−1, lt−1,mt−1))

1− αm

)+δ (φt−1 − αkkt−1 − αllt−1 − αmmt−1)

= ˜constant+

(αk

1− αm

)kt +

(αl

1− αm

)lt

1− αm(φt−1 − αkkt−1 − αllt−1 − αmmt−1) .

Plugging in for the Cobb-Douglas parametric form of M−1, it can be shown that φt−1 = mt−1,

which implies

E [yt | Γt] = ˜constant+

(αk

1− αm

)kt+

(αl

1− αm

)lt−δ

(αk

1− αm

)kt−1−δ

(αl

1− αm

)lt−1−δmt−1.

Notice that, although there are five sources of variation (kt, lt, kt−1, lt−1,mt−1), the model is

not identified. Variation in mt−1 identifies δ, but the coefficient on kt is equal to the coefficient on

kt−1 multiplied by −δ, and the same is true for l. In other words, variation in kt−1 and lt−1 do not

provide any additional information about the parameters of the production function. As a result, all

we can identify is αk(

11−αm

)and αl

(1

1−αm

). To put it another way, the rank condition necessary

for identification of this model is not satisfied.

In terms of our proposed alternative structure in Theorem 1, we would have

αk = (1− a)αk ; αl = (1− a)αl ; αm = (1− a)αm + a ; δ = δ .

It immediately follows that(

αk1−αm

)=(

αk1−αm

)and

(αl

1−αm

)=(

αl1−αm

), and thus our continuum

of alternative structures indexed by a ∈ (0, 1) satisfy the instrumental variables restriction.

Doraszelski and Jaumandreu (2013) avoid this problem by exploiting both parametric restric-

tions and observed price variation as an instrument for identification. This illustrates an important

difference with our approach. In addition to not requiring parametric restrictions (or price varia-

A-5

Page 62: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

tion), we are not using the first-order condition to find a replacement function for ω in the produc-

tion function. Instead, we use it to form the share regression equation, which gives us a second

structural equation that we use in identification and estimation. In terms of our Cobb-Douglas

example, the second equation would be given by the following share equation st = αm− εt. Since

this equation identifies αm (given that E [εt] = 0), this is enough to allow for identification of the

whole production function and productivity.

Appendix C: Extensions

In this section we discuss four extensions to our baseline model: allowing for fixed effects, in-

corporating additional unobservables in the flexible input demand, allowing for multiple flexible

inputs, and revenue production functions.

C1. Fixed Effects

One benefit of our identification strategy is that it can easily incorporate fixed effects in the pro-

duction function. With fixed effects, the production function can be written as

yt = f (kt, lt,mt) + a+ ωt + εt, (38)

where a is a firm-level fixed effect.42 From the firm’s perspective, the optimal decision problem for

intermediate inputs is the same as before, as is the derivation of the nonparametric share regression

(equation 15), with ωt ≡ a+ ωt replacing ωt.

The other half of our approach can be easily augmented to allow for the fixed effects. We

follow the dynamic panel data literature and impose that persistent productivity ω follows a first-

order linear Markov process to difference out the fixed effects: ωt = δωt−1 + ηt.43 The equivalent

42Kasahara, Schrimpf, and Suzuki (2015) generalize our approach to allow for the entire production function to befirm-specific.

43For simplicity we use an AR(1) here, but higher order auto-regressive models can be incorporated as well. Weomit the constant from the Markov process since it is not separately identified from the mean of the fixed effects.

A-6

Page 63: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

of equation (9) is given by:

Yt = a− C (kt, lt) + δ (Yt−1 + C (kt−1, lt−1)) + ηt.

Subtracting the counterpart for period t − 1 eliminates the fixed effect. Re-arranging terms leads

to:

Yt − Yt−1 = − (C (kt, lt)− C (kt−1, lt−1)) + δ (Yt−1 − Yt−2)

+δ (C (kt−1, lt−1)− C (kt−2, lt−2)) + (ηt − ηt−1) .

Recall that E [ηt | Γt] = 0. Since Γt−1 ⊂ Γt, this implies that E [ηt − ηt−1 | Γt−1] = 0, where Γt−1

includes (kt−1, lt−1,Yt−2, kt−2, lt−2,Yt−3, ...).

Let

r (kt, lt, kt−1, lt−1, (Yt−1 − Yt−2) , kt−2, lt−2) = − (C (kt, lt)− C (kt−1, lt−1)) (39)

+δ (Yt−1 − Yt−2)

+δ (C (kt−1, lt−1)− C (kt−2, lt−2)) .

From this we have the following nonparametric IV equation

E [Yt − Yt−1 | kt−1, lt−1,Yt−2, kt−2, lt−2, kt−3, lt−3]

= E [r (kt, lt, kt−1, lt−1, (Yt−1 − Yt−2) , kt−2, lt−2) | kt−1, lt−1,Yt−2, kt−2, lt−2, kt−3, lt−3] ,

which is an analogue to equation (11), in the case without fixed effects.

Theorem 5. Under Assumptions 2 - 5, plus the additional assumption that the distribution of the

endogenous variables conditional on the exogenous variables (i.e., instruments),

G (lt, kt, lt−1, kt−1, (Yt−1 − Yt−2) , lt−2, kt−2 | lt−3, kt−3, lt−1, kt−1,Yt−2, lt−2, kt−2), is complete (as

defined in Newey and Powell, 2003), the production function f is nonparametrically identified up

to an additive constant if ∂∂mt

f (lt, kt,mt) is nonparametrically known.

A-7

Page 64: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

Following the first part of the proof of Theorem 3, we know that the production function is

identified up to an additive function C (kt, lt). Following directly from Newey and Powell (2003),

we know that, if the distribution G is complete, then the function r defined in equation (39) is

identified.

Let(C , δ

)be a candidate alternative structure. The two structures (C , δ) and

(C , δ

)are

observationally equivalent if and only if

− (C (kt, lt)− C (kt−1, lt−1)) + δ (Yt−1 − Yt−2) + δ (C (kt−1, lt−1)− C (kt−2, lt−2))

= −(C (kt, lt)− C (kt−1, lt−1)

)+ δ (Yt−1 − Yt−2) + δ

(C (kt−1, lt−1)− C (kt−2, lt−2)

).

(40)

By taking partial derivatives of both sides of (40) with respect to kt and lt we obtain

∂zC (kt, lt) =

∂zC (kt, lt)

for z ∈ {kt, lt}, which implies C (kt, lt)− C (kt, lt) = c for a constant c. Thus we have shown the

production function is identified up to an additive constant.

The estimation strategy for the model with fixed effects is almost exactly the same as without

fixed effects. The first stage, estimating Dr (kt, lt,mt), is the same. We then form Yt in the same

way. We also use the same series estimator for C (kt, lt). This generates analogues to equations

(22) and (23):

Yt−Yt−1 = −∑

0<τk+τl≤τ

ατk,τlkτkt l

τlt +

∑0<τk+τl≤τ

ατk,τlkτkt−1l

τlt−1+δ (ωt−1 − ωt−2)+(ηt − ηt−1) (41)

andYt − Yt−1 = −

∑0<τk+τl≤τ ατk,τlk

τkt l

τlt + δ (Yt−1 − Yt−2)

+ (δ + 1)(∑

0<τk+τl≤τ ατk,τlkτkt−1l

τlt−1

)−δ(∑

0<τk+τl≤τ ατk,τlkτkt−2l

τlt−2

)+ (ηt − ηt−1) .

(42)

We can use similar moments as for the model without fixed effects, except that now we need to lag

the instruments one period given the differencing involved. Therefore the following moments can

A-8

Page 65: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

be used to form a standard sieve GMM criterion function to estimate (α, δ): E [(ηt − ηt−1) kτkt−ιlτlt−ι],

for ι ≥ 1.

C2. Extra Unobservables

Our identification and estimation approach can also be extended to incorporate additional unob-

servables driving the intermediate input demand.

In our baseline model, our system of equations consists of the share equation and the production

function given by

st = lnD (kt, lt,mt) + ln E − εt

yt = f (kt, lt,mt) + ωt + εt.

We now show that our model can be extended to include an additional structural unobservable to

the share equation for intermediate inputs, which we denote by ψt:

st = lnD (kt, lt,mt) + ln E − εt − ψt (43)

yt = f (kt, lt,mt) + ωt + εt,

where E ≡ E[eψt+εt

].

Assumption 7. ψt ∈ It is known to the firm at the time of making its period t decisions and is not

persistent: Pψ (ψt | It−1) = Pψ (ψt).

C2.1. Interpretations for the extra unobservable

We now discuss some possible interpretations for the extra unobservable ψ.

Shocks to prices of output and/or intermediate inputs Suppose that the prices of output and

intermediate inputs, Pt and ρt, are not fully known when firm j decides its level of intermediate

A-9

Page 66: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

inputs, but that the firm has private signals about the prices, denoted P ∗jt and ρ∗jt, where

lnP ∗jt = lnPt − νjt,

ln ρ∗jt = ln ρt − νMjt ,

where we add the firm subscripts j for clarity. Firms maximize expected profits conditional on

their signals:

M (kjt, ljt, ωt) = maxMjt

Eε,ν,νM[PtF (kjt, ljt,mjt) e

ωjt+εjt − ρtMjt | P ∗jt, ρ∗jt]

= maxMjt

Eε,ν,νM[(P ∗jte

νjt)F (kjt, ljt,mjt) e

ωjt+εjt −(ρ∗jte

νMjt

)Mjt | P ∗jt, ρ∗jt

]= max

Mjt

E (eνjt)E (eεjt)P ∗jtF (kjt, ljt,mjt) eωjt − E

(eν

Mjt

)ρ∗jtMjt.

This implies that the firm’s first order condition for intermediate inputs is given by

E (eνjt)E (eεjt)P ∗jt∂

∂Mjt

F (kjt, ljt,mjt) eωjt − E

(eν

Mjt

)ρ∗jt = 0,

which can be rewritten as

(ρtMjt

PtYjt

)=E (eνjt)E (eεjt)

E(eν

Mjt

) ∂

∂mjt

f (kjt, ljt,mjt)eν

Mjt

evjteεjt.

Let ψjt ≡ νjt − νMjt , and we have

lnρtMjt

PtYjt= sjt = lnD (kjt, ljt,mjt) + ln E − εjt − ψjt.

Optimization error Suppose that firms do not exactly know their productivity, ωt, when they

make their intermediate input decision. Instead, they observe a signal about productivity ω∗t =

ωt − ψt, where ψt denotes the noise in the signal. The firm’s profit maximization problem with

A-10

Page 67: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

respect to intermediate inputs is

M (kt, lt, ωt) = arg maxMt

PtEε,ψ[F (kt, lt,mt) e

ω∗t+ψt+εt]− ρtMt.

This implies the following first order condition

Pt∂

∂Mt

F (kt, lt,mt) eω∗tEε,ψ

[eψt+εt

]= ρt

Re-arranging to solve for the share of intermediate inputs gives us the share equation

lnρtMt

PtYt= st = lnD (kt, lt,mt) + ln E − εt − ψt.

Notice that for both interpretations of ψ, the firm will take into account the value of E[eεt+ψt

]when deciding on the level of intermediate inputs, which means we want to correct the share

estimates by this term. As in the baseline model, we can recover this term by estimating the share

equation, forming the residuals, εt + ψt, and computing the expectation of eεt+ψt .

C2.2. Identification

The identification of the share equation is similar to our main specification, but with two differ-

ences. The first is that, since ψt drives intermediate input decisions and is in the residual of the

modified share equation (43), intermediate inputs are now endogenous in the share equation. As

a result, we need to instrument for mt in the share regression. We can use mt−1 as an instrument

for mt, since it is correlated with mt and independent of the error (εt + ψt). Since in the share re-

gression we condition only on kt and lt (and no lags), mt−1 generates variation in mt (conditional

on kt and lt), due to Assumptions 3 and 4. Identification follows from standard nonparametric IV

arguments as in Newey and Powell (2003).

The second difference is that the error in the share equation is εt + ψt instead of εt. We can

A-11

Page 68: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

form an alternative version of Yt, which we denote Yt:

Yt ≡ yt −∫D (kt, lt,mt) dmt − (εt + ψt) = Yt − ψt. (44)

This generates an analogous equation to equation (8) in the paper:

Yt = −C (kt, lt) + ωt − ψt ⇒ ωt = Yt + C (kt, lt) + ψt.

Re-arranging terms and plugging in the Markovian structure of ω gives us:

Yt = −C (kt, lt) + h

Yt−1 + C (kt−1, lt−1) + ψt−1︸ ︷︷ ︸ωt−1

+ ηt − ψt, (45)

which is an analogue of equation (9).

The challenge is that we cannot form ωt−1, the argument of h in equation (45), because ψt−1

is not observed. We can, however, construct two noisy measures of ωt−1: (ωt−1 + εt−1) and

(ωt−1 − ψt−1) where

ωt−1 + εt−1 = yt−1 − f (kt−1, lt−1,mt−1)

= yt−1 −∫D (kt−1, lt−1,mt−1) dmt−1 + C (kt−1, lt−1)

ωt−1 − ψt−1 = (ωt−1 + εt−1)− (εt−1 + ψt−1)

=

(yt−1 −

∫D (kt−1, lt−1,mt−1) dmt−1 + C (kt−1, lt−1)

)− (st−1 − lnD (kt−1, lt−1,mt−1)) .

We could proceed to identify h and C from equation (45) by adopting methods from the mea-

surement error literature (Hu and Schennach (2008) and Cunha, Heckman, and Schennach (2010))

using one of the noisy measures as our measure of ωt−1 and using the other as an instrument.

However, such an exercise is beyond the scope of the current paper.

A-12

Page 69: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

Instead, we illustrate our approach using an AR(1) process for the evolution of ω: h (ωt−1) =

δ0 + δωt−1 + ηt. We can then re-write equation (45) as

Yt = −C (kt, lt) + δ0 + δ(Yt−1 + C (kt−1, lt−1)

)+ ηt − ψt + δψt−1, (46)

where now the residual is given by ηt − ψt + δψt−1. Given Assumptions 2 and 7, we have that

E [ηt − ψt + δψt−1 | Γt−1] = 0, where recall that Γt−1 = Γ (It−2), i.e., a transformation of the

period t− 2 information set. If we let

r(kt, lt, Yt−1, kt−1, lt−1

)= −C (kt, lt) + δ0 + δ

(Yt−1 + C (kt−1, lt−1)

),

then identification of equation (46) follows from a parallel argument to that in Theorem 5 (i.e., in-

cluding the completeness assumption and following the nonparametric IV identification arguments

in Newey and Powell (2003)). Therefore we can identify the entire production function, up to an

additive constant. We can also identify δ0 and δ, as well as productivity: ω + ε.

C3. Multiple Flexible Inputs

Suppose that, in addition to intermediate inputs being flexible, the researcher believes that one

or more additional inputs are also flexible.44 Our approach can also be extended to handle this

case. In what follows we assume that labor is the additional flexible input, but the approach can be

extended to allow for more than two flexible inputs.

When labor and intermediate inputs are both assumed to be flexible, we have two share equa-

tions. We use superscripts M and L to distinguish them. Given the extra equation, we allow for

additional structural errors in the model, ψ, as described in the preceding sub-section. Our system

44See, for example, Doraszelski and Jaumandreu (2013).

A-13

Page 70: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

of equations is thus given by:

sMt = lnDM (kt, lt,mt) + ln EM − εt − ψMt

sLt = lnDL (kt, lt,mt) + ln EL − εt − ψLt

yt = f (kt, lt,mt) + ωt + εt.

Nonparametric identification of the flexible input elasticities of L and M proceeds as above in

Appendix C2.

These two input elasticities define a system of partial differential equations of the production

function. By the fundamental theorem of calculus we have

∫ mt

m0

∂mt

f (kt, lt,mt) dmt = f (kt, lt,mt) + CM (kt, lt)

and

∫ lt

l0

∂ltf (kt, lt,mt) dlt = f (kt, lt,mt) + C L (kt,mt)

where now we have two constants of integration, one for each integrated share equation, CM (kt, lt)

and C L (kt,mt). Following directly from Varian (1992), these partial differential equations can be

combined to construct the production function as follows:

f (kt, lt,mt) =

mt∫m0

∂mt

f (kt, l0, s) ds+

lt∫l0

∂ltf (kt, τ,mt) dτ − C (kt) . (47)

That is, by integrating the (log) elasticities of intermediate inputs and labor, we can construct the

production function up to a constant that is a function of capital only.45

45In order to see why this is the case, evaluate the integrals on the RHS of equation (47), we have the following

f (kt, lt,mt) =(f (kt, l0,mt)− CM (kt, l0)

)−(f (kt, l0,m0)− CM (kt, l0)

)+(f (kt, lt,mt)− C L (kt,mt)

)−(f (kt, l0,mt)− C L (kt,mt)

)+f (kt, l0,m0)

= f (kt, lt,mt) ,

A-14

Page 71: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

We can now construct an analogue to equation (44) above using the residual from either share

equation. Using the intermediate input share equation, we have

˜Yt ≡ yt −mt∫

m0

∂mt

f (kt, l0, s) ds−lt∫

l0

∂ltf (kt, τ,mt) dτ −

(εt + ψMt

)(48)

By subtracting equation (48) from the production function and re-arranging terms we have

˜Yt = −C (kt) + ωt − ψMt .

Plugging in the Markovian structure of ω gives us

˜Yt = −C (kt) + h

˜Y t−1 + C (kt−1) + ψMt−1︸ ︷︷ ︸ωt−1

+ ηt − ψMt , (49)

an analogue to equation (45) above. Identification of C and h can be achieved in the same way as

described in C2 for equation (45), with the difference that in this case C only depends on capital.

C4. Revenue Production Functions

We now show that our empirical strategy can be extended to the setting with imperfect competition

and revenue production functions such that 1) we solve the identification problem with flexible

inputs and 2) we can recover time-varying industry markups.46 We specify a generalized version

of the demand system in Klette and Griliches (1996) and De Loecker (2011),

PjtΠt

=

(YjtYt

) 1σt

eχjt , (50)

where f (kt, l0,m0) ≡ C (kt) is a constant of integration that is a function of capital kt.46This stands in contrast to the Klette and Griliches (1996) approach that can only allow for a markup that is time-

invariant.

A-15

Page 72: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

where Pjt is the output price of firm j, Πt is the industry price index, Yt is a quantity index that

plays the role of an aggregate demand shifter,47 χjt is an observable (to the firm) demand shock,

and σt is the elasticity of demand that is allowed to vary over time.

Substituting for price using equation (50), the firm’s first order condition with respect to Mjt

in the (expected) profit maximization problem is

(1

σt+ 1

)Πt

Y1σtjt

Y1σtt

1

e1σtεjt

∂Mjt

F (kjt, ljt,mjt) eχjtE

[eεjt

(1σt

+1)]

= ρt.

Following the same strategy as before, we can rewrite this expression in terms of the observed log

revenue share, which becomes

sjt = ln

(1

σt+ 1

)+ ln

(D (kjt, ljt,mjt)E

[eεjt

(1σt

+1)])−(

1

σt+ 1

)εjt, (51)

where sjt ≡ ln(ρtMt

PjtYt

), 1(

1σt

+1) is the expected markup, D (·) is the output elasticity of interme-

diate inputs, and εjt is the ex-post shock. Equation (51) nests the one obtained for the perfectly

competitive case in (15), the only difference being the addition of the expected markup, which is

equal to 1 under perfect competition.

We now show how to use the share regression (51) to identify production functions among

imperfectly competitive firms. Letting εjt =(

1σt

+ 1)εjt, equation (51) becomes

sjt = Υt + lnD (kjt, ljt,mjt) + ln E − εjt, (52)

where E = E[eεjt]

and Υt = ln(

1σt

+ 1)

. The intermediate input elasticity can be rewritten so

that we can break it into two parts: a component that varies with inputs and a constant µ, i.e.,

47As noted by Klette and Griliches (1996) and De Loecker (2011), Yt can be calculated using a market-shareweighted average of deflated revenues.

A-16

Page 73: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

lnD (kjt, ljt,mjt) = lnDµ (kjt, ljt,mjt) + µ. Then, equation (52) becomes

sjt = (Υt + µ) + ln E + lnDµ (kjt, ljt,mjt)− εjt

= ϕt + ln E + lnDµ (kjt, ljt,mjt)− εjt.(53)

As equation (53) makes clear, without observing prices, we can nonparametrically recover the

scaled ex-post shock εjt (and hence E), the output elasticity of intermediate inputs up to a constant

lnDµ(kjt, ljt,mjt) = lnD(kjt, ljt,mjt)−µ, and the time-varying markups up to the same constant,

ϕt = Υt + µ, using time dummies for ϕt. Recovering the growth pattern of markups over time is

useful as an independent result as it can, for example, be used to check whether market power has

increased over time, or to analyze the behavior of market power with respect to the business cycle.

As before, we can correct our estimates for E and solve the differential equation that arises

from equation (53). However, because we can still only identify the elasticity up to the constant µ,

we have to be careful about keeping track of it as we can only calculate∫Dµ (kjt, ljt,mjt) dmjt =

e−µ∫D (kjt, ljt,mjt) dmjt. It follows that

f (kjt, ljt,mjt) e−µ + C (kjt, ljt) e

−µ =

∫Dµ (kjt, ljt,mjt) dmjt.

From this equation it is immediately apparent that, without further information, we will not be able

to separate the integration constant C (kjt, ljt) from the unknown constant µ.

To see how both the constant µ and the constant of integration can be recovered, notice that

what we observe in the data is the firm’s real revenue, which in logs is given by rjt = (pjt − πt) +

yjt. Recalling equation (2), and replacing for pjt−πt using (50), the observed log-revenue produc-

tion function is

rjt =

(1

σt+ 1

)f(kjt, ljt,mjt)−

1

σtyt + χjt +

(1

σt+ 1

)ωjt + εjt. (54)

However, we can write(

1 + 1σt

)= eϕte−µ. We know ϕt from our analysis above, so only µ is

A-17

Page 74: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

unknown. Replacing back into (54) we get

rjt = eϕte−µf (kjt, ljt,mjt)−(eϕte−µ − 1

)yt (55)

+[(eϕte−µ

)ωjt + χjt

]+ εjt.

We then follow a similar strategy as before. As in equation (8), we first form an observable

variable

Rjt ≡ ln

(PjtYjt

Πt

eεjteeϕt∫Dµ(kjt,ljt,mjt)dmjt

),

where we now use revenues (the measure of output we observe), include eϕt , as well as using Dµ

instead of the, for now, unobservable D. Replacing into (55) we obtain

Rjt = −eϕt−µC (kjt, ljt)−(eϕte−µ − 1

)yt +

[(eϕte−µ

)ωjt + χjt

].

From this equation it is clear that the constant µ will be identified from variation in the observed

demand shifter yt. Without having recovered ϕt from the share regression first, it would not be

possible to identify time-varying markups. Note that in equation (54) both σt and yt change with

time, and hence yt cannot be used to identify σt unless we restrict σt = σ as in Klette and Griliches

(1996) and De Loecker (2011).

Finally, we can only recover a linear combination of productivity and the demand shock,(1 + 1

σt

)ωjt + χjt. The reason is clear: since we do not observe prices, we have no way of

disentangling whether, after controlling for inputs, a firm has higher revenues because it is more

productive (ωjt) or because it can sell at a higher price (χjt). We can write ωµjt =(

1 + 1σt

)ωjt+χjt

as a function of the parts that remain to be recovered

ωµjt = Rjt + eϕt−µC (kjt, ljt) +(eϕte−µ − 1

)yt,

A-18

Page 75: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

and impose the Markovian assumption on this combination:48 ωµjt = h(ωµjt−1

)+ ηµjt. We can use

similar moment restrictions as before, E(ηµjt|kjt, ljt

)= 0, to identify the constant of integration

C (kjt, ljt) as well as µ (and hence the level of the markups).

48In this case, one would need to replace Assumption 2 with the assumption that the weighted sum of productivityωjt and the demand shock, χjt is Markovian. Note that this assumption does not necessarily imply that the twocomponents will be Markovian individually. See De Loecker (2011) for an example that imposes this assumption.

A-19

Page 76: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

Colombia

Food Products

(311)

Textiles

(321)

Apparel

(322)

Wood Products

(331)

Fabricated Metals

(381) All

Gross Output

(GNR)

Gross Output

(GNR)

Gross Output

(GNR)

Gross Output

(GNR)

Gross Output

(GNR)

Gross Output

(GNR)

Labor 0.15 0.21 0.37 0.28 0.29 0.22

(0.02) (0.03) (0.03) (0.06) (0.03) (0.01)

Capital 0.06 0.05 0.05 0.01 0.04 0.08

(0.01) (0.02) (0.01) (0.03) (0.02) (0.01)

Raw Materials 0.71 0.69 0.49 0.63 0.58 0.63

(0.02) (0.03) (0.04) (0.07) (0.03) (0.01)

Energy+Services 0.08 0.11 0.09 0.10 0.11 0.11

(0.00) (0.00) (0.00) (0.00) (0.00) (0.00)

Sum 1.01 1.05 1.00 1.02 1.02 1.04

(0.01) (0.02) (0.01) (0.03) (0.01) (0.01)

Mean(Capital) /

Mean(Labor) 0.43 0.23 0.12 0.04 0.14 0.37

(0.08) (0.14) (0.04) (0.08) (0.06) (0.04)

Chile

Labor 0.18 0.28 0.31 0.29 0.31 0.22

(0.02) (0.03) (0.03) (0.04) (0.02) (0.01)

Capital 0.06 0.08 0.04 0.06 0.08 0.11

(0.01) (0.01) (0.01) (0.02) (0.01) (0.01)

Raw Materials 0.77 0.65 0.65 0.59 0.63 0.67

(0.02) (0.02) (0.02) (0.05) (0.02) (0.01)

Energy+Services 0.07 0.07 0.06 0.11 0.07 0.07

(0.00) (0.00) (0.00) (0.00) (0.00) (0.00)

Sum 1.08 1.08 1.06 1.05 1.10 1.08

(0.01) (0.01) (0.01) (0.02) (0.01) (0.00)

Mean(Capital) /

Mean(Labor) 0.36 0.28 0.13 0.20 0.26 0.49

(0.04) (0.05) (0.04) (0.06) (0.04) (0.03)

Notes:

a. Standard errors are estimated using the bootstrap with 200 replications and are reported in parentheses below the point estimates.

b. For each industry, the numbers are based on a gross output specification in which energy+services is flexible and raw materials is quasi-fixed. The results are estimated using a complete polynomial series of

degree 2 for each of the two nonparametric functions (G and C ) of our approach.

c. Since the input elasticities are heterogeneous across firms, we report the average input elasticities within each given industry.

Table D1: Average Input Elasticities of Output--Energy+Services Flexible(Structural Estimates: Gross Ouput)

d. The row titled "Sum" reports the sum of the average labor, capital, raw materials, and energy+services elasticities, and the row titled "Mean(Capital)/Mean(Labor)" reports the ratio of the average capital

elasticity to the average labor elasticity.

Appendix D: Additional Results

A-20

Page 77: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

Colombia

Food Products

(311)

Textiles

(321)

Apparel

(322)

Wood Products

(331)

Fabricated Metals

(381) All

Gross Output

(GNR)

Gross Output

(GNR)

Gross Output

(GNR)

Gross Output

(GNR)

Gross Output

(GNR)

Gross Output

(GNR)

75/25 ratio 1.20 1.25 1.24 1.30 1.28 1.33(0.02) (0.03) (0.03) (0.06) (0.02) (0.01)

90/10 ratio 1.50 1.62 1.60 1.75 1.68 1.83(0.05) (0.10) (0.07) (0.16) (0.06) (0.03)

95/5 ratio 1.87 2.09 2.00 2.26 2.04 2.43(0.09) (0.22) (0.11) (0.24) (0.11) (0.08)

Exporter 0.14 -0.04 0.02 0.14 0.08 0.01(0.04) (0.06) (0.03) (0.12) (0.02) (0.03)

Importer 0.00 -0.03 -0.03 -0.04 0.10 -0.02(0.02) (0.06) (0.03) (0.05) (0.02) (0.05)

Advertiser -0.12 -0.13 -0.10 -0.07 0.05 -0.16(0.03) (0.11) (0.04) (0.06) (0.02) (0.05)

Wages > Median 0.06 0.13 0.14 0.09 0.19 0.10(0.02) (0.05) (0.02) (0.05) (0.02) (0.04)

Chile

75/25 ratio 1.31 1.42 1.41 1.44 1.48 1.49(0.01) (0.02) (0.02) (0.04) (0.02) (0.01)

90/10 ratio 1.76 2.04 2.01 2.13 2.23 2.26(0.03) (0.05) (0.04) (0.12) (0.05) (0.02)

95/5 ratio 2.22 2.69 2.65 2.94 2.94 3.08(0.05) (0.12) (0.07) (0.21) (0.08) (0.04)

Exporter -0.01 -0.06 0.01 0.01 -0.05 -0.03(0.03) (0.03) (0.03) (0.05) (0.03) (0.01)

Importer 0.02 0.04 0.07 0.09 0.05 0.07(0.02) (0.02) (0.02) (0.04) (0.02) (0.01)

Advertiser -0.01 -0.01 0.01 0.01 0.00 0.02(0.01) (0.02) (0.01) (0.01) (0.02) (0.01)

Wages > Median 0.12 0.15 0.19 0.17 0.16 0.25(0.01) (0.02) (0.02) (0.03) (0.03) (0.01)

Notes:

a. Standard errors are estimated using the bootstrap with 200 replications and are reported in parentheses below the point estimates.

b. For each industry, the numbers are based on a gross output specification in which energy+services is flexible and raw materials is quasi-fixed. The results are estimated using a complete polynomial series of

degree 2 for each of the two nonparametric functions (G and C ) of our approach.

c. In the first three rows we report ratios of productivity for plants at various percentiles of the productivity distribution. In the remaining four rows we report estimates of the productivity differences between

plants (as a fraction) based on whether they have exported some of their output, imported intermediate inputs, spent money on advertising, and paid wages above the industry median. For example, in industry 311

for Chile a firm that advertises is, on average, 1% less productive than a firm that does not advertise.

Table D2: Heterogeneity in Productivity--Energy+Services Flexible(Structural Estimates: Gross Output)

A-21

Page 78: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

Colombia

Food Products

(311)

Textiles

(321)

Apparel

(322)

Wood Products

(331)

Fabricated Metals

(381) All

Gross Output

(GNR)

Gross Output

(GNR)

Gross Output

(GNR)

Gross Output

(GNR)

Gross Output

(GNR)

Gross Output

(GNR)

Labor 0.13 0.21 0.33 0.28 0.30 0.24

(0.02) (0.04) (0.03) (0.06) (0.02) (0.01)

Capital 0.05 0.07 0.02 0.01 0.05 0.06

(0.01) (0.03) (0.02) (0.04) (0.02) (0.02)

Raw Materials 0.60 0.44 0.41 0.42 0.42 0.43

(0.01) (0.01) (0.01) (0.02) (0.01) (0.01)

Energy+Services 0.22 0.30 0.23 0.26 0.28 0.29

(0.02) (0.05) (0.03) (0.07) (0.04) (0.02)

Sum 1.00 1.01 1.00 0.97 1.05 1.02

(0.01) (0.04) (0.03) (0.05) (0.04) (0.01)

Mean(Capital) /

Mean(Labor) 0.39 0.33 0.07 0.04 0.16 0.26

(0.12) (0.26) (0.06) (0.15) (0.07) (0.09)

Chile

Labor 0.19 0.34 0.35 0.38 0.42 0.26

(0.02) (0.03) (0.04) (0.04) (0.05) (0.02)

Capital 0.06 0.07 0.06 0.06 0.12 0.11

(0.01) (0.02) (0.02) (0.02) (0.03) (0.01)

Raw Materials 0.59 0.47 0.49 0.46 0.43 0.47

(0.00) (0.01) (0.01) (0.01) (0.01) (0.00)

Energy+Services 0.21 0.18 0.15 0.16 0.18 0.21

(0.02) (0.03) (0.04) (0.03) (0.06) (0.02)

Sum 1.05 1.06 1.05 1.07 1.14 1.05

(0.02) (0.02) (0.02) (0.02) (0.03) (0.01)

Mean(Capital) /

Mean(Labor) 0.32 0.20 0.16 0.16 0.28 0.41

(0.04) (0.06) (0.04) (0.06) (0.07) (0.04)

Notes:

d. The row titled "Sum" reports the sum of the average labor, capital, raw materials, and energy+services elasticities, and the row titled "Mean(Capital)/Mean(Labor)" reports the ratio of the average capital

elasticity to the average labor elasticity.

Table D3: Average Input Elasticities of Output--Raw Materials Flexible(Structural Estimates: Gross Ouput)

a. Standard errors are estimated using the bootstrap with 200 replications and are reported in parentheses below the point estimates.

b. For each industry, the numbers are based on a gross output specification in which raw materials is flexible and energy+services is quasi-fixed. The results are estimated using a complete polynomial series of

degree 2 for each of the two nonparametric functions (G and C ) of our approach.

c. Since the input elasticities are heterogeneous across firms, we report the average input elasticities within each given industry.

A-22

Page 79: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

Colombia

Food Products

(311)

Textiles

(321)

Apparel

(322)

Wood Products

(331)

Fabricated Metals

(381) All

Gross Output

(GNR)

Gross Output

(GNR)

Gross Output

(GNR)

Gross Output

(GNR)

Gross Output

(GNR)

Gross Output

(GNR)

75/25 ratio 1.24 1.27 1.25 1.24 1.24 1.31(0.02) (0.07) (0.03) (0.07) (0.07) (0.02)

90/10 ratio 1.58 1.64 1.59 1.62 1.57 1.72(0.03) (0.16) (0.08) (0.19) (0.19) (0.05)

95/5 ratio 1.92 2.10 1.90 2.01 1.89 2.13(0.06) (0.24) (0.13) (0.41) (0.30) (0.07)

Exporter 0.09 -0.01 0.04 0.05 0.01 0.04(0.04) (0.09) (0.07) (0.18) (0.11) (0.01)

Importer 0.02 0.03 0.08 -0.03 0.05 0.07(0.02) (0.09) (0.10) (0.15) (0.08) (0.01)

Advertiser -0.06 0.01 -0.01 -0.02 -0.02 -0.03(0.02) (0.05) (0.04) (0.05) (0.05) (0.01)

Wages > Median 0.04 0.11 0.14 0.09 0.11 0.13(0.02) (0.07) (0.03) (0.06) (0.07) (0.01)

Chile

75/25 ratio 1.34 1.44 1.40 1.50 1.52 1.51(0.02) (0.02) (0.02) (0.02) (0.04) (0.01)

90/10 ratio 1.82 2.13 2.05 2.31 2.26 2.32(0.07) (0.06) (0.06) (0.06) (0.13) (0.02)

95/5 ratio 2.32 2.80 2.70 3.09 2.98 3.19(0.12) (0.10) (0.10) (0.11) (0.25) (0.04)

Exporter -0.02 0.01 0.05 -0.01 -0.03 0.01(0.06) (0.03) (0.03) (0.03) (0.03) (0.01)

Importer 0.06 0.06 0.09 0.12 0.06 0.13(0.06) (0.03) (0.03) (0.04) (0.03) (0.01)

Advertiser 0.00 0.02 0.02 0.03 -0.02 0.04(0.02) (0.02) (0.03) (0.01) (0.02) (0.01)

Wages > Median 0.13 0.15 0.18 0.19 0.17 0.25(0.04) (0.03) (0.03) (0.02) (0.04) (0.01)

Notes:

a. Standard errors are estimated using the bootstrap with 200 replications and are reported in parentheses below the point estimates.

b. For each industry, the numbers are based on a gross output specification in which raw materials is flexible and energy+services is quasi-fixed. The results are estimated using a complete polynomial series of

degree 2 for each of the two nonparametric functions (G and C ) of our approach.

c. In the first three rows we report ratios of productivity for plants at various percentiles of the productivity distribution. In the remaining four rows we report estimates of the productivity differences between

plants (as a fraction) based on whether they have exported some of their output, imported intermediate inputs, spent money on advertising, and paid wages above the industry median. For example, in industry 311

for Chile a firm that advertises is, on average, 0% less productive than a firm that does not advertise.

Table D4: Heterogeneity in Productivity--Raw Materials Flexible(Structural Estimates: Gross Output)

A-23

Page 80: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

Colombia

Food Products

(311)

Textiles

(321)

Apparel

(322)

Wood Products

(331)

Fabricated Metals

(381) All

Gross Output

(GNR)

Gross Output

(GNR)

Gross Output

(GNR)

Gross Output

(GNR)

Gross Output

(GNR)

Gross Output

(GNR)

Labor 0.18 0.26 0.39 0.46 0.29 0.33

(0.05) (0.07) (0.04) (0.12) (0.09) (0.02)

Capital 0.09 0.04 0.06 0.25 0.09 0.07

(0.07) (0.06) (0.04) (0.16) (0.11) (0.02)

Intermediates 0.67 0.54 0.52 0.51 0.53 0.54

(0.01) (0.01) (0.01) (0.02) (0.01) (0.00)

Sum 0.95 0.84 0.96 1.22 0.90 0.95

(0.12) (0.11) (0.07) (0.26) (0.18) (0.04)

Mean(Capital) /

Mean(Labor) 0.52 0.16 0.15 0.55 0.30 0.21

(1.42) (0.66) (0.09) (0.34) (0.35) (0.07)

Chile

Labor 0.20 0.33 0.50 0.37 0.60 0.30

(0.03) (0.07) (0.05) (0.03) (0.15) (0.02)

Capital 0.02 0.08 0.17 0.10 0.32 0.15

(0.06) (0.09) (0.07) (0.06) (0.15) (0.05)

Intermediates 0.67 0.54 0.56 0.59 0.50 0.55

(0.00) (0.01) (0.01) (0.01) (0.01) (0.00)

Sum 0.89 0.95 1.24 1.05 1.42 1.01

(0.08) (0.13) (0.11) (0.07) (0.29) (0.07)

Mean(Capital) /

Mean(Labor) 0.09 0.23 0.35 0.27 0.53 0.50

(0.25) (0.23) (0.11) (0.14) (0.35) (0.15)

Notes:

c. Since the input elasticities are heterogeneous across firms, we report the average input elasticities within each given industry.

d. The row titled "Sum" reports the sum of the average labor, capital, and intermediate input elasticities, and the row titled "Mean(Capital)/Mean(Labor)" reports the ratio of the average capital elasticity to the

average labor elasticity.

Table D5: Average Input Elasticities of Output--Fixed Effects(Structural Estimates: Gross Ouput)

a. Standard errors are estimated using the bootstrap with 200 replications and are reported in parentheses below the point estimates.

b. For each industry, the numbers are based on a gross output specification with fixed effects and are estimated using a complete polynomial series of degree 2 for each of the two nonparametric functions (G and

C ) of our approach.

A-24

Page 81: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

Colombia

Food Products

(311)

Textiles

(321)

Apparel

(322)

Wood Products

(331)

Fabricated Metals

(381) All

Gross Output

(GNR)

Gross Output

(GNR)

Gross Output

(GNR)

Gross Output

(GNR)

Gross Output

(GNR)

Gross Output

(GNR)

75/25 ratio 1.36 1.67 1.30 1.58 1.52 1.52(0.32) (0.40) (0.06) (0.46) (0.32) (0.08)

90/10 ratio 1.82 2.82 1.70 2.71 2.20 2.21(1.25) (1.71) (0.18) (2.82) (1.04) (0.25)

95/5 ratio 2.30 4.14 2.09 4.04 2.78 2.84(2.66) (4.01) (0.35) (13.90) (1.75) (0.41)

Exporter 0.26 0.25 0.10 -0.04 0.41 0.22(0.35) (0.95) (0.19) (2.64) (0.50) (0.11)

Importer 0.13 0.35 0.19 -0.08 0.32 0.27(0.29) (0.76) (0.25) (2.25) (0.37) (0.10)

Advertiser 0.01 0.32 0.07 -0.17 0.19 0.14(0.09) (0.30) (0.08) (0.41) (0.24) (0.04)

Wages > Median 0.17 0.45 0.20 0.02 0.40 0.37(0.26) (0.53) (0.06) (0.51) (0.33) (0.09)

Chile

75/25 ratio 1.57 1.60 1.52 1.52 2.06 1.57(0.15) (0.17) (0.12) (0.13) (0.36) (0.15)

90/10 ratio 2.41 2.55 2.40 2.34 4.48 2.45(0.40) (0.59) (0.45) (0.48) (1.27) (0.45)

95/5 ratio 3.14 3.38 3.38 3.20 7.30 3.41(0.61) (1.20) (0.98) (0.99) (2.55) (0.77)

Exporter 0.34 0.07 -0.07 0.07 -0.42 0.14(0.23) (0.21) (0.12) (0.42) (0.38) (0.24)

Importer 0.51 0.17 -0.02 0.22 -0.25 0.25(0.26) (0.18) (0.11) (0.31) (0.39) (0.21)

Advertiser 0.22 0.13 -0.06 0.04 -0.20 0.12(0.11) (0.14) (0.09) (0.07) (0.23) (0.10)

Wages > Median 0.50 0.28 0.10 0.23 -0.12 0.39(0.20) (0.17) (0.08) (0.15) (0.44) (0.22)

Notes:

Table D6: Heterogeneity in Productivity--Fixed Effects(Structural Estimates: Gross Output)

a. Standard errors are estimated using the bootstrap with 200 replications and are reported in parentheses below the point estimates.

b. For each industry, the numbers are based on a gross output specification with fixed effects and are estimated using a complete polynomial series of degree 2 for each of the two nonparametric functions (G and

C ) of our approach.

c. In the first three rows we report ratios of productivity for plants at various percentiles of the productivity distribution. In the remaining four rows we report estimates of the productivity differences between

plants (as a fraction) based on whether they have exported some of their output, imported intermediate inputs, spent money on advertising, and paid wages above the industry median. For example, in industry 311

for Chile a firm that advertises is, on average, 5% more productive than a firm that does not advertise.

A-25

Page 82: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

Colombia

Food Products

(311)

Textiles

(321)

Apparel

(322)

Wood Products

(331)

Fabricated Metals

(381) All

Gross Output

(GNR)

Gross Output

(GNR)

Gross Output

(GNR)

Gross Output

(GNR)

Gross Output

(GNR)

Gross Output

(GNR)

Labor 0.18 0.32 0.39 0.45 0.40 0.36

(0.04) (0.04) (0.03) (0.07) (0.03) (0.01)

Capital 0.13 0.18 0.08 -0.01 0.11 0.15

(0.03) (0.02) (0.02) (0.04) (0.02) (0.01)

Intermediates 0.67 0.54 0.52 0.51 0.53 0.54

(0.01) (0.01) (0.01) (0.01) (0.01) (0.00)

Sum 0.98 1.03 0.99 0.95 1.03 1.05

(0.02) (0.03) (0.02) (0.08) (0.02) (0.01)

Mean(Capital) /

Mean(Labor) 0.72 0.55 0.21 -0.02 0.27 0.41

Chile

Labor 0.24 0.44 0.45 0.37 0.52 0.36

(0.01) (0.03) (0.02) (0.03) (0.03) (0.01)

Capital 0.12 0.11 0.07 0.09 0.14 0.17

(0.01) (0.02) (0.01) (0.02) (0.01) (0.01)

Intermediates 0.66 0.54 0.55 0.59 0.50 0.55

(0.00) (0.01) (0.01) (0.01) (0.01) (0.00)

Sum 1.02 1.09 1.08 1.04 1.16 1.08

(0.01) (0.02) (0.02) (0.02) (0.02) (0.01)

Mean(Capital) /

Mean(Labor) 0.50 0.26 0.15 0.25 0.28 0.48

(0.05) (0.05) (0.04) (0.05) (0.04) (0.02)

Notes:

c. Since the input elasticities are heterogeneous across firms, we report the average input elasticities within each given industry.

d. The row titled "Sum" reports the sum of the average labor, capital, and intermediate input elasticities, and the row titled "Mean(Capital)/Mean(Labor)" reports the ratio of the average capital elasticity to the

average labor elasticity.

Table D7: Average Input Elasticities of Output--Extra Unobservable(Structural Estimates: Gross Ouput)

a. Standard errors are estimated using the bootstrap with 200 replications and are reported in parentheses below the point estimates.

b. For each industry, the numbers are based on a gross output specification with fixed effects and are estimated using a complete polynomial series of degree 2 for each of the two nonparametric functions (G and

C ) of our approach.

A-26

Page 83: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

Colombia

Food Products

(311)

Textiles

(321)

Apparel

(322)

Wood Products

(331)

Fabricated Metals

(381) All

Gross Output

(GNR)

Gross Output

(GNR)

Gross Output

(GNR)

Gross Output

(GNR)

Gross Output

(GNR)

Gross Output

(GNR)

75/25 ratio 1.35 1.35 1.29 1.35 1.32 1.37(0.04) (0.03) (0.02) (0.08) (0.03) (0.01)

90/10 ratio 1.82 1.83 1.68 1.94 1.76 1.88(0.13) (0.08) (0.06) (0.30) (0.05) (0.02)

95/5 ratio 2.36 2.34 2.03 2.57 2.18 2.37(0.26) (0.17) (0.11) (0.70) (0.09) (0.03)

Exporter 0.15 0.03 0.04 0.32 0.11 0.06(0.07) (0.04) (0.04) (0.25) (0.04) (0.01)

Importer 0.03 0.05 0.11 0.14 0.12 0.11(0.04) (0.04) (0.04) (0.15) (0.03) (0.01)

Advertiser -0.03 0.05 0.03 0.07 0.07 0.02(0.03) (0.04) (0.03) (0.11) (0.03) (0.01)

Wages > Median 0.09 0.18 0.18 0.21 0.23 0.19(0.04) (0.04) (0.02) (0.10) (0.03) (0.01)

Chile

75/25 ratio 1.37 1.49 1.43 1.51 1.54 1.55(0.01) (0.03) (0.02) (0.02) (0.02) (0.01)

90/10 ratio 1.92 2.18 2.12 2.35 2.35 2.40(0.03) (0.09) (0.04) (0.05) (0.06) (0.02)

95/5 ratio 2.51 2.93 2.77 3.15 3.12 3.33(0.06) (0.18) (0.08) (0.11) (0.12) (0.04)

Exporter 0.00 0.03 0.09 0.00 -0.02 0.03(0.04) (0.05) (0.03) (0.04) (0.03) (0.01)

Importer 0.12 0.10 0.13 0.15 0.09 0.15(0.04) (0.04) (0.02) (0.04) (0.03) (0.01)

Advertiser 0.04 0.04 0.06 0.04 0.01 0.06(0.02) (0.03) (0.02) (0.02) (0.02) (0.01)

Wages > Median 0.20 0.19 0.22 0.22 0.20 0.30(0.03) (0.04) (0.02) (0.03) (0.03) (0.01)

Notes:

Table D8: Heterogeneity in Productivity--Extra Unobservable(Structural Estimates: Gross Output)

a. Standard errors are estimated using the bootstrap with 200 replications and are reported in parentheses below the point estimates.

b. For each industry, the numbers are based on a gross output specification with fixed effects and are estimated using a complete polynomial series of degree 2 for each of the two nonparametric functions (G and

C ) of our approach.

c. In the first three rows we report ratios of productivity for plants at various percentiles of the productivity distribution. In the remaining four rows we report estimates of the productivity differences between

plants (as a fraction) based on whether they have exported some of their output, imported intermediate inputs, spent money on advertising, and paid wages above the industry median. For example, in industry 311

for Chile a firm that advertises is, on average, 5% more productive than a firm that does not advertise.

A-27

Page 84: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

Online Appendix O1: Monte Carlo Simulations

We consider a panel of 500 firms over 30 periods. To simplify the problem we abstract away from

labor and consider the following Cobb-Douglas production function

Yjt = Kαkjt M

αmjt e

ωjt+εjt ,

where αk = 0.25, αm = 0.65, and εjt is measurement error that is distributed N (0, 0.07). ωjt

follows an AR(1) process

ωjt = δ0 + δωjt−1 + ηjt,

where δ0 = 0.2, δ = 0.8, and ηjt ∼ N (0, 0.04). We select the variances of the errors and the AR(1)

parameters to roughly correspond to the estimates from our Chilean and Colombian datasets.

The environment facing the firms is the following. At the beginning of each period, firms

choose investment Ijt and intermediate inputsMjt. Investment determines the next period’s capital

stock via the law of motion for capital

Kjt+1 = (1− dj)Kjt + Ijt,

where dj ∈ {0.05, 0.075, 0.10, 0.125, 0.15} is the depreciation rate which is distributed uniformly

across firms. Intermediate inputs are subject to quadratic adjustment costs of the form

CMjt = 0.5b

(Mjt −Mjt−1)2

Mjt

,

where b is a parameter that indexes the level of adjustment costs, which we vary in our simulations.

Firms choose investment and intermediate inputs to maximize expected discounted profits. The

O-1

Page 85: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

problem of the firm, written in recursive form, is thus given by

V (Kjt,Mjt−1, ωjt) = maxIjt,Mjt

PtKαkjt M

αmjt e

ωjt − P It Ijt − ρtMjt

−0.5b(Mjt −Mjt−1)2

Mjt

+ βEtV (Kjt+1,Mjt, ωjt+1)

s.t.

Kjt+1 = (1− dj)Kjt + Ijt

Ijt ≥ 0,Mjt ≥ 0

ωjt+1 = δ0 + δωjt + ηjt+1.

The price of output Pt and the price of intermediate inputs ρt are set to 1. The price of investment

P It is set to 8, and there are no other costs to investment. The discount factor is set to 0.985.

In order for our Monte Carlo simulations not to depend on the initial distributions of (k,m, ω),

we simulate each firm for a total of 200 periods, saving only the last 30 periods. The initial

conditions, k1, m0, and ω1 are drawn from the following distributions: U (11, 400) , U (11, 400) ,

and U (1, 3). Since the firm’s problem does not have an analytical solution, we solve the problem

numerically by value function iteration with an intermediate modified policy iteration with 100

steps, using a multi-linear interpolant for both the value and policy functions.49

O1.1. Inference

In the first set of Monte Carlo simulations, we provide evidence that our bootstrap procedure has

the correct coverage for our estimator. For this set of simulations, we set the adjustment cost pa-

rameter for intermediate inputs, b, to zero to correspond with our DGP. We begin by simulating

500 samples, each consisting of 500 firms over 30 periods. For each sample we nonparametrically

bootstrap the data 199 times.50 For each bootstrap replication we estimate the output elasticities of

capital and intermediate inputs using our procedure as described in Section 6. We then compute

the 95% bootstrap confidence interval using the 199 bootstrap replications. This generates 500

49See Judd (1998) for details.50See Davidson and MacKinnon (2004).

O-2

Page 86: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

bootstrap confidence intervals, one for each sample. We then count how many times (out of 500)

the true values of the output elasticities (i.e., 0.25 and 0.65) lie within the bootstrap confidence

interval. The results are presented graphically in Figures O1.1A and O1.1B. The true value of

the elasticity is contained inside the 95% confidence interval 95.4% (capital) and 94.2% (interme-

diate inputs) of the time. Hence, for both the capital and intermediate elasticities we obtain the

correct coverage, suggesting that we can use our bootstrap procedure to do inference even in the

nonparametric case.

O1.2. Estimator Performance

For our second set of Monte Carlo simulations, we evaluate how well our estimator performs

when the first-order condition for intermediate inputs does not hold exactly. We first generate 100

Monte Carlo samples for each of 9 values of the adjustment cost parameter b, ranging from zero

adjustment costs to very large adjustment costs. For the largest value, b = 1, this would imply

that firms in our Chilean and Colombian datasets, on average, pay substantial adjustment costs

for intermediate inputs of almost 10% of the value of total gross output. For each sample we

estimate the average capital and intermediate input elasticities in two ways. As a benchmark, we

first obtain estimates using a simple version of dynamic panel with no fixed effects. The reason we

use dynamic panel is that, in light of our non-identification arguments in Section 3, this procedure

provides consistent estimates under the presence of adjustment costs. We compare these estimates

to ones obtained via our nonparametric procedure, which assumes adjustment costs of zero.

We impose the (true) Cobb-Douglas parametric form in the estimation of the dynamic panel

(but not in our nonparametric procedure) to give dynamic panel the best possible chance of recov-

ering the true parameters and to minimize the associated standard errors. Given the Cobb-Douglas

structure and the AR(1) process for productivity, we have

yjt − αkkjt − αmmjt − δ0 − δ (yjt−1 − αkkjt−1 − αmmjt−1) = ηjt − δεjt−1 + εjt.

The dynamic panel procedure estimates the parameter vector (αk, αm, δ0, δ) by forming moments

O-3

Page 87: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

in the RHS of the equation above. Specifically we use a constant and kjt, kjt−1,mjt−1 as the

instruments.

Since the novel part of our procedure relates to the intermediate input elasticity via the first

stage, we focus on the intermediate input elasticity estimates. The comparison for the capital

elasticities is very similar. The results are presented graphically in Figures O1.2A and O1.2B. Not

surprisingly, the dynamic panel data method breaks down and becomes very unstable for small

values of adjustment costs, as these costs are insufficient to provide identifying variation via the

lags. This is reflected both in the large percentile ranges and in the fact that the average estimates

bounce around the truth. Our method on the other hand performs very well, as expected. This is the

case even though for dynamic panel we impose and exploit the constraint that the true technology

is Cobb-Douglas, whereas for our procedure we do not.

As we increase the level of adjustment costs, our nonparametric method experiences a small

upward bias relative to the truth and relative to dynamic panel, although in some cases our estimates

are quite close to those of dynamic panel. The percentile range for dynamic panel is much larger,

however. So while on average dynamic panel performs slightly better for large values of adjustment

costs, the uncertainty in the estimates is larger. Overall our procedure performs remarkably well,

even for very large values of adjustment costs. At the largest value, our average estimated elasticity

of 0.689 is less than 3 percentage points larger than the truth.

O-4

Page 88: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

Figure O1.1A

: Monte C

arlo: Inference--Capital E

lasticityD

istribution of 95% B

ootstrap Confidence Intervals

Notes: This figure presents the results from

applying our estimator to M

onte Carlo data generated as described in O

nline Appendix O

1.1 in the absenece of adjustment costs. For each sim

ulation we

nonparametrically bootstrap the data 199 tim

es. For each bootstrap replication we estim

ate the output elasticity of capital using our procedure as described in Section 6. W

e then compute the 95%

bootstrap confidence intervals using these replications. This generates a confidence interval for each of the 500 M

onte Carlo sam

ples. In the figure we plot the low

er and upper boundaries of the confidence intervals for each M

onte Carlo sam

ple. The simulations are sorted by the m

id-point of these intervals. The true value of the elasticity is 0.25. 95.4% of the constructed confidence intervals

cover the true value.

O-5

Page 89: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

Figure O1.1B

: Monte C

arlo: Inference--Intermediate E

lasticityD

istribution of 95% B

ootstrap Confidence Intervals

Notes: This figure presents the results from

applying our estimator to M

onte Carlo data generated as described in O

nline Appendix O

1.1 in the absenece of adjustment costs. For each sim

ulation we

nonparametrically bootstrap the data 199 tim

es. For each bootstrap replication we estim

ate the output elasticity of intermediate inputs using our procedure as described in S

ection 6. We then com

pute the 95%

bootstrap confidence intervals using these replications. This generates a confidence interval for each of the 500 Monte C

arlo samples. In the figure w

e plot the lower and upper boundaries of the

confidence intervals for each Monte C

arlo sample. The sim

ulations are sorted by the mid-point of these intervals. The true value of the elasticity is 0.65. 94.2%

of the constructed confidence intervals cover the true value.

O-6

Page 90: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

Fig

ure O

1.2

A: M

on

te C

arlo

: Estim

ato

r P

erfo

rm

an

ce--In

term

ed

iate

Inp

ut E

lastic

ityG

NR

an

d D

yn

am

ic Pan

el: Av

erages

No

tes: This fig

ure p

resents th

e results fro

m ap

plyin

g b

oth

ou

r estimato

r and

a dyn

amic p

anel d

ata estimato

r to M

onte C

arlo d

ata gen

erated as d

escribed

in O

nlin

e Ap

pen

dix

O1

.2. T

he d

ata are gen

erated su

ch th

at the first o

rder co

nd

ition n

o lo

nger

ho

lds b

ecause o

f qu

adratic ad

justm

ent co

sts. The p

arameter b

ind

exes th

e deg

ree of ad

justm

ent co

sts in in

termed

iate inp

uts facin

g th

e firm, w

ith h

igher v

alues rep

resentin

g larg

er adju

stmen

t costs. T

he y-ax

is measu

res the av

erage estim

ated

interm

ediate in

pu

t elasticity for b

oth

estimato

rs across 1

00

Mo

nte C

arlo sim

ulatio

ns. T

he tru

e valu

e of th

e averag

e elasticity is 0.6

5.

0.6

0.6

5

0.7

0.7

5

0.8

0.8

5

0.0

00

.10

0.2

00

.30

0.4

00

.50

0.6

00

.70

0.8

00

.90

1.0

0

Average Intermediate Input Elasticity

Valu

e of A

dju

stme

nt C

ost P

arameter (b

)

Dyn

amic P

anel

GN

R

O-7

Page 91: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

Fig

ure O

1.2

B: M

on

te Ca

rlo: E

stima

tor P

erform

an

ce--Interm

edia

te Inp

ut E

lasticity

GN

R a

nd

Dy

na

mic P

an

el: Per

cen

tile Ra

ng

es

No

tes: This fig

ure p

resents th

e results fro

m ap

plyin

g b

oth

ou

r estimato

r and

a dyn

amic p

anel d

ata estimato

r to M

onte C

arlo d

ata gen

erated as d

escribed

in O

nlin

e Ap

pen

dix

O1

.2. T

he d

ata are gen

erated su

ch th

at the first o

rder co

nd

ition n

o lo

nger h

old

s

becau

se of q

uad

ratic adju

stmen

t costs. T

he p

arameter b

ind

exes th

e deg

ree of ad

justm

ent co

sts in in

termed

iate inp

uts facin

g th

e firm, w

ith h

igher v

alues rep

resentin

g larg

er adju

stmen

t costs. T

he y-ax

is measu

res the 2

.5 an

d 9

7.5

percen

tiles of th

e estimated

interm

ediate in

pu

t elasticity for b

oth

estimato

rs across 1

00

Mo

nte C

arlo sim

ulatio

ns. T

he tru

e valu

e of th

e averag

e elasticity is 0.6

5.

-0.6

-0.4

-0.2 0

0.2

0.4

0.6

0.8 1

1.2

0.0

00

.10

0.2

00

.30

0.4

00

.50

0.6

00

.70

0.8

00

.90

1.0

0

Average Intermediate Input Elasticity

Valu

e o

f Ad

justm

en

t Co

st Param

eter (b

)

Dyn

amic P

anel (9

7.5

percen

tile)

GN

R (9

7.5

pe

rcen

tile)

Dyn

amic P

anel (2

.5 p

ercentile

)G

NR

(2.5

pe

rcen

tile)

O-8

Page 92: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

Online Appendix O2: Value Added

In this appendix we provide additional details regarding value added.

Restricted Profit Functions

Recall equation (24) in the main body:

V AEt = F (kt, lt,mt) eωt −Mt≡ V (kt, lt, e

ωt) .

It can be shown that the total derivative of value added with respect to one of its inputs is equal to

the partial derivative of gross output with respect to that input. For example, the total derivative of

value added with respect to productivity is given by:

dV AEtdeωt

=dV (kt, lt, e

ωt)

deωt

=

[∂F (kt, lt,mt) e

ωt

∂eωt−(∂F (kt, lt,mt) e

ωt

∂Mt

− 1

)∂Mt

∂eωt

]=

∂GOt

∂eωt.

Due to the first order condition in equation (14) in the main text, the term inside the parentheses

on the second line is equal to zero, where the relative price of output to intermediate inputs has

been normalized to one via deflation. This implies that:

(∂GOt

∂eωteωt

GOt

)︸ ︷︷ ︸

elasGOteωt

=

(dV AEtdeωt

eωt

V AEt

)︸ ︷︷ ︸

elasV Ateωt

V AEtGOE

t

=

(dV AEtdeωt

eωt

V AEt

)︸ ︷︷ ︸

elasV Ateωt

(1− St) .

However, once we add back in the ex-post shocks we have the following:

dV AEtdeωt

=dV (kt, lt, e

ωt , eεt)

deωt

=

[∂F (kt, lt,mt) e

ωt+εt

∂eωt−(∂F (kt, lt,mt) e

ωt+εt

∂Mt

− 1

)∂Mt

∂eωt

].

O-9

Page 93: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

Notice now that the term inside the parentheses is no longer equal to zero, due to the presence of

the ex-post shock, εt. The reason is that the first order condition, which previously made that term

equal to zero, is an ex-ante object, whereas what is inside the parentheses is ex-post. Therefore,

we cannot simply transform the value added elasticities into their gross output counterparts by

re-scaling via the ratio of value added to gross output.

The first order condition implies that ∂F (kt,lt,mt)eωt+εt

∂Mt= eεt

E . In turn, this implies that

dV AEtdeωt

=

[∂F (kt, lt,mt) e

ωt+εt

∂eωt−(eεt

E− 1

)∂Mt

∂eωt

]⇒ elas

V AEteωt = elasGOteωt

GOt

V AEt− ∂Mt

∂eωt+εteωt

V AEt

(eεt

E− 1

).

The equation above can then be rearranged to form relationship between the elasticities as:

(∂GOt

∂eωteωttGOt

)︸ ︷︷ ︸

elasGOteωt

=

(∂V AEt∂eωt

eωt

V AEt

)︸ ︷︷ ︸

elasV AEteωt

(1− St) +∂Mt

∂eωteωt

GOt

(eεt

E− 1

).

A similar result holds when we analyze the elasticities with respect to the entire productivity shock,

eωt+εt , instead of just the persistent component, eωt . In this case we have the following relationship:

(∂GOt

∂eωt+εteωt+εt

GOt

)︸ ︷︷ ︸

elasGOt

eωt+εt

=

(∂V AEt∂eωt+εt

eωt+εt

V AEt

)︸ ︷︷ ︸

elasV AEt

eωt+εt

(1− St) +∂Mt

∂eωt+εteωt+εt

GOt

(eεt

E− 1

).

“Structural” Value Added

As discussed in Section 8.2, for the Leontief case we have

Yt = min [H(kt, lt), C (mt)] eωt+εt . (56)

The standard Leontief condition,H(kt, lt) = C (mt), will not generally hold unless C (mt) = aMt.

Even in this linear case, the value added production function, H(kt, lt)eωt+εt , does not relate

cleanly to the empirical measure of value added V AEt ≡ Yt−Mt, since V AEt = H(kt, lt)(eωt+εt − 1

a

).

O-10

Page 94: On the Identification of Production Functions: How ...production function cannot generally be used to identify features of interest from the gross output production function. Applying

However, it does correspond directly to gross output since

Yt = H(kt, lt)eωt+εt .

Neither of these issues are resolved by moving ωt inside the min function in equation (56). Sup-

pose that instead of equation (56), one wrote the production function as: Yt = min[H (kt, lt) eωt ,

C (mt)]eεt . For similar reasons, the condition, H(kt, lt)e

ωt = C (mt) , only holds when C (mt) =

aMt. Even when this is the case, the value added production function will again not correspond

to the empirical measure of value added as V AEt = H(kt, lt)eωt(eεt − 1

a

). As was the case above,

however, it directly corresponds to gross output: Yt = H(kt, lt)eωt+εt .

It is also the case that moving εt inside the min function does not help. The problem is that the

key condition,H(kt, lt)eεt = aMt, will not hold when ωt is outside the min because of the presence

of the ex-post shock εt. Since εt is realized after input decisions are made, the key condition will

generally not hold. Thus neither V AEt , nor gross output Yt correspond to the structural value-

added production function H (kt, lt) eωt+εt . An analogous argument holds when ωt is inside the

min function.

References Online AppendixDavidson, Russell and James G MacKinnon. 2004. Econometric Theory and Methods, vol. 5. New

York: Oxford University Press.

Judd, Kenneth L. 1998. Numerical Methods in Economics. Cambridge: MIT Press.

O-11