Stochastic DEA: Myths and misconceptions Timo Kuosmanen (HSE & MTT) Andrew Johnson (Texas A&M...
Transcript of Stochastic DEA: Myths and misconceptions Timo Kuosmanen (HSE & MTT) Andrew Johnson (Texas A&M...
Stochastic DEA: Myths and misconceptions
Timo Kuosmanen (HSE & MTT)Andrew Johnson (Texas A&M University)
Mika Kortelainen (University of Manchester)
XI EWEPA 2009, Pisa, Italy
2
What is stochastic DEA?
”DEA is truly a stochastic frontier estimation method, and it is incorrect to classify it as a deterministic method.”
Banker & Natarajan (2008) Operations Research, p.49
3
What is stochastic DEA?
• Term stochastic(from Greek “Στοχος” for ”aim” or ”guess”)generally refers to statistical random variation
4
Elements of random variation in DEA
• Random sampling of observations from the production possibility set (sampling error)
• Random sampling of observations outside the production possibility set (outliers)
• Random outcome of production process (stochastic technology)
• Random measurement errors, omitted variables, and other disturbances (stochastic noise)
5
Common myths and misconceptions
• Confusing stochastic noise with sampling variation, outliers, or stochastic technology
• Statistical inference on sampling error is believed to improve robustness to noise
• Robustness to outliers is seen as the same as robustness to noise (or at least closely related)
6
Sampling error
True frontier
input x
output y
7
Sampling error
True frontier
Random sample of observations (DMUs, firms)
y
x
8
Sampling error
True frontier
Random sample of observations (DMUs, firms)
y
x
9
Sampling error
True frontier
Random sample of observations (DMUs, firms)
y
x
10
Sampling error
True frontier
DEA frontier
y
x
11
Statistical foundation of DEA
– Banker (1993) Management Science– Korostelev, Simar & Tsybakov (1995) Annals Stat.– Kneip, Park & Simar (1998) Econometric Theory– Simar & Wilson (2000) JPA
• Deterministic technology• No outliers or noise• Data randomly sampled from the PPS • DEA frontier converges to the true frontier as the
sample size approaches to infinity• In a finite sample, DEA frontier is downward biased
12
Statistical foundation of DEA
• Statistical inference on sampling error is possible by using
– Asymptotic sampling distribution (Banker 1993)– Bootstrapping (Simar & Wilson 1998)
• Such inferences have nothing to do with– outliers– stochastic technology– stochastic noise
13
Bootstrapping
• Purpose of the smooth consistent bootstrap (Simar & Wilson 1998, 2000) is to mimic the original random sampling to estimate the sampling bias
• Bias corrected DEA frontier will always lie above the original DEA frontier
• In noisy data, DEA tends to overestimate the frontier
• Assuming away noise, and “correcting” for the small sample bias by bootstrapping, we will shift the frontier upward=> If noise is a problem, then bias correction will only make it worse
14
Simulated example
y
x3,000
4,000
5,000
6,000
0,000 2,000 4,000 6,000 8,000 10,000 12,000
Frontier
Data points
15
Simulated example
y
x3,000
4,000
5,000
6,000
0,000 2,000 4,000 6,000 8,000 10,000 12,000
DEA Frontier
Frontier
Data points
16
Simulated example
y
x3,000
4,000
5,000
6,000
0,000 2,000 4,000 6,000 8,000 10,000 12,000
Bias Corrected Frontier
DEA Frontier
Frontier
Data points
17
Critique of Löthgren & Tambour (LT)
“LT bootstrap involves measuring the distance from a different, random (as opposed to fixed) point to the [frontier] on each replication of the bootstrap Monte Carlo exercise. It seems entirely unclear what this procedure estimates. Certainly, it does not estimate anything of interest.”
…“LT method assumes not only that [the frontier] is
unknown, but also (implicitly) that the point from which one wishes to measure distance to the frontier is unknown. This is absurd.”
Simar & Wilson (2000), JPA, pp. 67-68.
18
Outliers
True frontier
Outliersy
x
19
Outliers
True frontier
DEA frontier
y
x
20
Outliers
– Super-efficiency approach (Wilson 1995 JPA)– Peeling the onion; context dependent DEA (Seiford & Zhu
1999 Management Science) – Robust efficiency measures / efficiency depth (Kuosmanen
& Post 1999 DP, Cherchye, Kuosmanen & Post 2000 DP)– Conditional order-m and order-α quantile frontiers (Aragon,
Daouia & Thomas-Agnan 2002 DP; Cazals, Florens & Simar 2002 J Econometrics; Daouia & Simar 2007 J Econometrics; Daraio & Simar 2007 book)
• Deterministic technology • Improve robustness to outliers by not enveloping
the most extreme observations• Outliers are different from noise
– Noise affects all observations
21
Stochastic technology
Pr.[f(x)≤f]= 0.50
Pr.[f(x)≤f]= 0.05
Pr.[f(x)≤f]= 0.95
y
x
22
Stochastic technology
y
x
Pr.[f(x)≤f]= 0.50
Pr.[f(x)≤f]= 0.05
Pr.[f(x)≤f]= 0.95
23
Chance constrained DEA
– Land, Lovell & Thore (1993) Managerial & Decision Econ.– Olesen & Petersen (1995) Management Science – Cooper, Huang & Li (1996) Annals of OR – Huang & Li (2001) JPA
• Stochastic technology, stochastic noise, both?
24
Chance constrained stochastic DEA
• Huan & Li (2001) JPA• Assume inputs and outputs are multivariate normal
random variables, with known expected values and covariance matrix
25
Chance constrained stochastic DEA
• How do we get “knowledge” about the expected values of inputs and outputs?
– Cannot be estimated from cross-sectional data– Panel data estimation would require that the true
inputs and outputs do not change over time
• How do we get “knowledge” about the variances and covariances of the error terms???
• Uncertainty of the parameter estimates not taken into account in the model
26
Stochastic noise
True frontier
y
x
27
Stochastic noise
True frontier
y
x
28
Stochastic noise
True frontier
y
x
29
Stochastic DEA models to deal with noise
• DEA+– Gstach (1998) JPA – Banker & Natarajan (2008) Operations Research
• “Stochastic DEA”– Banker, Datar & Kemerer (1991) Management Science
• Stochastic FDH/DEA estimators– Simar & Zelenyuk (2008) DP.
• Stochastic Nonparametric Envelopment of Data (StoNED)
– Kuosmanen (2006) DP; Kuosmanen & Kortelainen (2007) DP.
30
Stochastic DEA models to deal with noise
• Estimation of a fully deterministic frontier based on data perturbed by noise
– The shape of frontier can be estimated without parametric assumptions
• Estimation of inefficiency (efficiency scores) is very challenging in cross-sectional setting
– Observed output contains the noise term– Only conditional expected value can be estimated– Even the SFA efficiency estimator is not consistent!
31
Stochastic DEA models to deal with noise
• In cross-sectional setting, identifying inefficiency and noise requires some strong assumption
– Assuming away noise completely is a strong assumption, too
• Distributional assumptions do not influence the efficiency rankings
– Ondrich & Ruggiero 2001, EJOR
32
Conclusions
• Stochastic noise should not be confused with sampling error, outliers, or stochastic technology
• Correcting for small sample bias by bootstrapping does not improve robustness to noise; it can even make things worse
• Improving robustness to outliers is different from stochastic noise that perturbs all observations