Constrained Bayes and empirical Bayes estimation under random effects normal ANOVA model with...

Journal of Statistical Planning and Inference 138 (2008) 2017–2028www.elsevier.com/locate/jspi

Constrained Bayes and empirical Bayes estimation under randomeffects normal ANOVA model with balanced loss function�

Malay Ghosha,∗, Myung Joon Kimb, Dal Ho Kimc

aDepartment of Statistics, University of Florida, 103 Griffin-Floyd Hall, P.O. Box 118545, Gainesville, FL 32611-8545, USAbSamsung Fire & Marine Insurance, Seuol, South Korea

cKyungpook National University, Taegu, South Korea

Received 27 March 2006; received in revised form 11 April 2007; accepted 16 August 2007Available online 12 October 2007

Abstract

The paper develops constrained Bayes and empirical Bayes estimators in the random effects ANOVA model under balanced lossfunctions. In the balanced normal–normal model, estimators of the Bayes risks of the constrained Bayes and constrained empiricalBayes estimators are provided which are correct asymptotically up to O(m−1), that is the remainder term is o(m−1), m denotingthe number of strata.© 2007 Elsevier B.V. All rights reserved.

Keywords: Constrained Bayes; Constrained empirical Bayes; Mean squared error; Asymptotic; Balanced loss

1. Introduction

Simultaneous estimation of multiple parameters has received considerable attention in recent years. Small areaestimation is one among its many applications. It is well known that under the usual quadratic loss, the Bayes estimatoris the vector of posterior means. However, it is pointed out in Louis (1984) and Ghosh (1992) that the empirical histogramof the posterior means is underdispersed as an estimate of the histogram of the corresponding population parameters.Thus adjustment of Bayes estimators is needed in order to meet the twin objective of simultaneous estimation andcloseness of the histogram of the estimates with the posterior estimate of the parameter histogram.

Viewed in a decision theoretic framework (Louis, 1984), one has to keep in mind two loss functions to tackle suchproblems—the squared error loss (L2) for measuring the divergence between the parameters and their estimates, and asecond loss (Ld ) designed to measure the distance between the empirical distribution of the estimates of the parametersand the parameters themselves. However, squared error loss is primarily designed to reflect precision of estimation. Ifin addition, one also takes into account the goodness of fit, then it is more appropriate to replace squared error loss by aweighted average of two losses, one the squared distance between the parameters and their estimates, and the other thesquared distance between the estimates and the data. The latter has been referred to as a balanced loss by Zellner (1988,1992). Denoting this loss by LB , ideally Bayes estimates of the parameters are obtained by minimizing Ld + �LB with

� This research was partially supported by NSF Grants SES-9911485 and SES-0317589.∗ Corresponding author. Tel.: +1 352 392 1941x232; fax: +1 352 392 5175.

E-mail address: [email protected] (M. Ghosh).

0378-3758/$ - see front matter © 2007 Elsevier B.V. All rights reserved.doi:10.1016/j.jspi.2007.08.004

http://www.elsevier.com/locate/jspi

mailto:[email protected]

2018 M. Ghosh et al. / Journal of Statistical Planning and Inference 138 (2008) 2017–2028

� → 0. This, however, leads to an intractable solution. Accordingly, following Louis (1984) and Ghosh (1992) we findconstrained Bayes (CB) and constrained empirical Bayes (CEB) estimates of the parameters of interest by matchingthe first two empirical moments of the set of estimates with the corresponding posterior moments of the populationparameters.

The CB and CEB estimators meet the dual need of ensemble estimation and detection of extremes. For example,in subgroup analysis, the problem is not just to estimate the different components of a parameter vector, but also toidentify the parameters which are above, and the others which are below some specified threshold point. For example,as shown in Ghosh (1992), if the problem is not just to estimate the average wages and salaries of workers in a certainindustry, but also to identify units with very high or very low average wages, then the CB and CEB estimators meet theobjective better than the regular Bayes or empirical Bayes (EB) estimators.

The setup considered in this paper is a balanced normal–normal model which can be viewed alternatively as a balancedrandom effects normal ANOVA model. The CB estimators are derived in Section 2, and a second order asymptoticexpansion of the Bayes risk of such estimators is also provided in this section. CEB estimators are considered in Section3, and once again a second order asymptotic expansion of Bayes risks of such estimators is given. Section 4 providesapproximations of Bayes risks of both the CB and CEB estimators. Section 5 contains some numerical simulationresults demonstrating the accuracy of these approximations. This section also contains a real data example. Finally,some concluding remarks are made in Section 6.

Before concluding this section, we may point out that the present paper extends the work of Ghosh et al. (2004) in twodirections. The latter found the second order correct Bayes risk of constrained James–Stein estimators under squarederror loss rather than the balanced loss. Also, constrained James–Stein estimators can be viewed as constrained EBestimators under random effects ANOVA model, but only under the assumption of known error variance. The presentarticle does not make that assumption. Thus, the present paper considers a more general loss, and also two unknownvariance components instead of one.

2. Constrained Bayes estimators and their Bayes risks in the balanced ANOVA model

Consider the balanced normal ANOVA model with Yij = �i + eij and �i = � + �i (j = 1, . . . , k; i = 1, . . . , m).Here the �i and the eij are mutually independent with �i∼iidN(0, �2) and eij∼iidN(0, �2). Alternatively, in a Bayesianframework, this amounts to saying that Yij |�i∼iidN(�i , �2), i = 1, . . . , m and �i∼iidN(�, �2).

Minimal sufficiency consideration allows us to restrict to (X1, . . . , Xm, SSW), where Xi = 1k

∑kj=1 Yij = Yi and

SSW =∑mi=1

∑kj=1(Yij − Yi )

2. We may note that marginally X1, . . . , Xm and SSW are mutually independent with

Xi∼iidN(�, �2 + �2/k), i.e., N(�, �2/(kB)), where B = �2/k

�2/k+�2 = �2

�2+k�2 , and SSW ∼ �2�2m(k−1). Following the

conventional notation, we write MSW = SSW/(m(k − 1)), SSB = k∑m

i=1(Xi − X)2 (X = m−1∑mi=1Xi) and MSB =

SSB/(m − 1). Also, let X = (X1, . . . , Xm)T and � = (�1, . . . , �m)T.For an estimator e(X) ≡ e = (e1, . . . , em)T of �, the balanced loss as introduced by Zellner (1988, 1992) is given by

L(�, e) = m−1[w‖X − e‖2 + (1 − w)‖e − �‖2], (2.1)

where ‖ · ‖ is the Euclidean norm and w ∈ [0, 1] is the known weight. The choice of w reflects the relative weightwhich the experimenter wants to assign to goodness of fit and precision of estimation. The extreme cases w = 0 andw = 1 refer solely to precision of an estimate and goodness of fit, respectively.

Following Louis (1984) and Ghosh (1992), we seek compromise estimators t = (t1, . . . , tm)T of � which satisfy (i)t=m−1∑m

i=1 ti =m−1∑mi=1 E(�i |X)=(1−B)X+B� and (ii)

∑mi=1 (ti − t )2=∑m

i=1 E[(�i −�)2|X]=H1(X)+H2(X),where H1(X) = ∑m

i=1 V (�i − �|X) = (m − 1)(�2/k)(1 − B) and H2(X) = ∑mi=1[E(�i |X) − E(�i |X)]2 = (1 −

B)2∑mi=1 (Xi − X)2, and subject to (i) and (ii), minimize the loss given in (2.1). The resulting Bayes estimators are

referred to as “constrained Bayes” (CB) estimators.From a result of Kim (2004), it follows that the constrained Bayes estimator of � is given by

�CB = a(X)(1 − B)(X − X1m) + {(1 − B)X + B�}1m,

for all w ∈ [0.1], where 1m is an m-component column vector with each element equal to 1, and a2(X) = 1 + H1(X)H2(X)

=1+ �2

(1−B)MSB . This is the same as the estimator obtained by Ghosh (1992) for w=0, and clearly signifies the robustness

M. Ghosh et al. / Journal of Statistical Planning and Inference 138 (2008) 2017–2028 2019

of the latter. However, as we will see later in Section 3, the choice of w does matter in the Bayes risk calculations. Inpractice, the choice of w reflects the trade-off that one is willing to accept between the goodness of fit and the precisionof an estimator.

For the balanced loss as given in (2.1), the Bayes risk of �CB

is given by

r(�CB

) = m−1{wE‖X − �CB‖2 + (1 − w)E‖�CB − �‖2}.

The following theorem provides an asymptotic expansion of r(�CB

) correct up to O(m−1). Its proof is long and technical,and is deferred to Appendix A.

Theorem 1. Under the loss given in (2.1), the Bayes risk of �CB

is given by

r(�CB

) = a1(B) + m−1a2(B) + o(m−1),

where a1(B) = �2

k[(1 − w)(1 − B) + (2 − B)B−1{1 − (1 − w)B} − 2(1 − B)1/2B−1{1 − (1 − w)B}] and a2(B) =

�2

k[wB − (2 − B)B−1{1 − (1 − w)B} + (1 − B)1/2(B/2 + 2/B){1 − (1 − w)B}].

3. Constrained empirical Bayes estimators and their Bayes risks in the balanced ANOVA model

In an EB scenario, both � and B are unknown, and need to be estimated from the marginal distribution of theXi’s. We estimate � by X. Also, by the Law of Large Numbers, MSW

a.s.→ �2 and MSBa.s.→ �2/B as m → ∞. Thus

MSW/MSBa.s.→ B as m → ∞. Further, we keep the upper bound of the estimator of B slightly bounded away from 1

so that we do not face any problem estimating (1 − B)−1 as needed in connection with a(X).The CEB estimator of � is given by

�CEB = aEB(X)(1 − B)(X − X1m) + X1m,

after substitution of � by X, B by B and �2 by MSW. Here B = min{m−3m−1 , MSW

MSB } and a2EB(X)= 1 + MSW

(1−B)MSB. We now

find the Bayes risk of �CEB

in the following theorem.

Theorem 2. Under the loss given in (2.1), the Bayes risk of �CEB

in the balanced ANOVA model is given by

a1(B) + m−1a3(B) + o(m−1),

where a1(B) is defined in Theorem 1, and a3(B) = �2

k[(1 − w)B − (2 − B − 2(1 − B)1/2)B−1{1 − (1 − w)B} +

kk−1

B2 (1 − B)−3/2{1 − (1 − w)B}].

The proof of this theorem is also technical, and is deferred to Appendix A.

Remark 1. Since the leading term in the expansion of the Bayes risk of �CEB

agrees with that of �CB

, it follows that

�CEB

is asymptotically optimal in the sense of Robbins (1956).

4. Estimation of Bayes risks

In this section, we find approximations of Bayes risks of both the constrained Bayes (CB) and constrained empiricalBayes (CEB) estimators of � which are correct up to O(m−1). The Bayes risk of the CB estimator of � is of the form

a1(B) + a2(B)/m + o(m−1), where a1(B) and a2(B) are defined in Theorem 1. Since BP→ B as m → ∞, by the

dominated convergence theorem E[a2(B)] = a2(B) + o(1). We need to however calculate E[a1(B)] which as wewill show will be of the form a1(B) + a4(B)/m + o(m−1). (a4(B) will be defined later in this section.) Hence, theapproximation of the Bayes risk of the CB estimator of � correct up to O(m−1) is given by a1(B)+m−1[a2(B)−a4(B)].In a similar vein, since the Bayes risk of the CEB estimator of � is a1(B)+ a3(B)/m+ o(m−1) where a3(B) is defined


in Theorem 2, the approximation to the Bayes risk of the CEB estimator of � correct up to O(m−1) is given bya1(B) + m−1[a3(B) − a4(B)].

First note that

a1(B) = MSW

k[(1 − w)(1 − B) + (2 − B)B−1{1 − (1 − w)B}

− 2(1 − B)1/2B−1{1 − (1 − w)B}]. (4.1)

Thus we need to find E[a1(B)]. First we calculate

E[MSW(1 − B)] = E

[MSW − (MSW)2

MSB

]+ O(m−r )

= �2(1 − B) − 2B�2

m

k

k − 1+ O(m−2). (4.2)

Next we calculate

E[MSW(2 − ˆB)

ˆB

−1{1 − (1 − w)

ˆB}]

= 2E[MSW ˆB

−1] − E[MSW] − (1 − w)E[MSW(2 − ˆ

B)]= �2(2 − B)B−1(1 − (1 − w)B) + 2�2(1 − w)B

m

k

k − 1+ O(m−2). (4.3)

Also,

E[{(1 − ˆB)1/2 ˆ

B−1

MSW}I[B= ˆB]]

= E[{(1 − ˆB)1/2MSB}I[B= ˆ

B]] = E[{(MSB)(MSB − MSW)}1/2I[B= ˆB]]

= E[h(MSB, MSW)I[B= ˆB]],

where h(x, y) is defined in Section 3. Hence, after some simplifications,

E[h(MSB, MSW)] = �2(1 − B)1/2B−1 − kB�2

4m(k − 1)(1 − B)3/2 + O(m−3/2).

Finally,

E[(1 − B)1/2MSW] = E[(1 − ˆB)1/2MSWI[B= ˆ

B]] + O(m−r )

= E

[{(MSW)2 − (MSW)3

MSB}1/2I[B= ˆ

B]

]+ O(m−r )

= E[u(MSB, MSW)I[B= ˆB]] + O(m−r ), (4.4)

where u(x, y) = (y2 − y3

x)1/2 for y < x. Note again that

�u

�x= 1

2u

y3

x2 ,�u

�y= 1

2u(2y − 3y2

x)

�2u

�x2 = −y5(4x − 3y)

4u3x4 ,�2u

�y2 = −y3(4x − 3y)

4u3x2 .

Hence, after much simplifications,

E[u(MSB, MSW)I[B= ˆB]] = �2(1 − B)1/2 − �2Bk(4 − 3B)

4m(k − 1)(1 − B)3/2 + O(m−2). (4.5)


Combining (4.1) with (4.2)–(4.5), we have

E[a1(B)] = a1(B) + a4(B)/m + o(m−1),

where a4(B) = �2

k−1B

2(1−B)3/2 {1 − (1 − w)(4 − 3B)}.

5. Numerical studies

In this section, we first report the results of a simulation study to demonstrate the accuracy of the Bayes riskapproximation as described in the previous sections. For illustration, we consider a simple normal–normal model with� = 0. We investigate the performance of the simulated Bayes risks of the CB and CEB estimators as well as theirasymptotic Bayes risks and the estimated Bayes risks after retaining terms up to O(m−1) for several choices of w, m,�2, and �2.

Details of the simulation are described below.

(a) First we generate �i (i = 1, . . . , m) from the N(0, �2) distribution with fixed �2 value.(b) For given �i (i = 1, . . . , m), we generate the data yij , i = 1, . . . , m, j = 1, . . . , k from the N(�i , �2) distribution

with fixed �2 value. We repeat steps (a) and (b) R = 10, 000 times with k = 3 and 8. Then we calculate CB andCEB estimates for each simulated dataset.

(c) Finally we compute the simulated Bayes risk

(mR)−1w

m∑i=1

R∑r=1

(xir − �ir )2 + (mR)−1(1 − w)

m∑i=1

R∑r=1

(�ir − �ir )2

under the balanced loss for selected w values and different values of m after R = 10, 000 repetitions of theexperiment. In addition, we calculate asymptotic approximate Bayes risks as well as the corresponding estimatedBayes risks of the CB and CEB estimates for the same w and m values. Also we compute the relative risk toexamine how well these approximations are close to the simulated Bayes risks. The relative risk is calculated as theabsolute difference of the simulated Bayes risk and either the asymptotic Bayes risk or the estimated asymptoticBayes risk divided by the simulated Bayes risk.

Tables 1 and 2 report the values of the simulated, asymptotic and asymptotically estimated Bayes risks of CB andCEB estimates for m = 10, 30, 50, 100, 300 and k = 3, 8 and selected values of B = �2/(�2 + k�2). Here we consideronly �2 = 1 since for �2 = c, the approximate Bayes risks and their approximate estimates are c times those for �2 = 1.Also we consider different values of w which reflect the emphasis on the relative importance of the goodness of fitand precision of estimation. Table 1 considers w = 0.3, that is, the loss function is pretty weighed towards precisionof estimation, while Table 2 considers w = 0.7 which is more weighted towards lack of bias. Not surprisingly, thesetables show that the simulated Bayes risks as well as the asymptotic Bayes risks and asymptotically estimated Bayesrisks for the CB and CEB estimates are fairly close to the simulated Bayes risks for large m, at least for m = 50 andbeyond in all situations. In addition to all these, we provide the relative risks within parentheses for the CB and CEBestimates.

Next we consider an example where the main objective is to illustrate how the proposed CEB estimator can be usedin real data analysis. This example is taken from Box et al. (1978, Table 17.3). The data originate from a chemicalprocess, where each batch of the pigment was routinely tested for moisture content by performing a single test ona single sample. It was convenient to take a unit of measured moisture content to be one-tenth of 1%. In terms ofthese units the moisture content thus determined varied about a mean of approximately 25 with a standard deviation ofabout 6.

In order to implement the procedure, each of 15 batches of product was independently sampled twice, yielding 30samples. Each sample was then thoroughly mixed and split into two parts, yielding 60 subsamples. These subsampleswere numbered and randomly introduced into the stream of routine analysis going to the analytical laboratory.


Table 1Numerical Bayes risks of CB and CEB estimates for selected values of k, B and m with w = 0.3 and �2 = 1

k B m Simulated Asymptotic (rel. risk) Estimated (rel. risk)

CB CEB CB CEB CB CEB

10 0.198 0.206 0.198(.002) 0.207(.001) 0.200(.011) 0.219(.059)30 0.196 0.199 0.196(.000) 0.199(.002) 0.196(.001) 0.199(.001)

1/4 50 0.195 0.197 0.195(.001) 0.197(.001) 0.196(.001) 0.197(.003)100 0.195 0.196 0.195(.000) 0.196(.000) 0.195(.000) 0.196(.000)300 0.195 0.195 0.195(.001) 0.195(.001) 0.195(.001) 0.195(.001)

10 0.159 0.175 0.159(.002) 0.185(.057) 0.173(.087) 0.221(.265)30 0.155 0.166 0.156(.001) 0.164(.011) 0.152(.022) 0.172(.036)

3 1/2 50 0.155 0.162 0.155(.001) 0.160(.010) 0.153(.012) 0.162(.004)100 0.154 0.157 0.154(.000) 0.157(.002) 0.154(.000) 0.157(.001)300 0.154 0.155 0.154(.001) 0.155(.000) 0.154(.001) 0.155(.001)

10 0.117 0.143 0.116(.006) 0.195(.361) 0.158(.351) 0.226(.581)30 0.113 0.127 0.113(.000) 0.139(.093) 0.096(.148) 0.163(.282)

3/4 50 0.112 0.125 0.112(.001) 0.128(.024) 0.088(.216) 0.143(.146)100 0.111 0.120 0.112(.002) 0.119(.007) 0.092(.171) 0.126(.045)300 0.111 0.114 0.111(.001) 0.114(.005) 0.109(.018) 0.115(.002)

10 0.074 0.077 0.074(.000) 0.077(.007) 0.074(.000) 0.079(.023)30 0.073 0.074 0.073(.002) 0.074(.001) 0.073(.002) 0.074(.003)

1/4 50 0.073 0.074 0.073(.001) 0.074(.002) 0.073(.001) 0.074(.002)100 0.073 0.073 0.073(.001) 0.073(.001) 0.073(.001) 0.073(.002)300 0.073 0.073 0.073(.000) 0.073(.000) 0.073(.000) 0.073(.000)

10 0.060 0.065 0.060(.000) 0.067(.042) 0.063(.065) 0.077(.191)30 0.058 0.061 0.058(.001) 0.061(.008) 0.058(.013) 0.062(.016)

8 1/2 50 0.058 0.060 0.058(.000) 0.060(.005) 0.058(.004) 0.060(.003)100 0.058 0.059 0.058(.000) 0.059(.001) 0.058(.001) 0.059(.000)300 0.058 0.058 0.058(.000) 0.058(.000) 0.058(.000) 0.058(.000)

10 0.044 0.052 0.044(.002) 0.067(.292) 0.057(.301) 0.077(.488)30 0.042 0.047 0.042(.002) 0.050(.061) 0.038(.094) 0.056(.198)

3/4 50 0.042 0.046 0.042(.001) 0.047(.012) 0.036(.135) 0.051(.097)100 0.042 0.045 0.042(.001) 0.044(.012) 0.038(.096) 0.046(.022)300 0.042 0.043 0.042(.001) 0.043(.004) 0.042(.005) 0.043(.000)

We take averages of two subsamples in original data for our illustrative purpose, yielding two sample averages foreach batch which are the Yij ’s in our setup. So in our case we have m=15 and k =2. Consequently, Xi = (Yi1 +Yi2)/2.With the notations of Section 2, B = 0.67, MSB = 43.25 and MSW = 28.99 in our dataset. We calculate means, CEBestimates and usual EB estimates for each batch which are given as follows. The usual EB estimators are given by

�EBi = (1 − B)Xi + BX.

batch 1 2 3 4 5 6 7 8Xi 34.75 26.25 21.50 27.25 18.25 28.75 28.00 31.50

�CEBi 31.36 26.48 23.75 27.05 21.88 27.91 27.48 29.49

�EBi 29.41 26.61 25.04 26.94 23.97 27.43 27.18 28.34

batch 9 10 11 12 13 14 15Xi 29.00 20.00 25.00 30.25 24.50 24.25 32.50

�CEBi 28.06 22.89 25.76 28.77 25.47 25.33 30.07

�EBi 27.51 24.55 26.20 27.93 26.03 25.95 28.67


Table 2Numerical Bayes risks of CB and CEB estimates for selected values of k, B and m with w = 0.7 and �2 = 1

k B m Simulated Asymptotic (rel. risk) Estimated (rel. risk)

CB CEB CB CEB CB CEB

10 0.104 0.107 0.104(.002) 0.106(.003) 0.098(.062) 0.113(.060)30 0.099 0.101 0.099(.000) 0.100(.004) 0.099(.003) 0.101(.001)

1/4 50 0.098 0.099 0.099(.001) 0.099(.000) 0.099(.001) 0.099(.002)100 0.098 0.098 0.098(.000) 0.098(.000) 0.098(.000) 0.098(.000)300 0.097 0.097 0.097(.001) 0.097(.000) 0.097(.001) 0.097(.001)

10 0.110 0.105 0.110(.000) 0.129(.224) 0.077(.301) 0.134(.270)30 0.102 0.111 0.103(.001) 0.109(.019) 0.086(.164) 0.114(.030)

3 1/2 50 0.101 0.106 0.101(.000) 0.105(.017) 0.095(.054) 0.107(.003)100 0.100 0.102 0.100(.000) 0.102(.005) 0.100(.002) 0.102(.000)300 0.099 0.100 0.099(.001) 0.100(.000) 0.099(.001) 0.100(.001)

10 0.126 0.092 0.125(.010) 0.226(1.45) 0.059(.530) 0.147(.590)30 0.116 0.118 0.116(.001) 0.149(.263) 0.019(.833) 0.141(.194)

3/4 50 0.114 0.124 0.114(.002) 0.134(.086) 0.028(.759) 0.135(.093)100 0.112 0.123 0.112(.003) 0.123(.005) 0.059(.474) 0.126(.026)300 0.112 0.116 0.112(.002) 0.115(.006) 0.106(.051) 0.116(.001)

10 0.039 0.040 0.039(.003) 0.039(.012) 0.038(.035) 0.041(.027)30 0.037 0.037 0.037(.002) 0.037(.001) 0.037(.002) 0.037(.003)

1/4 50 0.037 0.037 0.037(.001) 0.037(.001) 0.037(.001) 0.037(.002)100 0.037 0.037 0.037(.001) 0.037(.001) 0.037(.001) 0.037(.002)300 0.037 0.037 0.037(.000) 0.037(.000) 0.037(.000) 0.037(.000)

10 0.041 0.039 0.041(.003) 0.046(.161) 0.033(.194) 0.047(.200)30 0.038 0.040 0.038(.001) 0.040(.014) 0.035(.079) 0.041(.014)

8 1/2 50 0.038 0.039 0.038(.000) 0.039(.011) 0.037(.019) 0.039(.002)100 0.037 0.038 0.037(.000) 0.038(.001) 0.037(.002) 0.038(.000)300 0.037 0.037 0.037(.000) 0.037(.000) 0.037(.000) 0.037(.000)

10 0.047 0.035 0.047(.002) 0.074(1.15) 0.028(.402) 0.052(.494)30 0.043 0.045 0.043(.002) 0.053(.183) 0.019(.563) 0.051(.139)

3/4 50 0.043 0.046 0.043(.000) 0.048(.050) 0.022(.485) 0.049(.062)100 0.042 0.046 0.042(.000) 0.045(.013) 0.031(.264) 0.046(.014)300 0.042 0.043 0.042(.001) 0.043(.005) 0.041(.015) 0.043(.000)

6. Summary and conclusion

The paper derives constrained Bayes and constrained empirical Bayes estimators in the random effects ANOVAmodel under balanced loss functions. The constrained Bayes and constrained empirical Bayes estimators have theirvirtues, especially when one is interested in seeking a compromise between different losses. The Bayes risks of theconstrained Bayes and constrained empirical Bayes estimators are very close in the simulation study. Extension andapplication of the proposed methods in the small area estimation context is a worthwhile project as the further work.

In practice, the choice of w will depend on what an experimenter wants as a tradeoff between goodness-of-fit andprecision. Because of the linearity of the Bayes risk in w, minimization of the same with respect to w will only yieldw = 0 or 1. It seems pointless to take this course in order to find an optimizing w.

Acknowledgements

The paper has benefited much from the very helpful comments of two reviewers.

Appendix A.

Proof of Theorem 1. The posterior mean of � is given by ePM = (1 − B)X + B�1m. Then

m−1E‖�CB − �‖2 = m−1E‖�CB − ePM + ePM − �‖2

= �2

k(1 − B) + m−1E‖ePM − �

CB‖2. (A.1)


But

ePM − �CB = (1 − B)(X − X1m) − a(X)(1 − B)(X − X1m)

= (1 − a(X))(1 − B)(X − X1m). (A.2)

So by (A.2), one gets

m−1E‖ePM − �CB‖2 = (1 − B)2 m − 1

kmE[(1 − a(X))2MSB]. (A.3)

Next

X − �CB = (1 − a(X)(1 − B))(X − X1m) + B(X − �)1m. (A.4)

Hence, by (A.4), one gets

m−1E‖X − �CB‖2 = m − 1

kmE[{1 − a(X)(1 − B)}2MSB] + B�2

km. (A.5)

Combining the results from (A.1), (A.3) and (A.5),

E[L(�, �CB

)] = �2

k(1 − w)(1 − B) + wB�2

km

+ m − 1

kmE[(w(1 − a(X)(1 − B))2MSB]

+ m − 1

kmE[(1 − w)(1 − B)2(1 − a(X))2MSB]. (A.6)

On simplification,

w{1 − (1 − B)a(X)}2 + (1 − w)(1 − B)2(1 − a(X))2

= (1 − B)2a2(X) − 2(1 − B)(1 − (1 − w)B)a(X) + {w + (1 − w)(1 − B)2}. (A.7)

We now calculate

E[a2(X)MSB] = E

[(1 + �2

(1 − B)MSB

)MSB

]= �2

B(1 − B). (A.8)

Hence, from (A.7) and (A.8),

E[(w{1 − (1 − B)a(X)}2 + (1 − w)(1 − B)2(1 − a(X))2)MSB]= B−1(1 − B)�2 + {w + (1 − w)(1 − B)2}B−1�2

− 2(1 − B)(1 − (1 − w)B)E[a(X)MSB]= �2(2 − B)B−1[1 − (1 − w)B] − 2(1 − B)(1 − (1 − w)B)E[a(X)MSB]. (A.9)

Next we find

E[a(X)MSB] = E

[(1 + �2

(1 − B)MSB

)1/2

MSB)

]

= (1 − B)−1/2E[(1 − B)(MSB)2 + �2MSB]1/2

= (1 − B)−1/2E[g(MSB)] (say),

where g(x) = [(1 − B)x2 + �2x]1/2. By Taylor expansion, noting that E(MSB) = �2/B,

g(MSB) = g(�2/B) + (MSB − �2/B)g′(�2/B) + 1

2(MSB − �2/B)2g′′(�2/B)

+ 1

2(MSB − �2/B)3

∫ 1

0(1 − )2g′′′[(MSB) + (1 − )(�2/B)] d. (A.10)


By some standard algebra,

g(�2/B) = [(1 − B)(B−1�2)2 + B−1�4]1/2

= [B−1�4{(1 − B)B−1 + 1}]1/2

= B−1�2; (A.11)

g′′(�2/B) = − B3

4�2 . (A.12)

Finally, for x > 0,

|g′′′(x)|� 3�4(�2 + 2x)

8x5/2�5= 3(�2 + 2x)

8�x5/2.

Thus,

|g′′′[MSB + (1 − )�2/B]|� 3

8�4 (1 − )−5/2B5/2 + 3

4�4 (1 − )−3/2B3/2.

Hence,∫ 1

0 (1 − )2|g′′′[X + (1 − )EX]| d < ∞. Also, by the general result that for iid random variables, say,Z1, . . . , Zm with mean say, � and finite r(�2)th moment, E|Z−�|r =O(m−r/2) (Brillinger, 1962), one gets E|MSB−�2/B|3 = O(m−3/2). In order to use Brillinger’s result, one needs to use the Helmert orthogonal transformation to theXi to reexpress

∑mi=1(Xi − X)2 =∑m−1

i=1 Zi , where the Zi are iid N(0, �2 + �2/k). Also,

E(MSB − �2/B)2 = 2B−2�4

m − 1. (A.13)

Combining the results from (A.10) to (A.13),

E[g(MSB)] = B−1�2 − B−2�4

m − 1

(B3

4�2

)+ O(m−3/2)

= B−1�2 − B�2

4m+ O(m−3/2). (A.14)

Combining (A.6), (A.9) and (A.14), the result follows after some simplification. �

Proof of Theorem 2. First we find

m−1E‖� − �CEB‖2 = �2

k(1 − B) + m−1E‖ePM − �

CEB‖2.

But

ePM − �CEB = [(1 − B) − aEB(X)(1 − B)](X − X1m) − B(X − �)1m.

Hence,

m−1E‖ePM − �CEB‖2 = B�2

km+ E

[{(1 − B) − aEB(X)(1 − B)}2 (m − 1)MSB

km

].

Again,

m−1E‖X − �CEB‖2 = E

[{1 − aEB(X)(1 − B)}2 (m − 1)MSB

km

].

Hence,

E[L(�, �CEB

)] = �2

k(1 − B) + (1 − w)B�2

km

+ m − 1

kmE[{(1 − w)((1 − B) − aEB(X)(1 − B))2

+ w(1 − aEB(X)(1 − B))2}MSB]. (A.15)


On simplification,

(1 − w)[(1 − B) − aEB(X)(1 − B)]2 + w[1 − aEB(X)(1 − B)]2

= w + (1 − w)(1 − B)2 + a2EB(X)(1 − B)2 − 2(1 − B)aEB(X)(1 − (1 − w)B).

But a2EB(X)(1 − B)2MSB = (1 − B)2MSB + (1 − B)MSW. Let ˆ

B = MSW/MSB. Then

P(B = ˆB) = P

(MSW

MSB>

m − 3

m − 1

)

= P

[MSB − m − 1

m − 3MSW − E

(MSB − m − 1

m − 3MSW

)< − E(MSB − m − 1

m − 3MSW)

]

= P

[MSB − m − 1

m − 3MSW − E

(MSB − m − 1

m − 3MSW

)< − �2(1 − B)

B+ 2i�2(m − 3)−1

]

� E

∣∣∣∣MSB − m − 1

m − 3MSW − E

(MSB − m − 1

m − 3MSW

)∣∣∣∣2r/[

�2(1 − B)

B− 2�2(m − 3)−1

]2r

� 22r−1

[E(MSB−EMSB)2r+

(m−1

m−3

)2r

E(MSW−EMSW)2r

]/[�2(1−B)

B−2�2(m−3)−1

]2r

= O(m−r ) for arbitrarily large r > 0.

Hence,

E|B − ˆB|j = E[|B − ˆ

B|j I[B = ˆB]]

= E

[∣∣∣∣m − 3

m − 1− MSW

MSB

∣∣∣∣j

I[B = ˆB]

]

�E1/2∣∣∣∣m − 3

m − 1− MSW

MSB

∣∣∣∣2j

P 1/2(B = ˆB)

= O(1)O(m−r ) = O(m−r ), (A.16)

for arbitrarily large r > 0 and j = 1, 2, 3, . . . . Accordingly,

E[a2EB(X)(1 − B)2MSB] = E[(1 − B)2MSB + (1 − B)MSW]

= E[(1 − ˆB)2MSB + (1 − ˆ

B)MSW] + O(m−r ),

for arbitrarily large r > 0. Also, on simplification,

E[(1 − ˆB)2MSB + (1 − ˆ

B)MSW] = �2(1 − B)/B.

Hence, similar to (A.9),

E[{(1 − w)(1 − B − (1 − B)aEB(X))2 + w(1 − aEB(X)(1 − B))2}MSB]= �2(2 − B)B−1[1 − (1 − w)B] − 2(1 − (1 − w)B)E[aEB(X)(1 − B)MSB]. (A.17)

Next we calculate

E[aEB(X)(1 − B)MSB]= E[(1 − B)2(MSB)2 + (1 − B)(MSB)(MSW)]1/2

= E[(1 − ˆB)2(MSB)2 + (1 − ˆ

B)(MSB)(MSW)]1/2I[B= ˆB]

+ E

[(1 − m − 3

m − 1

)2

(MSB)2 +(

1 − m − 3

m − 1

)(MSB)(MSW)

]1/2

I[B = ˆB]. (A.18)


Applying the Schwartz inequality, and arguing as in (A.16), it follows that the second term in the right-hand side of(A.18) is O(m−r ) for arbitrarily large r > 0. Now, on simplification,

(1 − ˆB)2(MSB)2 + (1 − ˆ

B)(MSB)(MSW) = (MSB − MSW)MSB.

Hence,

E[(1 − ˆB)2(MSB)2 + (1 − ˆ

B)(MSB)(MSW)]1/2I[B= ˆB] = E[h(MSB, MSW)]I[ MSW

MSB < m−3m−1 ],

where h(x, y) = [(x − y)x]1/2 for y < x. Now by Taylor expansion, for y < x,

h(x, y) = h(�2/B, �2) +{(

x − �2

B

)�h

�x+ (y − �2)

�h

�y

}∣∣∣∣x=�2/B,y=�2

+{

1

2(x − �2/B)2 �2h

�x2 + 1

2(y − �2)2 �2h

�y2

}∣∣∣∣x=�2/B,y=�2

+(x − �2/B)(y − �2)�2h

�x �y

}∣∣∣∣x=�2/B,y=�2

+ R(x, y) (say).

Note now that

�h

�x= 1

h

(x − 1

2y

),

�h

�y= − x

2h

�2h

�x2 = − y2

4h3 ,�2h

�y2 = − x2

4h3 ,�2h

�x�y= xy

4h3 .

As before, one can show that

E[R(MSB, MSW)]I[MSWMSB < m−3

m−1

] = O(m−r ),

for arbitrarily large r > 0. Also,

E[h(MSB, MSW)]I[MSWMSB < m−3

m−1

]

= �2(1 − B)1/2B−1 + 1

2

2�4/B2

m − 1

⎛⎜⎝− �4

4{(

�2

B− �2

)�2

B

}3/2

⎞⎟⎠

+ 1

2

2�4

m(k − 1)

(− (�2/B)2

4{(�2/B − �2)�2/B}3/2

)+ O(m−3/2)

= �2(1 − B)1/2B−1 − B�2

4(m − 1)(1 − B)3/2 − B�2

4m(k − 1)(1 − B)3/2 + O(m−3/2)

= �2(1 − B)1/2B−1 − k�2B(1 − B)−3/2

4m(k − 1)+ O(m−3/2). (A.19)

Combining (A.5) with (A.17)–(A.19), the proof is completed. �

References

Box, G.E.P., Hunter, W.G., Hunter, J.S., 1978. Statistics for Experimenters. Wiley, New York.Ghosh, M., 1992. Constrained Bayes estimation with applications. J. Amer. Statist. Assoc. 87, 533–540.Ghosh, M., Kim, D., Kim, M.J., 2004. Asymptotic mean squared error of constrained James–Stein estimators. J. Statist. Plann. Inference 126,

107–118.Kim, M.J., 2004. Constrained Bayes and Empirical Bayes Estimation under Balanced Loss Functions. Unpublished Ph.D. Dissertation, Department

of Statistics, University of Florida.


Louis, T.A., 1984. Estimating a population of parameter values using Bayes and empirical Bayes methods. J. Amer. Statist. Assoc. 79, 393–398.Robbins, H., 1956. An empirical Bayes approach to statistics. In: Proceedings of the 3rd Berkelay Symposium on Mathematical Statistics and

Probability, vol. 1, pp. 157–164.Zellner, A., 1988. Bayesian analysis in econometrics. J. Econometrics 37, 27–50.Zellner, A., 1992. Bayesian and non-Bayesian estimation using balanced loss functions. In: Gupta, S.S., Berger, J.O. (Eds.), Statistical Decision

Theory and Related Topics V. Springer, New York, pp. 377–390.

Constrained Bayes and empirical Bayes estimation under random effects normal ANOVA model with...

Documents

Transcript of Constrained Bayes and empirical Bayes estimation under random effects normal ANOVA model with...