C ISSN: 0740-817X print / 1545-8830 online DOI: 10.1080...

IIE Transactions (2013) 45, 736–750Copyright C© “IIE”ISSN: 0740-817X print / 1545-8830 onlineDOI: 10.1080/0740817X.2012.705454

Efficient computing budget allocation for finding simplestgood designs

QING-SHAN JIA1,∗, ENLU ZHOU2 and CHUN-HUNG CHEN3,4

1Center for Intelligent and Networked Systems, Department of Automation, TNLIST, Tsinghua University, Beijing 100084, ChinaE-mail: [email protected] of Industrial & Enterprise Systems Engineering, University of Illinois at Urbana–Champaign, IL 61801, USA3Department of Electrical Engineering and Institute of Industrial Engineering, National Taiwan University, Taipei, Taiwan4Department of Systems Engineering & Operations Research, George Mason University, Fairfax, VA 22030, USA

Received June 2011 and accepted May 2012

In many applications some designs are easier to implement, require less training data and shorter training time, and consume lessstorage than others. Such designs are called simple designs and are usually preferred over complex ones when they all have goodperformance. Despite the abundant existing studies on how to find good designs in simulation-based optimization, there exist fewstudies on finding simplest good designs. This article considers this important problem and the following contributions are made tothe subject. First, lower bounds are provided for the probabilities of correctly selecting the m simplest designs with top performanceand selecting the best m such simplest good designs, respectively. Second, two efficient computing budget allocation methods aredeveloped to find m simplest good designs and to find the best m such designs, respectively, and their asymptotic optimalities havebeen shown. Third, the performance of the two methods is compared with equal allocations over six academic examples and a smokedetection problem in a wireless sensor network.

Keywords: Simulation-based optimization, complexity preference, optimal computing budget allocation, wireless sensor network

1. Introduction

Many systems nowadays follow not only physical laws butalso man-made rules. These systems are called discrete-event dynamic systems and the optimization of theirperformance enters the realm of Simulation-Based Opti-mization (SBO). In many applications some designs areeasier to implement, require less training data and shortertraining time, and consume less storage than the others.Such designs are called simple designs and are usually pre-ferred over complex ones when they all have good perfor-mance. For example, a wireless sensor network can be usedto detect smoke. While a larger sensing radius allows thesmoke to be detected faster, it also increases the power con-sumption and shortens the lifetime of the network. Thus,when the detection is fast enough, a small sensing radius ispreferred over a large one.

There are abundant existing studies on SBO. Rank-ing and Selection (R&S) procedures are well-known pro-cedures to combine simulation and optimization to im-prove efficiency levels. Their origin can be traced back totwo papers, namely, Bechhofer (1954) for the indifference-

∗Corresponding author

zone formulation and Gupta (1956, 1965) for the subsetselection formulation. Excellent surveys on R&S can befound in Bechhofer et al. (1995), Kim and Nelson (2003),and Swisher et al. (2003). Chen et al. (2000) and Chen andYucesan (2005) developed the Optimal Computing BudgetAllocation (OCBA) procedure to asymptotically maximizea lower bound of the probability of correctly selecting thebest solution candidate. OCBA was later extended to se-lect the best several solution candidates (Chen et al., 2008)and to handle stochastic constraints (Pujowidianto et al.,2009), multiple objective functions (Lee et al., 2004), corre-lated observation noises (Fu et al., 2007), and opportunitycost (He et al., 2007). A comprehensive introduction toOCBA was recently provided by Chen and Lee (2011). Re-cent surveys on other methods for SBO can be found inAndradottir (1998), Fu (2002), Swisher et al. (2004), Tekinand Sabuncuoglu (2004), and Fu et al. (2008).

Despite the abundance of existing studies on findinggood designs in SBO, there exist few studies on finding sim-plest good designs. This problem is challenging due to thefollowing difficulties. First, simulation-based performanceevaluation is usually time-consuming and provides onlynoisy performance estimation. The second difficulty is ran-domness. Due to the pervasive randomness in the systemdynamics, usually multiple runs are required in order to

0740-817X C© 2013 “IIE”

Dow

nloa

ded

by [

Tsi

nghu

a U

nive

rsity

] at

21:

40 1

7 Se

ptem

ber

2013

Efficient computing budget allocation 737

obtain accurate estimation. The last difficulty is complex-ity preference for simple designs, which requires one toconsider both the performance and the complexity simul-taneously. Existing studies on SBO usually handle the firsttwo difficulties well, but very few consider the complexitypreference, which is the main contribution of this article.

The preference for simple designs has been math-ematically described by the Kolmogorov complex-ity (Kolmogorov, 1965), Solomonoff’s universal prior(Solomonoff, 1964, 1978), the Levin complexity (Levin,1973), and the AIXI model (Hutter, 2005), to name justa few. Most of these formulations assume that the per-formance of designs can be easily evaluated and thus donot address the unique difficulties of simulation-based per-formance evaluations. The study of SBO with complexitypreference has only recently been considered. Jia and Zhao(2009) studied the relationship between performance andcomplexity of different policies in Markov Decision Pro-cesses (MDPs). Jia (2011b) showed that it is possible toobtain simple policies with good performance in MDPs.Jia (2009, 2011a) considered SBO with descriptive com-plexity preference and developed an adaptive sampling al-gorithm to find the simplest design with bounded cardinalperformance. Later on, Yan et al. (2010, 2012) developedtwo algorithms (OCBAmSG and OCBAbSG) to find msimplest designs with bounded cardinal performance and tofind the best m such designs, respectively. The above meth-ods suit applications where there are clear bounds on thecardinal performance of designs. However, in many appli-cations it is difficult to estimate the performance of the bestdesign a priori, which makes it difficult to identify “good”designs in a cardinal sense. Ho et al. (2007) showed that theprobability of correctly identifying the relative order amongtwo designs converges to unity exponentially fast with re-spect to (w.r.t.) the number of observations that are takenfor each design. Note that the standard deviation of cardi-nal performance estimation using Monte Carlo simulationonly converges in the rate of 1/

√n, where n is the number

of observations. Thus, in comparison, one finds that the or-dinal values converge much faster than the cardinal ones.Since in many applications we want to find simple designswith the best performance, we focus this article on findingsimplest good designs in the ordinal sense.

In this article, we consider the important problem ofhow to allocate the computing budget so that the simplestgood designs can be found with high probability and makethe following major contributions. First, we mathemati-cally formulate two related problems. One is how to find msimplest designs that have top-g performance. When g > mthere could be multiple choices for m such designs. For ex-ample, suppose θ1, θ2, and θ3 are the best, the second best,and the third best designs, respectively. Also, suppose theircomplexities are the same. When g = 3 and m = 2, thereare three choices for m simplest designs with top-g perfor-mance, namely, {θ1, θ2}, {θ1, θ3}, and {θ2, θ3}. Clearly, thechoice of {θ1, θ2} has the best performance. Thus, another

related problem is how to find m simplest top-g designsthat have the best performance among all of the choices.We develop lower bounds for the probabilities of correctlyselecting the two subsets of m simplest good designs, whichare denoted as PCSm and PCSb, respectively. Second, wedevelop efficient computing budget allocation methods toasymptotically optimize the two PCS functions, respec-tively. The two methods are called Optimal ComputingBudget Allocation for m Simplest Good Designs in the or-dinal sense (OCBAmSGO) and Optimal Computing Bud-get Allocation for the best m Simplest Good Designs inthe ordinal sense (OCBAbSGO), respectively. Then we nu-merically compare their performance with equal allocationon academic examples and a smoke detection problem inWireless Sensor Networks (WSNs).

The rest of this article is organized as follows. We mathe-matically formulate the two problems in Section 2, presentthe main results in Section 3, show the experimental resultsin Section 4, and draw conclusions in Section 5.

2. Problem formulation

In this section we define the m Simplest Good designs (ormSG for short) and the Best m Simplest Good designs (orbSG for short) using the true performance of the designs inSection 2.1 and define the probabilities of correct selectionbased on a Bayesian model in Section 2.2.

2.1. Definitions of mSG and bSG

Consider a search space of k competing designs � ={θ1, . . . , θk}. Let J(θ) be the performance of θ , which canbe accurately evaluated only through an infinite number ofreplications; that is,

J(θ) a.s.== limn→∞

1n

n∑i=1

J(θ, ξi ), (1)

where

J(θ, ξi ) = J(θ)+ w(θ, ξi ) (2)

is the observation; w(θ, ξi ) is the observation noise, whichhas independent Gaussian distribution N(0, σ 2(θ)); andξi represents the randomness in the i th sample path.Suppose we are considering minimization problems. Sortall of the designs from the smallest to the largest ac-cording to J(θ). Let G be the set of the top-g (g < k)designs; i.e., G = {θi1, . . . , θig }, where θi j is the top j thdesign. Let C(θ) be the complexity of design θ . With-out loss of generality, we assume M different complexi-ties; i.e., {C(θ), θ ∈ �} = {1, . . . , M}. Let �i denote designswith complexity i ; i.e., �i = {θ ∈ �, C(θ) = i}. Then each�i can be divided into two subsets, namely, Gi = �i ∩ G,which contains all the good designs with complexity i , andDi = �i \ Gi , which contains all of the rest of the designswith complexity i , where \ represents set minus. We then

Dow

nloa

ded

by [

Tsi

nghu

a U

nive

rsity

] at

21:

40 1

7 Se

ptem

ber

2013

738 Jia et al.

have⋃M

i=1 Gi = G and D = �\G. It is possible that Gi = ∅

or Di = ∅ for some i . Throughout this article, we make thefollowing assumption:

1. The complexities of all the designs are known; i.e., C(θ)is known for all θ ∈ �.

2. When selecting among good designs, complexity haspriority over performance.

A set of designs S is called the m (m < g) simplest gooddesigns (or mSG for short) if all of the following conditionsare satisfied.

1. |S| = m,2. S⊂ G,3. maxθ∈S C(θ) ≤ minθ∈G\S C(θ).

Furthermore, a set of designs S is called the best mSG(or bSG for short) if all of the following conditions aresatisfied.

1. S is mSG,2. if there exists θ ∈ S and θ ′ ∈ G \ S s.t. C(θ) = C(θ ′),

then J(θ) ≤ J(θ ′).

Note that the first condition above clearly shows that a bSGmust be an mSG. The second condition shows that a bSGhas the best performance among all mSGs.

2.2. Definitions of probabilities of correct selection

Note that in the above definitions of G, D, mSG, and bSGthe true performance J(θ) is used, which can be obtainedonly when an infinite number of replications is used. Inpractice only a finite number of replications are affordable;that is,

J(θ) = 1n(θ)

n(θ)∑i=1

J(θ, ξi ). (3)

We follow the Bayesian model as in Chen (1996), Chenet al. (2000), and Chen et al. (2008). The mean of the sim-ulation output for each design, J(θ), is assumed unknownand treated as a random variable. After the simulation isperformed, a posterior distribution for the unknown meanJ(θ), p(J(θ)| J(θ, ξi ), i = 1, . . . , n(θ)), is constructed basedon two pieces of information: (i) prior knowledge of the sys-tem’s performance and (ii) current simulation output. As inChen (1996), we assume that the unknown mean J(θ) hasa conjugate normal prior distribution and consider nonin-formative prior distributions, which implies that no priorknowledge is available about the performance of any de-sign before conducting the simulations, in which case theposterior distribution of J(θ) is (cf. DeGroot (1970)):

J(θ) ∼ N( J(θ), σ 2(θ)/n(θ)). (4)

Given J, then G and D are both random sets. Thus, wedefine the probability of correctly selecting an mSG as

PCSm ≡ Pr {S is mSG}= Pr

{|S| = m, S⊂ G, max

θ∈SC(θ) ≤ min

θ ′∈G\SC(θ ′)

},

(5)

and define the probability of correctly selecting a bSG as

PCSb ≡ Pr {S is bSG}= Pr

{|S| = m, S⊂ G, max

θ∈SC(θ) ≤ min

θ ′∈G\SC(θ ′),

maxθ∈�max

θ ′∈S C(θ ′)∩SJ(θ) ≤ min

θ ′′∈�maxθ ′∈S C(θ ′)\S

J(θ ′′)}.

(6)

Now we can mathematically formulate the following twoproblems:

� (P1) maxn(θ1),...,n(θk) PCSm s.t.∑k

i=1 n(θi ) = T;� (P2) maxn(θ1),...,n(θk) PCSb s.t.

∑ki=1 n(θi ) = T;

where n(θi ) is the number of replications allocated to de-sign θi . We will provide methods to address the above twoproblems in the next section.

3. Main results

In this section we address problems P1 and P2 in subsec-tions 3.1 and 3.2, respectively.

3.1. Selecting an mSG

Given the observed performance of all of the designs, wecan divide the entire design space � into at most 2M sub-sets, namely, G1, . . . , G M and D1, . . . , DM, where G and Drepresent the observed top-g designs and the rest of the de-signs, respectively, Gi = �i ∩ G and Di = �i \ Gi . Thoughthere are multiple choices of an mSG, we are specificallyinterested in the following procedure to select an mSG. Ini-tially, set S= ∅. We start from G1 and add designs fromG1 to S from the smallest to the largest according to theirobserved performance J(θ). When all of the designs in Gihave been added to S and |S| < m, we move on to Gi+1 andcontinue the above procedure until |S| = m. Suppose t sat-isfies

∑ti=1 |Gi | < m ≤∑t+1

i=1 |Gi |. Let θi, j be the observedj th best design in �i . Then we have

S=(

t⋃i=1

Gi

)∪{θt+1,1, . . . , θt+1,m−∑t

i=1 |Gi |}

. (7)

This specific choice of S contains the best simplest mdesigns in G. It is not only an estimate of mSG underthe given J(θ) values but also is an estimate of bSG. Thislatter point will be explored later in Section 3.2.

Dow

nloa

ded

by [

Tsi

nghu

a U

nive

rsity

] at

21:

40 1

7 Se

ptem

ber

2013


Note that complexity has been considered in the choiceof S in Equation (7). We have |S| = m, S⊂ G, andmaxθ∈S C(θ) ≤ minθ ′∈G\S C(θ ′). Then following equation(5) we have

PCSm = Pr {S is mSG}= Pr

{|S| = m, S⊂ G, max

θ∈SC(θ) ≤ min

θ ′⊂G\SC(θ ′)

}

= Pr{

S⊂ G, maxθ∈S

C(θ) ≤ minθ ′∈G\S

C(θ ′), G = G}

+Pr{

S⊂ G, maxθ∈S


C(θ ′), G �= G}

≥ Pr{

S⊂ G, maxθ∈S


C(θ ′)|G = G}

× Pr{G = G}= Pr{G = G}= Pr

{J(θ) < J(θ ′) for all θ ∈ G and θ ′ ∈ D

}≥ Pr

{J(θ) ≤ μ1 and J(θ ′) > μ1 for all θ ∈ G

and θ ′ ∈ D}, (8)

where μ1 is a given constant. Due to the independencebetween the J(θ) values, we have

Pr{ J(θ)≤μ1 and J(θ ′) > μ1 for all θ ∈ G and θ ′ ∈ D}= Pr{ J(θ) ≤ μ1 for all θ ∈ G}× Pr{ J(θ) > μ1 for all θ ′ ∈ D}= (�θ∈G Pr{ J(θ) ≤ μ1})(�θ ′∈D Pr{ J(θ ′)>μ1}). (9)

Following Equations (8) and (9), we have the followingApproximate PCSm (APCSm),

APCSm := (�θ∈G Pr{ J(θ) ≤ μ1})(�θ∈D Pr{ J(θ)>μ1}).(10)

Then an approximate version of P1 is

� (AP1) maxn(θ1),...,n(θk) APCSm s.t.∑k

i=1 n(θi ) = T.

Note that the idea of APCSm is that PCSm is lowerbounded by the probability that all observed good enoughdesigns are truly good enough, which can be asymptoticallymaximized if we follow the allocation procedure in Chenet al. (2008). This explains why the following Theorem 1leads to a similar allocation procedure as in Chen et al.(2008). We briefly provide the analysis as follows.

Let Lm be the Lagrangian relaxation of AP1:

Lm ≡ APCSm+ λm

(k∑

i=1

n(θi )− T

)

= (�θ∈G Pr{ J(θ) ≤ μ1})(�θ∈D Pr{ J(θ) > μ1})

+ λm

( k∑i=1

n(θi )− T)

. (11)

The Karush–Kuhn–Tucker (KKT) conditions (cf. Walker(1999)) of AP1 are as follows. For θ ∈ G:

∂Lm

∂n(θ)= (�θ ′∈G,θ ′ �=θ Pr{ J(θ ′) ≤ μ1})

× (�θ ′∈D Pr{ J(θ ′) > μ1})φ(

μ1 − J(θ)

σ (θ)/√

n(θ)

)

× μ1 − J(θ)

2σ (θ)√

n(θ)+ λm = 0, (12)

for θ ∈ D:

∂Lm

∂n(θ)= (�θ ′∈G Pr{ J(θ ′) ≤ μ1}

)× (

�θ ′∈D,θ ′ �=θ Pr{ J(θ ′) > μ1})

× φ

(J(θ)− μ1

σ (θ)/√

n(θ)

)J(θ)− μ1

2σ (θ)√

n(θ)+ λm = 0,

(13)

and

∂Lm

∂λm=

k∑i=1

n(θi )− T = 0, (14)

where φ(·) is the probability density function of the stan-dard Gaussian distribution. We now analyze the relation-ship between n(θ) and n(θ ′). First, we consider the case thatθ and θ ′ ∈ G and θ �= θ ′. Following Equation (12) we have

(�θ ′′∈G,θ ′′ �=θ Pr{ J(θ ′′) ≤ μ1}

) (�θ ′′∈D Pr{ J(θ ′′) > μ1}

)× φ

(− δ1(θ)

σ (θ)/√

n(θ)

)−δ1(θ)

2σ (θ)√

n(θ)

= (�θ ′′∈G,θ ′′ �=θ ′ Pr{ J(θ ′′) ≤ μ1}) (

�θ ′′∈D Pr{

J(θ ′′ > μ1)})

× φ

(− δ1(θ ′)

σ (θ ′)/√

n(θ ′)

)−δ1(θ ′)

2σ (θ ′)√

n(θ ′), (15)

where δ1(θ) ≡ J(θ)− μ1 for all θ ∈ �. Simplifying Equa-tion (15), we have

Pr{ J(θ ′) ≤ μ1}φ(− δ1(θ)

σ (θ)/√

n(θ)

)δ1(θ)

σ (θ)√

n(θ)

= Pr{ J(θ) ≤ μ1}φ(− δ1(θ ′)

σ (θ ′)/√

n(θ ′)

)δ1(θ ′)

σ (θ ′)√

n(θ ′). (16)

Dow

nloa

ded

by [

Tsi

nghu

a U

nive

rsity

] at

21:

40 1

7 Se

ptem

ber

2013

740 Jia et al.

Taking the natural log of both sides, we have

log Pr{ J(θ ′) ≤ μ1} + log1√2π− δ2

1(θ)σ 2(θ)

n(θ)

+ logδ1(θ)σ (θ)

− 12

log n(θ)

= log Pr{ J(θ) ≤ μ1} + log1√2π− δ2

1(θ ′)2σ 2(θ ′)

n(θ ′)

+ logδ1(θ ′)σ (θ ′)

− 12

log n(θ ′). (17)

Letting n(θ) = α(θ)T and n(θ ′) = α(θ ′)T and dividing bothsides by T, we have

1T

log Pr{ J(θ ′) ≤ μ1} + 1T

log1√2π− δ2

1(θ)2σ 2(θ)

α(θ)

+ 1T

logδ1(θ)σ (θ)

− 12T

(log α(θ)+ log T)

= 1T

log Pr{ J(θ) ≤ μ1} + 1T

log1√2π− δ2

1(θ ′)2σ 2(θ ′)

α(θ ′)

+ 1T

logδ1(θ ′)σ (θ ′)

− 12T

(log α(θ ′)+ log T

). (18)

Letting T→∞, we have

δ21(θ)

σ 2(θ)α(θ) = δ2

1(θ ′)σ 2(θ ′)

α(θ ′). (19)

Rearranging the terms, we have

n(θ)n(θ ′)

= α(θ)α(θ ′)

=(

σ (θ)/δ1(θ)σ (θ ′)/δ1(θ ′)

)2

. (20)

For the other choices of θ and θ ′, it is tedious but straight-forward to show that we have

n(θ)n(θ ′)

=(

σ (θ)/δ1(θ)σ (θ ′)/δ1(θ ′)

)2

. (21)

Also note that Equation (14) requires that n(θ) ≥ 0 forall θ ∈ �. Thus, Equation (20) provides an asymptoticallyoptimal allocation for AP1. We summarize the above resultsinto the following theorem.

Theorem 1. PCSm is asymptotically maximized when

n(θ)n(θ ′)

=(

σ (θ)/δ1(θ)σ (θ ′)/δ1(θ ′)

)2

,

for all θ, θ ′ ∈ �.

Note that the difference between APCSm and PCSm de-pends on μ1, which is the boundary that separates the top-gdesigns from the rest of the designs. In order to minimizethis difference we should pick μ1 such that APCSm is maxi-mized. Following a similar analysis as in Chen et al. (2008),

we have

μ1 =σ (θig+1 ) J(θig )/

√n(θig+1 )+ σ (θig ) J(θig+1 )/

√n(θig )

σ (θig )/√

n(θig )+ σ (θig+1 )/√

n(θig+1 ).

(22)Following Theorem 1, we have Algorithm 1.

Algorithm 1 Optimal computing budget allocation for msimplest good designs in the ordinal sense (OCBAmSGO)Step 0. Simulate each design by n0 replications; l ← 0;

nl (θ1) = nl (θ2) = · · · = nl (θk) = n0.Step 1. If

∑ki=1 n(θi ) ≥ T, stop.

Step 2. Increase the total simulation time by

and compute the new budget allocationnl+1(θ1), . . . , nl+1(θk) using Theorem 1.

Step 3. Simulate design i for additional max(0, nl+1(θi )−nl (θi )) time, i = 1, . . . , k; l ← l + 1. Go to Step 1.

3.2. Selecting a bSG

The choice of S in Equation (7) also provides an estimateof the bSG under the given observed performance. We candivide the entire design space into four subsets:

S1 =M⋃

i=1,i �=t+1

Gi , (23)

S2 =M⋃

i=1

Di , (24)

S3 ={θt+1,1, . . . , θt+1,m−∑t

i=1 |Gi |}

, (25)

S4 = Gt+1 \ S3. (26)

In other words, S1 contains all of the observed good designsexcept those in Gt+1; S2 contains all of the observed bad de-signs, including those in Dt+1; S3 = S∩ Gt+1; S4 containsthe designs in Gt+1 other than S3. Then we have

PCSb= Pr{S is bSG}≥ Pr{S1 ∪ S3 ∪ S4 is good, S2 is not good, S3 is better

than S4}≥ Pr{ J(θ) ≤ μ1, θ ∈ S1; J(θ) > μ1, θ ∈ S2;

J(θ) ≤ μ2, θ ∈ S3; μ2 < J(θ) ≤ μ1, θ ∈ S4}= (�θ∈S1 Pr{ J(θ) ≤ μ1})(�θ∈S2 Pr{ J(θ) > μ1})× (�θ∈S3 Pr{ J(θ) ≤ μ2})(�θ∈S4 Pr{μ2 < J(θ) ≤ μ1})

:= APCSb, (27)

where μ2 is the boundary that separates the designs in S3from the designs in S4. The difference between APCSb andPCSb depends on μ1 and μ2. In order to minimize thisdifference we should pick μ1 and μ2 such that APCSb ismaximized. Following a similar analysis as in Chen et al.(2008), we use Equation (22) to determine the value of μ1,

Dow

nloa

ded

by [

Tsi

nghu

a U

nive

rsity

] at

21:

40 1

7 Se

ptem

ber

2013


and we have

μ2 = min(

μ1,

σ (θt+1,r+1) J(θt+1,r )/√

n(θt+1,r+1)+ σ (θt+1,r ) J(θt+1,r+1)/√

n(θt+1,r )

σ (θt+1,r )/√

n(θt+1,r )+ σ (θt+1,r+1)/√

n(θt+1,r+1)

),

(28)

where r = m−∑ti=1 |Gi |. Then an approximate version of

P2 is

� (AP2) maxn(θ),θ∈� APCSb s.t.∑k

i=1 n(θi ) = T.

Let Lb be the Lagrangian relaxation of AP2.

Lb ≡ APCSb+ λb

(k∑

i=1

n(θi )− T

). (29)

The KKT conditions of AP2 are as follows.For θ ∈ S1:

∂Lb

∂n(θ)= (�θ∈S1,θ ′ �=θ Pr{ J(θ ′) ≤ μ1}

)× (�θ ′∈S2 Pr{ J(θ ′) > μ1}

)× (�θ ′∈S3 Pr{ J(θ ′) ≤ μ2}

)× (�θ ′∈S4 Pr{μ2 < J(θ ′) ≤ μ1}

)φ

×(−δ1(θ)

σ (θ)/√

n(θ)

)−δ1(θ)

2σ (θ)√

n(θ)+ λb = 0, (30)

for θ ∈ S2:

∂Lb

∂n(θ)= (�θ ′∈S1 Pr{ J(θ ′) ≤ μ1}

)× (�θ ′∈S2,θ ′ �=θ Pr{ J(θ ′) > μ1}

)× (�θ ′∈S3 Pr{ J(θ ′) ≤ μ2}

)× (�θ ′∈S4 Pr{μ2 < J(θ ′) ≤ μ1}

)φ

×(

δ1(θ)

σ (θ)/√

n(θ)

)δ1(θ)

2σ (θ)√

n(θ)+ λb = 0; (31)

for θ ∈ S3:

∂Lb

∂n(θ)= (�θ ′∈S1 Pr{ J(θ ′) ≤ μ1}

) (�θ ′∈S2 Pr{ J(θ ′) > μ1}

)× (�θ ′∈S3,θ ′ �=θ Pr{ J(θ ′) ≤ μ2}

)× (�θ ′∈S4 Pr{μ2 < J(θ ′) ≤ μ1}

)φ

×(−δ2(θ)

σ (θ)/√

n(θ)

)−δ2(θ)

2σ (θ)√

n(θ)+ λb = 0, (32)

for θ ∈ S4:

∂Lb

∂n(θ)= (�θ ′∈S1 Pr{ J(θ ′) ≤ μ1}

) (�θ ′∈S2 Pr{ J(θ ′) > μ1}

)× (�θ ′∈S3 Pr{ J(θ ′) ≤ μ2}

)× (�θ ′∈S4,θ ′ �=θ Pr{μ2 < J(θ ′) ≤ μ1}

)×[φ

(−δ1(θ)

σ (θ)/√

n(θ)

)−δ1(θ)

2σ (θ)√

n(θ)

− φ

(−δ2(θ)

σ (θ)/√

n(θ)

)−δ2(θ)

2σ (θ)√

n(θ)

]+λb = 0, (33)

and

∂Lb

∂λb=

k∑i=1

n(θi )− T = 0, (34)

where δ2(θ) = J(θ)− μ2.We now analyze the relationship among the n(θ) values,

θ ∈ �.

Case 1: θ, θ ′ ∈ S1. Following Equation (30), we have

(�θ ′′∈S1,θ ′′ �=θ Pr{ J(θ ′′) ≤ μ1}

) (�θ ′′∈S2 Pr{ J(θ ′′) > μ1}

)× (�θ ′′∈S3 Pr{ J(θ ′′) ≤ μ2}

)× (�θ ′′∈S4 Pr{μ2 < J(θ ′′) ≤ μ1}

)φ

(−δ1(θ)

σ (θ)/√

n(θ)

)

× −δ1(θ)

σ (θ)√

n(θ)

= (�θ ′′∈S1,θ ′′ �=θ ′ Pr{ J(θ ′′) ≤ μ1}) (

�θ ′′∈S2 Pr{ J(θ ′′) > μ1})

× (�θ ′′∈S3 Pr{ J(θ ′′) ≤ μ2})

× (�θ ′′∈S4 Pr{μ2 < J(θ ′′) ≤ μ1})φ

(−δ1(θ ′)

σ (θ ′)/√

n(θ ′)

)

× −δ1(θ ′)

σ (θ ′)√

n(θ ′). (35)

Simplifying Equation (35), we have

Pr{ J(θ ′) ≤ μ1}φ(−δ1(θ)

σ (θ)/√

n(θ)

)δ1(θ)

σ (θ)√

n(θ)

= Pr{ J(θ) ≤ μ1}φ(−δ1(θ ′)

σ (θ ′)/√

n(θ ′)

)δ1(θ ′)

σ (θ ′)√

n(θ ′). (36)

Dow

nloa

ded

by [

Tsi

nghu

a U

nive

rsity

] at

21:

40 1

7 Se

ptem

ber

2013

742 Jia et al.


log Pr{

J(θ ′) ≤ μ1}+ log

1√2π− δ2

1(θ)2σ 2(θ)

n(θ)

+ logδ1(θ)σ (θ)

− 12

log n(θ)

= log Pr{

J(θ) ≤ μ1}+ log

1√2π− δ2

1(θ ′)2σ 2(θ ′)

n(θ ′)

+ logδ1(θ ′)σ (θ ′)

− 12

log n(θ ′). (37)

Let n(θ) = α(θ)T and n(θ ′) = α(θ ′)T. Dividing both sidesby T, we have

1T

log Pr{ J(θ ′) ≤ μ1} + 1T

log1√2π− δ2

1(θ)2σ 2(θ)

α(θ)

+ 1T

logδ1(θ)σ (θ)

− 12T

log α(θ)T

= 1T

log Pr{

J(θ) ≤ μ1}+ 1

Tlog

1√2π− δ2

1(θ ′)2σ 2(θ ′)

α(θ ′)

+ 1T

logδ1(θ ′)σ (θ ′)

− 12T

log α(θ ′)T. (38)

Letting T→∞, we have

δ21(θ)

σ 2(θ)α(θ) = δ2

1(θ ′)σ 2(θ ′)

α(θ ′). (39)


n(θ)n(θ ′)

= α(θ)α(θ ′)

=(

σ (θ)/δ1(θ)σ (θ ′)/δ1(θ ′)

)2

. (40)

Following similar analysis we have that for θ, θ ′ ∈ S1 ∪ S2 ∪S3:

n(θ)n(θ ′)

=(

σ (θ)/ν(θ)σ (θ ′)/ν(θ ′)

)2

, (41)

where

ν(θ) ={

δ1(θ), θ ∈ S1 ∪ S2,

δ2(θ), θ ∈ S3.

Case 2: θ ∈ S1, θ′ ∈ S4. Following Equations (30) and (33),

we have

(�θ ′′∈S1,θ ′′ �=θ Pr{ J(θ ′′) ≤ μ1}

) (�θ ′′∈S2 Pr{ J(θ ′′) > μ1}

)× (

�θ ′′∈S3 Pr{ J(θ ′′) ≤ μ2})

(�θ ′′∈S4 Pr{μ2 < J(θ ′′) ≤ μ1}

)φ

(−δ1(θ)

σ (θ)/√

n(θ)

)

× −δ1(θ)

2σ (θ)√

n(θ)

= (�θ ′′∈S1 Pr{ J(θ ′′) ≤ μ1}) (

�θ ′′∈S2 Pr{ J(θ ′′) > μ1})

× (�θ ′′∈S3 Pr{ J(θ ′′) ≤ μ2}

)(�θ ′′∈S4,θ ′′ �=θ ′ Pr

{μ2 < J(θ ′′) ≤ μ1

})[φ

(−δ1(θ ′)

σ (θ ′)/√

n(θ ′)

)−δ1(θ ′)

2σ (θ ′)√

n(θ ′)− φ

(−δ2(θ ′)

σ (θ ′)/√

n(θ ′)

)

× −δ2(θ ′)

2σ (θ ′)√

n(θ ′)

]. (42)

Simplifying Equation (42), we have

Pr{μ2 < J(θ ′) ≤ μ1}φ(−δ1(θ)

σ (θ)/√

n(θ)

)−δ1(θ)

σ (θ)√

n(θ)

= Pr{

J(θ) ≤ μ1}

A, (43)

where

A= φ

(−δ1(θ ′)

σ (θ ′)/√

n(θ ′)

)−δ1(θ ′)

σ (θ ′)√

n(θ ′)− φ

(−δ2(θ ′)

σ (θ ′)/√

n(θ ′)

)

× −δ2(θ ′)

σ (θ ′)√

n(θ ′). (44)


log Pr{μ2 < J(θ ′) ≤ μ1} + log1√2π− δ2

1(θ)2σ 2(θ)

n(θ)

+ log−δ1(θ)σ (θ)

− 12

log n(θ) = log Pr{ J(θ) ≤ μ1} + log A.

(45)


−δ21(θ)α(θ)2σ 2(θ)

= limT→∞

1T

log A. (46)

Note that by L’Hopital’s rule we have

limT→∞

1T

log A= limT→∞

dA/dTA

, (47)

Dow

nloa

ded

by [

Tsi

nghu

a U

nive

rsity

] at

21:

40 1

7 Se

ptem

ber

2013


where

dA/dTA= exp

{− δ2

1(θ ′)2σ 2(θ ′)

α(θ ′)T}[

δ31(θ ′)

2σ 3(θ ′)

√α(θ ′)

T

+ δ1(θ ′)2σ (θ ′)

1√α(θ)T3

]−exp

{− δ2

2(θ ′)2σ 2(θ ′)

α(θ ′)T}

×[

δ32(θ ′)

2σ 3(θ ′)

√α(θ ′)

T+ δ2(θ ′)

2σ (θ ′)1√

α(θ ′)T3

]

÷ exp{− δ2

1(θ ′)2σ 2(θ ′)

α(θ ′)T} −δ1(θ ′)

σ (θ ′)√

α(θ ′)T

− exp{− δ2

2(θ ′)2σ 2(θ ′)

α(θ ′)T} −δ2(θ ′)

σ (θ ′)√

α(θ ′)T.

(48)

Dividing both the numerator and the denominatoron the right-hand side (RHS) of Equation (48) byexp{(−δ2

2(θ ′)/2σ 2(θ ′))α(θ ′)T}, we have

dA/dTA= exp

{− δ2

1(θ ′)2σ 2(θ ′)

α(θ ′)T − −δ22(θ ′)

2σ 2(θ ′)α(θ ′)T

}

×[

δ31(θ ′)

2σ 3(θ ′)

√α(θ ′)

T+ δ1(θ ′)

2σ (θ ′)

√1

α(θ ′)T3

]

−[

δ32(θ ′)

2σ 3(θ ′)

√α(θ ′)

T+ δ2(θ ′)

2σ (θ ′)

√1

α(θ ′)T3

]

÷ exp{− δ2

1(θ ′)2σ 2(θ ′)

α(θ ′)T − −δ22(θ ′)

2σ 2(θ ′)α(θ ′)T

}

× −δ1(θ ′)

σ (θ ′)√

α(θ ′)T+ δ2(θ ′)

σ (θ ′)√

α(θ ′)T.

(49)

If |δ2(θ ′)| > |δ1(θ ′)|—i.e., J(θ ′) > (μ1 + μ2)/2—we have

limT→∞

dA/dTA= − δ2

1(θ ′)2σ 2(θ ′)

α(θ ′). (50)

Combining Equations (46), (47), and (50), we have

α(θ)α(θ ′)

=(

σ (θ)/δ1(θ)σ (θ ′)/δ1(θ ′)

)2

. (51)

If |δ2(θ ′)| < |δ1(θ ′)|—i.e., J(θ ′) ≤ (μ1 + μ2)/2—we have

limT→∞

dA/dTA= − δ2

2(θ ′)2σ 2(θ ′)

α(θ ′). (52)

Combining Equations (46), (47), and (52), we have

α(θ)α(θ ′)

=(

σ (θ)/δ1(θ)σ (θ ′)/δ2(θ ′)

)2

. (53)

Combining Equations (51) and (53), we have

n(θ)n(θ ′)

=

⎧⎪⎪⎨⎪⎪⎩

(σ (θ)/δ1(θ)σ (θ ′)/δ1(θ ′)

)2

if J(θ ′) >μ1 + μ2

2,(

σ (θ)/δ1(θ)σ (θ ′)/δ2(θ ′)

)2

if J(θ ′) ≤ μ1 + μ2

2.

(54)

Similarly, for θ ∈ S1 ∪ S2 ∪ S3 and θ ′ ∈ S4, we have

n(θ)n(θ ′)

=

⎧⎪⎪⎨⎪⎪⎩

(σ (θ)/ν(θ)

σ (θ ′)/δ1(θ ′)

)2

if J(θ ′) >μ1 + μ2

2,(

σ (θ)/ν(θ)σ (θ ′)/δ2(θ ′)

)2

if J(θ ′) ≤ μ1 + μ2

2.

(55)

In other words, we can further split S4 into two subsets:

S41 ={θ ∈ S4 and J(θ) >

μ1 + μ2

2

}, (56)

S42 ={θ ∈ S4 and J(θ) ≤ μ1 + μ2

2

}. (57)

Combining Equations (41), (54), and (55) together, we have

n(θ)

(σ (θ)/δ1(θ))2

∣∣∣∣∣θ∈S1∪S2∪S41

= n(θ)

(σ (θ)/δ2(θ))2

∣∣∣∣∣θ∈S3∪S42

. (58)

Then we have the following theorem.

Theorem 2. PCSb is asymptotically maximized when

n(θ)

(σ (θ)/δ1(θ))2

∣∣∣∣∣θ∈S1∪S2∪S41

= n(θ)

(σ (θ)/δ2(θ))2

∣∣∣∣∣θ∈S3∪S42

. (59)

We then have Algorithm 2.

Algorithm 2 Optimal computing budget allocation for thebest m simplest good designs in the ordinal sense (OCBAb-SGO)Step 0. Simulate each design by n0 replications; l ← 0;

nl (θ1) = nl (θ2) = · · · = nl (θk) = n0.Step 1. If

∑ki=1 n(θi ) ≥ T, stop.

Step 2. Increase the total simulation time by

and compute the new budget allocationnl+1(θ1), . . . , nl+1(θk) using Theorem 2.

Step 3. Simulate design i for additional max(0, nl+1(θi )−nl (θi )), i = 1, . . . , k; l ← l + 1. Go to Step 1.

Note that though the above choices of S in mSG andbSG are the same, their allocations are clearly different.OCBAmSGO tries to make sure that G are truly top-g.Then the choice of S in Equation (7) will make sure thatthe simplest m designs in G are picked. OCBAbSGO triesto furthermore make sure that designs in S3 are better thandesigns in S4.

Dow

nloa

ded

by [

Tsi

nghu

a U

nive

rsity

] at

21:

40 1

7 Se

ptem

ber

2013

744 Jia et al.

4. Numerical results

We compare OCBAmSGO (Algorithm 1, or A1 for short)and OCBAbSGO (Algorithm 2, or A2 for short) with theEqual Allocation (EA) approach over two groups of ex-amples. The first group includes academic examples. Thesecond group includes smoke detection problems in WSNs.

4.1. Academic examples

In order to capture different relationships between cardinalperformance and ordinal indexes of designs, the followingtypes of ordered performance are considered:

� neutral, J(θ[i ]) = i − 1, i = 1, . . . , 10,� flat, J(θ[i ]) = 9− 3

√10− i , i = 1, . . . , 10,

� steep, J(θ[i ]) = 9− ((10− i )/3)2, i = 1, . . . , 10,

where θ[i ] represents the top i th design. In the neutral type ofproblems, the performance difference between neighboringdesigns is equal. In the flat type of problems, most designshave a good performance. On the contrary, in the steeptype of problems, most designs have poor performance. Thefollowing two types of relationships between performanceand complexity are considered.

1. Simpler is better; i.e., if C(θ) < C(θ ′), then J(θ) < J(θ ′).2. Simpler is worse; i.e., if C(θ) < C(θ ′), then J(θ) > J(θ ′).

When simpler is better, we have θ[i ] = θi , i = 1, . . . , 10.When simpler is worse, we have θ[i ] = θ11−i , i = 1, . . . , 10.

Combining the above two considerations, we then havesix types of problems. In each problem, �1 = {θ1, θ2}, �2 ={θ3, θ4}, �3 = {θ5, θ6}, �4 = {θ7, θ8}, and �5 = {θ9, θ10}.The performance of the six problems are shown in Figs. 1and 2.

Fig. 1. The three examples where simpler is better (color figureprovided online).

Fig. 2. The three examples where simpler is worse (color figureprovided online).

Regard the top-3 designs as good; i.e., g = 3. We areinterested in the m-simplest good designs, where m = 2.Suppose the observation noise for each design is indepen-dent and identically distributed Gaussian N(0, 62). We useOCBAmSGO, OCBAbSGO, and EA to find the mSG andbSG, respectively, where n0 = 30 and = 10. Their PCSmand PCSb for different T values were estimated over 100 000independent runs and shown in Figs. 3–8, respectively. Wemake the following remarks.

Remark 1. In all of the six problems, OCBAmSGO (Al-gorithm 1) achieves higher PCSm values than EA and

Fig. 3. The neutral problem where simpler is better (color figureprovided online).

Dow

nloa

ded

by [

Tsi

nghu

a U

nive

rsity

] at

21:

40 1

7 Se

ptem

ber

2013


Fig. 4. The flat problem where simpler is better (color figureprovided online).

Fig. 5. The steep problem where simpler is better (color figureprovided online).

Fig. 6. The neutral problem where simpler is worse (color figureprovided online).

Fig. 7. The flat problem where simpler is worse (color figure pro-vided online).

OCBAbSGO (Algorithm 2) achieves higher PCSb valuesthan EA.

Remark 2. In all of the six problems, when T is fixed, thePCSm that is achieved by OCBAmSGO is higher thanthe PCSb that is achieved by OCBAbSGO. Similarly, EAachieves higher PCSm values than PCSb values. This isconsistent with the fact that a bSG must be an mSG but anmSG is not necessarily a bSG.

Remark 3. When T increases, OCBAmSGO, OCBAbSGO,and EA all achieve higher PCS values. This is consistentwith intuition because more computing budget should leadto higher PCS values.

Fig. 8. The steep problem where simpler is worse (color figureprovided online).

Dow

nloa

ded

by [

Tsi

nghu

a U

nive

rsity

] at

21:

40 1

7 Se

ptem

ber

2013

746 Jia et al.

Remark 4. For a given T, OCBAmSGO, OCBAbSGO, andEA achieve the highest PCS values in the steep problems,achieve lower PCS values in the neutral problems, andachieve the lowest PCS values in the flat problems. Thisis consistent with intuition because the performance differ-ences among the good designs in the steep problems arelarger than that in the neutral problems, which in turn arelarger than that in the flat problems.

Remark 5. Note that in our academic examples where sim-pler is better PCSm only requires two of the actual top-3designs to be within the observed top-3. But when simpleris worse PCSm requires all of the actual top-3 designs to beobserved as top-3. Similarly, when simpler is better PCSbonly requires the actual top-2 designs to be observed astop-2. However, when simpler is worse PCSb requires allof the actual top-3 designs to be observed as top-3 andfurthermore requires the truly best design to be observedas better than the truly second-best design. This explainswhy the PCS’s in Figs. 3 to 5 are higher than the PCS’s inFigs. 6 to 8. Because the performance differences betweenthe good and bad designs in the steep problems are largerthan those in the neutral problems, which in turn are largerthan those in the flat problems, the flat problem where sim-pler is worse (Fig. 7) has a significantly lower PCS thanwhere simpler is better (Fig. 4). Note that though the PCSvalues are lower in problems where simpler is worse, theystill converge to unity when there is an infinite computingbudget. It will be an interesting future research topic toexplore the knowledge of “simpler is worse” or “simpler isbetter” to develop a better allocation procedure. It will alsobe interesting to study how to change the complexities ofthe designs so that a higher PCS may be achieved under agiven computing budget.

1 3

7 9

2

4 5 6

8

Fig. 9. A smoke detection problem in a WSN.

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

Fig. 10. The 16 deployments of the three sensors.

4.2. Smoke detection in WSNs

Consider a WSN with three nodes that are used to detectsmoke in an Area of Interest (AoI). The AoI is discretizedinto 10× 10 grids in Fig. 9. A fire may start at any pointon the grid in the AoI with an equal probability. Onceinitiated, a smoke particle is generated at the fire sourceat any unit of time. An existing smoke particle randomlywalks to a neighboring point along the grid with a positiveprobability. There are four possible directions in which towalk. Each direction is taken with some probability; that is,

Pr{xt+1 = xt + 1, yt+1 = yt} ∝ d((xt + 1, yt), (x0, y0)),(60)

Pr{xt+1 = xt − 1, yt+1 = yt} ∝ d((xt − 1, yt), (x0, y0)),(61)

Pr{xt+1 = xt, yt+1 = yt + 1} ∝ d((xt, yt + 1), (x0, y0)),(62)

Pr{xt+1 = xt, yt+1 = yt − 1} ∝ d((xt, yt − 1), (x0, y0)),(63)

where (xt, yt) is the position of the smoke particle at timet, (x0, y0) is the position of the fire, and d(·, ·) is the dis-tance between two positions. In other words, the fire isgenerating smoke with the specified probabilities. The sen-sors can be deployed in three of nine possible positions.Once deployed, the sensors use an identical active sensingradius r ∈ {0, 1, 2, 3}, where r = 0 means that the sensordetects the smoke (and thus triggers the alarm) only if a

Dow

nloa

ded

by [

Tsi

nghu

a U

nive

rsity

] at

21:

40 1

7 Se

ptem

ber

2013


Fig. 11. The performance of the 64 designs (color figure provided online).

smoke particle arrives at the sensor (this is purely passivedetection); r > 1 means that the sensor detects the smokeif a smoke particle is within r grids away from the sensor.Power consumption increases as the active sensing radiusincreases. Thus, we are interested in a good deploymentwith a short sensing radius. When we pick out m simplestgood deployments, it is up to the final user to determinewhich deployment to use. Usually only one deployment iseventually selected and implemented. There are 84 possible

Fig. 12. PCSm values for the EA and A1 (m = 19, 20) (color figureprovided online).

ways to deploy the three sensors; however, removing all ofthe symmetric possibilities leaves us with only 16 to ana-lyze, as seen in Fig. 10. Let r be the complexity measure.Thus, |�i | = 16; i = 1, 2, 3, 4; and |�| = 64. For each de-ployment and sensing radius we are interested in the prob-ability that a smoke particle is detected within time T0,with T0 = 5 in our numerical experiments. We evaluatedthe performance of the 64 designs over 10 000 independentsimulations and the results are shown in Fig. 11 together

Fig. 13. PCSb values for the of EA and A2 (m = 19, 20) (colorfigure provided online).

Dow

nloa

ded

by [

Tsi

nghu

a U

nive

rsity

] at

21:

40 1

7 Se

ptem

ber

2013

748 Jia et al.

with the standard deviations of their performance observa-tions. We regard the top-20 designs as good enough—i.e.,g = 20—which is denoted by big circles in Fig. 11. For dif-ferent T values and m values, we applied OCBAmSGO,OCBAbSGO, and EA to get mSG and bSG values andestimate their PCSm and PCSb using 100 000 independentruns of the algorithms, and the results are shown in Figs. 12and 13, respectively, where n0 = 20 and = 10. We makethe following remarks.

Remark 6. OCBAmSGO (A1) achieves higher PCSm valuesthan EA for both values of m and for all of the comput-ing budgets T. This clearly demonstrates the advantage ofOCBAmSGO over EA even when the observation noisedoes not have Gaussian distribution and when there areonly finite computing budgets.

Remark 7. OCBAbSGO (A2) achieves higher PCSb valuesthan EA for both values of m and for all of the computingbudgets T. Similarly, this demonstrates the advantage ofOCBAbSGO over EA even when the observation noisedoes not have Gaussian distribution and when there areonly finite computing budgets.

Remark 8. When m increases from 19 to 20, bothOCBAmSGO and EA achieve lower PCSm values. Sim-ilarly, both OCBAbSGO and EA achieve lower PCSb val-ues. However, this does not imply that PCS is a monotonicfunction of m. For example, suppose � = {θ1, θ2, θ3}, �1 ={θ1}, �2 = {θ2}, �3 = {θ3}, J(θi ) = i, i = 1, 2, 3. Let g = 3.When m = 2, as long as there is observation noise,PCSm < 1 and PCSb < 1. However, m = 3, PCSm = 1,and PCSb = 1. This clearly shows that both PCSm andPCSb are not necessarily decreasing functions w.r.t. m.

Remark 9. When m = 19, the PCSm values achieved byOCBAmSGO and EA are both higher than the PCSb val-ues achieved by OCBAbSGO and EA, respectively. This isintuitively reasonable because a bSG is an mSG, whereasthe reverse is not necessarily true. When m = 20, the PCSmvalues achieved by OCBAmSGO and EA are equal tothe PCSb values achieved by OCBAbSGO and EA, re-spectively, because in this case the bSG coincides with themSG.

5. Conclusions

Simple designs with good ordinal performance are of spe-cial interest in many applications, especially when it is diffi-cult to specify what cardinal values are good. In this article,we consider the computing budget allocation problem in se-lecting m simplest good designs and develop OCBAmSGOand OCBAbSGO to asymptotically maximize the probabil-ities for correctly selecting such an mSG and the best mSG,respectively. We compare their performance with equal allo-cation on several numerical examples. Though we assumeGaussian observation noise in this article, the numerical

results indicate that both OCBAmSGO and OCBAbSGOhave a good performance even when the observation noiseis not Gaussian and when the computing budget is finite.Note that APCSm and APCSb are lower bounds for PCSmand PCSb, respectively. They are derived by consideringonly the case that the observed top-g designs are truly top-g. It is possible for the choice of S in Equation (7) to bean mSG (or bSG) even though G �= G; i.e., when an ob-served top-g design is not truly top-g. Exploring this casemay lead to tighter lower bounds and better allocation pro-cedures, which is an important further research direction.Also, note that only a single objective function with nosimulation-based constraints is considered in this article. Itis an interesting future research topic to extend the workin this article to problems with multiple objective func-tions and simulation-based constraints. We hope this workbrings more insight to finding simple and good designs ingeneral.

Acknowledgements

The authors would like to thank Grover Laporte, the ed-itor, and the anonymous reviewers for their constructivecomments on previous versions of this article.

This work has been supported in part by National Nat-ural Science Foundation of China under grants 60704008,60736027, 61174072, 61222302, 91224008, and 90924001;the Specialized Research Fund for the Doctoral Programof Higher Education (under grants 20070003110); theTsinghua National Laboratory for Information Scienceand Technology (TNLIST) Cross-Discipline Foundation;the National 111 International Collaboration Project (un-der grant B06002); the National Science Foundation un-der grants ECCS-0901543 and CMMI-1130273; the AirForce Office of Scientific Research under grant FA-9550-12-1-0250; the Department of Energy under Award DE-SC0002223; the NIH under grant 1R21DK088368-01; andthe National Science Council of Taiwan under Award NSC-100-2218-E-002-027-MY3.

References

Andradottir, S. (1998) Simulation optimization, in Handbook on Sim-ulation, Banks, J. (ed), John Wiley and Sons, New York, NY, pp.307–333.

Bechhofer, R.E. (1954) A single-sample multiple decision procedure forranking means of normal populations with known variances. Annalsof Mathematical Statistics, 25, 16–39.

Bechhofer, R.E., Santner, T.J. and Goldsman, D. (1995) Design and Anal-ysis of Experiments for Statistical Selection, Screening and MultipleComparisons, John Wiley & Sons, New York, NY.

Chen, C.H. (1996) A lower bound for the correct subset-selection proba-bility and its application to discrete event system simulations. IEEETransactions on Automatic Control, 41, 1227–1231.

Chen, C.-H., He, D., Fu, M. and Lee, L.H. (2008) Efficient simulationbudget allocation for selecting an optimal subset. INFORMS Jour-nal on Computing, 20(4), 579–595.

Dow

nloa

ded

by [

Tsi

nghu

a U

nive

rsity

] at

21:

40 1

7 Se

ptem

ber

2013


Chen, C.-H. and Lee, L.-H. (2011) Stochastic Simulation Optimization:An Optimal Computing Budget Allocation, World Scientific, Hack-ensack, NJ.

Chen, C.-H., Lin, J., Yucesan, E. and Chick, S.E. (2000) Simulationbudget allocation for further enhancing the efficiency of ordinaloptimization. Discrete Event Dynamic Systems: Theory and Appli-cations, 10, 251–270.

Chen, C.H. and Yucesan, E. (2005) An alternative simulation budgetallocation scheme for efficient simulation. International Journal ofSimulation and Process Modeling, 1, 49–57.

DeGroot, M.H. (1970) Optimal Statistical Decisions, McGraw-Hill, NewYork, NY.

Fu, M.C. (2002) Optimization for simulation: theory vs. practice.INFORMS Journal on Computing, 14, 192–215.

Fu, M.C., Chen, C.-H. and Shi, L. (2008) Some topics for simulationoptimization, in Proceedings of the 2008 Winter Simulation Confer-ence, Mason, S.J., Hill, R.R., Monch, L., Rose, O., Jefferson, T. andFowler, J.W. (eds), IEEE Press, Piscataway, NJ, pp. 27–38.

Fu, M.C., Hu, J.H., Chen, C.H. and Xiong, X. (2007) Simulation allo-cation for determining the best design in the presence of correlatedsampling. INFORMS Journal on Computing, 19(1), 101–111.

Gupta, S.S. (1956) On a decision rule for a problem in ranking means.Ph.D. thesis, Institute of Statistics, University of North Carolina,Chapel Hill, NC.

Gupta, S.S. (1965) On some multiple decision (ranking and selection)rules. Technometrics, 7, 225–245.

He, D., Chick, S.E. and Chen, C.-H. (2007) Opportunity cost and ocbaselection procedures in ordinal optimization for a fixed numberof alternative systems. IEEE Transactions on Systems, Man, andCybernetics—Part C: Applications and Reviews, 37(5), 951–961.

Ho, Y.-C., Zhao, Q.-C. and Jia, Q.-S. (2007) Ordinal Optimization: SoftOptimization for Hard Problems, Springer, New York, NY.

Hutter, M. (2005) Universal Artificial Intelligence: Sequential DecisionsBased On Algorithmic Probability, Springer, Berlin, Germany.

Jia, Q.-S. (2009) An adaptive sampling algorithm for simulation-basedoptimization with descriptive complexity constraints, in Proceed-ings of the First IEEE Youth Conference on Information, Comput-ing, and Telecommunications, IEEE Press, Piscataway, NJ, pp. 118–121.

Jia, Q.-S. (2011b) On state aggregation to approximate complex valuefunctions in large-scale Markov decision processes. IEEE Transac-tions on Automatic Control, 56(2), 333–344.

Jia, Q.-S. (2011a) An adaptive sampling algorithm for simulation-basedoptimization with descriptive complexity preference. IEEE Trans-actions on Automatic Science and Engineering, 8(4), 720–731.

Jia, Q.-S. and Zhao, Q.-C. (2009) Strategy optimization for controlledmarkov process with descriptive complexity constraint. Science inChina Series F: Information Sciences, 52(11), 1993–2005.

Kim, S.-H. and Nelson, B.L. (2003) Selecting the best system: theory andmethods, in Proceedings of the 2003 Winter Simulation Conference,Chick, S., Sanchez, P.J., Ferrin, D. and Morrice, D.J. (eds), IEEEPress, Piscataway, NJ, pp. 101–112.

Kolmogorov, A. (1965) Three approaches to the quantitative definitionof information. Problems of Information Transmission, 1(1), 1–7.

Lee, L.H., Chew, E.P., Teng, S. and Goldsman, D. (2004) Optimalcomputing budget allocation for multi-objective simulation models,in Proceedings of the 2004 Winter Simulation Conference, Ingalls,R.G., Rosetti, M.D., Smith, J.S. and Peters, B.A. (eds), IEEE Press,Piscataway, NJ, pp. 586–594.

Levin, L.A. (1973) Universal sequential search problems. Problems ofInformation Transmission, 9, 265–266.

Pujowidianto, N.A., Lee, L.H., Chen, C.-H. and Yap, C.M. (2009) Op-timal computing budget allocation for constrained optimization,in Proceedings of the 2009 Winter Simulation Conference, Rossetti,M.D., Hill, R.R., Johansson, B., Dunkin, A. and Ingalls, R.G. (eds),IEEE Press, Piscataway, NJ, pp. 584–589.

Solomonoff, R.J. (1964) A formal theory of inductive inference: parts 1and 2. Information and Control, 7(1–22), 224–254.

Solomonoff, R.J. (1978) Complexity-based induction systems: compar-isons and convergence theorems. IEEE Transactions on InformationTheory, 24, 422–432.

Swisher, J.R., Hyden, P.D., Jacobson, S.H. and Schruben, L.W.(2004) A survey of recent advances in discrete input parameterdiscrete-event simulation optimization. IIE Transactions, 36, 591–600.

Swisher, J.R., Jacobson, S.H. and Yucesan, E. (2003) Discrete-event sim-ulation optimization using ranking, selection, and multiple com-parison procedures: a survey. ACM Transactions on Modeling andComputer Simulation, 13, 134–154.

Tekin, E. and Sabuncuoglu, I. (2004) Simulation optimization: a com-prehensive review on theory and applications. IIE Transactions, 36,1067–1081.

Walker, R.C. (1999) Introduction to Mathematical Programming, PrenticeHall, Upper Saddle River, NJ.

Yan, S., Zhou, E. and Chen, C.-H. (2010) Efficient simulation budgetallocation for selecting the best set of simplest good enough designs,in Proceedings of the 2010 Winter Simulation Conference, Johansson,B., Jain, S., Montoya-Torres, J., Hugan, J. and Yucesan, E. (eds),IEEE Press, Piscataway, NJ, pp. 1152–1159.

Yan, S., Zhou, E. and Chen, C.-H. (2012) Efficient selection of a set ofgood enough designs with complexity preference. IEEE Transac-tions on Automatic Science and Engineering, 9, 596–606.

Biographies

Qing-Shan Jia received a B.E. degree in Automation in July 2002 anda Ph.D. in Control Science and Engineering in July 2006, both fromTsinghua University, Beijing, China. He is an Associate Professor at theCenter for Intelligent and Networked Systems, Department of Automa-tion, TNLIST, Tsinghua University, Beijing, China. He was a VisitingScholar at Harvard University in 2006 and a Visiting Scholar at theHong Kong University of Science and Technology in 2010, and is a Vis-iting Scholar at the Massachusetts Institute of Technology. His researchinterests include theories and applications of discrete event dynamic sys-tems and simulation-based performance evaluation and optimization ofcomplex systems. He is a senior member of IEEE and serves as a co-guesteditor and an associate editor of Discrete Event Dynamic Systems: The-ory and Applications, co-guest editor and area editor of IIE Transactions,and an associate editor of IEEE Transactions on Automation Science andEngineering.

Enlu Zhou received a B.S. degree with highest honors in Electrical En-gineering from Zhejiang University, China, in 2004 and a Ph.D. in Elec-trical Engineering from the University of Maryland, College Park, in2009. Since then she has been an Assistant Professor at the Industrial& Enterprise Systems Engineering Department at the University of Illi-nois Urbana–Champaign. Her research interests include Markov de-cision processes, stochastic control, and simulation optimization. Sheis a recipient of the Best Theoretical Paper award at the 2009 Win-ter Simulation Conference and the 2012 AFOSR Young Investigatoraward.

Chun-Hung Chen received his Ph.D. degree in Engineering Sciences fromHarvard University in 1994. He is a Professor of Systems Engineering& Operations Research at George Mason University (GMU) and is alsoaffiliated with National Taiwan University. He was an Assistant Pro-fessor of Systems Engineering at the University of Pennsylvania beforejoining GMU. Sponsored by the NSF, NIH, DOE, NASA, MDA, and

Dow

nloa

ded

by [

Tsi

nghu

a U

nive

rsity

] at

21:

40 1

7 Se

ptem

ber

2013

750 Jia et al.

FAA, he has worked on the development of very efficient methodol-ogy for stochastic simulation optimization and its applications to airtransportation systems, semiconductor manufacturing, health care, se-curity networks, power grids, and missile defense systems. He receivedthe “National Thousand Talents” Award from the central governmentof China in 2011, the Best Automation Paper Award from the 2003IEEE International Conference on Robotics and Automation, the 1994Eliahu I. Jury Award from Harvard University, and the 1992 MasPar

Parallel Computer Challenge Award. He has served as Co-Editor of theProceedings of the 2002 Winter Simulation Conference and Program Co-Chair for 2007 Informs Simulation Society Workshop. He has servedas a Department Editor for IIE Transactions, Associate Editor of IEEETransactions on Automatic Control, Area Editor of Journal of SimulationModeling Practice and Theory, Associate Editor of International Jour-nal of Simulation and Process Modeling, and Associate Editor of IEEEConference on Automation Science and Engineering.

Dow

nloa

ded

by [

Tsi

nghu

a U

nive

rsity

] at

21:

40 1

7 Se

ptem

ber

2013

C ISSN: 0740-817X print / 1545-8830 online DOI: 10.1080...

Documents

Transcript of C ISSN: 0740-817X print / 1545-8830 online DOI: 10.1080...