Parallel Bayesian Policies for Multiple Comparisons with a ... · Goal: Allocate budget to best...
Transcript of Parallel Bayesian Policies for Multiple Comparisons with a ... · Goal: Allocate budget to best...
![Page 1: Parallel Bayesian Policies for Multiple Comparisons with a ... · Goal: Allocate budget to best support classi cation c x optional monetary cost to simulate alternative x once. c](https://reader034.fdocuments.in/reader034/viewer/2022050507/5f9838a9c21c27494537d161/html5/thumbnails/1.jpg)
Parallel Bayesian Policies for Multiple Comparisons witha Known Standard
Weici HuPeter Frazier
Operations Research & Information Engineering (ORIE)Cornell University
CORS/INFORMS International Conference, MontrealJune 14, 2015
![Page 2: Parallel Bayesian Policies for Multiple Comparisons with a ... · Goal: Allocate budget to best support classi cation c x optional monetary cost to simulate alternative x once. c](https://reader034.fdocuments.in/reader034/viewer/2022050507/5f9838a9c21c27494537d161/html5/thumbnails/2.jpg)
In MCS, we use simulation to determine which alternativesperform better than a threshold
MCS = multiple comparisons with a known standard
An MCS problem involves k projects/systems and has a knownthreshold value d
It uses simulation/experiment to decide which among the k projectsperform better than the threshold
![Page 3: Parallel Bayesian Policies for Multiple Comparisons with a ... · Goal: Allocate budget to best support classi cation c x optional monetary cost to simulate alternative x once. c](https://reader034.fdocuments.in/reader034/viewer/2022050507/5f9838a9c21c27494537d161/html5/thumbnails/3.jpg)
MCS problems appears in crowdsourced image labeling
![Page 4: Parallel Bayesian Policies for Multiple Comparisons with a ... · Goal: Allocate budget to best support classi cation c x optional monetary cost to simulate alternative x once. c](https://reader034.fdocuments.in/reader034/viewer/2022050507/5f9838a9c21c27494537d161/html5/thumbnails/4.jpg)
MCS appears in Ambulance Positioning
We must allocate ambulances across 11 bases in the city of Edmonton.Which allocations satisfy mandated minimums for percentage of callsanswered in time, under a variety of different possible call arrival rates?
Call
Arrival
Rate
Allocations of Ambulances to Bases
>70% on time
<70% on time
2/hr
20/hr
[Thanks to Shane Henderson and Matt Maxwell for providing the ambulance simulation]
![Page 5: Parallel Bayesian Policies for Multiple Comparisons with a ... · Goal: Allocate budget to best support classi cation c x optional monetary cost to simulate alternative x once. c](https://reader034.fdocuments.in/reader034/viewer/2022050507/5f9838a9c21c27494537d161/html5/thumbnails/5.jpg)
MCS appears in Algorithm Development
A researcher who develops a new algorithm would like to know:
In which problem settings is average-case performance better withAlgorithm A than with Algorithm B?
0 10 20 30 40 50 60 70 80 90 100−0.005
0
0.005
0.01
0.015
0.02
0.025
0.03
M
VK
G−
VIE
N=10*MN=3*MN=1*M
![Page 6: Parallel Bayesian Policies for Multiple Comparisons with a ... · Goal: Allocate budget to best support classi cation c x optional monetary cost to simulate alternative x once. c](https://reader034.fdocuments.in/reader034/viewer/2022050507/5f9838a9c21c27494537d161/html5/thumbnails/6.jpg)
Given Samples, Estimating the Level Set isWell-Understood
y(x)
x=1 x=10
![Page 7: Parallel Bayesian Policies for Multiple Comparisons with a ... · Goal: Allocate budget to best support classi cation c x optional monetary cost to simulate alternative x once. c](https://reader034.fdocuments.in/reader034/viewer/2022050507/5f9838a9c21c27494537d161/html5/thumbnails/7.jpg)
Given Samples, Estimating the Level Set isWell-Understood
●
●
●
●●
●
●●
●
●
y(x)
x=1 x=10
![Page 8: Parallel Bayesian Policies for Multiple Comparisons with a ... · Goal: Allocate budget to best support classi cation c x optional monetary cost to simulate alternative x once. c](https://reader034.fdocuments.in/reader034/viewer/2022050507/5f9838a9c21c27494537d161/html5/thumbnails/8.jpg)
The Optimal Policy Puts Samples Where They Help Most
y(x)
x=1 x=10
![Page 9: Parallel Bayesian Policies for Multiple Comparisons with a ... · Goal: Allocate budget to best support classi cation c x optional monetary cost to simulate alternative x once. c](https://reader034.fdocuments.in/reader034/viewer/2022050507/5f9838a9c21c27494537d161/html5/thumbnails/9.jpg)
The Optimal Policy Puts Samples Where They Help Most
y(x)
x=1 x=10
![Page 10: Parallel Bayesian Policies for Multiple Comparisons with a ... · Goal: Allocate budget to best support classi cation c x optional monetary cost to simulate alternative x once. c](https://reader034.fdocuments.in/reader034/viewer/2022050507/5f9838a9c21c27494537d161/html5/thumbnails/10.jpg)
The Optimal Policy Puts Samples Where They Help Most
●
●
●
●● ●
●
● ●●
y(x)
x=1 x=10
![Page 11: Parallel Bayesian Policies for Multiple Comparisons with a ... · Goal: Allocate budget to best support classi cation c x optional monetary cost to simulate alternative x once. c](https://reader034.fdocuments.in/reader034/viewer/2022050507/5f9838a9c21c27494537d161/html5/thumbnails/11.jpg)
Comparison to previous work on the MCS problem
We differ from most of the previous work on MCS by using aBayesian setting, and/or focusing on the allocation problem.
Comparing to a similar work done by [Xie & Frazier 2013],I We consider a finite horizon, which is more realistic than a geometric
horizon and infinite horizon studied in [Xie & Frazier 2013]I We allow cost per sample to be optional.I We look at allocating parallel simulation resources
![Page 12: Parallel Bayesian Policies for Multiple Comparisons with a ... · Goal: Allocate budget to best support classi cation c x optional monetary cost to simulate alternative x once. c](https://reader034.fdocuments.in/reader034/viewer/2022050507/5f9838a9c21c27494537d161/html5/thumbnails/12.jpg)
Goal: Allocate budget to best support classification
k alternatives
N time periods
m parallel computing resources. Each one can simulate 1 alternativein 1 time period.
θx is the underlying true performance of alternative x .
We will observe iid Bernoulli(θx) samples from alternative x .
Our goal is to determine whether each θx is larger or smaller thansome known threshold dx .
![Page 13: Parallel Bayesian Policies for Multiple Comparisons with a ... · Goal: Allocate budget to best support classi cation c x optional monetary cost to simulate alternative x once. c](https://reader034.fdocuments.in/reader034/viewer/2022050507/5f9838a9c21c27494537d161/html5/thumbnails/13.jpg)
Goal: Allocate budget to best support classification
cx optional monetary cost to simulate alternative x once. cx = 0 forsimplicity in this talk.
zn,x is the number of simulation resources to use on alternative x . Itis the decision to be made at each step.
∑x zn,x ≤ m
After we finish sampling, we decide whether to classify eachalternative x as above the threshold or below.
If we decide it is above, we get reward θx − dx .
If we decide it is below, we get reward dx − θx .
Goal:Allocate our samples to best support our finalclassification of alternatives
![Page 14: Parallel Bayesian Policies for Multiple Comparisons with a ... · Goal: Allocate budget to best support classi cation c x optional monetary cost to simulate alternative x once. c](https://reader034.fdocuments.in/reader034/viewer/2022050507/5f9838a9c21c27494537d161/html5/thumbnails/14.jpg)
We use a Bayesian approach
Yn,x is the number of successes observed after we do zn,x simulationson alternative x at time n
Yn,x |θx , zn,x ∼ Binomial(zn,x , θx)
We use Beta as a conjugate prior θx
θx ∼ Beta(α0,x , β0,x).
θx |z1,x ,Y1,x , . . . , zn,x ,Yn,x ∼ Beta(αn,x , βn,x).
A policy π is a mapping from histories onto allocations of simulationresources zn = (zn,1, . . . , zn,k) ∈ Nk satisfying
k∑x=1
zn,x ,≤ m
The MCS problem under Bayesian framework is
maxπ
Eπ[
k∑x=1
|αN,x
αN,x + βN,x− dx |
∣∣∣ααα0,βββ0
](1)
![Page 15: Parallel Bayesian Policies for Multiple Comparisons with a ... · Goal: Allocate budget to best support classi cation c x optional monetary cost to simulate alternative x once. c](https://reader034.fdocuments.in/reader034/viewer/2022050507/5f9838a9c21c27494537d161/html5/thumbnails/15.jpg)
Dynamic programming gives an optimal solution
We formulate the MCS problem as a dynamic program with
state at time n, Sn = (sn,1, . . . , sn,k) = posterior parameters of all thealternatives
value function Vn(Sn) = the maximum expected total reward to beobtained from time step n onward given the current state Sn.
The optimal value is V0(S0) = (1)
The optimal policy π∗ is the sequence of z∗1, . . . , z∗N that achieves the
maximum in Bellman’s recursion
![Page 16: Parallel Bayesian Policies for Multiple Comparisons with a ... · Goal: Allocate budget to best support classi cation c x optional monetary cost to simulate alternative x once. c](https://reader034.fdocuments.in/reader034/viewer/2022050507/5f9838a9c21c27494537d161/html5/thumbnails/16.jpg)
Problem: This dynamic program is computationallyinfeasible
The number of states in state space at time n is O((mn)k).
Memory scales exponentially in k .
Computation scales exponentially in k.
E.g., m = k = 8, at time step N = 5, there are 2.35426 ∗ 1012 states.
Curse of Dimensionality!
![Page 17: Parallel Bayesian Policies for Multiple Comparisons with a ... · Goal: Allocate budget to best support classi cation c x optional monetary cost to simulate alternative x once. c](https://reader034.fdocuments.in/reader034/viewer/2022050507/5f9838a9c21c27494537d161/html5/thumbnails/17.jpg)
Solution: instead we form an upper bound
![Page 18: Parallel Bayesian Policies for Multiple Comparisons with a ... · Goal: Allocate budget to best support classi cation c x optional monetary cost to simulate alternative x once. c](https://reader034.fdocuments.in/reader034/viewer/2022050507/5f9838a9c21c27494537d161/html5/thumbnails/18.jpg)
Step 1 in forming an upper bound:Relax the original DP
We perform a Lagrangian relaxation:We replace the hard constraint
k∑x=1
zn,x ≤ m,∀1 ≤ n ≤ N
by a linear penality
λn(k∑
x=1
zn,x −m)
where λλλ = {λ1, ..., λN} ≥ 0
Lemma (1)
For λλλ ≥ 0, let Vλλλ0 (S0) be the optimal value of the relaxed problem, we
haveVλλλ0 (S0) ≥ V0(S0)
![Page 19: Parallel Bayesian Policies for Multiple Comparisons with a ... · Goal: Allocate budget to best support classi cation c x optional monetary cost to simulate alternative x once. c](https://reader034.fdocuments.in/reader034/viewer/2022050507/5f9838a9c21c27494537d161/html5/thumbnails/19.jpg)
Step 2 in forming an upper bound:Decompose the relaxed DP to single-alternative DPs
We define an MCS problem on a single alternative x , for eachx ∈ {1, . . . , k}:
I Up to m resources can be used at each time stepI The additional cost per sample is λn at time n.
Solving single-alternative MCS problems by dynamic programming iscomputationally feasible.
Lemma (2)
For any λλλ > 0, let Vλλλ0,x(S0,x) be the value of a single-alternative MCS
problem on x ,
Vλλλ0 (S0) =
k∑x=1
Vλλλ0,x(S0,x) +m
N∑n=1
λn, (2)
![Page 20: Parallel Bayesian Policies for Multiple Comparisons with a ... · Goal: Allocate budget to best support classi cation c x optional monetary cost to simulate alternative x once. c](https://reader034.fdocuments.in/reader034/viewer/2022050507/5f9838a9c21c27494537d161/html5/thumbnails/20.jpg)
Now we have a computable upper bound
Lemma 1 and 2 hold for any λλλ ≥ 0, hence∑kx=1 V
λλλ0,x(S0,x) + m
∑Nn=1 λn forms an upper bound to MCS
problem for any given λλλ
Theorem
An upper bound on the optimal value of the original MCS problem is
infλλλ≥0
[ k∑x=1
Vλλλ0,x(S0,x) + m
N∑n=1
λn
](3)
Computing method:first order convex optimization
![Page 21: Parallel Bayesian Policies for Multiple Comparisons with a ... · Goal: Allocate budget to best support classi cation c x optional monetary cost to simulate alternative x once. c](https://reader034.fdocuments.in/reader034/viewer/2022050507/5f9838a9c21c27494537d161/html5/thumbnails/21.jpg)
This analysis also inspires this index policy
At each time step n = 1, . . . ,N,
1 Let zλλλn,x(Sn−1,x) be the number of samples taken under an optimalsingle-alternative policy given λλλ, breaking ties arbitrarily.
2 Let λ∗ = inf{λ :∑
x zλλλn,x(Sn−1,x) ≤ m,λλλ = λe
}.
3 Set λλλ∗ = λ∗e.
4 Let zn = {zλλλ∗n,x , x = 1, ..., k} so as to satisfy∑k
x=1 zλλλ∗n,x ≤ m, breaking
ties arbitrarily between different allocations zn that satisfy thisconstraint.
![Page 22: Parallel Bayesian Policies for Multiple Comparisons with a ... · Goal: Allocate budget to best support classi cation c x optional monetary cost to simulate alternative x once. c](https://reader034.fdocuments.in/reader034/viewer/2022050507/5f9838a9c21c27494537d161/html5/thumbnails/22.jpg)
Numerical experiment
For each k = 2, 4, 8, 16
m = k
dx = 0.2, ∀x ∈ {1, ..., k}S0 = (ααα0,βββ0) = (1, 1)k
� – Upper bound
— 95% confidence interval with index policy, based on 10000replications
– 95% confidence interval with equal allocation policy, based on50000 replications
![Page 23: Parallel Bayesian Policies for Multiple Comparisons with a ... · Goal: Allocate budget to best support classi cation c x optional monetary cost to simulate alternative x once. c](https://reader034.fdocuments.in/reader034/viewer/2022050507/5f9838a9c21c27494537d161/html5/thumbnails/23.jpg)
Numerical experiment
0 2 4 6 8 10 12 14 16 180.318
0.32
0.322
0.324
0.326
0.328
0.33
0.332
0.334
Exp
ecte
d T
otal
Rew
ard
/ k
Number of Alternatives (k)
Upper BoundIndex PolicyEqual Allocation Policy
![Page 24: Parallel Bayesian Policies for Multiple Comparisons with a ... · Goal: Allocate budget to best support classi cation c x optional monetary cost to simulate alternative x once. c](https://reader034.fdocuments.in/reader034/viewer/2022050507/5f9838a9c21c27494537d161/html5/thumbnails/24.jpg)
Conclusion and Future Work
We create a computationally feasible upper bound on the value offinite-horizon MCS porblem which allows parallel computing resources.
We create an index-based policy in the settings studied.
Conjecture: Index policy becomes optimal policy as m and k increasesto infinity
![Page 25: Parallel Bayesian Policies for Multiple Comparisons with a ... · Goal: Allocate budget to best support classi cation c x optional monetary cost to simulate alternative x once. c](https://reader034.fdocuments.in/reader034/viewer/2022050507/5f9838a9c21c27494537d161/html5/thumbnails/25.jpg)
Thank you!