Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf ·...
Transcript of Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf ·...
![Page 1: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/1.jpg)
Information collection in a linear program
Ilya O. Ryzhov Warren B. Powell
Operations Research and Financial EngineeringPrinceton University
Princeton, NJ 08544, USA
International Conference on Stochastic ProgrammingAugust 17, 2010
1 / 37
![Page 2: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/2.jpg)
Outline
1 Introduction
2 Mathematical model
3 The knowledge gradient algorithmDerivationComputationTheory: asymptotic optimality
4 Experimental results
5 Conclusions
2 / 37
![Page 3: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/3.jpg)
Outline
1 Introduction
2 Mathematical model
3 The knowledge gradient algorithmDerivationComputationTheory: asymptotic optimality
4 Experimental results
5 Conclusions
3 / 37
![Page 4: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/4.jpg)
Motivation: emergency response
Our goal is to find the shortest(least congested) path across anetwork
This is an LP where eachobjective coefficient representsthe congestion on an edge
We can measure the localcongestion on an individual edge(e.g. from the air) and changeour estimate of the congestionon that edge
4 / 37
![Page 5: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/5.jpg)
Motivation: agricultural planning
We solve an LP to maximizetotal crop yield subject toacreage constraints indifferent fields
The exact yield fromplanting a certain field isunknown
Before settling on a plan, weperform expensive soil testson different fields to improveour beliefs about the yield
5 / 37
![Page 6: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/6.jpg)
LP formulation
We consider an LP in standard form,
V (c) = maxx cT xs.t. Ax = b
x ≥ 0,
where the vector c ∈RM is unknown
We have a Bayesian prior belief about c in which the coefficients arecorrelated
We can measure a coefficient (e.g. perform a soil test) and observe aresult that changes our beliefs
We are given N measurements to learn the true optimal valueV (c)...what should we measure?
6 / 37
![Page 7: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/7.jpg)
The effect of learning
Changing our estimate of a single objective coefficient can drasticallychange what we believe to be the optimal solution
Consider the shortest-path problem:
7 / 37
![Page 8: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/8.jpg)
The effect of learning
Changing our estimate of a single objective coefficient can drasticallychange what we believe to be the optimal solution
Consider the shortest-path problem:
7 / 37
![Page 9: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/9.jpg)
The effect of learning
Changing our estimate of a single objective coefficient can drasticallychange what we believe to be the optimal solution
Consider the shortest-path problem:
7 / 37
![Page 10: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/10.jpg)
Correlated beliefs in optimal learning
By measuring one coefficient, we can obtain information about manyother coefficientsIn a traffic network, if edge (i , j) is congested, it is likely that edgesinto i and out of j are congested
8 / 37
![Page 11: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/11.jpg)
Correlated beliefs in optimal learning
By measuring one coefficient, we can obtain information about manyother coefficientsIn a traffic network, if edge (i , j) is congested, it is likely that edgesinto i and out of j are congested
8 / 37
![Page 12: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/12.jpg)
Correlated beliefs in optimal learning
By measuring one coefficient, we can obtain information about manyother coefficientsIn a traffic network, if edge (i , j) is congested, it is likely that edgesinto i and out of j are congested
8 / 37
![Page 13: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/13.jpg)
Correlated beliefs in optimal learning
Correlations are modeled using a covariance matrix
We assume c ∼N(c0,Σ0
)Example:
Σ0 =
12 6 36 7 43 4 15
The value Σ0
j ,k represents our belief about the covariance ofcoefficients j and k
9 / 37
![Page 14: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/14.jpg)
A quick literature review
Stochastic linear programmingI Theoretical properties of the expected optimal value of a stochastic LP
(Madansky 1960, Itami 1974)I Approximate algorithms for multi-stage problems (Birge 1982)
Parametric linear programming/sensitivity analysisI Linear programs with varying objective coefficients (Jansen et al. 1997)
Optimal learningI Simple underlying optimization models, e.g. ranking and selection
(Bechhofer et al. 1985) and multi-armed bandits (Gittins 1989)I Recent work on learning with correlated beliefs (Frazier et al. 2009)
and independent beliefs on graphs (Ryzhov & Powell 2010)
Our work synthesizes and builds on concepts from all of these areas.
10 / 37
![Page 15: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/15.jpg)
Outline
1 Introduction
2 Mathematical model
3 The knowledge gradient algorithmDerivationComputationTheory: asymptotic optimality
4 Experimental results
5 Conclusions
11 / 37
![Page 16: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/16.jpg)
Preliminaries
We assume that the feasible region is known and bounded
Let x (c) be the optimal solution, i.e. the solution of
V (c) = maxx cT xs.t. Ax = b
x ≥ 0,
By strong duality, the dual LP has the same optimal value:
V (c) = miny bT ys.t. AT y − s = c
s ≥ 0
Let y (c) and s (c) denote the optimal dual solution
12 / 37
![Page 17: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/17.jpg)
Learning with correlated Bayesian beliefs
At first, we believe that
c ∼N(c0,Σ0
)We measure the jth coefficient and observe
c1j ∼N (cj ,λj) .
As a result, our beliefs change:
c1 = c0 +c1j − c0
j
λj + Σ0jj
Σ0ej
Σ1 = Σ0−Σ0eje
Tj Σ0
λj + Σ0jj
Repeat the process to obtain c2,c3, ...
The vector ej isgiven by
ej = (0, ...,1, ...,0)T
where component jis equal to 1.
13 / 37
![Page 18: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/18.jpg)
Dynamic programming formulation
The optimal measurement strategy can be described using Bellman’sequation:
V ∗,N(cN ,ΣN
)= V
(cN)
V ∗,n (cn,Σn) = maxj
IE[V ∗,n+1
(cn+1,Σn+1
)|cn,Σn, jn = j
]The optimal measurement J∗,n (cn,Σn) is the choice of j thatachieves the argmax in V ∗,n (cn,Σn)
Due to the curse of dimensionality, this equation is computationallyintractable
14 / 37
![Page 19: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/19.jpg)
Outline
1 Introduction
2 Mathematical model
3 The knowledge gradient algorithmDerivationComputationTheory: asymptotic optimality
4 Experimental results
5 Conclusions
15 / 37
![Page 20: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/20.jpg)
Definition
Originally developed for ranking and selection (Gupta & Miescke1996, Frazier et al. 2008)
The KG decision rule is given by
JKG ,n (cn,Σn) = arg maxj
IEnj
[V(cn+1
)−V (cn)
]The KG factor
νKG ,nj = IEn
j
[V(cn+1
)−V (cn)
]is the expected improvement in our estimate of the optimal value ofthe LP that is achieved by measuring j
The future beliefs cn+1 are random at time n, meaning that KGcomputes the expected value of a stochastic LP
16 / 37
![Page 21: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/21.jpg)
Derivation
It can be shown (Frazier et al. 2009) that, given cn and Σn, and giventhat we measure j at time n, the conditional distribution of cn+1 is
cn+1 ∼ cn +Σnej√λj + Σn
jj
·Z
where Z is a one-dimensional standard normal random variable.
Thus, the KG factor becomes
νKG ,nj = IE
[V(cn + ∆cn
j ·Z)]−V (cn)
where ∆cnj =
Σnej√λj +Σn
jj
.
17 / 37
![Page 22: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/22.jpg)
Graphical illustration
The solution x(cn + z∆cn
j
)is constant if z is in a certain interval
Varying z rotates the level curve of the objective function
18 / 37
![Page 23: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/23.jpg)
Graphical illustration
The solution x(cn + z∆cn
j
)is constant if z is in a certain interval
Varying z rotates the level curve of the objective function
18 / 37
![Page 24: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/24.jpg)
Graphical illustration
The solution x(cn + z∆cn
j
)is constant if z is in a certain interval
Varying z rotates the level curve of the objective function
18 / 37
![Page 25: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/25.jpg)
Graphical illustration
The solution x(cn + z∆cn
j
)is constant if z is in a certain interval
Varying z rotates the level curve of the objective function
18 / 37
![Page 26: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/26.jpg)
Derivation (continued)
The set of values of z for which x(cn + z∆cn
j
)is constant is known
(Hadigheh & Terlaky 2006) as the invariant support set
Let −∞ = z1 < z2 < ... < zI = ∞ be a partition of the real line intoinvariant support sets
Let xi = x(cn + z∆cn
j
)for z ∈ (zi ,zi+1)
Then,
IEV(cn + ∆cn
j ·Z)
= ∑i
∫ zi+1
zi
[(cn + z∆cn
j
)Txi
]φ(z)dz
where φ is the standard normal pdf
19 / 37
![Page 27: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/27.jpg)
Graphical illustration
The optimal solution x(cn + z∆cn
j
)changes at the breakpoints zi
The level curve of cn + zi∆cnj is tangent to a face of the polyhedron
20 / 37
![Page 28: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/28.jpg)
Graphical illustration
The optimal solution x(cn + z∆cn
j
)changes at the breakpoints zi
The level curve of cn + zi∆cnj is tangent to a face of the polyhedron
z > zi
20 / 37
![Page 29: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/29.jpg)
Graphical illustration
The optimal solution x(cn + z∆cn
j
)changes at the breakpoints zi
The level curve of cn + zi∆cnj is tangent to a face of the polyhedron
z = zi
20 / 37
![Page 30: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/30.jpg)
Graphical illustration
The optimal solution x(cn + z∆cn
j
)changes at the breakpoints zi
The level curve of cn + zi∆cnj is tangent to a face of the polyhedron
z < zi
20 / 37
![Page 31: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/31.jpg)
The KG formula
After some algebra, we can obtain an expression
νKG ,nj = ∑
i
(bi+1−bi ) f (−|zi |)
whereI bi =
(∆cn
j
)Txi
I f (z) = zΦ(z) + φ (z)I Φ is the standard normal cdf
This formula gives the exact value of the KG factor, provided that wecan compute the breakpoints zi of the piecewise linear function
21 / 37
![Page 32: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/32.jpg)
Computation of the breakpoints
At time n, we start with one optimal solution x (cn) for z = 0
We determine whether z = 0 is itself a breakpoint by solving two LPs
z− = miny ,s,z zs.t. AT y − s− z∆cn
j = cn
x (cn)T s = 0s ≥ 0,
andz+ = maxy ,s,z z
s.t. AT y − s− z∆cnj = cn
x (cn)T s = 0s ≥ 0
The values z−,z+ are the smallest and largest values of z for whichx (cn) is optimal (Roos et al. 1997)
22 / 37
![Page 33: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/33.jpg)
Computation of the breakpoints
If either z− or z+ is equal to zero, then z = 0 is a breakpointSuppose that z− = 0 and z+ > 0
23 / 37
![Page 34: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/34.jpg)
Computation of the breakpoints
If either z− or z+ is equal to zero, then z = 0 is a breakpointSuppose that z− = 0 and z+ > 0
23 / 37
![Page 35: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/35.jpg)
Computation of the breakpoints
If either z− or z+ is equal to zero, then z = 0 is a breakpointSuppose that z− = 0 and z+ > 0
x (cn)
23 / 37
![Page 36: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/36.jpg)
Computation of the breakpoints
If either z− or z+ is equal to zero, then z = 0 is a breakpointSuppose that z− = 0 and z+ > 0
x (cn)
xl (cn)
23 / 37
![Page 37: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/37.jpg)
Finding the neighbouring extreme point
The point xl (cn) is the optimal solution to the LP
Vl (cn) = minx
(∆cn
j
)Tx
s.t. Ax = b
(s (cn))T x = 0x ≥ 0
The quantity(
∆cnj
)Txl (cn) is the left derivative of the piecewise
linear function at the breakpoint z = 0
The right derivative is(
∆cnj
)Tx (cn) itself
24 / 37
![Page 38: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/38.jpg)
Finding the next breakpoint
However, xl (cn) is also optimal at two breakpoints, zero and zl (cn)
x (cn)
xl (cn)
25 / 37
![Page 39: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/39.jpg)
Finding the next breakpoint
However, xl (cn) is also optimal at two breakpoints, zero and zl (cn)
xl (cn)
z = 0
25 / 37
![Page 40: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/40.jpg)
Finding the next breakpoint
However, xl (cn) is also optimal at two breakpoints, zero and zl (cn)
xl (cn)
z = zl (cn)
25 / 37
![Page 41: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/41.jpg)
Finding the next breakpoint
This next breakpoint is the optimal value of the LP
zl (cn) = miny ,s,z zs.t. AT y − s− z∆cn
j = cn
(xl (cn))T s = 0s ≥ 0.
This LP is identical to the one we used to find z−, but with x (cn)replaced by xl (cn)
We can now find a new z− and repeat the procedure until z− =−∞
26 / 37
![Page 42: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/42.jpg)
Other cases
If z− < 0 and z+ = 0, we can find the neighbouring extreme pointxu (cn) by solving
Vu (cn) = maxx
(∆cn
j
)Tx
s.t. Ax = b
(s (cn))T x = 0x ≥ 0
The next breakpoint is the optimal value of an LP
zu (cn) = maxy ,s,z zs.t. AT y − s− z∆cn
j = cn
(xu (cn))T s = 0s ≥ 0
Again, the process can be repeated until z+ = ∞
27 / 37
![Page 43: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/43.jpg)
Other cases
z− < 0 < z+: zero is not a breakpoint, but both z− and z+ arez− = z+ = 0: zero is a breakpoint, but x (cn) is not an extreme point
xu (cn)
xl (cn)
x (cn)
28 / 37
![Page 44: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/44.jpg)
Summary of algorithm for computing KG factors
Given a set of beliefs (cn,Σn), do the following for j = 1, ...,M:
1 Let z = 0 and solve for x (cn), y (cn) and s (cn).
2 Solve two LPs to obtain z−,z+ and decide whether z = 0 is abreakpoint.
3 Solve a sequence of LPs to obtain the entire vector z of breakpointsand the set x of invariant solutions.
4 Compute νKG ,nj using z and x .
Finally, we measure the coefficient with the largest νKG ,nj .
29 / 37
![Page 45: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/45.jpg)
Asymptotic optimality property of KG
Proposition
For any measurement strategy π,
IEπV(cN)≤ IEV (c) .
Theorem
limN→∞
IEKGV(cN)
= IEV (c) .
Recall that our objective is to maximize IEπV(cN)
As N → ∞, the KG method achieves the highest possible value
30 / 37
![Page 46: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/46.jpg)
Outline
1 Introduction
2 Mathematical model
3 The knowledge gradient algorithmDerivationComputationTheory: asymptotic optimality
4 Experimental results
5 Conclusions
31 / 37
![Page 47: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/47.jpg)
Experimental results: shortest-path problem
Ten layered graphs (22 nodes, 50 edges)
Ten larger layered graphs (38 nodes, 102 edges)
32 / 37
![Page 48: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/48.jpg)
Outline
1 Introduction
2 Mathematical model
3 The knowledge gradient algorithmDerivationComputationTheory: asymptotic optimality
4 Experimental results
5 Conclusions
33 / 37
![Page 49: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/49.jpg)
Conclusions
We have proposed a new class of optimal learning problems where theunderlying optimization model is a linear program
We have derived a knowledge gradient method for deciding what tomeasure in this setting
The KG method computes the value of a single measurement exactlyand is asymptotically optimal
The algorithm for finding breakpoints terminates in finite time, but iscomputationally expensive
34 / 37
![Page 50: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/50.jpg)
References
Bechhofer, R., Santner, T. & Goldsman, D. (1995) Design andAnalysis of Experiments for Statistical Selection, Screening andMultiple Comparisons. John Wiley and Sons, New York.
Birge, J. (1982) “The value of the stochastic solution in stochasticlinear programs with fixed recourse.” Mathematical Programming 24,314–325.
Frazier, P.I., Powell, W. & Dayanik, S. (2008) “A knowledge-gradientpolicy for sequential information collection.” SIAM J. on Control andOptimization 47:5, 2410-2439.
Frazier, P.I., Powell, W.B. & Dayanik, S. (2009) “Theknowledge-gradient policy for correlated normal rewards.” INFORMSJ. on Computing 21:4, 599-613.
35 / 37
![Page 51: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/51.jpg)
References
Gittins, J.C. (1989) Multi-Armed Bandit Allocation Indices. JohnWiley and Sons, New York.
Gupta, S. & Miescke, K. (1996) “Bayesian look ahead one stagesampling allocation for selecting the best population.” J. onStatistical Planning and Inference 54:229-244.
Hadigheh, A. & Terlaky, T. (2006) “Sensitivity analysis in linearoptimization: Invariant support set intervals.” European Journal onOperational Research 169:3, 1158–1175.
Itami, H. (1974) “Expected Value of a Stochastic Linear Program andthe Degree of Uncertainty of Parameters.” Management Science21:3, 291–301.
36 / 37
![Page 52: Information collection in a linear programscholar.rhsmith.umd.edu/.../files/2010_sp12_lp.pdf · Computation Theory: asymptotic optimality 4 Experimental results 5 Conclusions 2/37.](https://reader034.fdocuments.in/reader034/viewer/2022050220/5f65886e6498586be84c4c01/html5/thumbnails/52.jpg)
References
Jansen, B., de Jong, J., Roos, C. & Terlaky, T. (1997) “Sensitivityanalysis in linear programming: just be careful!” European Journal ofOperational Research 101:1, 15–28.
Madansky, A. (1960) “Inequalities for stochastic linear programmingproblems.” Management Science 6:2, 197–204.
Roos, C., Terlaky, T. & Vial, J. (1997) Theory and Algorithms forLinear Optimization: An Interior Point Approach. John Wiley andSons, Chichester, UK.
Ryzhov, I.O. & Powell, W.B. (2010) “Information collection on agraph.” To appear in Operations Research.
37 / 37