1 online convex optimization (with partial information) Elad Hazan @ IBM Almaden Joint work with...
-
Upload
victoria-moody -
Category
Documents
-
view
216 -
download
0
Transcript of 1 online convex optimization (with partial information) Elad Hazan @ IBM Almaden Joint work with...
![Page 1: 1 online convex optimization (with partial information) Elad Hazan @ IBM Almaden Joint work with Jacob Abernethy and Alexander Rakhlin @ UC Berkeley.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649e4b5503460f94b4038a/html5/thumbnails/1.jpg)
1
online convex optimization (with partial information)
Elad Hazan @ IBM Almaden
Joint work with Jacob Abernethy and Alexander Rakhlin
@ UC Berkeley
![Page 2: 1 online convex optimization (with partial information) Elad Hazan @ IBM Almaden Joint work with Jacob Abernethy and Alexander Rakhlin @ UC Berkeley.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649e4b5503460f94b4038a/html5/thumbnails/2.jpg)
2
Online routing
source
destination
Edge weights (congestion) – not known in advance.Can change arbitrarily between 0-10 (say)
![Page 3: 1 online convex optimization (with partial information) Elad Hazan @ IBM Almaden Joint work with Jacob Abernethy and Alexander Rakhlin @ UC Berkeley.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649e4b5503460f94b4038a/html5/thumbnails/3.jpg)
3
Online routing
source
destination
3
Iteratively:Pick path (not knowing network congestion), then see length of path.
![Page 4: 1 online convex optimization (with partial information) Elad Hazan @ IBM Almaden Joint work with Jacob Abernethy and Alexander Rakhlin @ UC Berkeley.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649e4b5503460f94b4038a/html5/thumbnails/4.jpg)
4
Online routing
source
destination
5
Iteratively:Pick path (not knowing network congestion), then see length of path.
![Page 5: 1 online convex optimization (with partial information) Elad Hazan @ IBM Almaden Joint work with Jacob Abernethy and Alexander Rakhlin @ UC Berkeley.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649e4b5503460f94b4038a/html5/thumbnails/5.jpg)
5
Online routing
source
destination
Iteratively:Pick path (not knowing network congestion), then see length of path.
This talk: efficient and optimal algorithm for online routing
![Page 6: 1 online convex optimization (with partial information) Elad Hazan @ IBM Almaden Joint work with Jacob Abernethy and Alexander Rakhlin @ UC Berkeley.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649e4b5503460f94b4038a/html5/thumbnails/6.jpg)
6
• Three sources of difficulty:
1. Prediction problem – future unknown, unrestricted
2. Partial information – performance of other decisions unknown
3. Efficiency: exponentially large decision set
• Similar situations arise in
Production, web retail, Advertising, Online routing…
Why is the problem hard ?
![Page 7: 1 online convex optimization (with partial information) Elad Hazan @ IBM Almaden Joint work with Jacob Abernethy and Alexander Rakhlin @ UC Berkeley.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649e4b5503460f94b4038a/html5/thumbnails/7.jpg)
7
1. Describe classical general framework
2. Where classical approaches come short
3. Modern, Online convex optimization framework
4. Our new algorithm, and a few technical details
Technical Agenda
![Page 8: 1 online convex optimization (with partial information) Elad Hazan @ IBM Almaden Joint work with Jacob Abernethy and Alexander Rakhlin @ UC Berkeley.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649e4b5503460f94b4038a/html5/thumbnails/8.jpg)
8
Multi-armed bandit problemIteratively, for time t=1,2,… : 1. Adversary chooses payoffs on the machines rt
2. Player picks a machine it
3. Loss is revealed rt(it)
Process is repeated T times ! (with different, unknown, arbitrary losses every round)
Goal:Minimize the regret
Regret = cost(ALG) – cost(smallest-cost fixed machine) = t rt(it) - minj t rt(j)
Online learning problem - design efficient algorithms with small regret
1
5
-3
5
![Page 9: 1 online convex optimization (with partial information) Elad Hazan @ IBM Almaden Joint work with Jacob Abernethy and Alexander Rakhlin @ UC Berkeley.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649e4b5503460f94b4038a/html5/thumbnails/9.jpg)
9
Multi-armed bandit problemStudied in game theory, statistics (as early as
1950’s), more recently machine learning. • Robbins (1952) (statistical setting), • Hannan (1957) – game theoretic setting, full
information.• [ACFS ’98] – adversarial bandits:
1
5
-3
5
![Page 10: 1 online convex optimization (with partial information) Elad Hazan @ IBM Almaden Joint work with Jacob Abernethy and Alexander Rakhlin @ UC Berkeley.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649e4b5503460f94b4038a/html5/thumbnails/10.jpg)
10
Multi-armed bandit problem[ACFS ’98] – adversarial bandits:
Regret = O(√ T √ K ) for K machines, T game iterations.
Optimal up to the constant.
1
5
-3
5
What’s left to do ?
Exponential regret & run time for routing(K = # of paths in the graph)
Our result: Efficient algorithm with √ T n Regret (n = # of edges in graph)
![Page 11: 1 online convex optimization (with partial information) Elad Hazan @ IBM Almaden Joint work with Jacob Abernethy and Alexander Rakhlin @ UC Berkeley.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649e4b5503460f94b4038a/html5/thumbnails/11.jpg)
11
Online convex optimization
• Convex bounded functions (for this talk, linear functions)– Total loss = t ft(xt)
• Can express online routing as OCO over polytope in Rn with O(n) constraints (despite exp’ large decision set)
x1
x2
xT
f1
f2
fT
f1(x1)
f2(x2)
fT(xT)
Distribution over paths = flow in the graph
The set of all flows = P µ Rn = polytope of flows
This polytope has a compact representation, as the number of facets = number of constraints for flows in graphs is small. (The flow constraints, flow conservation and edge capacity)
![Page 12: 1 online convex optimization (with partial information) Elad Hazan @ IBM Almaden Joint work with Jacob Abernethy and Alexander Rakhlin @ UC Berkeley.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649e4b5503460f94b4038a/html5/thumbnails/12.jpg)
12
Online performance metric - regret
• Loss of our algorithm = t ft(xt)
• Regret = Loss(ALG) – Loss(OPT point) = t ft(xt) - t ft(x*)
x*
f1f2 fT
minima for t ft(x)(fixed optimum in hindsight)
![Page 13: 1 online convex optimization (with partial information) Elad Hazan @ IBM Almaden Joint work with Jacob Abernethy and Alexander Rakhlin @ UC Berkeley.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649e4b5503460f94b4038a/html5/thumbnails/13.jpg)
13
Existing techniques
• Adaboost, Online Gradient Descent, Weighted Majority, Online Newton Step, Perceptron, Winnow…
• All can be seen as special case of “Follow the regularized leader”– At time t predict:
Requires complete knowledge of cost functions
![Page 14: 1 online convex optimization (with partial information) Elad Hazan @ IBM Almaden Joint work with Jacob Abernethy and Alexander Rakhlin @ UC Berkeley.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649e4b5503460f94b4038a/html5/thumbnails/14.jpg)
14
Our algorithm - template
Two generic parts:• Compute the “Regularized leader”
• In order to make the above work, estimate functions gt such that E[gt] = ft.
Note: we would prefer to use
But we do not have access to ft !
![Page 15: 1 online convex optimization (with partial information) Elad Hazan @ IBM Almaden Joint work with Jacob Abernethy and Alexander Rakhlin @ UC Berkeley.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649e4b5503460f94b4038a/html5/thumbnails/15.jpg)
15
Our algorithm - template
1. Compute the current prediction xt based on previous observations.
2. Play random variable yt, taken from a distribution such that
1. Distribution is centered at prediction: E[yt] = xt2. From the information ft(yt), we can create a random variable gt, such
thatE[gt] = ft
3. The variance of the random variable gt needs to be small
![Page 16: 1 online convex optimization (with partial information) Elad Hazan @ IBM Almaden Joint work with Jacob Abernethy and Alexander Rakhlin @ UC Berkeley.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649e4b5503460f94b4038a/html5/thumbnails/16.jpg)
16
Simple example: interpolating the cost function
• Want to compute & use:what is ft ? (we only see ft(xt))
Convex set = 2-dim simplex(dist on two endpoints)
![Page 17: 1 online convex optimization (with partial information) Elad Hazan @ IBM Almaden Joint work with Jacob Abernethy and Alexander Rakhlin @ UC Berkeley.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649e4b5503460f94b4038a/html5/thumbnails/17.jpg)
17
Simple example: interpolating the cost function
• Want to compute & use:what is ft ? (we only see ft(xt))
ft(1)
ft(2)
x1
x2
x1 * ft(1) + x2 * ft(2)
Convex set = 2-dim simplex(dist on two endpoints)
![Page 18: 1 online convex optimization (with partial information) Elad Hazan @ IBM Almaden Joint work with Jacob Abernethy and Alexander Rakhlin @ UC Berkeley.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649e4b5503460f94b4038a/html5/thumbnails/18.jpg)
18
Simple example: interpolating the cost function
Unbiased estimate of ft is just as good
ft(1)
ft(2)
With prob. x1
x2
Expected value remains x1 * ft(1) + x2 * ft(2)
With prob. x2
Note that E[gt] = ft
x1
![Page 19: 1 online convex optimization (with partial information) Elad Hazan @ IBM Almaden Joint work with Jacob Abernethy and Alexander Rakhlin @ UC Berkeley.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649e4b5503460f94b4038a/html5/thumbnails/19.jpg)
19
The algorithm – complete definition
• Let R(x) be a self concordant barrier for the convex set• Compute current prediction center:
• Compute eigenvectors of Hessian at current point. Consider the intersection of eigenvectors and the Dikin ellipsoid
• Sample uniformly from eigendirections, to obtain unbiased estimates E[gt] = ft
Remember that we need the random variables yt,gt to satisfy:
1. E[yt] = xt
2. E[gt] = ft
3. The variance of the random variable gt needs to be small
This is where self-concordance is crucial
![Page 20: 1 online convex optimization (with partial information) Elad Hazan @ IBM Almaden Joint work with Jacob Abernethy and Alexander Rakhlin @ UC Berkeley.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649e4b5503460f94b4038a/html5/thumbnails/20.jpg)
20
An extremely short survey: IP methods
• Probably most widely used algorithms today“IP polynomial algorithms in convex programming” [NN ‘94]
• To solve a general optimization problem: minimize convex function over a convex set (specified by constraints), reduce to unconstrained optimization via a self concordant barrier function
min f(x)A1 ¢ x - b1 · 0…Am¢ x - bm · 0x 2 Rn
min f(x) - £ i log(bi - Ai x) x 2 Rn
R(x)Barrier function
![Page 21: 1 online convex optimization (with partial information) Elad Hazan @ IBM Almaden Joint work with Jacob Abernethy and Alexander Rakhlin @ UC Berkeley.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649e4b5503460f94b4038a/html5/thumbnails/21.jpg)
21
The geometric properties of self concordant functions
• Let R be self concordant for convex set K, then at each point x, the hessian of R at x defines a local norm.
• The Dikin ellipsoid
• Fact: for all x in the set, DR(x) µ K
• The Dikin ellipsoid characterizes “space to the boundary” in every direction
• It is tight:
![Page 22: 1 online convex optimization (with partial information) Elad Hazan @ IBM Almaden Joint work with Jacob Abernethy and Alexander Rakhlin @ UC Berkeley.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649e4b5503460f94b4038a/html5/thumbnails/22.jpg)
22
Unbiased estimators from self-concordant functions
• Eigendirections of Hessian: create unbiased gradient estimator in each direction
• Orthogonal basis – complete estimator• Self concordance is important for:
– Capture the geometry of the set (Dikin ellipsoid)
– Control the variance of the estimator
![Page 23: 1 online convex optimization (with partial information) Elad Hazan @ IBM Almaden Joint work with Jacob Abernethy and Alexander Rakhlin @ UC Berkeley.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649e4b5503460f94b4038a/html5/thumbnails/23.jpg)
23
Online routing – the final algorithm
Guess path,Observe length
Flow decomposition for yt
Obtain path pt
yt
xt Xt+1
3
![Page 24: 1 online convex optimization (with partial information) Elad Hazan @ IBM Almaden Joint work with Jacob Abernethy and Alexander Rakhlin @ UC Berkeley.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649e4b5503460f94b4038a/html5/thumbnails/24.jpg)
24
1. Online optimization algorithm with limited feedback• Optimal regret – O(√T n)• Efficient (n3 time per iteration)• More efficient implementation based on IP theory
Summary
The End
Open Questions: • Dependence on dimension is not known to be optimal • Algorithm has large variance – T2/3 , reduce the variance
to √T ? • Adaptive adversaries