4.4 Geometry and solution of Lagrangean duals
description
Transcript of 4.4 Geometry and solution of Lagrangean duals
4.4 Geometry and solution of Lagrangean duals Solving Lagrangean dual: ZD = max 0 Z()
Z() = min kK (c’xk + ’(b-Axk)) (unless Z() = - )
Let fk = b - Axk and hk = c’xk . Then
Z() = min kK (hk + fk') (piecewise linear and concave)
Z() is nondifferentiable, but mimic the idea for differentiable function t+1 = t + t Z(t ), t = 1, 2, …
Prop 4.5: A function f : Rn R is concave if and only if for any x* Rn , there exists a vector s Rn such that
f(x) f(x*) + s’(x - x*), for all x Rn.
The vector s is called the gradient of f at x* (if differentiable). Generalize the above property to nondifferentiable functions.
Integer Programming 2011 1
Def 4.3: Let f be a concave function. A vector s such that f(x) f(x*) + s’(x - x*),
for all x Rn, is called a subgradient of f at x*. The set of all subgradients of f at x* is denoted as f(x*) and is called the subdifferential of f at x*.
Prop: The subdifferential f(x*) is closed and convex.Pf) f(x*) is the intersection of closed half spaces.
If f is differentiable at x*, then f(x*) = {f(x*)}.For convex functions, subgradient is defined as s which satisfies
f(x) f(x*) + s’(x - x*).
Geometric intuition: Think in Rm+1 space f(x) f(x*) + s’(x – x*)
s’ x* – f(x*) s’x - f(x) ( s, -1)’(x*, f(x*)) ( s, -1)’( x, f(x)) ( s, -1)’{ (x, f(x)) - (x*, f(x*)) } 0
Integer Programming 2011 2
Picture: ( s, -1)’{ (x, f(x)) - (x*, f(x*)) } 0
Integer Programming 2011 3
x
f(x)
x*
s
-1 (s, -1)
f(x) = f(x*) + s’(x – x*)
(x, f(x))
(x, f(x)) – (x*, f(x*))
x
(x*, f(x*))
For convex functions: ( s, -1)’{ (x, f(x)) - (x*, f(x*)) } 0
Integer Programming 2011 4
s
-1
f(x) = f(x*) + s’(x – x*)
(x, f(x)) – (x*, f(x*))
f(x)
xx x*
(x, f(x))(x*, f(x*))
(s, -1)
Prop 4.6: Let f : Rn R be a concave function. A vector x* maximizes f over Rn if and only if 0 f(x*). Pf) A vector x* maximizes f over Rn if and only if
f(x) f(x*) = f(x*) + 0’(x - x*), for all x Rn ,which is equivalent to 0 f(x*).
Prop 4.7: Let Z() = min kK (hk + fk')
E() = { k K: Z() = hk + fk'}.
Then, for every * 0 the following relations hold:(a) For every k E(*), fk' is a subgradient of the function Z() at *.
(b) Z(*) = conv( {fk : k E(*)} ), i.e., a vector s is a subgradient of the function Z() at * if and only if s is a convex combination of the vectors fk , k E(*).
Integer Programming 2011 5
Pf) (a) By definition of Z() we have for every k E(*) Z() hk + fk' = hk + fk'* + fk'( - *) = Z(*) + fk‘( - *).
(b) We have shown that {fk : k E(*)} Z(*) and since a convex combi-nation of two subgradients is also a subgradients , it follows that
conv( {fk : k E(*)} ) Z(*).
Assume that there is s Z(*), such that s conv( {fk : k E(*)} ) and de-rive a contradiction. By the separating hyperplane theorem, there exists a vector d and a scalar , such that
d’fk , for all k E(*) (4.29)
d’s < Since Z(*) < hi + fi'*, for all i E(*), then for sufficiently small > 0, we have, Z(* + d) = hk + fk'(* + d), for some k E(*)
Since Z(*) = hk + fk'* , we obtain that
Z(* + d) - Z(*) = fk’d, for some k E(*) Integer Programming 2011 6
(continued)From (4.29), we have
Z(* + d) - Z(*) . (4.30)Since s Z(*), we have
Z(* + d) Z(*) + s’d. (def’n of subgradient)From (4.29), we have
Z(* + d) - Z(*) < , ( d’s < )contradicting (4.30).
Integer Programming 2011 7
Subgradient algorithm
For differentiable functions, the direction f(x) is the direction of steepest as-cent of f at x. But, for nondifferentiable functions, the direction of subgradi-ent may not be an ascent direction. However, moving along the direction of a subgradient, we can move closer to the minimum point.
Integer Programming 2011 8
Ex: f(x1, x2) = min { -x1, x1 – 2x2, x1 + 2x2 } : concave function
direction of subgradient may not be an ascent direction for f, but we can move closer to the maximum point.
Integer Programming 2011 9
f(x) = -2
f(x) = -1
f(x) = 0
s2 = (1, 2)
s1 = (1, -2)
x* = (-2, 0)
s1(x-x*)=0
s2(x-x*)=0
x1
x2
The subgradient optimization algorithmInput: A nondifferentiable concave function Z().Output: A maximizer of Z() subject to 0.Algorithm:
1. Choose a starting point 1 0; let t = 1.2. Given t , check whether 0 Z(t). If so, then t is optimal and the algorithm
terminates. Else, choose a subgradient st of the function Z(t). (fk b-Axk )
3. Let jt+1 = max {j
t + t sjt , 0} (projection), where t is a positive step size pa-
rameter. Increment t and go to Step 2.
Integer Programming 2011 10
Choosing the step lengths: Thm :
(a) If t=1 t , and limt t = 0, then Z(t) ZD the optimal value of
Z().(b) If t = 0 t , t = 1, 2, …, for some parameter 0 < < 1, then Z(t) ZD if 0 and are sufficiently large.
(c) If t = f [ZD* – Z(t) ] / || st ||2 , where 0 < f < 2, and ZD
* ZD , then Z(t) ZD
*, or the algorithm finds t with ZD* Z(t) ZD , for some finite t.
(a) guarantees convergence, but slow (e.g., t = 1/t )
(b) In practice, may use halving the value of t after iterations
(c) ZD* typically unknown, may use a good primal upper bound in place of
ZD*. If not converge, decrease ZD
*. In practice, 0 Z(t) is rarely met. Usually, only find approximate optimal
solution fast and resort to B-and-B. Convergence is not monotone.If ZD = ZLP , solving LP relaxation may give better results (monotone conver-gence).Integer Programming 2011 11
Ex: Traveling salesman problem:For TSP, we dualize degree constraints for nodes except node 1.
Z() = min eE ( ce – i – j ) xe + 2 i V\{1} i
( = min eE ce xe + i V\{1} i (2 - e (i) xe ) )
e (1) xe = 2, eE(S) xe |S| - 1, S V\{1}, S , V\{1},
eE(V\{1}) xe = |V| - 2,
xe { 0, 1 }.
step direction is i
t+1 = it + t ( 2 - e (i) xe(t) )
step size using rule (c) is t = f [ZD
* – Z(t) ] / i V ( 2 - e (i) xe(t) )2
(t = f [ZD* – Z(t) ] / || b-Ax(t) ||2 )
Note that i-th coordinate of subgradient direction is two minus the number of edges incident to node i in the optimal one-tree. We do not have 0 here since the dualized constraints are equalities.Integer Programming 2011 12
Lagrangean heuristic and variable fixing: The solution obtained from solving the Lagrangian relaxation may not be fea-
sible to IP, but it can be close to a feasible solution to IP.Obtain a feasible solution to IP using heuristic procedures. Then we may ob-tain a good upper bound on optimal value.Also called ‘primal heuristic’
May fix values of some variables using information from Lagrangian relax-ation (refer to W,177-178).
Integer Programming 2011 13
Choosing a Lagrangean dual:How to determine the constraints to relax?
Strength of the Lagrangean dual boundEase of solution of Z()Ease of solution of Lagrangean dual problem ZD = max 0 Z()
Ex: Generalized Assignment Problem (max problem)(refer to W p. 179):z = max j = 1
n i = 1m cijxij
j = 1n xij 1 for i = 1, … , m
i = 1m aij xij bj for j = 1, … , n
x Bmn
Integer Programming 2011 14
Dualize both sets of constraints:Z1
D = minu 0, v 0 Z1(u, v), where
Z1(u, v) = max j = 1n i = 1
m (cij – ui – aijvj ) xij + i = 1m ui + j = 1
n vjbj
x Bmn Dualize first set of assignment constraints:
Z2D = minu 0 Z2(u), where
Z2(u) = max j = 1n i = 1
m (cij – ui ) xij + i = 1m ui
i = 1m aij xij bj for j = 1, … , n
x Bmn Dualize the knapsack constraints:
Z3D = minv 0 Z3(v), where
Z3(v) = max j = 1n i = 1
m (cij – aij vj ) xij + j = 1n vjbj
j = 1n xij 1 for i = 1, … , m
x Bmn
Integer Programming 2011 15
Z1D = Z3
D = ZLP
For each i,
conv{ x: j = 1n xij 1, xij {0, 1} for j = 1, … , n }
= conv{ x: j = 1n xij 1, 0 xij 1 for j = 1, … , n }
Z1(u, v), Z3(v) can be solved by inspection. Calculating Z3D look easier than
calculating Z1D since there are m dual variables compared to m+n for Z1
D. Z2
D ZLP
To find Z2(u), we need to solve n 0-1 knapsack problems. Also note that the information that we may obtain while solving the knapsack cannot be stored and used for subsequent optimization.
Integer Programming 2011 16
May solve the Lagrangean dual by constraint generation (NW p411-412): Recall that
ZD = maximize y
subject to y + ’(Axk - b) c’xk , kK ’Awj c’wj , jJ
0 Given (y*, *), with * 0, calculate
Z(*) = min x conv(F) (c’x + *’(b-Ax)) (min (c’-*’A)x + ’b)
If y* Z(*), stop y* = Z(*), = * is an optimal solutionIf y* > Z(*), an inequality is violated
Z(*) , ray wj s.t. (c’-*’A)wj < 0, hence ’Awj c’wj is violated Z(*) < , extreme point xk s.t. Z(*) = c’xk + *’(b-Axk). Since y* > Z(*),
y + ’(Axk - b) c’xk is violated.
Note that max/min is interchanged in NW.
Integer Programming 2011 17
Nonlinear Optimization Problems Geometric understanding of the strength of the bounds provided by the La-
grangean dual Let be continuous functions. Let .
minimize subject to (4.31)
.
Let be the optimal cost. Consider Lagrangean function.
For all , and is a concave function.Lagrangean dual: .
Let . Problem (4.31) can be restated as
minimize subject to .
Integer Programming 2011 18
Integer Programming 2011 19
𝑔
𝑓𝑐𝑜𝑛𝑣 (𝑌 )
𝑍 ( 𝜆 )= 𝑓 +𝜆𝑔 𝑍𝐷
𝑍 𝑃
Figure 4.7: The geometric interpretation of the Lagrangean dual
Given that , we have for a fixed and for all that, .
Geometrically, this means that the hyperplane lies below the set Y. For , we obtain , that is is the intercept of the hyperplane with the vertical axis.To maximize , this corresponds to finding a hyperplane , which lies below the set Y, such that the intercept of the hyperplane with the vertical axis is the largest.
Thm 4.10: The value of the Lagrangean dual is equal to the value of the fol-lowing optimization problem:
minimize subject to .
Integer Programming 2011 20
Ex 4.7: Ex 4.3 revisited and ,
{(1, 0), (2,0), (1, 1), (2, 1), (0, 2), (1, 2), (2, 2), (1, 3), (2, 3)} {(3, -2), (6, -3), (2, -1), (5, -2), (-2, 1), (1, 0), (4, -1), (0, 1), (3, 0)}
Integer Programming 2011 21
𝑔
𝑓
𝑍𝐷=− 13
duality gap
𝑐𝑜𝑛𝑣 (𝑌 )
𝑍𝑃=1