Multistage Discrete Optimization Part II: Dualityted/files/multistage/...Multistage Discrete...

79
Multistage Discrete Optimization Part II: Duality Ted Ralphs 1 Joint work with Suresh Bolusani 1 , Scott DeNegre 3 , Menal Güzelsoy 2 , Anahita Hassanzadeh 4 , Sahar Tahernajad 1 1 COR@L Lab, Department of Industrial and Systems Engineering, Lehigh University 2 SAS Institute, Advanced Analytics, Operations Research R & D 3 The Hospital for Special Surgery 4 Climate Corp Friedrich-Alexander-Universität Erlangen-Nürnberg, 20-21 March 2017 Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

Transcript of Multistage Discrete Optimization Part II: Dualityted/files/multistage/...Multistage Discrete...

  • Multistage Discrete OptimizationPart II: Duality

    Ted Ralphs1

    Joint work with Suresh Bolusani1, Scott DeNegre3,Menal Güzelsoy2, Anahita Hassanzadeh4, Sahar Tahernajad1

    1COR@L Lab, Department of Industrial and Systems Engineering, Lehigh University2SAS Institute, Advanced Analytics, Operations Research R & D 3The Hospital for Special Surgery 4Climate Corp

    Friedrich-Alexander-Universität Erlangen-Nürnberg, 20-21 March 2017

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Outline

    1 Introduction

    2 Value Functions(Continuous) Linear OptimizationDiscrete Optimization

    3 Dual ProblemsDual FunctionsSubadditive Dual

    4 Approximating the Value FunctionPrimal Bounding FunctionsDual Bounding Functions

    5 Related MethodologiesWarm StartingSensitivity Analysis

    6 Conclusions

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Mathematical Optimization

    The general form of a mathematical optimization problem is:

    Form of a General Mathematical Optimization Problem

    zMP = min f (x)s.t. gi(x) ≤ bi, 1 ≤ i ≤ m (MP)

    x ∈ X

    where X ⊆ Rn may be a discrete set.The function f is the objective function, while gi is the constraint functionassociated with constraint i.Our primary goal is to compute the optimal value zMP.However, we may want to obtain some auxiliary information as well.More importantly, we may want to develop parametric forms of (MP) in whichthe input data are the output of some other function or process.

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • What is Duality?

    It is difficult to define precisely what is meant by “duality” in generalmathematics, though the literature is replete with various examples of it.

    Set Theory and Logic (De Morgan Laws)Geometry (Pascal’s Theorem & Brianchon’s Theorem)Combinatorics (Graph Coloring)

    We are interested in the notions of duality relevant to solving optimizationproblems.This duality manifests itself in different forms, depending on our point of view.

    Forms of Duality in Optimization

    NP versus co-NP (computational complexity)

    Separation versus optimization (polarity)

    Inverse optimization versus forward optimization

    Weyl-Minkowski duality (representation theorem)

    Economic duality (pricing and sensitivity)

    Primal/dual functions in optimization

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Economic Interpretation of Duality

    The economic viewpoint interprets the variables as representing possibleactivities in which one can engage at specific numeric levels.The constraints represent available resources so that gi(x̂) represents how muchof resource i will be consumed at activity levels x̂ ∈ X.With each x̂ ∈ X, we associate a cost f (x̂) and we say that x̂ is feasible ifgi(x̂) ≤ bi for all 1 ≤ i ≤ m.The space in which the vectors of activities live is the primal space.On the other hand, we may also want to consider the problem from the viewpoint of the resources in order to ask questions such as

    How much are the resources “worth” in the context of the economic systemdescribed by the problem?

    What is the marginal economic profit contributed by each existing activity?

    What new activities would provide additional profit?

    The dual space is the space of resources in which we can frame these questions.

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Outline

    1 Introduction

    2 Value Functions(Continuous) Linear OptimizationDiscrete Optimization

    3 Dual ProblemsDual FunctionsSubadditive Dual

    4 Approximating the Value FunctionPrimal Bounding FunctionsDual Bounding Functions

    5 Related MethodologiesWarm StartingSensitivity Analysis

    6 Conclusions

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Linear Optimization

    For this part of the talk, we focus on (single-level) mixed integer linearoptimization problems (MILPs).

    zIP = minx∈S

    c>x, (MILP)

    where, c ∈ Rn, S = {x ∈ Zr+ × Rn−r+ | Ax = b} with A ∈ Qm×n, b ∈ Rm.Note that in this lecture only, we are switching to the equality form ofconstraints to simplify the presentation.In this context, we can make the concepts outlined earlier more concrete.

    We can think of each row of A as representing a resource and each column asrepresenting an activity or product.

    For each activity, resource consumption is a linear function of activity level.

    We first consider the case r = 0, which is the case of the (continuous) linearoptimization problem (LP).Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • The LP Value Function

    Of central importance in duality theory for linear optimization is the valuefunction, defined by

    φLP(β) = minx∈S(β)

    c>x, (LPVF)

    for a given β ∈ Rm, where S(β) = {x ∈ Rn+ | Ax = β}.We let φLP(β) =∞ if β ∈ Ω = {β ∈ Rm | S(β) = ∅}.The value function returns the optimal value as a parametric function of theright-hand side vector, which represents available resources.

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Economic Interpretation of the Value Function

    What information is encoded in the value function?

    Consider the gradient u = φ′LP(β) at β for which φLP is continuous.

    The quantity u>∆b represents the marginal change in the optimal value if wechange the resource level by ∆b.

    In other words, it can be interpreted as a vector of the marginal costs of theresources.

    For reasons we will see shortly, this is also known as the dual solution vector.

    In the LP case, the gradient is a linear under-estimator of the value function andcan thus be used to derive bounds on the optimal value for any β ∈ Rm.

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Small Example: Fractional Knapsack Problem

    We are given a set N = {1, . . . n} of items and a capacity W.There is a profit pi and a size wi associated with each item i ∈ N.We want a set of items that maximizes profit subject to the constraint that theirtotal size does not exceed the capacity.In this variant of the problem, we are allowed to take a fraction of an item.For each item i, let variable xi represent the fraction selected.

    Fractional Knapsack Problem

    minn∑

    j=1

    pjxj

    s.t.n∑

    j=1

    wjxj ≤ W

    0 ≤ xi ≤ 1 ∀i

    (1)

    What is the optimal solution?Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Generalizing the Knapsack Problem

    Let us consider the value function of a (generalized) knapsack problem.

    To be as general as possible, we allow sizes, profits, and even the capacity to benegative.

    We also take the capacity constraint to be an equality.

    This is a proper generalization.

    Example 1φLP(β) = min 6y1 + 7y2 + 5y3

    s.t. 2y1 − 7y2 + y3 = βy1, y2, y3,∈ R+

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Value Function of the (Generalized) Knapsack Problem

    Now consider the value function φLP of Example 1.What do the gradients of this function represent?

    Value Function for Example 1

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • The Dual Optimization Problem

    Can we calculate the gradient of φLP at b directly?Note that for any µ ∈ Rm, we have

    minx≥0

    [c>x + µ>(b− Ax)

    ]≤ c>x∗ + u>(b− Ax∗)

    = c>x∗

    = φLP(b)

    and thus we have a lower bound on φLP(b).With some simplification, we can obtain a more explicit form for this bound.

    minx≥0[c>x + µ>(b− Ax)

    ]= µ>b + minx≥0(c> − µ>A)x

    =

    {0, if c> − µ>A ≥ 0>,−∞, otherwise,

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • The Dual Problem (cont’d)

    If we now interpret this quantity as a function

    g(u, β) =

    {u>β, if c> − u>A ≥ 0>,−∞, otherwise,

    with parameters u and β, then for fixed first parameter, g(·, β) is a linearunder-estimator of φLP.An LP dual problem is obtained by computing the u ∈ Rm that gives theunder-estimator yielding the strongest bound for a fixed b.

    LP Dual Problem

    maxµ∈Rm

    g(µ, ·) = max b>µ

    s.t. µ>A ≤ c> (LPD)

    An optimal solution to (LPD) is a (sub)gradient of φLP at b.Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Combinatorial Representation of the LP Value Function

    Note that the feasible region of (LPD) does not depend on b.From the fact that there is always an extremal optimum to (LPD), we concludethat the LP value function can be described combinatorially.

    Combinatorial Representation of the LP Value Function

    φLP(β) = maxu∈E

    u>β (LPVF)

    for β ∈ Rm, where E is the set of extreme points of the dual polyhedronD = {u ∈ Rm | u>A ≤ c>} (assuming boundedness).Alternatively, E is also in correspondence with the dual feasible bases of A.

    E ≡{

    cBA−1E | E is the index set of a dual feasible bases of A}

    Thus, we see that epi(φLP) is a polyhedral cone and whose facets correspond todual feasible bases of A.

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • What is the Importance in This Context?

    The dual problem is important is because it gives us a set of optimalityconditions.For a given b ∈ Rm, whenever we have

    x∗ ∈ S(b),u ∈ D, andc>x∗ = u>b,

    then x∗ is optimal!

    This means we can write down a set of constraints involving the value functionthat ensure optimality.

    This set of constraints can then be embedded inside another optimizationproblem.

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Outline

    1 Introduction

    2 Value Functions(Continuous) Linear OptimizationDiscrete Optimization

    3 Dual ProblemsDual FunctionsSubadditive Dual

    4 Approximating the Value FunctionPrimal Bounding FunctionsDual Bounding Functions

    5 Related MethodologiesWarm StartingSensitivity Analysis

    6 Conclusions

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • The MILP Value Function

    We now generalize the notions seen so far to the MILP case.The value function associated with the base instance (MILP) is

    MILP Value Function

    φ(β) = minx∈S(β)

    c>x (VF)

    for β ∈ Rm, where S(β) = {x ∈ Zr+ × Rn−r+ | Ax = β}.Again, we let φ(β) =∞ if β ∈ Ω = {β ∈ Rm | S(β) = ∅}.

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Example: The (Mixed) Binary Knapsack Problem

    We now consider a further generalization of the previously introduced knapsackproblem.

    In this problem, we must take some of the items either fully or not at all.

    In the example, we allow all of the previously introduced generalizations.

    Example 2φ(β) = min 12 x1 + 2x3 + x4

    s.t x1 − 32 x2 + x3 − x4 = βx1, x2 ∈ Z+, x3, x4 ∈ R+.

    (2)

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Value Function for (Generalized) Mixed Binary Knapsack

    Below is the value function of the optimization problem in Example 2.How do we interpret the structure of this function?

    Value Function for Example 2

    3

    0

    z(d)

    d1-1-2-3 3 42-4 − 32 −

    12−

    52−

    72

    52

    32

    12

    12

    32

    52

    72

    1

    2

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Related Work on Value Function

    Duality

    Johnson [1973, 1974, 1979]Jeroslow [1979]Wolsey [1981]Güzelsoy and Ralphs [2007], Güzelsoy [2009]

    Structure and ConstructionBlair and Jeroslow [1977b, 1979, 1982, 1984, 1985], Blair [1995]Kong et al. [2006]Güzelsoy and Ralphs [2008], Hassanzadeh and Ralphs [2014]

    Sensitivity and Warm Starting

    Ralphs and Güzelsoy [2005, 2006], Güzelsoy [2009]Gamrath et al. [2015]

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Properties of the MILP Value Function

    The value function is non-convex, lower semi-continuous, and piecewise polyhedral.Example 3

    φ(β) = min x1 −34

    x2 +34

    x3

    s.t.54

    x1 − x2 +12

    x3 = β

    x1, x2 ∈ Z+, x3 ∈ R+

    (Ex2.MILP)

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Example: MILP Value Function (Pure Integer)

    Example 4φ(β) = min 3x1 +

    72

    x2 + 3x3 + 6x4 + 7x5 + 5x6

    s.t. 6x1 + 5x2 − 4x3 + 2x4 − 7x5 + x6 = βx1, x2, x3, x4, x5, x6 ∈ Z+

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Another Example

    Example 5

    φ(β) = min 3x1 +72

    x2 + 3x3 + 6x4 + 7x5 + 5x6

    s.t. 6x1 + 5x2 − 4x3 + 2x4 − 7x5 + x6 = βx1, x2, x3 ∈ Z+, x4, x5, x6 ∈ R+

    The structure of this function is inherited from two related functions.

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Continuous and Integer Restriction of an MILP

    Consider the general form of the second-stage value function

    φ(β) = min c>I xI + c>C xC

    s.t. AIxI + ACxC = β,

    x ∈ Zr2+ × Rn2−r2+

    (VF)

    The structure is inherited from that of the continuous restriction:

    φC(β) = min c>C xCs.t. ACxC = β,

    xC ∈ Rn2−r2+

    (CR)

    for C = {r2 + 1, . . . , n2} and the similarly defined integer restriction:

    φI(β) = min c>I xIs.t. AIxI = β

    xI ∈ Zr2+

    (IR)

    for I = {1, . . . , r2}.Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Discrete Representation of the Value Function

    For β ∈ Rm2 , we have that

    φ(β) = min c>I xI + φC(β − AIxI)s.t. xI ∈ Zr2+

    (3)

    From this we see that the value function is comprised of the minimum of a set oftranslations of φC.

    The set of shifts, along with φC describe the value function exactly.

    For x̂I ∈ Zr2+, let

    φC(β, x̂I) = c>I x̂I + φC(β − AI x̂I) ∀β ∈ Rm2 . (4)

    Then we have that φ(β) = minxI∈Zr2+ φC(β, x̂I).

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Value Function of the Continuous Restriction

    Example 6φC(β) = min 6y1 + 7y2 + 5y3

    s.t. 2y1 − 7y2 + y3 = βy1, y2, y3 ∈ R+

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Related Results

    From the basic structure outlined, we can derive many other useful results.

    Proposition 1. [Hassanzadeh and Ralphs, 2014] The gradient of φ on aneighborhood of a differentiable point is a unique optimal dual feasiblesolution to (CR).

    Proposition 2. [Hassanzadeh and Ralphs, 2014] If φ is differentiable overa connected set N ⊆ Rm, then there exists x∗I ∈ Zr and E ∈ E such thatφ(b) = c>I x

    ∗I + ν

    >E (b− AIx∗I ) for all b ∈ N .

    This last result can be extended to subset of the domain over which φ is convex.Over such a region, φ coincides with the value function of a translation of thecontinuous restriction.Putting all of together, we get practical finite representation...

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Points of Strict Local Convexity (Finite Representation)Example 7

    Theorem 1. [Hassanzadeh and Ralphs, 2014]Under the assumption that {β ∈ Rm2 | φI(β) I xI + φC(β − AIxI)}. (5)

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Interpretation

    It is only possible to get a unique linear price function for resource vectors wherethe value function is differentiable.

    This only happens when the continuous restriction has a unique dual solution atthe current resource vector.

    Otherwise, there is no linear price function that will be valid in an epsilonneighborhood of the current resource vector.

    When the dual solution does exist, its value is determined by only the continuouspart of the problem!

    Thus, these prices reflect behavior over only a very localized region for whichthe discrete part of the solution remains constant.In the case of the generalized knapsack problem, the differentiable points havethe following two properties:

    the continuous part of the solution is non-zero (and unique); and

    The discrete part of the solution is unique.

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Outline

    1 Introduction

    2 Value Functions(Continuous) Linear OptimizationDiscrete Optimization

    3 Dual ProblemsDual FunctionsSubadditive Dual

    4 Approximating the Value FunctionPrimal Bounding FunctionsDual Bounding Functions

    5 Related MethodologiesWarm StartingSensitivity Analysis

    6 Conclusions

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Dual Bounding Functions

    A dual function F : Rm → R is one that satisfies F(β) ≤ φ(β) for all β ∈ Rm.How to select such a function?We choose may choose one that is easy to construct/evaluate or for whichF(b) ≈ φ(b).This results in the following generalized dual associated with the baseinstance (MILP).

    max {F(b) : F(β) ≤ φ(β), β ∈ Rm,F ∈ Υm} (D)

    where Υm ⊆ {f | f : Rm→R}We call F∗ strong for this instance if F∗ is a feasible dual function andF∗(b) = φ(b).This dual instance always has a solution F∗ that is strong if the value function isbounded and Υm ≡ {f | f : Rm→R}. Why?Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Example: LP Relaxation Dual FunctionExample 8

    FLP(d) = min vd,s.t 0 ≥ v ≥ − 12 , and

    v ∈ R,(6)

    which can be written explicitly as

    FLP(β) ={

    0, β ≤ 0− 12β, β > 0

    .

    FLP(d)

    0d

    1-1-2-3 3 42-4 − 32 −12−

    52−

    72

    52

    32

    12

    12

    32

    52

    72

    1

    2

    3

    z(d)

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Outline

    1 Introduction

    2 Value Functions(Continuous) Linear OptimizationDiscrete Optimization

    3 Dual ProblemsDual FunctionsSubadditive Dual

    4 Approximating the Value FunctionPrimal Bounding FunctionsDual Bounding Functions

    5 Related MethodologiesWarm StartingSensitivity Analysis

    6 Conclusions

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • The Subadditive Dual

    By considering that

    F(β) ≤ φ(β) ∀β ∈ Rm ⇐⇒ F(β) ≤ c>x , x ∈ S(β) ∀β ∈ Rm⇐⇒ F(Ax) ≤ c>x , x ∈ Zn+,

    the generalized dual problem can be rewritten as

    max {F(β) : F(Ax) ≤ cx, x ∈ Zr+ × Rn−r+ , F ∈ Υm}.

    Can we further restrict Υm and still guarantee a strong dual solution?The class of linear functions? NO!The class of convex functions? NO!The class of Subadditive functions? YES!

    See [Johnson, 1973, 1974, 1979, Jeroslow, 1979] for details.

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • The Subadditive Dual

    Let a function F be defined over a domain V . Then F is subadditive ifF(v1) + F(v2) ≥ F(v1 + v2)∀v1, v2, v1 + v2 ∈ V .Note that the value function z is subadditive over Ω. Why?If Υm ≡ Γm ≡ {F is subadditive | F : Rm→R,F(0) = 0}, we can rewrite thedual problem above as the subadditive dual

    max F(b)F(aj) ≤ cj j = 1, ..., r,F̄(aj) ≤ cj j = r + 1, ..., n, andF ∈ Γm,

    where the function F̄ is defined by

    F̄(β) = lim supδ→0+

    F(δβ)δ

    ∀β ∈ Rm.

    Here, F̄ is the upper β-directional derivative of F at zero.

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Example: Upper D-directional Derivative

    The upper β-directional derivative is the gradient of the value function at �β forsufficiently small �.We will see this structure is related to that of a certain LP.

    Example 9

    z̄(d)

    0d

    1-1-2-3 3 42-4 − 32 −12−

    52−

    72

    52

    32

    12

    12

    32

    52

    72

    1

    2

    3

    z(d)

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Weak Duality

    Weak Duality Theorem

    Let x be a feasible solution to the primal problem and let F be a feasible solutionto the subadditive dual. Then, F(b) ≤ c>x.

    Proof.

    Corollary

    For the primal problem and its subadditive dual:1 If the primal problem (resp., the dual) is unbounded then the dual problem

    (resp., the primal) is infeasible.2 If the primal problem (resp., the dual) is infeasible, then the dual problem

    (resp., the primal) is infeasible or unbounded.

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Strong Duality

    Strong Duality Theorem

    If the primal problem (resp., the dual) has a finite optimum, then so does thesubadditive dual problem (resp., the primal) and they are equal.

    Outline of the Proof. Show that the value function φ or an extension of φ is afeasible dual function.

    Note that φ satisfies the dual constraints.Ω ≡ Rm: φ ∈ Γm.Ω ⊂ Rm: ∃ φe ∈ Γm with φe(β) = φ(β) ∀β ∈ Ω and ze(β)

  • Example: Subadditive Dual

    For the instance in Example 2, the subadditive dual

    max F(b)F(1) ≤ 12

    F(− 32 ) ≤ 0F̄(1) ≤ 2

    F̄(−1) ≤ 1F ∈ Γ1.

    .

    and we have the following feasible dual functions:1 F1(β) = β2 is an optimal dual function for β ∈ {0, 1, 2, ...}.2 F2(β) = 0 is an optimal function for β ∈ {...,−3,− 32 , 0}.3 F3(β) = max{ 12dβ −

    ddβe−βe4 e, 2d −

    32dβ −

    ddβe−βe4 e} is an optimal function

    for b ∈ {[0, 14 ] ∪ [1,54 ] ∪ [2,

    94 ] ∪ ...}.

    4 F4(β) = max{ 32d2β3 −

    2dd 2β3 e−2β3 e

    3 e − β,−34d

    2β3 −

    2dd 2β3 e−2β3 e

    3 e+β2 } is an

    optimal function for b ∈ {... ∪ [− 72 ,−3] ∪ [−2,−32 ] ∪ [−

    12 , 0]}

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Example: Feasible Dual Functions

    Example 10

    F(d)

    0d

    1-1-2-3 3 42-4 − 32 −12−

    52−

    72

    52

    32

    12

    12

    32

    52

    72

    1

    2

    3

    z(d)

    Notice how different dual solutions are optimal for some right-hand sides andnot for others.Only the value function is optimal for all right-hand sides.Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Farkas’ Lemma

    For the primal problem, exactly one of the following holds:1 S 6= ∅2 There is an F ∈ Γm with F(aj) ≥ 0, j = 1, ..., n, and F(b) < 0.

    Proof. Let c = 0 and apply strong duality theorem to subadditive dual.

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Complementary Slackness [Wolsey, 1981]

    For a given right-hand side b, let x∗ and F∗ be feasible solutions to the primaland the subadditive dual problems, respectively. Then x∗ and F∗ are optimalsolutions if and only if

    1 x∗j (cj − F∗(aj)) = 0, j = 1, ..., n and2 F∗(b) =

    ∑nj=1 F

    ∗(aj)x∗j .

    Proof. For an optimal pair we have

    F∗(b) = F∗(Ax∗) =n∑

    j=1

    F∗(aj)x∗j = cx∗. (7)

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Optimality Conditions

    As in the linear programming case, we can derive optimality conditions from thedual optimization problems.

    Optimality conditions for (MILP)

    If x∗ ∈ S, F∗ is feasible for (D), and c>x∗ = F∗(b), then x∗ is an optimalsolution to (MILP) and F∗ is an optimal solution to (D).

    These are the optimality conditions achieved in the branch-and-cut algorithm forMILP that prove the optimality of the primal solution.The branch-and-bound tree encodes a solution to the dual.

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Outline

    1 Introduction

    2 Value Functions(Continuous) Linear OptimizationDiscrete Optimization

    3 Dual ProblemsDual FunctionsSubadditive Dual

    4 Approximating the Value FunctionPrimal Bounding FunctionsDual Bounding Functions

    5 Related MethodologiesWarm StartingSensitivity Analysis

    6 Conclusions

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Approximating the Value Function

    In general, it is difficult to construct the value function explicitly.We therefore propose to approximate the value function by either primal (upper)or dual (lower) bounding functions.

    Dual boundsDerived by considering the value function of relaxations of the originalproblem or by constructing dual functions⇒ Relax constraints.

    Primal boundsDerived by considering the value function of restrictions of the originalproblem⇒ Fix variables.

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Primal/Dual Bounding Functions

    Dual (Bounding) Functions

    Definition 1. A dual (bounding) function F : Rm → R is one that satisfiesF(β) ≤ φ(β) for all β ∈ Rm.

    Primal (Bounding ) Functions

    Definition 2. A primal (bounding) function F : Rm → R is one that satisfiesF(β) ≥ φ(β) for all β ∈ Rm.

    Strong Bounding Functions

    Definition 3. A bounding function F is said to be strong at b ∈ Rm if F(b) =φ(b).

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Strong Primal Bounding Functions

    Strong bounding functions can be used algorithmically both to construct thevalue function directly and to dynamically construct approximations.These approximations can be used in algorithms for multi-stage optimization.

    Theorem 2. Let x∗ be an optimal solution to the primal problem with right-handside b. Then φC(β, x∗I ) is a strong primal bounding function at b.

    By repeatedly evaluating φI(β), we can obtain upper approximations (andeventually the full value function).

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Benders-like Algorithm for Upper Approximation

    Algorithm

    Initialize: Let φ̄(b) =∞ for all b ∈ B, Γ0 =∞, x0I = 0, S0 = {x0I }, and k = 0.while Γk > 0 do:

    Let φ̄(β)← min{φ̄(β), φ̄(β; xkI )} for all β ∈ Rm.k← k + 1.Solve

    Γk = maxβ∈Rm

    φ̄(β)− c>I xI

    s.t. AIxI = bxI ∈ Zr+.

    (SP)

    to obtain xkI .Set Sk ← Sk−1 ∪ {xk}

    end whilereturn φ(b) = φ̄(b) for all b ∈ B.

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Algorithm for Upper Approximation

    f5

    d

    z

    b1 b2 b3 b4 b5

    f1

    f2

    f3

    f4

    Figure 1: Upper bounding functions obtained at right-hand sides bi, i = 1, . . . , 5.

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Formulating (SP)

    Surprisingly, the “cut generation” problem (SP) can be formulated easily as anMINLP.

    Γk = max θ

    s.t. θ + c>I xI ≤ c>I xiI + (AIxI − AIxiI)>ν i i = 1, . . . , k − 1A>C ν

    i ≤ cC i = 1, . . . , k − 1ν i ∈ Rm i = 1, . . . , k − 1xI ∈ Zr+.

    (8)

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Sample Computational Results

    Figure 2: Normalized approximation gap vs. iteration number.

    http://github.com/tkralphs/ValueFunction

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

    http://github.com/tkralphs/ValueFunction

  • Outline

    1 Introduction

    2 Value Functions(Continuous) Linear OptimizationDiscrete Optimization

    3 Dual ProblemsDual FunctionsSubadditive Dual

    4 Approximating the Value FunctionPrimal Bounding FunctionsDual Bounding Functions

    5 Related MethodologiesWarm StartingSensitivity Analysis

    6 Conclusions

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Dual Bounding Functions Revisited

    A dual function F : Rm → R is one that satisfies F(β) ≤ φ(β) for all β ∈ Rm.How to select such a function?We choose may choose one that is easy to construct/evaluate or for whichF(b) ≈ φ(b).This results in the following generalized dual associated with the baseinstance (MILP).

    max {F(b) : F(β) ≤ φ(β), β ∈ Rm,F ∈ Υm} (D)

    where Υm ⊆ {f | f : Rm→R}We call F∗ strong for this instance if F∗ is a feasible dual function andF∗(b) = φ(b).This dual instance always has a solution F∗ that is strong if the value function isbounded and Υm ≡ {f | f : Rm→R}. Why?Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Dual Functions from Branch and Bound

    Recall that a dual function F : Rm → R is one that satisfies F(β) ≤ φ(β) for allβ ∈ Rm.Observe that any branch-and-bound tree yields a lower approximation of thevalue function.

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Dual Functions from Branch-and-Bound [Wolsey, 1981]

    Let T be set of the terminating nodes of the tree. Then in a terminating node t ∈ T wesolve:

    φt(β) = min c>xs.t. Ax = β,

    lt ≤ x ≤ ut, x ≥ 0(9)

    The dual at node t:

    φt(β) = max {πtβ + πtlt + π̄tut}s.t. πtA + πt + π̄t ≤ c>

    π ≥ 0, π̄ ≤ 0(10)

    We obtain the following strong dual function:

    mint∈T{π̂tβ + π̂tlt + ˆ̄πtut}, (11)

    where (π̂t, π̂t, ˆ̄πt) is an optimal solution to the dual (10).

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Iterative Refinement

    The tree obtained from evaluating φ(β) yields a dual function strong at β.By solving for other right-hand sides, we obtain additional dual functions thatcan be aggregated.These additional solves can be done within the same tree, eventually yielding asingle tree representing the entire function.

    Node 0

    Node 2Node 1

    x2 = 0 x2 ≥ 1

    Node 0

    Node 2

    Node 4Node 3

    x2 = 1 x2 ≥ 2

    Node 1

    x2 = 0 x2 ≥ 1

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Tree Representation of the Value Function

    Continuing the process, we eventually generate the entire value function.Consider the strengthened dual

    φ∗(β) = mint∈T

    q>It ytIt + φ

    tN\It (β −WIt y

    tIt ), (12)

    It is the set of indices of fixed variables, ytIt are the values of the correspondingvariables in node t.φtN\It is the value function of the linear optimization problem at node t, includingonly the unfixed variables.Theorem 3. [Hassanzadeh and Ralphs, 2014] Under the assumption that{β ∈ Rm2 | φI(β)

  • Example of Value Function Tree

    Node 0

    Node 8

    Node 10

    Node 12

    Node 14

    Node 16

    Node 18β + 30

    Node 17max{β + 25,−2β − 5}

    y3 = 5 y3 ≥ 6

    Node 15max{β + 20,−2β − 4}

    y3 = 4 y3 ≥ 5

    Node 13max{β + 15,−2β − 3}

    y3 = 3 y3 ≥ 4

    Node 11max{β + 10, g9 = −2β − 2}

    y3 = 2 y3 ≥ 3

    Node 9max{β + 5, g7 = −2β − 1}

    y3 = 1 y3 ≥ 2

    Node 1

    Node 3

    Node 5

    Node 7−2β + 42

    Node 6max{2β + 28, β − 2}

    y2 = 2 y2 ≥ 3

    Node 4max{−2β + 14, β − 1}

    y2 = 1 y2 ≥ 2

    Node 2max{−2β, β}

    y2 = 0 y2 ≥ 1

    y3 = 0 y3 ≥ 1

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Correspondence of Nodes and Local Stability Regions

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Describing the Value Function by Parametric Inequalities

    For an ILP, it can be obtained by a finite number of limited operations onelements of the RHS:

    (i) rational multiplication(ii) nonnegative combination(iii) rounding

    Chvátal fcns.(iv) taking the maximum

    Gomory fcns.

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Chvátal and Gomory Functions

    Let Lm = {f | f : Rm→R, f is linear}.Chvátal functions are the smallest set of functions C m such that

    1 If f ∈ Lm, then f ∈ C m.2 If f1, f2 ∈ C m and α, β ∈ Q+, then αf1 + βf2 ∈ C m.3 If f ∈ C m, then df e ∈ C m.

    Gomory functions are the smallest set of functions G m ⊆ C m with the additionalproperty that

    1 If f1, f2 ∈ G m, then max{f1, f2} ∈ G m.

    Theorem 4. For PILPs (r = n), if φ(0) = 0, then there is a g ∈ G m such thatg(d) = φ(β) for all d ∈ Rm with S(d) 6= ∅.

    This result can be extended to MILPs by the addition of a correction term. Theresulting form of the value is called the Jeroslow Formula.

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Gomory’s Procedure [Blair and Jeroslow, 1977a]

    There is a Chvátal function that is optimal to the subadditive dual of an ILP withRHS b ∈ ΩIP and φ(b) > −∞.The procedure:In iteration k, we solve the following LP

    φk−1(b) = min cxs.t. Ax = β∑n

    j=1 fi(aj)xj ≥ f i(b) i = 1, ..., k − 1

    x ≥ 0

    The kth cut, k > 1, is dependent on the RHS and written as:

    f k(β) =

    ⌈m∑

    i=1

    λk−1i βi +

    k−1∑i=1

    λk−1m+ifi(β)

    ⌉where λk−1 = (λk−11 , ..., λ

    k−1m+k−1) ≥ 0

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Gomory’s Procedure (cont.)

    Assume that b ∈ ΩIP, φ(b) > −∞ and the algorithm terminates after k + 1iterations.If uk is the optimal dual solution to the LP in the final iteration, then

    Fk(β) =m∑

    i=1

    uki βi +k∑

    i=1

    ukm+ifi(β),

    is a Chvátal function with Fk(b) = φ(b) and furthermore, it is optimal to thesubadditive dual problem.

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Branch-and-Cut Method

    We have seen it it easy to get a strong dual function from branch-and-bound.Note, however, that it’s not subadditive in general.To obtain a subadditive function, we can include the variable bounds explicitly asconstraints, but then the function may not be strong.For branch-and-cut, we have to take care of the cuts.

    Case 1: We know the subadditive representation of each cut.Case 2: We know the RHS dependency of each cut.Case 3: Otherwise, we can use some proximity results or the variable bounds.

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Case 1

    If we know the subadditive representation of each cut:At a node t, we solve the LP relaxation of the following problem

    φt(b) = min cxs.t Ax ≥ b

    x ≥ lt−x ≥ −gtHtx ≥ ht

    x ∈ Zr+ × Rn−r+

    where gt, lt ∈ Rr are the branching bounds applied to the integer variables andHtx ≥ ht is the set of added cuts in the form∑

    j∈IFtk(a

    kj )xj +

    ∑j∈N\I

    F̄tk(akj )xj ≥ Ftk(σk(b)) k = 1, ..., ν(t),

    ν(t): the number of cuts generated so far,akj , j = 1, ..., n: the columns of the problem that the k

    th cut is constructed from,σk(b): is the mapping of b to the RHS of the corresponding problem.

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Case 1

    Let T be the set of leaf nodes, ut, ut, ut and wt be the dual feasible solution used toprune t ∈ T . Then,

    F(β) = mint∈T{utβ + utlt − utgt +

    ν(t)∑k=1

    wtkFtk(σk(β))}

    is an optimal dual function, that is, φ(b) = F(b).Again, we obtain a subadditive function if the variables are bounded.However, we may not know the subadditive representation of each cut.

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Methods for Constructing Dual Functions

    There are a wide range of other methods for constructing dual functions arisingmainly from other solution algorithms.

    Explicit constructionThe Value Function⇒ discussed todayGenerating Functions

    RelaxationsLagrangian RelaxationQuadratic Lagrangian RelaxationCorrected Linear Dual Functions

    Primal Solution AlgorithmsCutting Plane Method⇒ discussed todayBranch-and-Bound Method⇒ discussed todayBranch-and-Cut Method⇒ discussed today

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Representing/Embedding the Approximations

    In practice, we generally want to embed these approximations in other optimizationproblems and doing this in a computationally efficient way is difficult.

    1 The primal bounding functions we discussed can be represented by points ofstrict local convexity.

    Embedding the approximation using this representation involves explicitly listingthese points and choosing one (binary variables).The corresponding continuous part of the function can be generated dynamically orcan also be represented explicitly by dual extreme points.

    2 The dual bounding functions must generally be represented explicitly in terms oftheir polyhedral pieces.

    In this case, the points of strict local convexity are implicit and the selection is of therelevant piece or pieces.This yields a much larger representation.

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Outline

    1 Introduction

    2 Value Functions(Continuous) Linear OptimizationDiscrete Optimization

    3 Dual ProblemsDual FunctionsSubadditive Dual

    4 Approximating the Value FunctionPrimal Bounding FunctionsDual Bounding Functions

    5 Related MethodologiesWarm StartingSensitivity Analysis

    6 Conclusions

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Warm Starting

    Many optimization algorithms can be viewed as iterative procedures forsatisfying optimality conditions (based on duality).These conditions provide a measure of “distance from optimality.”Warm starting information is additional input data that allows an algorithm toquickly get “close to optimality.”In mixed integer linear optimization, the duality gap is the usual measure.As in linear programming, a feasible dual function may quickly reduce the gap.

    What is a feasible dual function and where do we get one?

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Valid Disjunctions

    Consider the implicit optimality conditions associated employed in branch andbound.Let P1, . . . ,Ps be a set of polyhedra whose union contains the feasible set whichdiffer from P only in variable bounds.Let Bi be the optimal basis for the LP minxi∈Pi c

    >xi.Then the following is a valid dual function

    L(β) = min{cBi(Bi)−1β + γi | 1 ≤ i ≤ s}

    where γi is a constant factor associated with the nonbasic variables fixed atnonzero bounds.A similar function yields an upper bound.If this disjunction is the set of leaf nodes of a branch-and-bound tree, this can beused to “warm start” the computation.Alternatively, we can use this disjunction to strengthen the root relaxation insome way (disjunction cuts, etc.).

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Outline

    1 Introduction

    2 Value Functions(Continuous) Linear OptimizationDiscrete Optimization

    3 Dual ProblemsDual FunctionsSubadditive Dual

    4 Approximating the Value FunctionPrimal Bounding FunctionsDual Bounding Functions

    5 Related MethodologiesWarm StartingSensitivity Analysis

    6 Conclusions

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Sensitivity Analysis

    Primal and dual bounding functions can be evaluated with modified problem datato obtain bounds on the optimal value in the obvious way.In the case of a branch-and-bound tree, the function

    L(β) = min{cBi(Bi)−1β + γi | 1 ≤ i ≤ s}

    provides a valid lower bound as a function of the right-hand side.The corresponding upper bounding function is

    U(c) = min{cBi(Bi)−1b + βi | 1 ≤ i ≤ s, x̂i ∈ S}

    These functions can be used for local sensitivity analysis, just as one would do incontinuous linear optimization.

    For changes in the right-hand side, the lower bound remains valid.For changes in the objective function, the upper bound remains valid.One can also make other modifications, such as adding variables or constraints.

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • Conclusions

    It is possible to generalize the duality concepts that are familiar to us fromcontinuous linear optimization.Making any of it practical is difficult but we will see in the next lectures that thisis possible in some cases.

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • References I

    C.E. Blair. A closed-form representation of mixed-integer program value functions.Mathematical Programming, 71:127–136, 1995.

    C.E. Blair and R.G. Jeroslow. The value function of a mixed integer program: I.Discrete Mathematics, 19(2):121–138, 1977a.

    C.E. Blair and R.G. Jeroslow. The value function of a mixed integer program: I.Discrete Mathematics, 19:121–138, 1977b.

    C.E. Blair and R.G. Jeroslow. The value function of a mixed integer program: II.Discrete Mathematics, 25:7–19, 1979.

    C.E. Blair and R.G. Jeroslow. The value function of an integer program.Mathematical Programming, 23:237–273, 1982.

    C.E. Blair and R.G. Jeroslow. Constructive characterization of the value function of amixed-integer program: I. Discrete Applied Mathematics, 9:217–233, 1984.

    C.E. Blair and R.G. Jeroslow. Constructive characterization of the value function of amixed-integer program: II. Discrete Applied Mathematics, 10:227–240, 1985.

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

  • References II

    G. Gamrath, B. Hiller, and J. Witzig. Reoptimization techniques for mip solvers. InProceedings of the 14th International Symposium on Experimental Algorithms,2015.

    M Güzelsoy. Dual Methods in Mixed Integer Linear Programming. Phd, LehighUniversity, 2009. URL http://coral.ie.lehigh.edu/{~}ted/files/papers/MenalGuzelsoyDissertation09.pdf.

    M Güzelsoy and T K Ralphs. Duality for mixed-integer linear programs.International Journal of Operations Research, 4:118–137, 2007. URL http://coral.ie.lehigh.edu/~ted/files/papers/MILPD06.pdf.

    M Güzelsoy and T K Ralphs. The value function of a mixed-integer linear programwith a single constraint. Technical report, COR@L Laboratory, Lehigh University,2008. URL http://coral.ie.lehigh.edu/~ted/files/papers/ValueFunction.pdf.

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

    http://coral.ie.lehigh.edu/{~}ted/files/papers/MenalGuzelsoyDissertation09.pdfhttp://coral.ie.lehigh.edu/{~}ted/files/papers/MenalGuzelsoyDissertation09.pdfhttp://coral.ie.lehigh.edu/~ted/files/papers/MILPD06.pdfhttp://coral.ie.lehigh.edu/~ted/files/papers/MILPD06.pdfhttp://coral.ie.lehigh.edu/~ted/files/papers/ValueFunction.pdfhttp://coral.ie.lehigh.edu/~ted/files/papers/ValueFunction.pdf

  • References III

    A Hassanzadeh and T K Ralphs. On the value function of a mixed integer linearoptimization problem and an algorithm for its construction. Technical report,COR@L Laboratory Report 14T-004, Lehigh University, 2014. URLhttp://coral.ie.lehigh.edu/~ted/files/papers/MILPValueFunction14.pdf.

    Robert G Jeroslow. Minimal inequalities. Mathematical Programming, 17(1):1–15,1979.

    Ellis L Johnson. Cyclic groups, cutting planes and shortest paths. Mathematicalprogramming, pages 185–211, 1973.

    Ellis L Johnson. On the group problem for mixed integer programming. InApproaches to Integer Programming, pages 137–179. Springer, 1974.

    Ellis L Johnson. On the group problem and a subadditive approach to integerprogramming. Annals of Discrete Mathematics, 5:97–112, 1979.

    N. Kong, A.J. Schaefer, and B. Hunsaker. Two-stage integer programs with stochasticright-hand sides: a superadditive dual approach. Mathematical Programming, 108(2):275–296, 2006.

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

    http://coral.ie.lehigh.edu/~ted/files/papers/MILPValueFunction14.pdfhttp://coral.ie.lehigh.edu/~ted/files/papers/MILPValueFunction14.pdf

  • References IV

    T K Ralphs and M Güzelsoy. The symphony callable library for mixed-integer linearprogramming. In Proceedings of the Ninth INFORMS Computing SocietyConference, pages 61–76, 2005. doi: 10.1007/0-387-23529-9_5. URL http://coral.ie.lehigh.edu/~ted/files/papers/SYMPHONY04.pdf.

    T K Ralphs and M Güzelsoy. Duality and warm starting in integer programming. InThe Proceedings of the 2006 NSF Design, Service, and Manufacturing Granteesand Research Conference, 2006. URLhttp://coral.ie.lehigh.edu/~ted/files/papers/DMII06.pdf.

    L.A. Wolsey. Integer programming duality: Price functions and sensitivity analysis.Mathematical Programming, 20(1):173–195, 1981. ISSN 0025-5610.

    Ralphs et.al. (COR@L Lab) Multistage Discrete Optimization: Duality

    http://coral.ie.lehigh.edu/~ted/files/papers/SYMPHONY04.pdfhttp://coral.ie.lehigh.edu/~ted/files/papers/SYMPHONY04.pdfhttp://coral.ie.lehigh.edu/~ted/files/papers/DMII06.pdf

    IntroductionValue Functions(Continuous) Linear OptimizationDiscrete Optimization

    Dual ProblemsDual FunctionsSubadditive Dual

    Approximating the Value FunctionPrimal Bounding FunctionsDual Bounding Functions

    Related MethodologiesWarm StartingSensitivity Analysis

    Conclusions