How to use your favorite MIP Solver ... - univie.ac.at · 261 across all node LP solves, there are...
Transcript of How to use your favorite MIP Solver ... - univie.ac.at · 261 across all node LP solves, there are...
How to use your favorite MIP Solver:modeling, solving, cannibalizing
Andrea LodiUniversity of Bologna, Italy
January-February, 2012 @ Universitat Wien
A. Lodi, How to use your favorite MIP Solver
Setting
• We consider a general Mixed Integer Program in the form:
max{cTx : Ax ≤ b, x ≥ 0, xj ∈ Z, ∀j ∈ I} (1)
where matrix A does not have a special structure.
• Thus, the problem is solved through branch-and-bound and the bounds are computed by
iteratively solving the LP relaxations through a general-purpose LP solver.
• The course basically covers the MIP but we will try to discuss when possible how crucial is the
LP component (the engine), and how much the whole framework is built on top the capability
of effectively solving LPs.
• Roughly speaking, using the LP computation as a tool, MIP solvers integrate the
branch-and-bound and the cutting plane algorithms through variations of the general
branch-and-cut scheme [Padberg & Rinaldi 1987] developed in the context of the Traveling
Salesman Problem (TSP).
A. Lodi, How to use your favorite MIP Solver 1
Outline
1. The building blocks of a MIP solver.
We will run over the first 50 exciting years of MIP by showing some crucial milestones and we
will highlight the building blocks that are making nowadays solvers effective from both a
performance and an application viewpoint.
2. How to use a MIP solver as a sophisticated (heuristic) framework.
Nowadays MIP solvers should not be conceived as black-box exact tools. In fact, they provide
countless options for their smart use as hybrid algorithmic frameworks, which thing might turn
out especially interesting on the applied context. We will review some of those options and
possible hybridizations, including some real-world applications.
A. Lodi, How to use your favorite MIP Solver 2
Outline
1. The building blocks of a MIP solver.
We will run over the first 50 exciting years of MIP by showing some crucial milestones and we
will highlight the building blocks that are making nowadays solvers effective from both a
performance and an application viewpoint.
2. How to use a MIP solver as a sophisticated (heuristic) framework.
Nowadays MIP solvers should not be conceived as black-box exact tools. In fact, they provide
countless options for their smart use as hybrid algorithmic frameworks, which thing might turn
out especially interesting on the applied context. We will review some of those options and
possible hybridizations, including some real-world applications.
3. Modeling and algorithmic tips to make a solver effective in practice.
The capability of a solver to produce good, potentially optimal, solutions depends on the
selection of the right model and the use of the right algorithmic tools the solver provides. We
will discuss useful tips, from simple to sophisticated, which allow a smart use of a MIP solver.
Finally, we will show that this is NOT the end of the story and many challenges for MIP
technology are still to be faced.
A. Lodi, How to use your favorite MIP Solver 2
PART 3
1. The building blocks of a MIP solver
2. How to use a MIP solver as a sophisticated (heuristic) framework
3. Modeling and algorithmic tips to make a solver effective in practice
A. Lodi, How to use your favorite MIP Solver 3
PART 3
1. The building blocks of a MIP solver
2. How to use a MIP solver as a sophisticated (heuristic) framework
3. Modeling and algorithmic tips to make a solver effective in practice
• Outline:
– Solving difficult MIPs [heavily based on Klotz & Newman 2011, PART-II-2.pdf]
A. Lodi, How to use your favorite MIP Solver 3
PART 3
1. The building blocks of a MIP solver
2. How to use a MIP solver as a sophisticated (heuristic) framework
3. Modeling and algorithmic tips to make a solver effective in practice
• Outline:
– Solving difficult MIPs [heavily based on Klotz & Newman 2011, PART-II-2.pdf]
1. Lack of node throughput because of LP solving
2. Lack in progress in the best (mixed-)integer solution
3. Lack in progress in the best bound
4. Lack of node throughput because of numerical instability
A. Lodi, How to use your favorite MIP Solver 3
PART 3
1. The building blocks of a MIP solver
2. How to use a MIP solver as a sophisticated (heuristic) framework
3. Modeling and algorithmic tips to make a solver effective in practice
• Outline:
– Solving difficult MIPs [heavily based on Klotz & Newman 2011, PART-II-2.pdf]
1. Lack of node throughput because of LP solving
2. Lack in progress in the best (mixed-)integer solution
3. Lack in progress in the best bound
4. Lack of node throughput because of numerical instability
– MIP Challenges
A. Lodi, How to use your favorite MIP Solver 3
Solving difficult MIPs: four common reasons
• There are four common reasons that integer programs can require significant amount of
solution time.
A. Lodi, How to use your favorite MIP Solver 4
Solving difficult MIPs: four common reasons
• There are four common reasons that integer programs can require significant amount of
solution time.
1. There is lack of node throughput due to troublesome linear programming node solves.
A. Lodi, How to use your favorite MIP Solver 4
Solving difficult MIPs: four common reasons
• There are four common reasons that integer programs can require significant amount of
solution time.
1. There is lack of node throughput due to troublesome linear programming node solves.
2. There is lack of progress in the best integer solution, i.e., the lower bound.
A. Lodi, How to use your favorite MIP Solver 4
Solving difficult MIPs: four common reasons
• There are four common reasons that integer programs can require significant amount of
solution time.
1. There is lack of node throughput due to troublesome linear programming node solves.
2. There is lack of progress in the best integer solution, i.e., the lower bound.
3. There is lack of progress in the best upper bound.
A. Lodi, How to use your favorite MIP Solver 4
Solving difficult MIPs: four common reasons
• There are four common reasons that integer programs can require significant amount of
solution time.
1. There is lack of node throughput due to troublesome linear programming node solves.
2. There is lack of progress in the best integer solution, i.e., the lower bound.
3. There is lack of progress in the best upper bound.
4. There is insufficient node throughput due to numerical instability in the problem data or
excessive memory usage.
A. Lodi, How to use your favorite MIP Solver 4
Lack of node throughput because of LP solving
solution. Therefore, alternate optimal bases can result in different branching variable selections.235
Different branching selections, in turn, can cause significant performance variation if the model236
formulation or optimizer features are not sufficiently robust to consistently solve the model quickly.237
This notion of performance variability in integer programs is discussed in more detail in Danna238
(2008). However, regardless of whether an integer program is consistently or only occasionally239
difficult to solve, the guidelines described in this section can help address the performance problem.240
We now discuss each potential performance bottleneck and suggest an associated remedy.241
3.1 Lack of Node Throughput Due to Troublesome Linear Programming Node242
Solves243
Because processing each node in the branch-and-bound tree requires the solution of a linear pro-244
gram, the choice of a linear programming algorithm can profoundly influence performance. An245
interior point method may be used for the root node solve; it is less frequently used than the246
simplex method at the child nodes because it lacks a basis and hence, the ability to start with247
an initial solution - an important ability when processing tens or hundreds of thousands of nodes.248
However, conducting different runs in which the practitioner invokes, alternately, the primal or the249
dual simplex method at the child nodes is a good idea. Consider the following two node logs, the250
former corresponding to solving the root and child node linear programs with the dual simplex251
method and the latter with the primal simplex method.252
253
Node Log #1: Node Linear Programs Solved with Dual Simplex254
Nodes Cuts/ ItCnt
Node Left Objective IInf Best Integer Best Node
0 0 -89.0000 6 -89.0000 5278
0 0 -89.0000 6 Fract: 4 12799
0 2 -89.0000 6 -89.0000 12799
1 1 infeasible -89.0000 20767
2 2 -89.0000 5 -89.0000 27275
3 1 infeasible -89.0000 32502
...
8 2 -89.0000 8 -89.0000 65717
9 1 infeasible -89.0000 73714
...
9Solution time = 177.33 sec. Iterations = 73,714 Nodes = 10 (1)
A. Lodi, How to use your favorite MIP Solver 5
Lack of node throughput because of LP solving (cont.d)Solution time = 177.33 sec. Iterations = 73714 Nodes = 10 (1)
255
256
Node Log #2: Node Linear Programs Solved with Primal Simplex257
Nodes Cuts/ ItCnt
Node Left Objective IInf Best Integer Best Node
0 0 -89.0000 5 -89.0000 6603
0 0 -89.0000 5 Fract: 5 7120
0 2 -89.0000 5 -89.0000 7120
1 1 infeasible -89.0000 9621
2 2 -89.0000 5 -89.0000 10616
3 1 infeasible -89.0000 12963
...
8 2 -89.0000 8 -89.0000 21522
9 1 infeasible -89.0000 23891
...
Solution time = 54.37 sec. Iterations = 23891 Nodes = 10 (1)
258
The iteration count for the root node solve shown in Node Log #1 that occurred without259
any advanced start information indicates 5,278 iterations. Computing the average iteration count260
across all node LP solves, there are 11 solves (10 nodes, and 1 extra solve for cut generation at node261
0) and 73,714 iterations, which were performed in a total of 177 seconds. The summary output in262
gray indicates in parentheses that one unexplored node remains. So, the average solution time per263
node is approximately 17 seconds, and the average number of iterations per node is about 6,701.264
In Node Log #2, the solution time is 54 seconds, at which point the algorithm has performed 11265
solves, and the iteration count is 23,891. Hence, the average number of iterations per node is about266
2,172. Thus, in Node Log #1, the 10 child node LPs require more iterations, 6,844, on average,267
than the root node LP (which requires 5,278), despite the advanced basis at the child node solves268
that was absent at the root node solve. Any time this is true, or even when the average node LP269
iteration count is more than 30-50% of the root node iteration count, an opportunity for improving270
node LP solve times exists by changing algorithms or algorithmic settings. In Node Log #2, the271
10 child node LPs require 1,729 iterations, on average, which is much fewer than those required by272
10
Solution time = 54.37 sec. Iterations = 23,891 Nodes = 10 (1)
A. Lodi, How to use your favorite MIP Solver 6
Lack of node throughput because of LP solving (cont.d)
• Summary of the Dual Simplex output:
– 5,278 iterations without any advanced start information,
– overall 73,714 iterations and 177 seconds for 11 solves indicates approximately 6,701
iterations and 17 seconds per solve.
A. Lodi, How to use your favorite MIP Solver 7
Lack of node throughput because of LP solving (cont.d)
• Summary of the Dual Simplex output:
– 5,278 iterations without any advanced start information,
– overall 73,714 iterations and 177 seconds for 11 solves indicates approximately 6,701
iterations and 17 seconds per solve.
• Summary of the Primal Simplex output:
– 6,603 iterations without any advanced start information,
– overall 23,891 iterations and 54 seconds for 11 solves indicates approximately 2,172
iterations and 5 seconds per solve.
A. Lodi, How to use your favorite MIP Solver 7
Lack of node throughput because of LP solving (cont.d)
• Summary of the Dual Simplex output:
– 5,278 iterations without any advanced start information,
– overall 73,714 iterations and 177 seconds for 11 solves indicates approximately 6,701
iterations and 17 seconds per solve.
• Summary of the Primal Simplex output:
– 6,603 iterations without any advanced start information,
– overall 23,891 iterations and 54 seconds for 11 solves indicates approximately 2,172
iterations and 5 seconds per solve.
• Thus, with the dual Simplex any solve subsequent to the first one requires 6,844 = (73,714 -
5,278)/10 iterations, which is higher than the 5,278 iterations required for the first solve!
• Any time this is true, or even when the average node LP iteration count is more than 30-50%
of the root node iteration count, an opportunity for improving node LP solve times exists by
changing algorithms or algorithmic settings.
A. Lodi, How to use your favorite MIP Solver 7
Lack of node throughput because of LP solving (cont.d)
• Summary of the Dual Simplex output:
– 5,278 iterations without any advanced start information,
– overall 73,714 iterations and 177 seconds for 11 solves indicates approximately 6,701
iterations and 17 seconds per solve.
• Summary of the Primal Simplex output:
– 6,603 iterations without any advanced start information,
– overall 23,891 iterations and 54 seconds for 11 solves indicates approximately 2,172
iterations and 5 seconds per solve.
• Thus, with the dual Simplex any solve subsequent to the first one requires 6,844 = (73,714 -
5,278)/10 iterations, which is higher than the 5,278 iterations required for the first solve!
• Any time this is true, or even when the average node LP iteration count is more than 30-50%
of the root node iteration count, an opportunity for improving node LP solve times exists by
changing algorithms or algorithmic settings.
• Indeed, in the primal Simplex case, any additional solve requires only 1,729 iterations, which is
much smaller than the 6,603 of the first solve.
A. Lodi, How to use your favorite MIP Solver 7
Lack in progress in the best (mixed-)integer solution!!
!
!
!!!!!!!!!
(or even all) of the penalty variables set to nonzero values.304
• Solve a related, auxiliary problem to get a solution (e.g. the Feasopt method in CPLEX,305
which looks for feasible solutions by minimizing infeasibilities), provided that the gain from306
the starting solution exceeds the auxiliary solve time.307
• Use the solution from a previous solve for the next solve when solving a sequence of models.308
To see the advantages of providing a starting point, compare Node Log #5 with Node Log309
#4. Log #4 shows that CPLEX with default settings takes about 1589 seconds to find a first310
feasible solution, with an associated gap of 4.18%. Log #5 illustrates the results obtained by311
solving a sequence of 5 faster optimizations (see Lambert et al. (2011) for details) to obtain a312
starting solution with a gap of 2.23%. The total computation time to obtain the starting solution313
was 623 seconds. So, the time to obtain the first solution is faster by providing an initial feasible314
solution, and if we let the algorithm with the initial solution run for an additional 1589−623 = 966315
seconds, the gap for the instance with the initial solution improves to 1.53%.316
317
Node Log #4: No initial practitioner-supplied solution318
Root relaxation solution time = 131.45 sec.
Nodes Cuts/
Node Left Objective IInf Best Integer Best Node ItCnt Gap
0 0 1.09590e+07 2424 1.09590e+07 108111
0 0 1.09570e+07 2531 Cuts: 4 108510
0 0 1.09405e+07 2476 Cuts: 2 109208
Heuristic still looking.
Heuristic still looking.
Heuristic still looking.
Heuristic still looking.
Heuristic still looking.
0 2 1.09405e+07 2476 1.09405e+07 109208
Elapsed real time = 384.09 sec. (tree size = 0.01 MB)
1 3 1.08913e+07 2488 1.09405e+07 109673
2 4 1.09261e+07 2326 1.09405e+07 109977
12
(or even all) of the penalty variables set to nonzero values.304
• Solve a related, auxiliary problem to get a solution (e.g. the Feasopt method in CPLEX,305
which looks for feasible solutions by minimizing infeasibilities), provided that the gain from306
the starting solution exceeds the auxiliary solve time.307
• Use the solution from a previous solve for the next solve when solving a sequence of models.308
To see the advantages of providing a starting point, compare Node Log #5 with Node Log309
#4. Log #4 shows that CPLEX with default settings takes about 1589 seconds to find a first310
feasible solution, with an associated gap of 4.18%. Log #5 illustrates the results obtained by311
solving a sequence of 5 faster optimizations (see Lambert et al. (2011) for details) to obtain a312
starting solution with a gap of 2.23%. The total computation time to obtain the starting solution313
was 623 seconds. So, the time to obtain the first solution is faster by providing an initial feasible314
solution, and if we let the algorithm with the initial solution run for an additional 1589−623 = 966315
seconds, the gap for the instance with the initial solution improves to 1.53%.316
317
Node Log #4: No initial practitioner-supplied solution318
Root relaxation solution time = 131.45 sec.
Nodes Cuts/
Node Left Objective IInf Best Integer Best Node ItCnt Gap
0 0 1.09590e+07 2424 1.09590e+07 108111
0 0 1.09570e+07 2531 Cuts: 4 108510
0 0 1.09405e+07 2476 Cuts: 2 109208
Heuristic still looking.
Heuristic still looking.
Heuristic still looking.
Heuristic still looking.
Heuristic still looking.
0 2 1.09405e+07 2476 1.09405e+07 109208
Elapsed real time = 384.09 sec. (tree size = 0.01 MB)
1 3 1.08913e+07 2488 1.09405e+07 109673
2 4 1.09261e+07 2326 1.09405e+07 109977
12...
1776 1208 1.05645e+07 27 1.09164e+07 474242
1814 1246 1.05588e+07 31 1.09164e+07 478648
1847 1277 1.05554e+07 225 1.09164e+07 484687
* 1880+ 1300 1.04780e+07 1.09164e+07 491469 4.18%
1880 1302 1.05474e+07 228 1.04780e+07 1.09164e+07 491469 4.18%
Elapsed real time = 1589.38 sec. (tree size = 63.86 MB)
319
320
Node Log #5: An initial solution supplied by the practitioner321
Root relaxation solution time = 93.92 sec.
Nodes Cuts/
Node Left Objective IInf Best Integer Best Node ItCnt Gap
* 0+ 0 1.07197e+07 108111 ---
0 0 1.09590e+07 2424 1.07197e+07 1.09590e+07 108111 2.23%
0 0 1.09570e+07 2531 1.07197e+07 Cuts: 4 108538 2.21%
...
485 433 1.09075e+07 2398 1.07197e+07 1.08840e+07 244077 1.53%
487 434 1.08237e+07 2303 1.07197e+07 1.08840e+07 244350 1.53%
497 439 1.08637e+07 1638 1.07197e+07 1.08840e+07 245391 1.53%
Elapsed real time = 750.11 sec. (tree size = 32.61 MB)
501 443 1.08503e+07 1561 1.07197e+07 1.08840e+07 245895 1.53%
...
Elapsed real time = 984.03 sec. (tree size = 33.00 MB)
1263 674 1.08590e+07 2574 1.07197e+07 1.08840e+07 314814 1.53%
322
In the absence of a readily identifiable initial solution, various branching strategies can aid in323
obtaining initial and subsequent solutions. These branching strategies may be purely based on the324
13
A. Lodi, How to use your favorite MIP Solver 8
Lack in progress in the best (mixed-)integer solution (cont.d)
...
1776 1208 1.05645e+07 27 1.09164e+07 474242
1814 1246 1.05588e+07 31 1.09164e+07 478648
1847 1277 1.05554e+07 225 1.09164e+07 484687
* 1880+ 1300 1.04780e+07 1.09164e+07 491469 4.18%
1880 1302 1.05474e+07 228 1.04780e+07 1.09164e+07 491469 4.18%
Elapsed real time = 1589.38 sec. (tree size = 63.86 MB)
319
320
Node Log #5: An initial solution supplied by the practitioner321
Root relaxation solution time = 93.92 sec.
Nodes Cuts/
Node Left Objective IInf Best Integer Best Node ItCnt Gap
* 0+ 0 1.07197e+07 108111 ---
0 0 1.09590e+07 2424 1.07197e+07 1.09590e+07 108111 2.23%
0 0 1.09570e+07 2531 1.07197e+07 Cuts: 4 108538 2.21%
...
485 433 1.09075e+07 2398 1.07197e+07 1.08840e+07 244077 1.53%
487 434 1.08237e+07 2303 1.07197e+07 1.08840e+07 244350 1.53%
497 439 1.08637e+07 1638 1.07197e+07 1.08840e+07 245391 1.53%
Elapsed real time = 750.11 sec. (tree size = 32.61 MB)
501 443 1.08503e+07 1561 1.07197e+07 1.08840e+07 245895 1.53%
...
Elapsed real time = 984.03 sec. (tree size = 33.00 MB)
1263 674 1.08590e+07 2574 1.07197e+07 1.08840e+07 314814 1.53%
322
In the absence of a readily identifiable initial solution, various branching strategies can aid in323
obtaining initial and subsequent solutions. These branching strategies may be purely based on the324
13
A. Lodi, How to use your favorite MIP Solver 9
Lack in progress in the best bound
P1 represents the convex hull of all integer feasible solutions of the MIP, while P2 represents the358
feasible region of the LP relaxation. Adding cuts yields the region P3, which contains all integer359
solutions of the MIP, but contains only a subset of the fractional solutions feasible for P2.360
P1
P3
P2
aT1 x ≤ b1
aT2 x ≤ b2
aT3 x ≤ b3
P1 := conv{x ∈ Zn : Ax ≤ b, x ≥ 0}P2 := {x ∈ Rn : Ax ≤ b, x ≥ 0}P3 := P2 ∩ {x ∈ Rn : Ax ≤ b}
Cuts must satisfy
1) aTi x ≤ bi ∀x ∈ P1 (validity)
2) ∃ x ∈ P2 : aTi x > bi (separation)
Figure 2: Convex hull
Node log #6 exemplifies progress in best integer solution but not in the best bound:361
362
Node Log #6: Progress in Best Integer Solution but not in the Best Bound363
Nodes Cuts/ ItCnt Gap364
Node Left Objective IInf Best Integer Best Node365
366
300 296 2018.0000 27 3780.0000 560.0000 3703 85.19%367
* 300+ 296 0 2626.0000 560.0000 3703 78.67%368
* 393 368 0 2590.0000 560.0000 4405 78.38%369
400 372 560.0000 291 2590.0000 560.0000 4553 78.38%370
500 472 810.0000 175 2590.0000 560.0000 5747 78.38%371
...372
* 7740+ 5183 0 1710.0000 560.0000 66026 67.25%373
7800 5240 1544.0000 110 1710.0000 560.0000 66279 67.25%374
7900 5325 944.0000 176 1710.0000 560.0000 66801 67.25%375
8000 5424 1468.0000 93 1710.0000 560.0000 67732 67.25%376
377
15
A. Lodi, How to use your favorite MIP Solver 10
Lack in progress in the best bound (cont.d)
• Most solvers offer parameter settings that can help improve progress of the best node or
tighten the formulation of the model by moving the value of the linear programming relaxation
closer to that of the optimal integer objective.
A. Lodi, How to use your favorite MIP Solver 11
Lack in progress in the best bound (cont.d)
• Most solvers offer parameter settings that can help improve progress of the best node or
tighten the formulation of the model by moving the value of the linear programming relaxation
closer to that of the optimal integer objective.
Especially,
– Best-bound-first node selection.
– Strong Branching.
– Probing.
– More aggressive levels of cut generation.
A. Lodi, How to use your favorite MIP Solver 11
Lack in progress in the best bound (cont.d)
• Most solvers offer parameter settings that can help improve progress of the best node or
tighten the formulation of the model by moving the value of the linear programming relaxation
closer to that of the optimal integer objective.
Especially,
– Best-bound-first node selection.
– Strong Branching.
– Probing.
– More aggressive levels of cut generation.
• If those mechanisms fail, then the user must carefully look at the model and
– Change model formulation by using alternate variable definitions.
A. Lodi, How to use your favorite MIP Solver 11
Lack in progress in the best bound (cont.d)
• Most solvers offer parameter settings that can help improve progress of the best node or
tighten the formulation of the model by moving the value of the linear programming relaxation
closer to that of the optimal integer objective.
Especially,
– Best-bound-first node selection.
– Strong Branching.
– Probing.
– More aggressive levels of cut generation.
• If those mechanisms fail, then the user must carefully look at the model and
– Change model formulation by using alternate variable definitions.
– Revisit the use of elastic/indicator variables, i.e., those relaxing a constraint by allowing for
violations (penalized in the objective function).
A. Lodi, How to use your favorite MIP Solver 11
Lack in progress in the best bound (cont.d)
• Most solvers offer parameter settings that can help improve progress of the best node or
tighten the formulation of the model by moving the value of the linear programming relaxation
closer to that of the optimal integer objective.
Especially,
– Best-bound-first node selection.
– Strong Branching.
– Probing.
– More aggressive levels of cut generation.
• If those mechanisms fail, then the user must carefully look at the model and
– Change model formulation by using alternate variable definitions.
– Revisit the use of elastic/indicator variables, i.e., those relaxing a constraint by allowing for
violations (penalized in the objective function).
– Look at additional cutting planes either within the standard families that might not having
been discovered by the solver or specific for the model at hands.
A. Lodi, How to use your favorite MIP Solver 11
Lack in progress in the best bound (cont.d)
• Let us consider again cut generation and the following small MIP:
max 3x1 + 2x2 + x3 + 2x4 + x5 (2)
subject to x1 + x2 ≤ 1 (3)
x1 + x3 ≤ 1 (4)
x2 + x3 ≤ 1 (5)
4x3 + 3x4 + 5x5 ≤ 10 (6)
x1 + 3x4 ≤ 2 (7)
3x2 + 4x5 ≤ 5 (8)
x ∈ {0, 1}5 (9)
A. Lodi, How to use your favorite MIP Solver 12
Lack in progress in the best bound (cont.d)
• Adding cuts does not always help branch-and-bound performance.
A. Lodi, How to use your favorite MIP Solver 13
Lack in progress in the best bound (cont.d)
• Adding cuts does not always help branch-and-bound performance.
• While it can remove integer infeasibilities, it also results in more constraints in each node LP.
• More constraints can increase the intractability of these LPs. Without a commensurate
speed-up in solution time associated with processing fewer nodes, cuts may not be worth
adding.
A. Lodi, How to use your favorite MIP Solver 13
Lack in progress in the best bound (cont.d)
• Adding cuts does not always help branch-and-bound performance.
• While it can remove integer infeasibilities, it also results in more constraints in each node LP.
• More constraints can increase the intractability of these LPs. Without a commensurate
speed-up in solution time associated with processing fewer nodes, cuts may not be worth
adding.
• Some solvers have internal logic to automatically assess the trade-offs between adding cuts and
node LP solve time. However, if the solver lacks such logic or fails to make a good decision, the
user may need to look at the branch-and-bound output.
A. Lodi, How to use your favorite MIP Solver 13
Lack in progress in the best bound (cont.d)
• Adding cuts does not always help branch-and-bound performance.
• While it can remove integer infeasibilities, it also results in more constraints in each node LP.
• More constraints can increase the intractability of these LPs. Without a commensurate
speed-up in solution time associated with processing fewer nodes, cuts may not be worth
adding.
• Some solvers have internal logic to automatically assess the trade-offs between adding cuts and
node LP solve time. However, if the solver lacks such logic or fails to make a good decision, the
user may need to look at the branch-and-bound output.
• In other cases, the computational effort required to derive the cuts needed to effectively solve
the model may exceed the performance benefit they provide.
A. Lodi, How to use your favorite MIP Solver 13
Lack of node throughput because of numerical instability
• Because the solver solves LPs at each node of the branch-and-bound tree, the practitioner must
be careful to avoid LP numerical performance issues (see, Section 3 of Klotz and Newman,
reading material PART-II-1.pdf).
A. Lodi, How to use your favorite MIP Solver 14
Lack of node throughput because of numerical instability
• Because the solver solves LPs at each node of the branch-and-bound tree, the practitioner must
be careful to avoid LP numerical performance issues (see, Section 3 of Klotz and Newman,
reading material PART-II-1.pdf).
• Especially important is avoiding, when possible, large differences in orders of magnitude in data
to preclude the introduction of unnecessary round-off error. Such differences of input values
create round-off error in floating point calculations that makes it difficult for the algorithm to
distinguish between this error and a legitimate value [Koch et al. 2011].
A. Lodi, How to use your favorite MIP Solver 14
Lack of node throughput because of numerical instability
• Because the solver solves LPs at each node of the branch-and-bound tree, the practitioner must
be careful to avoid LP numerical performance issues (see, Section 3 of Klotz and Newman,
reading material PART-II-1.pdf).
• Especially important is avoiding, when possible, large differences in orders of magnitude in data
to preclude the introduction of unnecessary round-off error. Such differences of input values
create round-off error in floating point calculations that makes it difficult for the algorithm to
distinguish between this error and a legitimate value [Koch et al. 2011].
• Elastic/indicator variables (refereed to as “big-M ’s”) correspond to logic expressions as “ifz = 0, then x = 0”, which are imposed through the use of arbitrary large coefficients
x− 100000000000z ≤ 0 (10)
0 ≤ x ≤ 5000; z binary (11)
A. Lodi, How to use your favorite MIP Solver 14
Lack of node throughput because of numerical instability
• Because the solver solves LPs at each node of the branch-and-bound tree, the practitioner must
be careful to avoid LP numerical performance issues (see, Section 3 of Klotz and Newman,
reading material PART-II-1.pdf).
• Especially important is avoiding, when possible, large differences in orders of magnitude in data
to preclude the introduction of unnecessary round-off error. Such differences of input values
create round-off error in floating point calculations that makes it difficult for the algorithm to
distinguish between this error and a legitimate value [Koch et al. 2011].
• Elastic/indicator variables (refereed to as “big-M ’s”) correspond to logic expressions as “ifz = 0, then x = 0”, which are imposed through the use of arbitrary large coefficients
x− 100000000000z ≤ 0 (10)
0 ≤ x ≤ 5000; z binary (11)
• However, the coefficient 1011 can be safely and effectively replaced by 5000, which forbids a
solution in which z = 10−8 and x = 1000 from being feasible.
A. Lodi, How to use your favorite MIP Solver 14
Tightening the formulation
• As anticipated, sometimes the user might be required to add problem-specific cutting planes.
A. Lodi, How to use your favorite MIP Solver 15
Tightening the formulation
• As anticipated, sometimes the user might be required to add problem-specific cutting planes.
However, before doing that, it is often useful to identify elements of the model making it
difficult, specifically, those that contain the constraints and variables from which useful cuts can
be derived.
A. Lodi, How to use your favorite MIP Solver 15
Tightening the formulation
• As anticipated, sometimes the user might be required to add problem-specific cutting planes.
However, before doing that, it is often useful to identify elements of the model making it
difficult, specifically, those that contain the constraints and variables from which useful cuts can
be derived.
– Simplify the model if necessary.
For example, try to identify any constraints or integrality restrictions that are not involved in
the slow performance by systematically removing constraints and restrictions and solving the
resulting model.
A. Lodi, How to use your favorite MIP Solver 15
Tightening the formulation
• As anticipated, sometimes the user might be required to add problem-specific cutting planes.
However, before doing that, it is often useful to identify elements of the model making it
difficult, specifically, those that contain the constraints and variables from which useful cuts can
be derived.
– Simplify the model if necessary.
For example, try to identify any constraints or integrality restrictions that are not involved in
the slow performance by systematically removing constraints and restrictions and solving the
resulting model.
– Identify the constraints that prevent the objective from improving.
With a maximization problem, this typically means identifying the constraints that force
prizes not to be gained.
A. Lodi, How to use your favorite MIP Solver 15
Tightening the formulation
• As anticipated, sometimes the user might be required to add problem-specific cutting planes.
However, before doing that, it is often useful to identify elements of the model making it
difficult, specifically, those that contain the constraints and variables from which useful cuts can
be derived.
– Simplify the model if necessary.
For example, try to identify any constraints or integrality restrictions that are not involved in
the slow performance by systematically removing constraints and restrictions and solving the
resulting model.
– Identify the constraints that prevent the objective from improving.
With a maximization problem, this typically means identifying the constraints that force
prizes not to be gained.
– Determine how removing integrality restrictions allows the root node relaxation to improve.
In weak formulations, the root node relaxation objective tends to be significantly better than
the optimal objective of the associated MIP. The variables with fractional solutions in the
root node relaxation help identify the constraints and variables that motivate additional cuts.
A. Lodi, How to use your favorite MIP Solver 15
Tightening the formulation (cont.d)
• Model characteristics from which to derive cuts are
A. Lodi, How to use your favorite MIP Solver 16
Tightening the formulation (cont.d)
• Model characteristics from which to derive cuts are
– Linear or logical combinations of constraints.
As discussed, combination of constraints is the base of current cut generation techniques.
The knowledge of the problem at hand can suggest which constraints should be combined.
A. Lodi, How to use your favorite MIP Solver 16
Tightening the formulation (cont.d)
• Model characteristics from which to derive cuts are
– Linear or logical combinations of constraints.
As discussed, combination of constraints is the base of current cut generation techniques.
The knowledge of the problem at hand can suggest which constraints should be combined.
– The optimization of one or more related models.
Extract a small(er) instance with same characteristics of the problem at hand to play with is
often very instructive.
A. Lodi, How to use your favorite MIP Solver 16
Tightening the formulation (cont.d)
• Model characteristics from which to derive cuts are
– Linear or logical combinations of constraints.
As discussed, combination of constraints is the base of current cut generation techniques.
The knowledge of the problem at hand can suggest which constraints should be combined.
– The optimization of one or more related models.
Extract a small(er) instance with same characteristics of the problem at hand to play with is
often very instructive.
– Use of the incumbent solution objective value.
Template cuts are based on detecting infeasibilities, while optimality cuts might lead in some
special cases to effective partitions of the solution space.
A. Lodi, How to use your favorite MIP Solver 16
Tightening the formulation (cont.d)
• Model characteristics from which to derive cuts are
– Linear or logical combinations of constraints.
As discussed, combination of constraints is the base of current cut generation techniques.
The knowledge of the problem at hand can suggest which constraints should be combined.
– The optimization of one or more related models.
Extract a small(er) instance with same characteristics of the problem at hand to play with is
often very instructive.
– Use of the incumbent solution objective value.
Template cuts are based on detecting infeasibilities, while optimality cuts might lead in some
special cases to effective partitions of the solution space.
– Disjunctions.
A. Lodi, How to use your favorite MIP Solver 16
Tightening the formulation (cont.d)
• Model characteristics from which to derive cuts are
– Linear or logical combinations of constraints.
As discussed, combination of constraints is the base of current cut generation techniques.
The knowledge of the problem at hand can suggest which constraints should be combined.
– The optimization of one or more related models.
Extract a small(er) instance with same characteristics of the problem at hand to play with is
often very instructive.
– Use of the incumbent solution objective value.
Template cuts are based on detecting infeasibilities, while optimality cuts might lead in some
special cases to effective partitions of the solution space.
– Disjunctions.
– The exploitation of infeasibility.
Infeasibility considerations on the model might allow to remove useless pieces of the search
tree.
A. Lodi, How to use your favorite MIP Solver 16
Tightening the formulation, Example 1
• The very small MIP
13429x1 + 26850x2 + 26855x3 + 40280x4 +
40281x5 + 53711x6 + 53714x7 + 67141x8 = 45094583 (12)
xj ≥ 0, integer, j = 1, . . . , 8 (13)
presents the following (disappointing) computational behavior:
A. Lodi, How to use your favorite MIP Solver 17
Tightening the formulation, Example 1 (cont.d)
Running CPLEX 12.2.0.2 with default settings results in no conclusion after over 7 hours and644
2 billion nodes, as illustrated in Node Log #7:645
646
Node Log #7647
Nodes Cuts/648
Node Left Objective IInf Best Integer Best Node ItCnt Gap649
...650
2054970910 13066 0.0000 1 0.0000 25234328651
Elapsed real time = 27702.98 sec. (tree size = 2.70 MB, solutions = 0)652
2067491472 14446 0.0000 1 0.0000 25388082653
2080023238 12892 0.0000 1 0.0000 25542160654
2092548561 15366 0.0000 1 0.0000 25696280655
...656
-------657
Total (root+branch&cut) = 28302.29 sec.658
659
660
MIP - Node limit exceeded, no integer solution.661
Current MIP best bound = 0.0000000000e+00 (gap is infinite)662
Solution time = 28302.31 sec. Iterations = 25787898 Nodes = 2100000004 (16642)663
664
However, note that all the coefficients in the model are very close to integer multiples of the665
coefficient of x1. Therefore, we can separate the left hand side into the part that is an integer666
multiple of this coefficient, and the much smaller remainder terms:667
13429 (x1 + 2x2 + 2x3 + 3x4 + 3x5 + 4x6 + 4x7 + 5x8)︸ ︷︷ ︸x
(19)
−8x2 − 3x3 − 7x4 − 6x5 − 5x6 − 2x7 − 4x8 (20)
= 3358 ∗ 13429 + 1 = 3359 ∗ 13429 − 13428 (21)
This constraint resembles the one from which we previously derived the mixed integer rounding668
cut. But, instead of separating the integer and fractional components, we separate the components669
27
A. Lodi, How to use your favorite MIP Solver 18
Tightening the formulation, Example 1 (cont.d)
• The behavior is then improved by the addition of the two cuts
x1 + 2x2 + 2x3 + 3x4 + 3x5 + 4x6 + 4x7 + 5x8 ≥ 3359 (14)
8x2 + 3x3 + 7x4 + 6x5 + 5x6 + 2x7 + 4x8 ≥ 13428 (15)
that are exact multiples of the coefficient of x1 from the remaining terms. We now perform the670
disjunction on x in an analogous manner, again using the nonnegativity of the variables.671
x ≤ 3358 ⇒ −8x2 − 3x3 − 7x4 − 6x5 − 5x6 − 2x7 − 4x8︸ ︷︷ ︸≤0
≥ 1 (22)
Thus, if x ≤ 3358, the model is infeasible. Therefore, infeasibility implies that x ≥ 3359 is a672
valid cut. We can derive an additional cut from the other side of the disjunction on x:673
x ≥ 3359 ⇒ −8x2 − 3x3 − 7x4 − 6x5 − 5x6 − 2x7 − 4x8 ≤ −13428 (23)
This analysis shows that we either have an infeasible model, or that constraints (24) (using the674
infeasibility argument above) and (25) (multiplying 23 through by -1) are globally valid cuts.675
x1 + 2x2 + 2x3 + 3x4 + 3x5 + 4x6 + 4x7 + 5x8 ≥ 3359 (24)
8x2 + 3x3 + 7x4 + 6x5 + 5x6 + 2x7 + 4x8 ≥ 13428 (25)
Adding these cuts enables CPLEX 12.2.0.2 to easily identify that the model is infeasible (see Node676
Log #8).677
678
Node Log #8679
680
Nodes Cuts/681
Node Left Objective IInf Best Integer Best Node ItCnt Gap682
683
0 0 0.0000 1 0.0000 1684
0 0 0.0000 2 MIRcuts: 1 3685
0 0 0.0000 2 MIRcuts: 1 5686
0 0 cutoff 5687
Elapsed real time = 0.23 sec. (tree size = 0.00 MB, solutions = 0)688
Mixed integer rounding cuts applied: 1689
...690
MIP - Integer infeasible.691
Current MIP best bound is infinite.692
Solution time = 0.46 sec. Iterations = 5 Nodes = 0693
28
A. Lodi, How to use your favorite MIP Solver 19
Tightening the formulation, Example 2
• The following Mixed Integer Quadratic Program
max
n∑i=1
n∑j=i+1
dijxixj (16)
subject ton∑j=1
xj ≤ k (17)
x ∈ {0, 1}n (18)
can be classically reformulated as a MIP by binary variables zij = xixj and constraints
zij ≤ xi, ∀i, j (19)
zij ≤ xj, ∀i, j (20)
xi + xj ≤ 1 + zij ∀i, j. (21)
• The performance of Cplex solver are as follows.
A. Lodi, How to use your favorite MIP Solver 20
Tightening the formulation, Example 2 (cont.d)!!!!
!
!!!!!!
while (30) forces zij to 1. So, regardless of the values of xi, and xj , zij = xixj , and we can replace713
occurrences of xixj with zij to obtain the linearized reformulation above.714
Using this linearized model with n = 60 and k = 24, Node Log #9 gives the results. The715
instance of the model has 1830 binary variables, and 5311 constraints; CPLEX processes over716
4 million nodes before running out of memory after about 4 hours. This level of performance717
indicates significant potential for improvement. Due to the large size of the branch-and-bound718
tree, we set CPLEX’s file parameter to instruct CPLEX to efficiently swap the memory associated719
with the branch-and-bound tree to disk. This enables the run to proceed further than with default720
settings in which CPLEX stores the tree in physical memory. All other parameter settings remain721
at defaults, so CPLEX makes use of all four available processors. CPLEX runs for just over four722
hours, terminating when the size of the swap file for the branch-and-bound tree exceeds memory723
limits. At that point the solution has an objective value of 3483.0000, proven to be within 51.32% of724
optimal. Although we do not provide the output here, the original MIQP formulation in (MIQP )725
performs even worse.726
727
Node Log #9728
Nodes Cuts/729
Node Left Objective IInf Best Integer Best Node ItCnt Gap730
731
* 0+ 0 0.0000 2247 ---732
0 0 7640.4000 1830 0.0000 7640.4000 2247 ---733
* 0+ 0 19.0000 7640.4000 2247 ---734
735
...736
737
* 0+ 0 3185.0000 7445.4286 2286 133.77%738
0 2 7628.5333 1829 3185.0000 7445.4286 2286 133.77%739
Elapsed real time = 4.09 sec. (tree size = 0.01 MB, solutions = 8)740
35 37 6579.2308 1378 3185.0000 7445.4286 6615 133.77%741
...742
4332613 3675298 4936.6750 1099 3483.0000 5270.8377 1.78e+08 51.33%743
4341075 3682375 3889.4643 714 3483.0000 5270.4545 1.79e+08 51.32%744
745
30
...746
CPLEX Error 1803: Failure on temporary file write.747
748
Solution pool: 25 solutions saved.749
750
MIP - Error termination, no tree: Objective = 3.4830000000e+03751
Current MIP best bound = 5.2704102564e+03 (gap = 1787.41, 51.32%)752
Solution time = 15031.18 sec. Iterations = 178699476 Nodes = 4342299 (3682262)753
754
Experimentation with non-default parameter settings as described in Section 3 yields modest755
performance improvements, but does not come close to enabling CPLEX to find an optimal solution756
to the model.757
We carefully examine a smaller model instance with n = 3 and k = 2 to assess how removing758
integrality restrictions yields an artificially high objective function value:759
max 3z12 + 4z13 + 5z23
subject to x1 + x2 + x3 ≤ 2
z12 − x1 ≤ 0
z12 − x2 ≤ 0
x1 + x2 ≤ 1 + z12
z13 − x1 ≤ 0
z13 − x3 ≤ 0
x1 + x3 ≤ 1 + z13
z23 − x2 ≤ 0
z23 − x3 ≤ 0
x2 + x3 ≤ 1 + z23
x1, x2, x3, z12, z13, z23 binary
The optimal solution to this MILP consists of setting z23 = x2 = x3 = 1, yielding an objective760
value of 5. By contrast, relaxing integrality enables a fractional solution consisting of setting all761
x and z variables to 2/3, yielding a much better objective value of 8. Note that the difference762
31
A. Lodi, How to use your favorite MIP Solver 21
Tightening the formulation, Example 2, improved
!!!!!!
!
!!
!!!!!
!
!!!!
x1 = x2 = · · · = xk = 1, and xk+1 = · · · = xn = 0. From (32), zij = 1 if and only if 1 ≤ i ≤ k,784
1 ≤ j ≤ k, and i < j. We can therefore count the number of z variables that equal 1 when785
x1 = x2 = · · · = xk = 1. Specifically, there are k(k −1) pairs (i, j) with i #= j, but only half of them786
have i < j. So, at most k(k − 1)/2 of the zij variables can be set to 1 when k of the x variables are787
set to 1. In other words,788
n∑
i=1
n∑
j=i+1
zij ≤ k(k − 1)/2
is a globally valid cut.789
Adding this cut to the instance with n = 60 and k = 24 enables CPLEX to solve the model790
to optimality in just over 2 hours and 30 minutes on the same machine using identical settings791
as the previous run without the cut. (See Node Log #10.) Note that the cut tightened the792
formulation significantly, as can be seen by the much better root node objective value of 4552.4000,793
which compares favorably to the root node objective value of 7640.4000 on the instance without794
the cut. Furthermore, the cut enabled CPLEX to add numerous zero-half cuts to the model that795
it could not with the original formulation. The zero-half cuts resulted in additional progress in the796
best node value that was essential to solving the model to optimality in a reasonable amount of797
time.798
799
Node Log #10800
Nodes Cuts/
Node Left Objective IInf Best Integer Best Node ItCnt Gap
* 0+ 0 0.0000 1161 ---
0 0 4552.4000 750 0.0000 4552.4000 1161 ---
* 0+ 0 6.0000 4552.4000 1161 ---
...
* 0+ 0 3477.0000 3924.7459 37882 12.88%
0 2 3924.7459 1281 3477.0000 3924.7459 37882 12.88%
Elapsed real time = 51.42 sec. (tree size = 0.01 MB, solutions = 31)
1 3 3919.3378 1212 3477.0000 3924.7459 39886 12.88%
2 4 3910.8201 1243 3477.0000 3924.7459 42289 12.88%
3 5 3910.8041 1144 3477.0000 3919.3355 44070 12.72%
...
33
125571 7819 cutoff 3590.0000 3599.7046 60456851 0.27%
Elapsed real time = 9149.19 sec. (tree size = 234.98 MB, solutions = 43)
Nodefile size = 196.38 MB (168.88 MB after compression)
*126172 7231 integral 0 3591.0000 3599.7046 60571398 0.24%
127700 5225 cutoff 3591.0000 3598.0159 60769494 0.20%
131688 6 cutoff 3591.0000 3592.5939 60980430 0.04%
Zero-half cuts applied: 2244
Solution pool: 44 solutions saved.
MIP - Integer optimal solution: Objective = 3.5910000000e+03
Solution time = 9213.79 sec. Iterations = 60980442 Nodes = 131695
801
Given the modest size of the model, a run time of 2.5 hours to optimality suggests potential802
for additional improvements in the formulation. However, by adding one globally valid cut, we see803
a dramatic performance improvement nonetheless. Furthermore, the derivation of this cut draws804
heavily on the guidelines proposed for tightening the formulation. By using a small instance of805
the model, we can easily identify how removal of integrality restrictions enables the objective to806
improve. Furthermore, we use infeasibility to derive the cut: by recognizing that the simplified807
MILP model is infeasible when z12 + z13 + z23 ≥ 2, we show that z12 + z13 + z23 ≤ 1 is a valid cut.808
5 Conclusion809
Today’s hardware and software allow practitioners to formulate and solve increasingly large and810
detailed models. However, optimizers have become less straightforward, often providing many811
methods for implementing their algorithms to enhance performance given various mathematical812
structures. Additionally, the literature regarding methods to increase the tractability of mixed813
integer linear programming problems contains a high degree of theoretical sophistication. Both of814
these facts might lead a practitioner to conclude that developing the skills necessary to successfully815
solve difficult mixed integer programs is too time consuming or difficult. This paper attempts to816
refute that perception, illustrating that practitioners can implement many techniques for improving817
performance without expert knowledge in the underlying theory of integer programming, thereby818
enabling them to solve larger and more detailed models with existing technology.819
34
!!!!!
!
!!!!
x1 = x2 = · · · = xk = 1, and xk+1 = · · · = xn = 0. From (32), zij = 1 if and only if 1 ≤ i ≤ k,784
1 ≤ j ≤ k, and i < j. We can therefore count the number of z variables that equal 1 when785
x1 = x2 = · · · = xk = 1. Specifically, there are k(k −1) pairs (i, j) with i #= j, but only half of them786
have i < j. So, at most k(k − 1)/2 of the zij variables can be set to 1 when k of the x variables are787
set to 1. In other words,788
n∑
i=1
n∑
j=i+1
zij ≤ k(k − 1)/2
is a globally valid cut.789
Adding this cut to the instance with n = 60 and k = 24 enables CPLEX to solve the model790
to optimality in just over 2 hours and 30 minutes on the same machine using identical settings791
as the previous run without the cut. (See Node Log #10.) Note that the cut tightened the792
formulation significantly, as can be seen by the much better root node objective value of 4552.4000,793
which compares favorably to the root node objective value of 7640.4000 on the instance without794
the cut. Furthermore, the cut enabled CPLEX to add numerous zero-half cuts to the model that795
it could not with the original formulation. The zero-half cuts resulted in additional progress in the796
best node value that was essential to solving the model to optimality in a reasonable amount of797
time.798
799
Node Log #10800
Nodes Cuts/
Node Left Objective IInf Best Integer Best Node ItCnt Gap
* 0+ 0 0.0000 1161 ---
0 0 4552.4000 750 0.0000 4552.4000 1161 ---
* 0+ 0 6.0000 4552.4000 1161 ---
...
* 0+ 0 3477.0000 3924.7459 37882 12.88%
0 2 3924.7459 1281 3477.0000 3924.7459 37882 12.88%
Elapsed real time = 51.42 sec. (tree size = 0.01 MB, solutions = 31)
1 3 3919.3378 1212 3477.0000 3924.7459 39886 12.88%
2 4 3910.8201 1243 3477.0000 3924.7459 42289 12.88%
3 5 3910.8041 1144 3477.0000 3919.3355 44070 12.72%
...
33
125571 7819 cutoff 3590.0000 3599.7046 60456851 0.27%
Elapsed real time = 9149.19 sec. (tree size = 234.98 MB, solutions = 43)
Nodefile size = 196.38 MB (168.88 MB after compression)
*126172 7231 integral 0 3591.0000 3599.7046 60571398 0.24%
127700 5225 cutoff 3591.0000 3598.0159 60769494 0.20%
131688 6 cutoff 3591.0000 3592.5939 60980430 0.04%
Zero-half cuts applied: 2244
Solution pool: 44 solutions saved.
MIP - Integer optimal solution: Objective = 3.5910000000e+03
Solution time = 9213.79 sec. Iterations = 60980442 Nodes = 131695
801
Given the modest size of the model, a run time of 2.5 hours to optimality suggests potential802
for additional improvements in the formulation. However, by adding one globally valid cut, we see803
a dramatic performance improvement nonetheless. Furthermore, the derivation of this cut draws804
heavily on the guidelines proposed for tightening the formulation. By using a small instance of805
the model, we can easily identify how removal of integrality restrictions enables the objective to806
improve. Furthermore, we use infeasibility to derive the cut: by recognizing that the simplified807
MILP model is infeasible when z12 + z13 + z23 ≥ 2, we show that z12 + z13 + z23 ≤ 1 is a valid cut.808
5 Conclusion809
Today’s hardware and software allow practitioners to formulate and solve increasingly large and810
detailed models. However, optimizers have become less straightforward, often providing many811
methods for implementing their algorithms to enhance performance given various mathematical812
structures. Additionally, the literature regarding methods to increase the tractability of mixed813
integer linear programming problems contains a high degree of theoretical sophistication. Both of814
these facts might lead a practitioner to conclude that developing the skills necessary to successfully815
solve difficult mixed integer programs is too time consuming or difficult. This paper attempts to816
refute that perception, illustrating that practitioners can implement many techniques for improving817
performance without expert knowledge in the underlying theory of integer programming, thereby818
enabling them to solve larger and more detailed models with existing technology.819
34
A. Lodi, How to use your favorite MIP Solver 22
MIP Challenges
• Overall, a big challenge from both performance and modeling viewpoints is accuracy, which is
somehow a new issue, i.e., an old issue that starts to be very important after realizing that MIP
solvers can now really solve the problems.
A. Lodi, How to use your favorite MIP Solver 23
MIP Challenges
• Overall, a big challenge from both performance and modeling viewpoints is accuracy, which is
somehow a new issue, i.e., an old issue that starts to be very important after realizing that MIP
solvers can now really solve the problems.
The MIPlib 2010 paper [Koch et al. 2011] includes, for the first time, scripts to run automated
tests in a predefined way, and a solution checker to test the accuracy of provided solutions
using exact arithmetic.
A. Lodi, How to use your favorite MIP Solver 23
MIP Challenges
• Overall, a big challenge from both performance and modeling viewpoints is accuracy, which is
somehow a new issue, i.e., an old issue that starts to be very important after realizing that MIP
solvers can now really solve the problems.
The MIPlib 2010 paper [Koch et al. 2011] includes, for the first time, scripts to run automated
tests in a predefined way, and a solution checker to test the accuracy of provided solutions
using exact arithmetic.
• Some difficult MIPs are encountered because of:
– bad modeling, i.e.,
∗ the model has numerical difficulties,
∗ the MIP modeling capability is not sufficient wrt the real problem;
– large size;
– knapsack constraints with huge coefficients and general-integer variables with large bounds;
– scheduling components with disjunctive constraints and fundamental continuous variables.
A. Lodi, How to use your favorite MIP Solver 23
MIP Challenges, performance
• The performance of MIP solvers can/must be improved in many different directions.
A. Lodi, How to use your favorite MIP Solver 24
MIP Challenges, performance
• The performance of MIP solvers can/must be improved in many different directions.
Among them, my favorite ones are:
– branching vs cutting
– sophisticated techniques for general-integer and continuous variables
– performance variability
– revisiting good “old” methods
– cutting plane exploitation
– symmetric MIPs
A. Lodi, How to use your favorite MIP Solver 24
MIP Challenges: branching vs cutting
x∗αTx = α0 αTx = α0 + 1
-�
A. Lodi, How to use your favorite MIP Solver 25
MIP Challenges: branching vs cutting
x∗αTx = α0 αTx = α0 + 1
-�
/
R
A. Lodi, How to use your favorite MIP Solver 25
MIP Challenges: branching vs cutting
x∗αTx = α0 αTx = α0 + 1
-�
/
R
x∗
first wisdom
x∗,1 x∗,2
βTx = β0
W
W
A. Lodi, How to use your favorite MIP Solver 25
MIP Challenges: branching vs cutting
x∗αTx = α0 αTx = α0 + 1
-�
/
R
x∗
first wisdom
x∗,1 x∗,2
βTx = β0
W
W
/
@@@@@@@@@@@@@@R
x∗
βTx = β0
αTx = α0� -
BBN
N
αTx = α0 + 1
A. Lodi, How to use your favorite MIP Solver 25
MIP Challenges: branching vs cutting
x∗αTx = α0 αTx = α0 + 1
-�
/
R
x∗
first wisdom
x∗,1 x∗,2
βTx = β0
W
W
/
@@@@@@@@@@@@@@R
x∗
βTx = β0
αTx = α0� -
BBN
N
αTx = α0 + 1
A. Lodi, How to use your favorite MIP Solver 25
MIP Challenges: branching vs cutting
x∗αTx = α0 αTx = α0 + 1
-�
/
R
x∗
first wisdom
x∗,1 x∗,2
βTx = β0
W
W
/
@@@@@@@@@@@@@@R
x∗
βTx = β0
αTx = α0� -
BBN
N
αTx = α0 + 1
second wisdom
A. Lodi, How to use your favorite MIP Solver 25
MIP Challenges, branching vs cutting (cont.d)
• The previous slide highlights a possibility of using traditional cutting plane theory in the
branching context [Karamanov & Cornuejols 2005, 2010]
A. Lodi, How to use your favorite MIP Solver 26
MIP Challenges, branching vs cutting (cont.d)
• The previous slide highlights a possibility of using traditional cutting plane theory in the
branching context [Karamanov & Cornuejols 2005, 2010]
• It seems that a better coordination of these two fundamental ingredients of the MIP solvers is
crucial for strong improvements.
A. Lodi, How to use your favorite MIP Solver 26
MIP Challenges, branching vs cutting (cont.d)
• The previous slide highlights a possibility of using traditional cutting plane theory in the
branching context [Karamanov & Cornuejols 2005, 2010]
• It seems that a better coordination of these two fundamental ingredients of the MIP solvers is
crucial for strong improvements.
• In the context of hard knapsack constraints branching on variables is not effective while (pure)
basis reduction methods have proven to be very powerful [Eisenbrand; Aardal; Pataki; . . . ].
A. Lodi, How to use your favorite MIP Solver 26
MIP Challenges, branching vs cutting (cont.d)
• The previous slide highlights a possibility of using traditional cutting plane theory in the
branching context [Karamanov & Cornuejols 2005, 2010]
• It seems that a better coordination of these two fundamental ingredients of the MIP solvers is
crucial for strong improvements.
• In the context of hard knapsack constraints branching on variables is not effective while (pure)
basis reduction methods have proven to be very powerful [Eisenbrand; Aardal; Pataki; . . . ].
• On the other hand, a tight integration of basis reduction techniques within MIP solvers has not
yet been achieved. One possibility for such an integration is the use of partial reformulations
but an intriguing option is exploiting these reformulations to generate cuts in the original space
of variables [Aardal & Wolsey 2009].
A. Lodi, How to use your favorite MIP Solver 26
MIP Challenges, branching vs cutting (cont.d)
• The previous slide highlights a possibility of using traditional cutting plane theory in the
branching context [Karamanov & Cornuejols 2005, 2010]
• It seems that a better coordination of these two fundamental ingredients of the MIP solvers is
crucial for strong improvements.
• In the context of hard knapsack constraints branching on variables is not effective while (pure)
basis reduction methods have proven to be very powerful [Eisenbrand; Aardal; Pataki; . . . ].
• On the other hand, a tight integration of basis reduction techniques within MIP solvers has not
yet been achieved. One possibility for such an integration is the use of partial reformulations
but an intriguing option is exploiting these reformulations to generate cuts in the original space
of variables [Aardal & Wolsey 2009].
• Branching on appropriate disjunctions has been recently proposed in the context of highly
symmetric MIPs [Ostrowsky, Linderoth, Rossi & Smriglio 2009].
A. Lodi, How to use your favorite MIP Solver 26
MIP Challenges, branching vs cutting (cont.d)
• The previous slide highlights a possibility of using traditional cutting plane theory in the
branching context [Karamanov & Cornuejols 2005, 2010]
• It seems that a better coordination of these two fundamental ingredients of the MIP solvers is
crucial for strong improvements.
• In the context of hard knapsack constraints branching on variables is not effective while (pure)
basis reduction methods have proven to be very powerful [Eisenbrand; Aardal; Pataki; . . . ].
• On the other hand, a tight integration of basis reduction techniques within MIP solvers has not
yet been achieved. One possibility for such an integration is the use of partial reformulations
but an intriguing option is exploiting these reformulations to generate cuts in the original space
of variables [Aardal & Wolsey 2009].
• Branching on appropriate disjunctions has been recently proposed in the context of highly
symmetric MIPs [Ostrowsky, Linderoth, Rossi & Smriglio 2009].
• Finally, the use of bilevel programming for computing strong multiple disjunctions (i.e.,
disjunctions involving more than 2 children) has been recently shown to be effective for special
0-1 MIPs [Lodi, Ralphs, Rossi & Smriglio 2011].
A. Lodi, How to use your favorite MIP Solver 26
MIP Challenges, performance (cont.d)
• A very important class of MIPs is 0/1 IPs. Many of the sophisticated techniques already
discussed have been originally proposed for this class and eventually extended to general MIPs.
A. Lodi, How to use your favorite MIP Solver 27
MIP Challenges, performance (cont.d)
• A very important class of MIPs is 0/1 IPs. Many of the sophisticated techniques already
discussed have been originally proposed for this class and eventually extended to general MIPs.
• For example, branching on variables is particularly natural and effective in the 0/1 case while it
is not when general-integer variables play a central role.
A. Lodi, How to use your favorite MIP Solver 27
MIP Challenges, performance (cont.d)
• A very important class of MIPs is 0/1 IPs. Many of the sophisticated techniques already
discussed have been originally proposed for this class and eventually extended to general MIPs.
• For example, branching on variables is particularly natural and effective in the 0/1 case while it
is not when general-integer variables play a central role.
• Another example are the models in which continuous variables are important: for those
variables MIP solvers do not do much (heuristics, strengthening, . . . ).
A. Lodi, How to use your favorite MIP Solver 27
MIP Challenges, performance (cont.d)
• A very important class of MIPs is 0/1 IPs. Many of the sophisticated techniques already
discussed have been originally proposed for this class and eventually extended to general MIPs.
• For example, branching on variables is particularly natural and effective in the 0/1 case while it
is not when general-integer variables play a central role.
• Another example are the models in which continuous variables are important: for those
variables MIP solvers do not do much (heuristics, strengthening, . . . ).
• A (urgent) MIP challenge is definitely dealing with general-integer and continuous variables
with special-purpose techniques.
A. Lodi, How to use your favorite MIP Solver 27
MIP Challenges, performance (cont.d)
• A very important class of MIPs is 0/1 IPs. Many of the sophisticated techniques already
discussed have been originally proposed for this class and eventually extended to general MIPs.
• For example, branching on variables is particularly natural and effective in the 0/1 case while it
is not when general-integer variables play a central role.
• Another example are the models in which continuous variables are important: for those
variables MIP solvers do not do much (heuristics, strengthening, . . . ).
• A (urgent) MIP challenge is definitely dealing with general-integer and continuous variables
with special-purpose techniques.
• Cutting plane generation has been a key step for the success of MIP solvers but: are we using
cuts in the best way?
A. Lodi, How to use your favorite MIP Solver 27
MIP Challenges, performance (cont.d)
• A very important class of MIPs is 0/1 IPs. Many of the sophisticated techniques already
discussed have been originally proposed for this class and eventually extended to general MIPs.
• For example, branching on variables is particularly natural and effective in the 0/1 case while it
is not when general-integer variables play a central role.
• Another example are the models in which continuous variables are important: for those
variables MIP solvers do not do much (heuristics, strengthening, . . . ).
• A (urgent) MIP challenge is definitely dealing with general-integer and continuous variables
with special-purpose techniques.
• Cutting plane generation has been a key step for the success of MIP solvers but: are we using
cuts in the best way?By far not!
A. Lodi, How to use your favorite MIP Solver 27
MIP Challenges, performance (cont.d)
• A very important class of MIPs is 0/1 IPs. Many of the sophisticated techniques already
discussed have been originally proposed for this class and eventually extended to general MIPs.
• For example, branching on variables is particularly natural and effective in the 0/1 case while it
is not when general-integer variables play a central role.
• Another example are the models in which continuous variables are important: for those
variables MIP solvers do not do much (heuristics, strengthening, . . . ).
• A (urgent) MIP challenge is definitely dealing with general-integer and continuous variables
with special-purpose techniques.
• Cutting plane generation has been a key step for the success of MIP solvers but: are we using
cuts in the best way?By far not!
• Fundamental questions about the use of cutting planes remain open, among which:
– stabilization/saturation issues,
– cut selection,
– cut interaction.
A. Lodi, How to use your favorite MIP Solver 27
MIP Challenges, performance (cont.d)
• The already discussed performance variability (some good/neutral features that might not be
monotonically helpful, or, worse, can deteriorate performance) [Koch et al. 2011] is due to
imperfect tie-breaking but is also related to the interaction of key ingredients of MIP.
A. Lodi, How to use your favorite MIP Solver 28
MIP Challenges, performance (cont.d)
• The already discussed performance variability (some good/neutral features that might not be
monotonically helpful, or, worse, can deteriorate performance) [Koch et al. 2011] is due to
imperfect tie-breaking but is also related to the interaction of key ingredients of MIP.
• This is the case of finding a (near-)optimal solution very early in the search tree that explicitly
improves the quality of the primal bound but might sometimes hurt in proving optimality (or at
least does not help).
A. Lodi, How to use your favorite MIP Solver 28
MIP Challenges, performance (cont.d)
• The already discussed performance variability (some good/neutral features that might not be
monotonically helpful, or, worse, can deteriorate performance) [Koch et al. 2011] is due to
imperfect tie-breaking but is also related to the interaction of key ingredients of MIP.
• This is the case of finding a (near-)optimal solution very early in the search tree that explicitly
improves the quality of the primal bound but might sometimes hurt in proving optimality (or at
least does not help).
• A deeper understanding through sophisticated testing techniques is needed [Hooker; McGeoch;
Margot].
A. Lodi, How to use your favorite MIP Solver 28
MIP Challenges, performance (cont.d)
• The already discussed performance variability (some good/neutral features that might not be
monotonically helpful, or, worse, can deteriorate performance) [Koch et al. 2011] is due to
imperfect tie-breaking but is also related to the interaction of key ingredients of MIP.
• This is the case of finding a (near-)optimal solution very early in the search tree that explicitly
improves the quality of the primal bound but might sometimes hurt in proving optimality (or at
least does not help).
• A deeper understanding through sophisticated testing techniques is needed [Hooker; McGeoch;
Margot].
• The negative example suggests an additional very crucial question: besides avoiding good primal
solutions hurting the optimality proof, how can we use them to have instead a strong speed up?
A. Lodi, How to use your favorite MIP Solver 28
MIP Challenges, performance (cont.d)
• The already discussed performance variability (some good/neutral features that might not be
monotonically helpful, or, worse, can deteriorate performance) [Koch et al. 2011] is due to
imperfect tie-breaking but is also related to the interaction of key ingredients of MIP.
• This is the case of finding a (near-)optimal solution very early in the search tree that explicitly
improves the quality of the primal bound but might sometimes hurt in proving optimality (or at
least does not help).
• A deeper understanding through sophisticated testing techniques is needed [Hooker; McGeoch;
Margot].
• The negative example suggests an additional very crucial question: besides avoiding good primal
solutions hurting the optimality proof, how can we use them to have instead a strong speed up?
• Good “old” methods have been rediscovered and revisited during the years, Gomory Mixed
Integer cuts being the most noticeable example. Recently:
– strong Benders cutting planes [Fischetti, Salvagnin & Zanette 2009];
– cutting plane use with the lexicographic simplex [Zanette, Fischetti & Balas 2010];
– cutting planes from group relaxation [Gomory; Richard; Dey; Wolsey; Dash & Gunluk;. . . ].
A. Lodi, How to use your favorite MIP Solver 28
MIP Challenges, the modeling viewpoint
• Besides developing additional tools in the spirit of the ones described before
(among all possible I would like
a tool for detecting minimal sources of numerical instability)
A. Lodi, How to use your favorite MIP Solver 29
MIP Challenges, the modeling viewpoint
• Besides developing additional tools in the spirit of the ones described before
(among all possible I would like
a tool for detecting minimal sources of numerical instability)
the main challenge from a modeling/application viewpoint seems to be dissemination.
A. Lodi, How to use your favorite MIP Solver 29
MIP Challenges, the modeling viewpoint
• Besides developing additional tools in the spirit of the ones described before
(among all possible I would like
a tool for detecting minimal sources of numerical instability)
the main challenge from a modeling/application viewpoint seems to be dissemination.
• More precisely, an interesting direction would be extending the modeling (and solving)
capability of the MIP framework.
A. Lodi, How to use your favorite MIP Solver 29
MIP Challenges, the modeling viewpoint
• Besides developing additional tools in the spirit of the ones described before
(among all possible I would like
a tool for detecting minimal sources of numerical instability)
the main challenge from a modeling/application viewpoint seems to be dissemination.
• More precisely, an interesting direction would be extending the modeling (and solving)
capability of the MIP framework.
• Two successful stories in this direction are:
1. SCIP (Solving Constraint Integer Programs [Achterberg 2007]) whose main feature is a tight
integration of Constraint Programming (CP) and SATisfiability techniques within an MIP
solver.
2. Bonmin (Basic Open-source Nonlinear Mixed INteger programming [Bonami et al. 2008])
has been developed for Convex MINLP within the framework of the MIP solver Cbc [Forrest].
A. Lodi, How to use your favorite MIP Solver 29
MIP Modeling, CP and SCIP
• SCIP can handle arbitrary (non-linear) constraints in a Constraint Programming fashion
A. Lodi, How to use your favorite MIP Solver 30
MIP Modeling, CP and SCIP
• SCIP can handle arbitrary (non-linear) constraints in a Constraint Programming fashion
– A global constraint defines
combinatorially a portion of the
feasible region, i.e., it is able to check
feasibility of an assignment of values
to variables.
A. Lodi, How to use your favorite MIP Solver 30
MIP Modeling, CP and SCIP
• SCIP can handle arbitrary (non-linear) constraints in a Constraint Programming fashion
– A global constraint defines
combinatorially a portion of the
feasible region, i.e., it is able to check
feasibility of an assignment of values
to variables.
– Moreover, a global constraint contains
an algorithm that prunes (filters) values
from the variable domains so as to
reduce as much as possible the search
space.
A. Lodi, How to use your favorite MIP Solver 30
MIP Modeling, CP and SCIP
• SCIP can handle arbitrary (non-linear) constraints in a Constraint Programming fashion
– A global constraint defines
combinatorially a portion of the
feasible region, i.e., it is able to check
feasibility of an assignment of values
to variables.
– Moreover, a global constraint contains
an algorithm that prunes (filters) values
from the variable domains so as to
reduce as much as possible the search
space.
A. Lodi, How to use your favorite MIP Solver 30
MIP Modeling, CP and SCIP
• SCIP can handle arbitrary (non-linear) constraints in a Constraint Programming fashion
– A global constraint defines
combinatorially a portion of the
feasible region, i.e., it is able to check
feasibility of an assignment of values
to variables.
– Moreover, a global constraint contains
an algorithm that prunes (filters) values
from the variable domains so as to
reduce as much as possible the search
space.
• In other words, a higher-level modeling layer has been added, of which MIP is just one of the
options, so as to allow a beneficial interaction among different modeling and solving
technologies.
A. Lodi, How to use your favorite MIP Solver 30
MIP Modeling, CP and SCIP
• SCIP can handle arbitrary (non-linear) constraints in a Constraint Programming fashion
– A global constraint defines
combinatorially a portion of the
feasible region, i.e., it is able to check
feasibility of an assignment of values
to variables.
– Moreover, a global constraint contains
an algorithm that prunes (filters) values
from the variable domains so as to
reduce as much as possible the search
space.
• In other words, a higher-level modeling layer has been added, of which MIP is just one of the
options, so as to allow a beneficial interaction among different modeling and solving
technologies.
• This is especially effective for those applications, like some classes of scheduling problems, in
which none of those technologies, in isolation, would outperform the others [Heinz & Beck
2012].
A. Lodi, How to use your favorite MIP Solver 30
MIP Modeling, Convex (and Non-Convex) MINLPs and Bonmin
• A network design example in water distribution, instance fossolo
• The model does not have special difficulties besides the so-called Hazen-Williams equation
modeling pressure loss in water pipes. However, such an equation is very “bad” . . .
A. Lodi, How to use your favorite MIP Solver 31
MIP Modeling, Convex (and Non-Convex) MINLPs and Bonmin
• A network design example in water distribution, instance fossolo
• The model does not have special difficulties besides the so-called Hazen-Williams equation
modeling pressure loss in water pipes. However, such an equation is very “bad” . . .
• A classical MIP model from the 80’s linearizes such an equation BUT Cplex does not find any
feasible solution for fossolo in 2 days of CPU time (!!) while Bonmin finds a very accurate
one in few seconds.
A. Lodi, How to use your favorite MIP Solver 31
MIP Modeling, Convex (and Non-Convex) MINLPs and Bonmin
• A network design example in water distribution, instance fossolo
• The model does not have special difficulties besides the so-called Hazen-Williams equation
modeling pressure loss in water pipes. However, such an equation is very “bad” . . .
• A classical MIP model from the 80’s linearizes such an equation BUT Cplex does not find any
feasible solution for fossolo in 2 days of CPU time (!!) while Bonmin finds a very accurate
one in few seconds. Using the diameters computed by Bonmin, the MIP does not certify the
solution to be feasible even allowing 1,000 linearization points.
A. Lodi, How to use your favorite MIP Solver 31
End of the course: concluding remarks
• We have seen
1. The building blocks of a MIP solver.
2. How to use a MIP solver as a sophisticated (heuristic) framework.
3. Modeling and algorithmic tips to make a solver effective in practice.
A. Lodi, How to use your favorite MIP Solver 32
End of the course: concluding remarks
• We have seen
1. The building blocks of a MIP solver.
2. How to use a MIP solver as a sophisticated (heuristic) framework.
3. Modeling and algorithmic tips to make a solver effective in practice.
Finally, we discussed some MIP Challenges.
A. Lodi, How to use your favorite MIP Solver 32
End of the course: concluding remarks
• We have seen
1. The building blocks of a MIP solver.
2. How to use a MIP solver as a sophisticated (heuristic) framework.
3. Modeling and algorithmic tips to make a solver effective in practice.
Finally, we discussed some MIP Challenges.
• In summary, MIP technology provides, through its commercial and noncommercial solvers, a
challenging, reliable, flexible and effective environment for application-oriented optimization.
1. challenging: a lot of good theoretical, methodological and experimental work is needed;
2. reliable: the software tools are stable;
3. flexible: it is open to hybridization, cannibalization, extensions;
4. effective: problems that were conceived as impossible only few years ago can regularly be
solved nowadays.
A. Lodi, How to use your favorite MIP Solver 32
End of the course: concluding remarks
• We have seen
1. The building blocks of a MIP solver.
2. How to use a MIP solver as a sophisticated (heuristic) framework.
3. Modeling and algorithmic tips to make a solver effective in practice.
Finally, we discussed some MIP Challenges.
• In summary, MIP technology provides, through its commercial and noncommercial solvers, a
challenging, reliable, flexible and effective environment for application-oriented optimization.
1. challenging: a lot of good theoretical, methodological and experimental work is needed;
2. reliable: the software tools are stable;
3. flexible: it is open to hybridization, cannibalization, extensions;
4. effective: problems that were conceived as impossible only few years ago can regularly be
solved nowadays.
• All of the above look like solid reasons for developing the skills for using (and, why not,
improving on) the MIP technology.
A. Lodi, How to use your favorite MIP Solver 32