Post on 01-May-2018
1
Constraint ProgrammingConstraint Programming
TutorialApril 2006
John HookerCarnegie Mellon University
2
Outline
Overview
Examples
Freight transfer
Traveling salesman problem
Continuous global optimization
Product configuration
Machine scheduling
3
Overview
Basic idea of CP:Each constraint is viewed as (invoking) a procedure.• The procedure reduces the search space by reducing
(filtering) the variable domains.MILP models are purely declarative.• Constraints merely describe what the solution should be like.
4
Overview
CP can exploit problem substructure.A highly structured subset of constraints is represented by a single global constraint.• Each global constraint is associated with a specialized
filtering or domain reduction algorithm.Domains reduced by one constraint are passed on to the next constraint for further reduction. • This is constraint propagation.
5
Overview
Advantages of CP relative to MILP:• More powerful modeling framework.• Allows the solver to exploit problem substructure.• More constraints make the problem easier, even when
they destroy overall problem structure.• Can use multiple formulations and link them with
channeling constraints.• Good performance when constraints contain only a few
variables, or with min/max objectives.• Good performance on sequencing, resource-constrained
scheduling, combinatorial, or disjunctive problems for which the MILP relaxation tends to be weak.
6
Overview
Disadvantages of CP relative to MILP:• User must draw from large library of global constraints.• User must think about the solution algorithm when writing
the model (often true in MILP!).• Models may not be transportable from one solver to
another.• Slower on highly-analyzed, highly structured problems
(e.g., TSP).• Slower on problems that have good MILP relaxations
(e.g., knapsack constraints, min cost objectives).
7
Overview
Myth: CP is better on “highly constrained” problems.• A tight constraint with many variables (i.e., cost constraint)
may make the problem insoluble for CP.• CP can be very fast on a loosely constrained problem if the
constraints “propagate well.”
8
Overview
Integrated methods.• There is no need to decide between MILP and CP.• Combine them to exploit complementary strengths.• Some of the following examples will suggest how to do this.
9
Overview
Similarity between MILP/CP solvers:• Both use branching. What happens at a node of the
search tree is different.Differences:• CP seeks feasible rather than optimal solution.
• A minor difference, since CP easily incorporates optimization.
• CP relies on filtering & propagation at a node, whereas MILP relies on relaxation.
10
Domain filtering, propagation, nogoods
Cutting planes, Benders cuts
Inference
Variable domainsContinuous relaxation (LP, Lagrangean, etc.)
Constraint store/ relaxation
Branch by splitting nonsingletondomain
Branch on variable with fractional value
Branching
CP (feasibility)MILP
MILP vs. CP Solution Approaches
11
When at least one domain is empty.
When relaxation is infeasible, has integral solution, or tree pruned by bound
Search backtracks…
When domains are singletons.
When solution of relaxation is integral.
Feasible solution obtained at a node…
None.Use bound from continuous relaxation.
Bounding
CP (feasibility)MILP
MILP vs. CP Solution Approaches
12
Outline
Example: Freight transfer
Bounds propagation
Bounds consistency
Knapsack cuts
Linear Relaxation
Branching search
13
Outline
Example: Traveling salesman problem
Filtering for all-different constraint
Relaxation of all-different
Arc consistency
14
Outline
Example: Continuous global optimization
Nonlinear bounds propagation
Function factorization
Interval splitting
Lagrangean propagation
Reduced-cost variable fixing
15
Outline
Example: Product configuration
Variable indices
Filtering for element constraint
Relaxation of element
Relaxing a disjunction of linear systems
16
Outline
Example: Machine scheduling
Edge finding
Benders decomposition and nogoods
17
Example: Freight Transfer
Bounds propagation
Bounds consistency
Knapsack cuts
Linear Relaxation
Branching search
18
Freight Transfer
}3,2,1,0{8
42345740506090min
4321
4321
4321
∈≤+++
≥++++++
jxxxxx
xxxxxxxx
Transport 42 tons of freight using at most 8 trucks.
Trucks have 4 different sizes: 7,5,4, and 3 tons.
How many trucks of each size should be used to minimize cost?
xj = number of trucks of size j. Cost of size 4 truck
19
Freight Transfer
We will solve the problem by branching search.
At each node of the search tree:
Reduce the variable domains using bounds propagation.
Solve a linear relaxation of the problem to get a bound on the optimal value.
20
Bounds Propagation
The domain of xj is the set of values xj can have in a some feasible solution.
Initially each xj has domain {0,1,2,3}.
Bounds propagation reduces the domains.
Smaller domains result in less branching.
21
Bounds Propagation
First reduce the domain of x1:
423457 4321 ≥+++ xxxx
⎥⎥⎤
⎢⎢⎡ ++−
≥7
34542 4321
xxxx
176
733343542
1 =⎥⎥⎤
⎢⎢⎡=⎥⎥
⎤⎢⎢⎡ ⋅−⋅−⋅−
≥x
So domain of x1 is reduced from {0,1,2,3} to {1,2,3}.
No reductions for x2, x3, x4 possible.
Max element of domain
22
Bounds Propagation
In general, let the domain of xi be {Li, …, Ui}.
An inequality ax ≥ b (with a ≥ 0) can be used to raise Li to
⎪⎭
⎪⎬
⎫
⎪⎩
⎪⎨
⎧−
∑≠
i
ijjj
i a
UabL ,max
An inequality ax ≤ b (with a ≥ 0) can be used to reduce Ui to
⎪⎭
⎪⎬
⎫
⎪⎩
⎪⎨
⎧−
∑≠
i
ijjj
i a
LabU ,min
23
Bounds Propagation
Now propagate the reduced domains to the other constraint, perhaps reducing domains further:
84321 ≤+++ xxxx
⎥⎥⎤
⎢⎢⎡ −−−
≤1
8 4321
xxxx
717
10018
1 =⎥⎥⎤
⎢⎢⎡=⎥⎥
⎤⎢⎢⎡ −−−
≤x
No further reduction is possible.
Min element of domain
24
Bounds Consistency
Again let {Lj, …, Uj} be the domain of xj
A constraint set is bounds consistent if for each j :
xj = Lj in some feasible solution and
xj = Uj in some feasible solution.
Bounds consistency ⇒ we will not set xj to any infeasible values during branching.
Bounds propagation achieves bounds consistency for a single inequality.
7x1 + 5x2 + 4x3 + 3x4 ≥ 42 is bounds consistent when the domains are x1 ∈ {1,2,3} and x2, x3, x4 ∈ {0,1,2,3}.
But not necessarily for a set of inequalities.
25
Bounds Consistency
Bounds propagation may not achieve consistency for a set.
Consider set of inequalities01
21
21
≥−≥+
xxxx
with domains x1, x2 ∈ {0,1}, solutions (x1,x2) = (1,0), (1,1).
Bounds propagation has no effect on the domains. From x1 + x2 ≥ 1: ⎡ ⎤ ⎡ ⎤
⎡ ⎤ ⎡ ⎤ 01110111
12
21
=−=−≥=−=−≥
UxUx
From x1 − x2 ≥ 1: ⎡ ⎤ ⎡ ⎤⎣ ⎦ ⎣ ⎦ 1011
0111
12
21
=+=+≤=−=−≥
LxUx
Constraint set is not bounds consistent because x1 = 0 in no feasible solution.
26
Knapsack Cuts
Inequality constraints (knapsack constraints) imply cutting planes.
Cutting planes make the linear relaxation tighter, and its solution is a stronger bound.
For the ≥ constraint, each maximal packing corresponds to a cutting plane (knapsack cut).
423457 4321 ≥+++ xxxx
These terms form a maximal packing because:
(a) They alone cannot sum to ≥ 42, even if each xjtakes the largest value in its domain (3).
(b) No superset of these terms has this property.
27
Knapsack Cuts
This maximal packing:
423457 4321 ≥+++ xxxx
corresponds to a knapsack cut:
{ } 23,4max
35374243 =⎥
⎥
⎤⎢⎢
⎡ ⋅−⋅−≥+ xx
Coefficients of x3, x4
Min value of 4x3 + 3x4needed to
satisfy inequality
28
Knapsack Cuts
In general, J is a packing for ax ≥ b (with a ≥ 0) if
bUaJj
jj <∑∈
∑ ∑∉ ∈
−≥Jj Jj
jjjj UabxaIf J is a packing for ax ≥ b, then the remaining terms must
cover the gap:
So we have a knapsack cut:
∑∑
∉∉
∈
⎥⎥⎥
⎥
⎤
⎢⎢⎢
⎢
⎡ −≥
Jj jJj
Jjjj
j a
Uabx
}{max
29
Knapsack Cuts
Valid knapsack cuts for
423457 4321 ≥+++ xxxx
are
322
32
42
43
≥+≥+≥+
xxxxxx
30
Linear Relaxation
jjj UxLxxxxxx
xxxxxxxx
xxxx
≤≤≥+≥+≥+
≤+++≥+++
+++
322
8423457
40506090min
32
42
43
4321
4321
4321
We now have a linear relaxation of the freight transfer problem:
31
Branching Search
At each node of the search tree:
Reduce domains with bounds propagation.
Solve a linear relaxation to obtain a lower bound on any feasible solution in the subtree rooted at current node.
If relaxation is infeasible, or its optimal value is no better than the best feasible solution found so far, backtrack.
Otherwise, if solution of relaxation is feasible, remember it and backtrack.
Otherwise, branch by splitting a domain.
32
31
32
31
523 value)0,2,3,2(
012301230123123
==
⎪⎪⎭
⎪⎪⎬
⎫
⎪⎪⎩
⎪⎪⎨
⎧
∈
x
x
relaxationinfeasible
123123
2312
⎪⎪⎭
⎪⎪⎬
⎫
⎪⎪⎩
⎪⎪⎨
⎧
∈x
526 value)0,2,6.2,3(
012301230123
3
==
⎪⎪⎭
⎪⎪⎬
⎫
⎪⎪⎩
⎪⎪⎨
⎧
∈
x
x
5.527 value)0,75.2,2,3(
0123123
0123
==
⎪⎪⎭
⎪⎪⎬
⎫
⎪⎪⎩
⎪⎪⎨
⎧
∈
x
x
solution feasible530 value
)2,0,3,3(012012
33
==
⎪⎪⎭
⎪⎪⎬
⎫
⎪⎪⎩
⎪⎪⎨
⎧
∈
x
x
solution feasible530 value
)1,2,2,3(1231212
3
==
⎪⎪⎭
⎪⎪⎬
⎫
⎪⎪⎩
⎪⎪⎨
⎧
∈
x
x
bound toduebacktrack
530 value)5.0,3,5.1,3(
0123
0123
==
⎪⎪⎭
⎪⎪⎬
⎫
⎪⎪⎩
⎪⎪⎨
⎧
∈
x
x
x1∈{1,2} x1=3
x2∈{0,1,2}x2=3
x3∈{1,2} x3=3
Branching search
Domains after bounds propagation
Solution of linear relaxation
33
Branching Search
)1,2,2,3(),,,( 4321 =xxxx
Two optimal solutions found (cost = 530):
)2,0,3,3(),,,( 4321 =xxxx
34
Example: Traveling Salesman
Filtering for all-different constraint
Relaxation of all-different
Arc consistency
35
Traveling Salesman
( ){ }nx
xx
c
j
n
ixx ii
,,1,,different-all
min
1
1
……
∈
∑ +
Salesman visits each city once.
Distance from city i to city j is cij .
Minimize distance traveled.
xi = i th city visited. Distance from i thcity visited to
next city
(city n+1 = city 1)
Can be solved by branching + domain reduction + relaxation.
36
Filtering for All-different
The best known filtering algorithm for all-different is based on maximum cardinality bipartite matching and a theorem of Berge.
Goal: filter infeasible values from variable domains.
Filtering reduces the domains and therefore reduces branching.
37
Filtering for All-different
Consider all-different(x1,x2,x3,x4,x5) with domains
}6,5,4,3,2,1{}5,1{
}5,3,2,1{}5,3,2{
}1{
5
4
3
2
1
∈∈∈∈∈
xxxxx
38
x1
x2
x3
x4
x5
1
2
3
4
5
6
Indicate domains with edges
Find maximum cardinality bipartite matching.
Mark edges in alternating paths that start at an uncovered vertex.
Mark edges in alternating cycles.
Remove unmarked edges not in matching.
39
Indicate domains with edges
Find maximum cardinality bipartite matching.x1
x2
x3
x4
x5
1
2
3
4
5
6
Mark edges in alternating paths that start at an uncovered vertex.
Mark edges in alternating cycles.
Remove unmarked edges not in matching.
40
Filtering for All-different
}6,4{}6,5,4,3,2,1{}5{}5,1{
}3,2{}5,3,2,1{}3,2{}5,3,2{
}1{
55
44
33
22
1
∈⇒∈∈⇒∈
∈⇒∈∈⇒∈
∈
xxxx
xxxx
x
41
Relaxation of All-different
All-different(x1,x2,x3,x4) with domains xj ∈ {1,2,3,4} has the convex hull relaxation:
43214321 +++=+++ xxxx
321
432
431
421
321
++≥
⎪⎪⎭
⎪⎪⎬
⎫
⎪⎪⎩
⎪⎪⎨
⎧
++++++++
xxxxxxxxxxxx
21
43
42
32
41
31
21
+≥
⎪⎪⎪⎪
⎭
⎪⎪⎪⎪
⎬
⎫
⎪⎪⎪⎪
⎩
⎪⎪⎪⎪
⎨
⎧
++++++
xxxxxxxxxxxx
1≥jx
This is the tightest possible linear relaxation, but it may be too weak (and have too many inequalities) to be useful.
42
Arc Consistency
A constraint set S containing variables x1, …, xn with domains D1, …, Dn is arc consistent if the domains contain no infeasible values.
That is, for every j and every v ∈Dj, xj = v in some feasible solution of S.
This is actually generalized arc consistency (since there may be more than 2 variables per constraint).
Arc consistency can be achieved by filtering all infeasible values from the domains.
43
Arc Consistency
The matching algorithm achieves arc consistency for all-different.
Practical filtering algorithms often do not achieve full arc consistency, since it is not worth the time investment.
The primary tools of constraint programming are filtering algorithms that achieve or approximate arc consistency.
Filtering algorithms are analogous to cutting plane algorithms in integer programming.
44
Arc Consistency
Domains that are reduced by a filtering algorithm for one constraint can be propagated to other constraints.
Similar to bounds propagation.
But propagation may not achieve arc consistency for a constraint set, even if it achieves arc consistency for each individual constraint.
Again, similar to bounds propagation.
45
Arc Consistency
For example, consider the constraint set
),(different-all),(different-all),(different-all
32
31
21
xxxxxx
}3,2{}3,2{}2,1{
3
2
1
∈∈∈
xxx
with domains
No further domain reduction is possible.
Each constraint is arc consistent for these domains.
But x1 = 2 in no feasible solution of the constraint set.
So the constraint set is not arc consistent.
46
Arc Consistency
On the other hand, sometimes arc consistency maintenance alone can solve a problem—without branching.
Consider a graph coloring problem.
Color the vertices red, blue or green so that adjacent vertices have different colors.
These are binary constraints (they contain only 2 variables).
Arbitrarily color two adjacent vertices red and green.
Reduce domains to maintain arc consistency for each constraint.
47
Graph coloring problem that can be solved by arc consistency maintenance alone.
48
Example: Continuous Global Optimization
Nonlinear bounds propagation
Function factorization
Interval splitting
Lagrangean propagation
Reduced-cost variable fixing
49
Continuous Global Optimization
Today’s constraint programming solvers (CHIP, ILOG Solver) emphasize propagation.
Today’s integer programming solvers (CPLEX, Xpress-MP) emphasize relaxation.
Current global solvers (BARON, LGO) already combine propagation and relaxation.
Perhaps in the near future, all solvers will combine propagation and relaxation.
50
Continuous Global Optimization
Consider a continuous global optimization problem:
]2,0[],1,0[14
22max
21
21
21
21
∈∈=
≤++
xxxx
xxxx
The feasible set is nonconvex, and there is more than one local optimum.
Nonlinear programming techniques try to find a local optimum.
Global optimization techniques try to find a global optimum.
51
]2,0[],1,0[22
14max
21
21
21
21
∈∈≤+
=+
xxxx
xxxx
Feasible set
Global optimum
Local optimum
x1
x2
Continuous Global Optimization
52
Branch by splitting continuous interval domains.Reduce domains with:
Nonlinear bounds propagation. Lagrangean propagation (reduced cost variable fixing)..
Relax the problem by factoring functions.
Continuous Global Optimization
53
Nonlinear Bounds Propagation
81
241
41
21 =
⋅=≥
Ux
To propagate 4x1x2 = 1, solve for x1 = 1/4x2 and x2 = 1/4x1. Then:
41
141
41
12 =
⋅=≥
Ux
This yields domains x1 ∈ [0.125,1], x2 ∈ [0.25,2].
Further propagation of 2x1 + x2 ≤ 2 yields x1 ∈ [0.125,0.875], x2 ∈ [0.25,1.75].
Cycling through the 2 constraints converges to the fixed point x1 ∈ [0.146,0.854], x2 ∈ [0.293,1.707].
In practice, this process is terminated early, due to decreasing returns for computation time,
54
Propagate intervals [0,1], [0,2]
through constraints to obtain
[1/8,7/8], [1/4,7/4]
x1
x2
Nonlinear Bounds Propagation
55
Factor complex functions into elementary functions that have known linear relaxations.
Write 4x1x2 = 1 as 4y = 1 where y = x1x2.
This factors 4x1x2 into linear function 4y and bilinear function x1x2.
Linear function 4y is its own linear relaxation.
Bilinear function y = x1x2 has relaxation:
212112212112
212112212112
ULxLxUyUUxUxULUxUxLyLLxLxL
−+≤≤−+−+≤≤−+
Function Factorization
56
We now have a linear relaxation of the nonlinear problem:
jjj UxLULxLxUyUUxUxU
LUxUxLyLLxLxLxx
yxx
≤≤−+≤≤−+
−+≤≤−+≤+
=+
212112212112
212112212112
21
21
2214
max
Function Factorization
57
Solve linear relaxation.
x1
x2
Interval Splitting
58
Solve linear relaxation.
x1
x2
Since solution is infeasible, split an interval and branch.
]1,25.0[2∈x
]75.1,1[2 ∈x
Interval Splitting
59
x1
x2
x1
x2
]75.1,1[2 ∈x ]1,25.0[2∈x
60
]75.1,1[2 ∈x ]1,25.0[2∈x
Solution of relaxation is
feasible, value = 1.25
This becomes best feasible
solution
x1
x2
x1
x2
61
]75.1,1[2 ∈x ]1,25.0[2∈x
Solution of relaxation is
feasible, value = 1.25
This becomes best feasible
solution
x1
x2
x1
x2Solution of relaxation is
not quite feasible,
value = 1.854
62
jjj UxLULxLxUyUUxUxU
LUxUxLyLLxLxLxx
yxx
≤≤−+≤≤−+
−+≤≤−+≤+
=+
212112212112
212112212112
21
21
2214
max
Lagrange multiplier (dual variable) in solution of relaxation is 1.1
Lagrangean Propagation
Optimal value of relaxation is 1.854
Any reduction ∆ in right-hand side of 2x1 + x2 ≤ 2 reduces the optimal value of relaxation by 1.1 ∆.
So any reduction ∆ in the left-hand side (now 2) has the same effect.
63
Lagrangean Propagation
Any reduction ∆ in left-hand side of 2x1 + x2 ≤ 2 reduces the optimal value of the relaxation (now 1.854) by 1.1 ∆.
The optimal value should not be reduced below the lower bound 1.25 (value of best feasible solution so far).
So we should have 1.1 ∆ ≤ 1.854 − 1.25, or
25.1854.1)]2(2[1.1 21 −≤+− xx
451.11.1
25.1854.122 21 =−
−≥+ xxor
This inequality can be propagated to reduce domains.
Reduces domain of x2 from [1,1.714] to [1.166,1.714].
64
Reduced-cost Variable Fixing
λLvbxg −
−≥)(
Lower bound on optimal value of original problem
Optimal value of relaxation
In general, given a constraint g(x) ≤ b with Lagrange multiplier λ, the following constraint is valid:
If g(x) ≤ b is a nonnegativity constraint −xj ≤ 0 with Lagrange multiplier λ (i.e, reduced cost of xj is −λ), then
λLvx j
−−≥−
λLvx j
−≤or
If xj is a 0-1 variable and (v − L)/ λ < 1, then we can fix xj to 0. This is reduced cost variable fixing.
65
Example: Product Configuration
Variable indices
Filtering for element constraint
Relaxation of element
Relaxing a disjunction of linear systems
66
Product Configuration
We want to configure a computer by choosing the type of power supply, the type of disk drive, and the type of memory chip.
We also choose the number of disk drives and memory chips.
Use only 1 type of disk drive and 1 type of memory.
Constraints:
Generate enough power for the components.
Disk space at least 700.
Memory at least 850.
Minimize weight subject to these constraints.
67
Memory
Memory
Memory
Memory
Memory
Memory
Powersupply
Powersupply
Powersupply
Powersupply
Disk drive
Disk drive
Disk drive
Disk drive
Disk drive
Personal computer
Product Configuration
68
Product Configuration
69
Product Configuration
Let
ti = type of component i installed.
qi = quantity of component i installed.
These are the problem variables.
This problem will illustrate the element constraint.
70
jUvL
jAqv
vc
jjj
iijtij
jjj
i
all ,
all ,
min
≤≤
= ∑∑Amount of attribute j
produced (< 0 if consumed):
memory, heat, power, weight, etc.
Quantity of component i
installed
Product Configuration
71
jUvL
jAqv
vc
jjj
iijtij
jjj
i
all ,
all ,
min
≤≤
= ∑∑Amount of attribute j
produced (< 0 if consumed):
memory, heat, power, weight, etc.
Quantity of component i
installed
Amount of attribute jproduced by type ti
of component i
Product Configuration
72
jUvL
jAqv
vc
jjj
iijtij
jjj
i
all ,
all ,
min
≤≤
= ∑∑Amount of attribute j
produced (< 0 if consumed):
memory, heat, power, weight, etc.
Quantity of component i
installed
ti is a variable index
Amount of attribute jproduced by type ti
of component i
Product Configuration
73
jUvL
jAqv
vc
jjj
iijtij
jjj
i
all ,
all ,
min
≤≤
= ∑∑Amount of attribute j
produced (< 0 if consumed):
memory, heat, power, weight, etc.
Unit cost of producing attribute j
Quantity of component i
installed
ti is a variable index
Amount of attribute jproduced by type ti
of component i
Product Configuration
74
Variable indices are implemented with the element constraint.
The y in xy is a variable index.
Constraint programming solvers implement xy by replacing it with a new variable z and adding the constraint element(y,(x1,…,xn),z).
This constraint sets z equal to the y th element of the list (x1,…,xn).
So is replaced by and the new constraint
Variable Indices
( )ijijniijii zAqAqt ),,,(,element 1 …∑
iijti i
Aq
The element constraint can be propagated and relaxed.
∑i
ijz
for each i.
75
Filtering for Element
)),,,(,(element 1 zxxy n…
can be processed with a domain reduction algorithm that maintains arc consistency.
otherwise}{if
}|{
{j
j
j
yj
x
yzx
xxyy
Djxzz
DjDD
D
DDjDD
DDD
=←
∅≠∩∩←
∩←∈∪
Domain of z
76
Example... )),,,,(,(element 4321 zxxxxy
The initial domains are:
}70,50,40{}90,80,50,40{
}20,10{}50,10{
}4,3,1{}90,80,60,30,20{
4
3
2
1
======
x
x
x
x
y
z
DDDDDD
The reduced domains are:
}70,50,40{}90,80{}20,10{}50,10{
}3{}90,80{
4
3
2
1
======
x
x
x
x
y
z
DDDDDD
Filtering for Element
77
Relaxation of Element
element(y,(x1,…,xn),z) can be given a continuous relaxation.
It implies the disjunction )( kkxz =∨
In general, a disjunction of linear systems can be given a convex hull relaxation.
This provides a way to relax the element constraint.
78
Relaxing a Disjunction of Linear Systems
Consider the disjunction
)( kk
kbxA ≤∨
It describes a union of polyhedra defined by Akx ≤ bk.
Every point x in the convex hull of this union is a convex combination of points in the polyhedra.
11 bxA ≤
22 bxA ≤33 bxA ≤
x
1x
2x
3x
Convex hull
kx
79
Relaxing a Disjunction of Linear Systems
So we have
∑
∑
≥=≤
=
kkk
kkkk
kk
kbxA
xx
0,1 all ,
αα
α
Using the change of variable we get the convex hull relaxation
kk
k xx α=
∑
∑
≥=≤
=
kkk
kkkk
k
k
kbxA
xx
0,1 all ,
ααα
80
Relaxation of Element
The convex hull relaxation of element(y,(x1, …, xn), z) is the convex hull relaxation of the disjunction , whichsimplifies to
ixx
xz
kiki
kkk
all ,∑∑
=
=
)( kkxz =∨
81
( ) jizAAt
jUvLzv
vc
ijijmiji
ijjjijj
jjj
, all ,),,,(,element
all, ,
min
1 …∑∑
≤≤=
Product ConfigurationReturning to the product configuration problem, the model
with the element constraint is
with domains
{ } { } { }
{ } { }3,2,1,,1 all],,[
,,,,,,,
321
321
∈∈∈
∈∈∈
qqqjULv
CBAtBAtCBAt
jjj
Type of power supply Type of disk Type of memory
Number of power
supplies
Number of disks, memory chips
82
Product ConfigurationWe can use knapsack cuts to help filter domains. Since
, the constraint implies
where each zij has one of the values
],[ jjj ULv ∈ ∑=j
ijj zv ∑ ≤≤j
Ujij
Lj vzv
ijmiiji AqAq ,,1 …
This implies the knapsack inequalities
∑≤i
ijkkij AqL }{max ji
ijkki UAq ≤∑ }{min
These can be used to reduce the domain of qi.
83
Product ConfigurationUsing the lower bounds Lj, for power, disk space and memory,
we have the knapsack inequalities
}400,300,250max{}0,0max{}0,0,0max{850}0,0,0max{}800,500max{}0,0,0max{700
}30,25,20max{}50,30max{}150,100,70max{0
321
321
321
qqqqqq
qqq
++≤++≤
−−−+−−+≤
which simplify to
3
2
321
400850800700
20301500
qqq
≤≤
−−≤
Propagation of these reduces domain of q3 from {1,2,3} to {3}.
84
Product ConfigurationPropagation of all constraints reduces domains to
]1040,565[]1200,900[]1600,700[]45,0[]90,75[]600,140[]350,350[
]120,900[]0,0[]0,0[]0,0[]1600,700[]0,0[
]75,90[]30,75[]150,150[},{},{}{
}3{}2,1{}1{
4321
342414
332313
322212
312111
321
321
∈∈∈∈∈∈∈∈∈∈∈∈∈
−−∈−−∈∈∈∈∈∈∈∈
vvvvzzzzzzzzzzzz
CBtBAtCtqqq
85
Product ConfigurationUsing the convex hull relaxation of the element constraints, we
have a linear relaxation of the problem:
kiq
jvqA
jqAviqqqjvvv
iqq
jqAv
vc
ik
i
UjiijkDk
iiijkDk
Lj
Uii
Li
Ujj
Lj
Dkiki
i Dkikijkj
jjj
it
it
it
it
, all ,0
all ,}{min
all },{max all , all ,
all ,
all ,
min
≥
≤
≤≤≤≤≤
=
=
∑∑
∑∑ ∑∑
∈
∈
∈
∈
Current bounds on vjCurrent bounds on qi
qik > 0 when type kof component i is
chosen, and qik is the quantity installed
86
Product ConfigurationThe solution of the linear relaxation at the root node is
)705,900,1000,15(),,(0 sother
32
1
41
33
22
11
==
======
vvq
qqqqqq
ij
B
A
C
…
Use 1 type C power supply (t1 = C)
Use 2 type A disk drives (t2 = A)
Use 3 type B memory chips (t3 = B)
Since only one qik > 0 for each i, there is no need to branch on the ti s.
This is a feasible and therefore optimal solution.
The problem is solved at the root node.
Min weight is 705.
87
Example: Machine Scheduling
Edge finding
Benders decomposition and nogoods
Cumulative Scheduling
88
Machine Scheduling
We want to schedule 5 jobs on 2 machines.
Each job has a release time and deadline.
Processing times and costs differ on the two machines.
We want to minimize total cost while observing the time widows.
We will first study propagation for the 1-machineproblem (edge finding).
We will then solve the 2-machine problem using Benders decomposition and propagation on the 1-machine problem
89
Machine Scheduling
Processing time on machine A
Processing time on machine B
90
Jobs must run one at a time on each machine (disjunctive scheduling). For this we use the constraint
Machine Scheduling
),(edisjunctiv pt
t = (t1,…tn) are start times
(variables)
p = (pi1,…pin) are processing
times(constants)
The best known filtering algorithm for the disjunctive constraint is edge finding.
Edge finding does not achieve full arc or bounds consistency.
91
The problem can be written
Machine Scheduling
( ))|(),|(edisjunctiv all ,
min
iypiytjdpt
f
jijjj
jjyj
jjy
j
j
==≤+
∑
Start time of job jMachine assigned
to job j
We will first look at a scheduling problem on 1 machine.
92
Edge Finding
Let’s try to schedule jobs 2, 3, 5 on machine A.
Initially, the Earliest Start Time Ej and Latest End Time Ljare the release dates rj and deadlines dj :
]7,3[],[]7,2[],[]8,0[],[
55
33
22
===
ELELEL
0 5 10
Job 2Job 3
Job 5E3
L3
= time window
93
Edge Finding
Job 2 must precede jobs 3 and 5, because:
Jobs 2, 3, 5 will not fit between the earliest start time of 3, 5 and the latest end time of 2, 3, 5.
Therefore job 2 must end before jobs 3, 5 start.
0 5 10
Job 2Job 3
Job 5
min{E3,E5} max{L2,L3,L5}
pA2 pA3 pA5
94
Edge Finding
Jobs 2 precedes jobs 3, 5 because:
0 5 10
Job 2Job 3
Job 5
min{E3,E5} max{L2,L3,L5}
pA2 pA3 pA5
53253532 },min{},,max{ AAA pppEELLL ++<−
This is called edge finding because it finds an edge in the precedence graph for the jobs.
95
Edge Finding
In general, job k must precede set S of jobs on machine i when
0 5 10
Job 2Job 3
Job 5
min{E3,E5} max{L2,L3,L5}
pA2 pA3 pA5
∑∪∈
∈∪∈<−
}{}{}{min}{max
kSjijjSjjkSj
pEL
96
Edge Finding
Since job 2 must precede 3 and 4, the earliest start times of jobs 3 and 4 can be updated.
They cannot start until job 2 is finished.
0 5 10
Job 2
Job 3
Job 5
pA2
[E3,L3] updated to [3,7]
97
Edge Finding
Edge finding also determines that job 3 must precede job 5, since
0 5 10
Job 2
Job 3
Job 5
pA2
[E5,L5] updated to [5,7]
53553 3254}3min{}7,7max{}min{},max{ AA ppELL +=+=<=−=−
pA3
pA5
This updates [E5,L5] to [5,7], and there is too little time to run job 5. The schedule is infeasible.
98
Edge finding
Both edge finding, and the resulting updates, require enumeration of exponentially many subsets of jobs…• …if done naively.• Both can be done in n log n time (n = number of jobs)
if one is clever.• Start with Jackson pre-emptive schedule.
99
Benders Decomposition
We will solve the 2-machine scheduling problem with logic-based Benders decomposition.
First solve an assignment problem that allocates jobs to machines (master problem).
Given this allocation, schedule the jobs assigned to each machine (subproblem).
If a machine has no feasible schedule:
Determine which jobs cause the infeasibility.
Generate a nogood (Benders cut) that excludes assigning these jobs to that machine again
Add the Benders cut to the master problem and re-solve it.
Repeat until the subproblem is feasible.
100
We will solve it as an integer programming problem.
Let binary variable xij = 1 when yj = i.
Benders Decomposition
cuts Benders
min∑j
jy jf
The master problem (assigns jobs to machines) is:
}1,0{cuts Benders
min
∈
∑
ij
ijijij
x
xf
101
Benders Decomposition
We will add a relaxation of the subproblem to the master problem, to speed up solution.
85
310
85
min
55443322
554433
54
5544332211
55443322
554433
≤+++≤++
≤≤++++
≤+++≤++
∑
BBBBBBBB
AAAAAA
AA
AAAAAAAAAA
AAAAAAAA
AAAAAA
ijijij
xpxpxpxpxpxpxp
xpxpxpxpxpxp
xpxpxpxpxpxpxp
xf If jobs 3,4,5 are all assigned to machine A, their processing time on that machine must fit between their earliest release time (2) and latest deadline (7).
102
Benders Decomposition
The solution of the master problem is
0 sother 145321
======
ij
BAAAA
xxxxxx
Assign jobs 1,2,3,5 to machine A, job 4 to machine B.
The subproblem is to schedule these jobs on the 2 machines.
Machine B is trivial to schedule (only 1 job).
We found that jobs 2, 3, 5 cannot be scheduled on machine A.
103
Benders Decomposition
1)1()1()1( 532 ≥−+−+− AAA xxx
So we create a Benders cut (nogood) that excludes assigning jobs 2, 3, 5 to machine A.
These are the jobs that caused the infeasibility.
The Benders cut is:
Add this constraint to the master problem.
The solution of the master problem now is:
0 sother 143521
======
ij
BBAAA
xxxxxx
104
Benders Decomposition
So we schedule jobs 1, 2, 5 on machine A, jobs 3, 4 on machine B.
These schedules are feasible.
The resulting solution is optimal, with min cost 130.
105
Jobs can run simultaneously provided total resource consumption rate never exceeds a limit. For this we use the constraint
Cumulative Scheduling
),,,(cumulative Ccpt
start times(variables)
processing times
(constants)
There are several filtering methods.
resource consumption
rates(constants)
Resourcelimit
(constant)
106
Cumulative Scheduling
Filtering methods:
Time tabling.
Edge finding.
Extended edge finding.
Not-first/not-last.
Energetic reasoning.
All can be done in polynomial time…
Using clever algorithms, data structures.