ICCV2009: MAP Inference in Discrete Models: Part 4
Transcript of ICCV2009: MAP Inference in Discrete Models: Part 4
Pushmeet Kohli
ICCV 2009
Course programme
9.30-10.00 Introduction (Andrew Blake)
10.00-11.00 Discrete Models in Computer Vision (Carsten Rother)
15min Coffee break
11.15-12.30 Message Passing: DP, TRW, LP relaxation (Pawan Kumar)
12.30-13.00 Quadratic pseudo-boolean optimization (Pushmeet Kohli)
1 hour Lunch break
14:00-15.00 Transformation and move-making methods (Pushmeet Kohli)
15:00-15.30 Speed and Efficiency (Pushmeet Kohli)
15min Coffee break
15:45-16.15 Comparison of Methods (Carsten Rother)
16:30-17.30 Recent Advances: Dual-decomposition, higher-order, etc. (Carsten Rother + Pawan Kumar)
All online material will be online (after conference):http://research.microsoft.com/en-us/um/cambridge/projects/tutorial/
E(x) = ∑ fi (xi) + ∑ gij (xi,xj) + ∑ hc(xc) i ij c
Unary Pairwise Higher Order
Image Segmentation
∑ ci xi + ∑ dij |xi-xj|i i,j
E: {0,1}n → R
n = number of pixels
Space of Problems
Submodular
Functions
CSP
Tree Structured
Pair-wiseO(n3)
MAXCUT
O(n6)
n = Number of Variables
Segmentation Energy
NP-Hard
More General Minimization Problems
st-mincut and Pseudo-booleanoptimization
Speed and Efficiency
More General Minimization Problems
st-mincut and Pseudo-booleanoptimization
Speed and Efficiency
Example: n = 2, A = [1,0] , B = [0,1]
f([1,0]) + f([0,1]) f([1,1]) + f([0,0])
Property : Sum of submodular functions is submodular
E(x) = ∑ ci xi + ∑ dij |xi-xj|i i,j
Binary Image Segmentation Energy is submodular
for all A,B ϵ {0,1}nf(A) + f(B) f(A˅B) + f(A˄B)(AND)(OR)
Pseudo-boolean function f{0,1}n ℝ is submodular if
Discrete Analogues of Concave Functions[Lovasz, ’83]
Widely applied in Operations Research
Applications in Machine Learning MAP Inference in Markov Random Fields
Clustering [Narasimhan , Jojic, & Bilmes, NIPS 2005]
Structure Learning [Narasimhan & Bilmes, NIPS 2006]
Maximizing the spread of influence through a social network [Kempe, Kleinberg & Tardos, KDD 2003]
Polynomial time algorithms Ellipsoid Algorithm: [Grotschel, Lovasz & Schrijver ‘81]
First strongly polynomial algorithm: [Iwata et al. ’00] [A. Schrijver ’00]
Current Best: O(n5 Q + n6) [Q is function evaluation time] [Orlin ‘07]
Symmetric functions: E(x) = E(1-x) Can be minimized in O(n3)
Minimizing Pairwise submodular functions Can be transformed to st-mincut/max-flow [Hammer , 1965]
Very low empirical running time ~ O(n)
E(X) = ∑ fi (xi) + ∑ gij (xi,xj)i ij
Source
Sink
v1 v2
2
5
9
41
2
Graph (V, E, C)
Vertices V = {v1, v2 ... vn}
Edges E = {(v1, v2) ....}
Costs C = {c(1, 2) ....}
Source
Sink
v1 v2
2
5
9
41
2
What is a st-cut?
Source
Sink
v1 v2
2
5
9
41
2
What is a st-cut?
An st-cut (S,T) divides the nodes between source and sink.
What is the cost of a st-cut?
Sum of cost of all edges going from S to T
5 + 1 + 9 = 15
What is a st-cut?
An st-cut (S,T) divides the nodes between source and sink.
What is the cost of a st-cut?
Sum of cost of all edges going from S to T
What is the st-mincut?
st-cut with the minimum cost
Source
Sink
v1 v2
2
5
9
41
2
2 + 2 + 4 = 8
Construct a graph such that:
1. Any st-cut corresponds to an assignment of x
2. The cost of the cut is equal to the energy of x : E(x)
SolutionT
S st-mincut
E(x)
[Hammer, 1965] [Kolmogorov and Zabih, 2002]
E(x) = ∑ θi (xi) + ∑ θij (xi,xj)i,ji
θij(0,1) + θij (1,0) θij (0,0) + θij (1,1)For all ij
E(x) = ∑ ci xi + ∑ cij xi(1-xj) cij≥0i,ji
Equivalent (transformable)
Sink (1)
Source (0)
a1 a2
E(a1,a2)
Sink (1)
Source (0)
a1 a2
E(a1,a2) = 2a1
2
a1 a2
E(a1,a2) = 2a1 + 5ā1
2
5
Sink (1)
Source (0)
a1 a2
E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2
2
5
9
4
Sink (1)
Source (0)
a1 a2
E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2
2
5
9
4
2
Sink (1)
Source (0)
a1 a2
E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2
2
5
9
4
2
1
Sink (1)
Source (0)
a1 a2
E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2
2
5
9
4
2
1
Sink (1)
Source (0)
a1 a2
E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2
2
5
9
4
2
1a1 = 1 a2 = 1
E (1,1) = 11
Cost of cut = 11
Sink (1)
Source (0)
a1 a2
E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2
2
5
9
4
2
1
Sink (1)
Source (0)
a1 = 1 a2 = 0
E (1,0) = 8
st-mincut cost = 8
Source
Sink
v1 v2
2
5
9
42
1
Solve the dual maximum flow problem
Compute the maximum flow between Source and Sink s.t.
Edges: Flow < Capacity
Nodes: Flow in = Flow out
Assuming non-negative capacity
In every network, the maximum flow equals the cost of the st-mincut
Min-cut\Max-flow Theorem
Augmenting Path Based Algorithms
Source
Sink
v1 v2
2
5
9
42
1
Flow = 0
Augmenting Path Based Algorithms
1. Find path from source to sink with positive capacity
Source
Sink
v1 v2
2
5
9
42
1
Flow = 0
Augmenting Path Based Algorithms
1. Find path from source to sink with positive capacity
2. Push maximum possible flow through this path
Source
Sink
v1 v2
2-2
5-2
9
42
1
Flow = 0 + 2
Source
Sink
v1 v2
0
3
9
42
1
Augmenting Path Based Algorithms
1. Find path from source to sink with positive capacity
2. Push maximum possible flow through this path
Flow = 2
Source
Sink
v1 v2
0
3
9
42
1
Augmenting Path Based Algorithms
1. Find path from source to sink with positive capacity
2. Push maximum possible flow through this path
3. Repeat until no path can be found
Flow = 2
Source
Sink
v1 v2
0
3
9
42
1
Augmenting Path Based Algorithms
1. Find path from source to sink with positive capacity
2. Push maximum possible flow through this path
3. Repeat until no path can be found
Flow = 2
Source
Sink
v1 v2
0
3
5
02
1
Augmenting Path Based Algorithms
1. Find path from source to sink with positive capacity
2. Push maximum possible flow through this path
3. Repeat until no path can be found
Flow = 2 + 4
Source
Sink
v1 v2
0
3
5
02
1
Augmenting Path Based Algorithms
1. Find path from source to sink with positive capacity
2. Push maximum possible flow through this path
3. Repeat until no path can be found
Flow = 6
Source
Sink
v1 v2
0
3
5
02
1
Augmenting Path Based Algorithms
1. Find path from source to sink with positive capacity
2. Push maximum possible flow through this path
3. Repeat until no path can be found
Flow = 6
Source
Sink
v1 v2
0
1
3
02-2
1+2
Augmenting Path Based Algorithms
1. Find path from source to sink with positive capacity
2. Push maximum possible flow through this path
3. Repeat until no path can be found
Flow = 6 + 2
Source
Sink
v1 v2
0
2
4
0
3
0
Augmenting Path Based Algorithms
1. Find path from source to sink with positive capacity
2. Push maximum possible flow through this path
3. Repeat until no path can be found
Flow = 8
Source
Sink
v1 v2
0
2
4
0
3
0
Augmenting Path Based Algorithms
1. Find path from source to sink with positive capacity
2. Push maximum possible flow through this path
3. Repeat until no path can be found
Flow = 8
a1 a2
E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2
2
5
9
4
2
1
Sink (1)
Source (0)
a1 a2
E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2
2
5
9
4
2
1
Sink (1)
Source (0)
2a1 + 5ā1
= 2(a1+ā1) + 3ā1
= 2 + 3ā1
Sink (1)
Source (0)
a1 a2
E(a1,a2) = 2 + 3ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2
0
3
9
4
2
12a1 + 5ā1
= 2(a1+ā1) + 3ā1
= 2 + 3ā1
a1 a2
E(a1,a2) = 2 + 3ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2
0
3
9
4
2
19a2 + 4ā2
= 4(a2+ā2) + 5ā2
= 4 + 5ā2
Sink (1)
Source (0)
a1 a2
E(a1,a2) = 2 + 3ā1+ 5a2 + 4 + 2a1ā2 + ā1a2
0
3
5
0
2
19a2 + 4ā2
= 4(a2+ā2) + 5ā2
= 4 + 5ā2
Sink (1)
Source (0)
a1 a2
E(a1,a2) = 6 + 3ā1+ 5a2 + 2a1ā2 + ā1a2
0
3
5
0
2
1
Sink (1)
Source (0)
a1 a2
E(a1,a2) = 6 + 3ā1+ 5a2 + 2a1ā2 + ā1a2
0
3
5
0
2
1
Sink (1)
Source (0)
a1 a2
E(a1,a2) = 6 + 3ā1+ 5a2 + 2a1ā2 + ā1a2
0
3
5
0
2
1
3ā1+ 5a2 + 2a1ā2
= 2(ā1+a2+a1ā2) +ā1+3a2
= 2(1+ā1a2) +ā1+3a2
F1 = ā1+a2+a1ā2
F2 = 1+ā1a2
a1 a2 F1 F2
0 0 1 1
0 1 2 2
1 0 1 1
1 1 1 1
Sink (1)
Source (0)
a1 a2
E(a1,a2) = 8 + ā1+ 3a2 + 3ā1a2
0
1
3
0
0
3
3ā1+ 5a2 + 2a1ā2
= 2(ā1+a2+a1ā2) +ā1+3a2
= 2(1+ā1a2) +ā1+3a2
a1 a2 F1 F2
0 0 1 1
0 1 2 2
1 0 1 1
1 1 1 1
F1 = ā1+a2+a1ā2
F2 = 1+ā1a2
Sink (1)
Source (0)
a1 a2
0
1
3
0
0
3
E(a1,a2) = 8 + ā1+ 3a2 + 3ā1a2
No more augmenting paths
possible
Sink (1)
Source (0)
a1 a2
0
1
3
0
0
3
E(a1,a2) = 8 + ā1+ 3a2 + 3ā1a2
Total Flow
Residual Graph(positive coefficients)
bound on the optimal solution
Tight Bound --> Inference of the optimal solution becomes trivial
Sink (1)
Source (0)
a1 a2
0
1
3
0
0
3
E(a1,a2) = 8 + ā1+ 3a2 + 3ā1a2
a1 = 1 a2 = 0
E (1,0) = 8
st-mincut cost = 8Total Flow
bound on the optimal solution
Residual Graph(positive coefficients)
Tight Bound --> Inference of the optimal solution becomes trivial
Sink (1)
Source (0)
[Slide credit: Andrew Goldberg]
Augmenting Path and Push-Relabel n: #nodes
m: #edges
U: maximum edge weight
Algorithms assume non-
negative edge weights
[Slide credit: Andrew Goldberg]
n: #nodes
m: #edges
U: maximum edge weight
Algorithms assume non-
negative edge weights
Augmenting Path and Push-Relabel
a1 a2
10001
Sink
Source
1000
1000 1000
0
Ford Fulkerson: Choose any augmenting path
a1 a2
10001
Sink
Source
1000
1000 1000
0 Good Augmenting
Paths
Ford Fulkerson: Choose any augmenting path
a1 a2
10001
Sink
Source
1000
1000 1000
0 Bad Augmenting
Path
Ford Fulkerson: Choose any augmenting path
a1 a2
9990
Sink
Source
1000
1000 999
1
Ford Fulkerson: Choose any augmenting path
a1 a2
9990
Sink
Source
1000
1000 999
1
Ford Fulkerson: Choose any augmenting path
n: #nodes
m: #edges
We will have to perform 2000 augmentations!
Worst case complexity: O (m x Total_Flow)(Pseudo-polynomial bound: depends on flow)
Dinic: Choose shortest augmenting path
n: #nodes
m: #edges
Worst case Complexity: O (m n2)
a1 a2
10001
Sink
Source
1000
1000 1000
0
Specialized algorithms for vision problems Grid graphs
Low connectivity (m ~ O(n))
Dual search tree augmenting path algorithm[Boykov and Kolmogorov PAMI 2004]
• Finds approximate shortest augmenting paths efficiently
• High worst-case time complexity
• Empirically outperforms other algorithms on vision problems
Efficient code available on the web
http://www.adastral.ucl.ac.uk/~vladkolm/software.html
E(x) = ∑ ci xi + ∑ dij |xi-xj|i i,j
xx* = arg min E(x)
How to minimize E(x)?
E: {0,1}n → R0 → fg1 → bg
n = number of pixels
Sink (1)
Source (0)
Graph *g;
For all pixels p
/* Add a node to the graph */nodeID(p) = g->add_node();
/* Set cost of terminal edges */set_weights(nodeID(p), fgCost(p), bgCost(p));
end
for all adjacent pixels p,qadd_weights(nodeID(p), nodeID(q), cost(p,q));
end
g->compute_maxflow();
label_p = g->is_connected_to_source(nodeID(p));// is the label of pixel p (0 or 1)
a1 a2
fgCost(a1)
Sink (1)
Source (0)
fgCost(a2)
bgCost(a1) bgCost(a2)
Graph *g;
For all pixels p
/* Add a node to the graph */nodeID(p) = g->add_node();
/* Set cost of terminal edges */set_weights(nodeID(p), fgCost(p), bgCost(p));
end
for all adjacent pixels p,qadd_weights(nodeID(p), nodeID(q), cost(p,q));
end
g->compute_maxflow();
label_p = g->is_connected_to_source(nodeID(p));// is the label of pixel p (0 or 1)
a1 a2
fgCost(a1)
Sink (1)
Source (0)
fgCost(a2)
bgCost(a1) bgCost(a2)
cost(p,q)
Graph *g;
For all pixels p
/* Add a node to the graph */nodeID(p) = g->add_node();
/* Set cost of terminal edges */set_weights(nodeID(p), fgCost(p), bgCost(p));
end
for all adjacent pixels p,qadd_weights(nodeID(p), nodeID(q), cost(p,q));
end
g->compute_maxflow();
label_p = g->is_connected_to_source(nodeID(p));// is the label of pixel p (0 or 1)
Graph *g;
For all pixels p
/* Add a node to the graph */nodeID(p) = g->add_node();
/* Set cost of terminal edges */set_weights(nodeID(p), fgCost(p), bgCost(p));
end
for all adjacent pixels p,qadd_weights(nodeID(p), nodeID(q), cost(p,q));
end
g->compute_maxflow();
label_p = g->is_connected_to_source(nodeID(p));// is the label of pixel p (0 or 1)
a1 a2
fgCost(a1)
Sink (1)
Source (0)
fgCost(a2)
bgCost(a1) bgCost(a2)
cost(p,q)
a1 = bg a2 = fg
Lunch
MIT Press, summer 2010
Topics of this course and much, much more
Contributors: usual suspects – lecturers on this course + Boykov,
Kolmogorov, Weiss, Freeman, ....
one for the office and one for home
www.research.microsoft.com/vision/MRFbook
Advances in Markov Random Fields for Computer Vision
More General Minimization Problems
st-mincut and Pseudo-booleanoptimization
Speed and Efficiency
Non-submodular Energy Functions
Mixed (Real-Integer) Problems
Higher Order Energy Functions
Multi-label Problems
Ordered Labels▪ Stereo (depth labels)
Unordered Labels▪ Object segmentation ( ‘car’, `road’, `person’)
Non-submodular Energy Functions
Mixed (Real-Integer) Problems
Higher Order Energy Functions
Multi-label Problems
Ordered Labels▪ Stereo (depth labels)
Unordered Labels▪ Object segmentation ( ‘car’, `road’, `person’)
Minimizing general non-submodular functions is NP-hard.
Commonly used method is to solve a relaxation of the problem
E(x) = ∑ θi (xi) + ∑ θij (xi,xj)i,ji
θij(0,1) + θij (1,0) ≤ θij (0,0) + θij (1,1) for some ij
[Boros and Hammer, ‘02]
pairwise nonsubmodular
unary
pairwise submodular
)0,1()1,0()1,1()0,0( pqpqpqpq
)0,1(~
)1,0(~
)1,1(~
)0,0(~
pqpqpqpq
[Boros and Hammer, ‘02]
Double number of variables: ppp xxx , )1( pp xx
[Boros and Hammer, ‘02]
Double number of variables: ppp xxx , )1( pp xx
Ignore constraint and solve
[Boros and Hammer, ‘02]
Double number of variables: ppp xxx , )1( pp xx
Local Optimality
[Boros and Hammer, ‘02]
[Rother, Kolmogorov, Lempitsky, Szummer] [CVPR 2007]
0 ? ? ? ? ?
rp q s t
0 0 0 ? ? 0 0 1 0 ?
rp q s t
rp q s tQPBO:
Probe Node p:
0 1
What can we say about variables?
•r -> is always 0•s -> is always equal to q•t -> is 0 when q = 1
Probe nodes in an order until energy unchanged
Simplified energy preserves global optimality and (sometimes) gives the global minimum
Result depends slightly on the order
• Property: E(y’) ≤ E(y) [autarky property]
0 ? ? ?
? ? ? ?
? ? ? ?
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 1
0 0 1 0
0 0 0 0
x (partial)y (e.g. from BP) y’ = FUSE(x,y)
0 0 ? ?
0 0 ? ?
? ? ? ?
0 0 0 1
0 0 1 ?
? ? ? ?
Non-submodular Energy Functions
Mixed (Real-Integer) Problems
Higher Order Energy Functions
Multi-label Problems
Ordered Labels▪ Stereo (image intensity, depth)
Unordered Labels▪ Object segmentation ( ‘car’, `road’, `person’)
[Kumar et al, 05] [Kohli et al, 06,08]
Image
colour appearance based Segmentation
Need for a human like segmentation
Segmentation Result
x – binary image segmentation (xi ∊ {0,1})
ω – non-local parameter (lives in some large set Ω)
constantunary
potentialspairwise
potentials
E(x,ω) = C(ω) + ∑ θi (ω, xi) + ∑ θij (ω,xi,xj)i,ji
≥ 0
Rough Shape Prior
Stickman Model
ωPose
θi (ω, xi) Shape Prior
[Kohli et al, 06,08]
x – binary image segmentation (xi ∊ {0,1})
ω – non-local parameter (lives in some large set Ω)
constantunary
potentialspairwise
potentials
E(x,ω) = C(ω) + ∑ θi (ω, xi) + ∑ θij (ω,xi,xj)i,ji
≥ 0
ωTemplate Position
Scale Orientation
[Kohli et al, 06,08] [Lempitsky et al, 08]
x – binary image segmentation (xi ∊ {0,1})
ω – non-local parameter (lives in some large set Ω)
constantunary
potentialspairwise
potentials
E(x,ω) = C(ω) + ∑ θi (ω, xi) + ∑ θij (ω,xi,xj)i,ji
≥ 0
{x*,ω*} = arg min E(x,ω)
• Standard “graph cut” energy if ω is fixed
x,ω
[Kohli et al, 06,08] [Lempitsky et al, 08]
Local Method: Gradient Descent over ω
ω*
ω * = arg min min E (x,ω)
xω
Submodular
[Kohli et al, 06,08]
Local Method: Gradient Descent over ω
ω * = arg min min E (x,ω)
xω
Submodular
Dynamic Graph Cuts
15- 20 time speedup!
E (x,ω1)
E (x,ω2)
Similar Energy Functions
[Kohli et al, 06,08]
[Kohli et al, 06,08]
Global Method: Branch and Mincut
[Lempitsky et al, 08]
Produces the global optimal solution
Exhaustively explores Ω in the worst case
Ω0
Ω0
Ω (space of w) is hierarchically clustered
Standard best-first branch-and-bound search:
Small fraction of nodes is visited
lowest lower bound
A
B
C
30,000,000 shapes
Exhaustive search: 30,000,000 mincuts
Branch-and-Mincut: 12,000 mincuts
Speed-up: 2500 times(30 seconds per 312x272 image)
[Lempitsky et al, 08]
Left ventricle epicardium tracking (work in progress)
Branch & Bound segmentation
Shape prior from other sequences
5,200,000 templates
≈20 seconds per frame
Speed-up 1150
Data courtesy: Dr Harald Becher, Department of Cardiovascular Medicine, University of Oxford
Original sequence No shape prior
[Lempitsky et al, 08]
Non-submodular Energy Functions
Mixed (Real-Integer) Problems
Higher Order Energy Functions
Multi-label Problems
Ordered Labels▪ Stereo (depth labels)
Unordered Labels▪ Object segmentation ( ‘car’, `road’, `person’)
Pairwise functions have limited expressive power
Inability to incorporate region based likelihoods and priors
Field of Experts Model[Roth & Black CVPR 2005 ][Potetz, CVPR 2007]
Minimize Curvature [Woodford et al. CVPR 2008 ]
Other Examples:[Rother, Kolmogorov, Minka & Blake, CVPR 2006][Komodakis and Paragios, CVPR 2009][Rother, Kohli, Feng, Jia, CVPR 2009][Ishikawa, CVPR 2009]And many others ...
E(X) = ∑ ci xi + ∑ dij |xi-xj|i i,j
E: {0,1}n → R
0 →fg, 1→bg
n = number of pixels
[Boykov and Jolly ‘ 01] [Blake et al. ‘04] [Rother, Kolmogorov and Blake `04]
Image Unary Cost Segmentation
Patch Dictionary (Tree)
Cmax C1
{ C1 if xi = 0, i ϵ pCmax otherwise
h(Xp) =
[Kohli et al. ‘07]
p
E(X) = ∑ ci xi + ∑ dij |xi-xj| + ∑ hp (Xp)i i,j p
p
{ C1 if xi = 0, i ϵ pCmax otherwise
h(Xp) =
E: {0,1}n → R
0 →fg, 1→bg
n = number of pixels
[Kohli et al. ‘07]
E(X) = ∑ ci xi + ∑ dij |xi-xj| + ∑ hp (Xp)i i,j
Image Pairwise Segmentation Final Segmentation
p
E: {0,1}n → R
0 →fg, 1→bg
n = number of pixels
[Kohli et al. ‘07]
T
Sst-mincut
Pairwise SubmodularFunction
Higher Order Submodular
Functions
Billionnet and M. Minoux [DAM 1985]Kolmogorov & Zabih [PAMI 2004]Freedman & Drineas [CVPR2005]Kohli Kumar Torr [CVPR2007, PAMI 2008]Kohli Ladicky Torr [CVPR 2008, IJCV 2009]Ramalingam Kohli Alahari Torr [CVPR 2008]Zivny et al. [CP 2008]
Exact Transformation
?
Identified transformable families of higher order function s.t.
1. Constant or polynomial number of auxiliary variables (a) added
2. All pairwise functions (g) are submodular
PairwiseSubmodular
Function
Higher Order Function
H (X) = F ( ∑ xi )Example:
H (X)
∑ xi
concave
[Kohli et al. ‘08]
0
Simple Example using Auxiliary variables
{ 0 if all xi = 0C1 otherwisef(x) =
min f(x) min C1a + C1 ∑ ā xi
x ϵ L = {0,1}n
x =x,a ϵ {0,1}
Higher Order Submodular Function
Quadratic Submodular Function
∑xi < 1 a=0 (ā=1) f(x) = 0
∑xi ≥ 1 a=1 (ā=0) f(x) = C1
min f(x) min C1a + C1 ∑ ā xix
=x,a ϵ {0,1}
Higher Order Submodular Function
Quadratic SubmodularFunction
∑xi
1 2 3
C1
C1∑xi
min f(x) min C1a + C1 ∑ ā xix
=x,a ϵ {0,1}
Higher Order Submodular Function
Quadratic SubmodularFunction
∑xi
1 2 3
C1
C1∑xi
a=1a=0Lower envelop of concave functions is
concave
min f(x) min f1 (x)a + f2(x)āx
=x,a ϵ {0,1}
Higher Order Submodular Function
Quadratic SubmodularFunction
∑xi
1 2 3
Lower envelop of concave functions is
concave
f2(x)
f1(x)
min f(x) min f1 (x)a + f2(x)āx
=x,a ϵ {0,1}
Higher Order Submodular Function
Quadratic SubmodularFunction
∑xi
1 2 3
a=1a=0Lower envelop of concave functions is
concave
f2(x)
f1(x)
Transforming Potentials with 3 variables [Woodford, Fitzgibbon, Reid, Torr, CVPR 2008]
Transforming general “sparse” higher order functions [Rother, Kohli, Feng, Jia, CVPR 2009][Ishikawa, CVPR 2009][Komodakis and Paragios, CVPR 2009]
Test Image Test Image(60% Noise)
TrainingImage
Result
PairwiseEnergy
P(x)
Minimized usingst-mincut or max-product
message passing
Test Image Test Image(60% Noise)
TrainingImage
Result
PairwiseEnergy
P(x)
Minimized usingst-mincut or max-product
message passing
Higher Order Structure not Preserved
Minimize:
Where:
Higher Order Function (|c| = 10x10 = 100)
Assigns cost to 2100 possible labellings!
Exploit function structure to transform it to a Pairwise function
E(X) = P(X) + ∑ hc (Xc) c
hc: {0,1}|c| → R
p1 p2 p3
Test Image Test Image(60% Noise)
TrainingImage
PairwiseResult
Higher-Order Result
Learned Patterns
[Joint work with Carsten Rother ]
Non-submodular Energy Functions
Mixed (Real-Integer) Problems
Higher Order Energy Functions
Multi-label Problems
Ordered Labels▪ Stereo (depth labels)
Unordered Labels▪ Object segmentation ( ‘car’, `road’, `person’)
Exact Transformation to QPBF
Move making algorithms
E(y) = ∑ fi (yi) + ∑ gij (yi,yj)i,ji
y ϵ Labels L = {l1, l2, … , lk}
Miny
[Roy and Cox ’98] [Ishikawa ’03] [Schlesinger & Flach ’06]
[Ramalingam, Alahari, Kohli, and Torr ’08]
So what is the problem?
Eb (x1,x2, ..., xm)Em (y1,y2, ..., yn)
Multi-label Problem Binary label Problem
yi ϵ L = {l1, l2, … , lk} xi ϵ L = {0,1}
such that:
Let Y and X be the set of feasible solutions, then
1. One-One encoding function T:X->Y
2. arg min Em(y) = T(arg min Eb(x))
• Popular encoding scheme [Roy and Cox ’98, Ishikawa ’03, Schlesinger & Flach ’06]
# Nodes = n * k
# Pairwise = m * k2
• Popular encoding scheme [Roy and Cox ’98, Ishikawa ’03, Schlesinger & Flach ’06]
# Nodes = n * k
# Pairwise = m * k2
Ishikawa’s result:
E(y) = ∑ θi (yi) + ∑ θij (yi,yj)i,ji
y ϵ Labels L = {l1, l2, … , lk}
θij (yi,yj) = g(|yi-yj|)Convex Function
g(|yi-yj|)
|yi-yj|
• Popular encoding scheme [Roy and Cox ’98, Ishikawa ’03, Schlesinger & Flach ’06]
# Nodes = n * k
# Pairwise = m * k2
Schlesinger & Flach ’06:
E(y) = ∑ θi (yi) + ∑ θij (yi,yj)i,ji
y ϵ Labels L = {l1, l2, … , lk}
θij(li+1,lj) + θij (li,lj+1) θij (li,lj) + θij (li+1,lj+1)
li +1
li
lj +1
lj
Image MAP Solution
Scanlinealgorithm
[Roy and Cox, 98]
Applicability Cannot handle truncated costs
(non-robust)
Computational Cost Very high computational cost
Problem size = |Variables| x |Labels|
Gray level image denoising (1 Mpixel image)
(~2.5 x 108 graph nodes)
θij (yi,yj) = g(|yi-yj|)
|yi-yj|
discontinuity preserving potentialsBlake&Zisserman’83,87
Unary Potentials
Pair-wise Potentials
Complexity
IshikawaTransformation
[03]
Arbitrary Convex and Symmetric
T(nk, mk2)
SchlesingerTransformation
[06]
Arbitrary Submodular T(nk, mk2)
Hochbaum[01]
Linear Convex and Symmetric
T(n, m) + n log k
Hochbaum[01]
Convex Convex and Symmetric
O(mn log n log nk)
Other “less known” algorithms
T(a,b) = complexity of maxflow with a nodes and b edges
Exact Transformation to QPBF
Move making algorithms
E(y) = ∑ fi (yi) + ∑ gij (yi,yj)i,ji
y ϵ Labels L = {l1, l2, … , lk}
Miny
[Boykov , Veksler and Zabih 2001] [Woodford, Fitzgibbon, Reid, Torr, 2008]
[Lempitsky, Rother, Blake, 2008] [Veksler, 2008] [Kohli, Ladicky, Torr 2008]
Solution Space
En
erg
y
Search Neighbourhood
Current Solution
Optimal Move
Solution Space
En
erg
y
Search Neighbourhood
Current Solution
Optimal Move
xc
(t) Key Property
Move Space
Bigger move space
Solution Space
En
erg
y
• Better solutions
• Finding the optimal move hard
Minimizing Pairwise Functions[Boykov Veksler and Zabih, PAMI 2001]
• Series of locally optimal moves
• Each move reduces energy
• Optimal move by minimizing submodular function
Space of Solutions (x) : Ln
Move Space (t) : 2nSearch Neighbourhood
Current Solution
n Number of Variables
L Number of Labels
Kohli et al. ‘07, ‘08, ‘09Extend to minimize Higher order Functions
Minimize over move variables t
x = t x1 + (1-t) x2
New solution
Current Solution
Second solution
Em(t) = E(t x1 + (1-t) x2)
For certain x1 and x2, the move energy is sub-modular QPBF
[Boykov , Veksler and Zabih 2001]
• Variables labeled α, β can swap their labels
[Boykov , Veksler and Zabih 2001]
Sky
House
Tree
GroundSwap Sky, House
• Variables labeled α, β can swap their labels
[Boykov , Veksler and Zabih 2001]
Move energy is submodular if:
Unary Potentials: Arbitrary
Pairwise potentials: Semi-metric
θij (la,lb) ≥ 0
θij (la,lb) = 0 a = b
Examples: Potts model, Truncated Convex
[Boykov , Veksler and Zabih 2001]
• Variables labeled α, β can swap their labels
[Boykov, Veksler, Zabih]
• Variables take label a or retain current label
[Boykov , Veksler and Zabih 2001]
Sky
House
Tree
Ground
Initialize with TreeStatus: Expand GroundExpand HouseExpand Sky
[Boykov, Veksler, Zabih][Boykov , Veksler and Zabih 2001]
• Variables take label a or retain current label
Move energy is submodular if:
Unary Potentials: Arbitrary
Pairwise potentials: Metric
[Boykov, Veksler, Zabih]
θij (la,lb) + θij (lb,lc) ≥ θij (la,lc)
Semi metric+
Triangle Inequality
Examples: Potts model, Truncated linear
Cannot solve truncated quadratic
• Variables take label a or retain current label
[Boykov , Veksler and Zabih 2001]
Expansion and Swap can be derived as a primal dual scheme
Get solution of the dual problem which is a lower bound on the energy of solution
Weak guarantee on the solution
[Komodakis et al 05, 07]
E(x) < 2(dmax /dmin) E(x*)
dmax
dmin
θij (li,lj) = g(|li-lj|)
|yi-yj|
Move Type First Solution
Second Solution
Guarantee
Expansion Old solution All alpha Metric
Fusion Any solution Any solution
Minimize over move variables t
x = t x1 + (1-t) x2
New solution
First solution
Second solution
Move functions can be non-submodular!!
x = t x1 + (1-t) x2
x1, x2 can be continuous
Fx1
x2
x
Optical Flow Example
Final Solution
Solution from
Method 1
Solution from
Method 2
[Woodford, Fitzgibbon, Reid, Torr, 2008] [Lempitsky, Rother, Blake, 2008]
Move variables can be multi-label
Optimal move found out by using the Ishikawa Transform
Useful for minimizing energies with truncated convex pairwise potentials
θij (yi,yj) = min(|yi-yj|2,T)
|yi-yj|
θij (yi,yj)
T
x = (t ==1) x1 + (t==2) x2 +… +(t==k) xk
[Veksler, 2007]
[Veksler, 2008]
ImageNoisy Image
Range Moves
Expansion Move
Why?
3,600,000,000 PixelsCreated from about 800 8 MegaPixel Images
[Kopf et al. (MSR Redmond) SIGGRAPH 2007 ]
[Kopf et al. (MSR Redmond) SIGGRAPH 2007 ]
Processing Videos1 minute video of 1M pixel resolution
3.6 B pixels
3D reconstruction [500 x 500 x 500 = .125B voxels]
Kohli & Torr (ICCV05, PAMI07)
Can we do better?
Segment
Segment
First Frame
Second Frame[Kohli & Torr, ICCV05 PAMI07]
Kohli & Torr (ICCV05, PAMI07)[Kohli & Torr, ICCV05 PAMI07]
Image
Flow Segmentation
EA SAminimize
fast
Simpler
3–100000
time speedup!Reparametrization
Reuse Computation
minimizeSB
differencesbetweenA and B
EB*
EBFrame 2
Frame 1
[Kohli & Torr, ICCV05 PAMI07] [Komodakis & Paragios, CVPR07]
Reparametrized Energy
Kohli & Torr (ICCV05, PAMI07)
E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2
E(a1,a2) = 8 + ā1+ 3a2 + 3ā1a2
Original Energy
E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 7a1ā2 + ā1a2
E(a1,a2) = 8 + ā1+ 3a2 + 3ā1a2 + 5a1ā2
New Energy
New Reparametrized Energy
[Kohli & Torr, ICCV05 PAMI07] [Komodakis & Paragios, CVPR07]
Original Problem(Large)
Fast partially optimal
algorithm
ApproximateSolution
[ Alahari Kohli & Torr CVPR ‘08]
Approximation algorithm
(Slow)
Original Problem(Large)
Fast partially optimal
algorithm
ApproximateSolution
Approximation algorithm
(Slow)
Reduced Problem
Solved Problem(Global Optima)
Approximation algorithm
ApproximateSolution
Fast partially optimal algorithm
[Kovtun ‘03] [ Kohli et al. ‘09]
[ Alahari Kohli & Torr CVPR ‘08]
(Fast)
Original Problem(Large)
Fast partially optimal
algorithm
ApproximateSolution
Tree ReweightedMessage Passing
(9.89 sec)
Reduced Problem
Solved Problem(Global Optima) Total Time
(0.30 sec)
ApproximateSolution
sky
Building
Airplane
Grass
sky
Building
Airplane
Grass
sky
Building
Airplane
Grass
3- 100Times Speed up
Tree ReweightedMessage Passing
[ Alahari Kohli & Torr CVPR ‘08]
Fast partially optimal algorithm
[Kovtun ‘03] [ Kohli et al. ‘09]
Minimization with Complex Higher Order Functions
Connectivity
Counting Constraints
Hybrid algorithms
Connections between Messages Passing algorithms BP, TRW, and graph cuts
MIT Press, summer 2010
Topics of this course and much, much more
Contributors: usual suspects – lecturers on this course + Boykov,
Kolmogorov, Weiss, Freeman, ....
one for the office and one for home
www.research.microsoft.com/vision/MRFbook
Advances in Markov Random Fields for Computer Vision
Space of Problems
CSP
MAXCUT
NP-Hard
Which functions are exactly solvable?
Approximate solutions of NP-hard problems
Scalability and Efficiency
Which functions are exactly solvable?Boros Hammer [1965], Kolmogorov Zabih [ECCV 2002, PAMI 2004] , Ishikawa [PAMI 2003], Schlesinger [EMMCVPR 2007], Kohli Kumar Torr [CVPR2007, PAMI 2008] , Ramalingam Kohli Alahari Torr [CVPR 2008] , Kohli Ladicky Torr [CVPR 2008, IJCV 2009] , Zivny Jeavons [CP 2008]
Approximate solutions of NP-hard problemsSchlesinger [76 ], Kleinberg and Tardos [FOCS 99], Chekuri et al. [01], Boykov et al. [PAMI 01], Wainwright et al. [NIPS01], Werner [PAMI 2007], Komodakis et al. [PAMI, 05 07], Lempitsky et al. [ICCV 2007], Kumar et al. [NIPS 2007], Kumar et al. [ICML 2008], Sontag and Jakkola [NIPS 2007], Kohli et al. [ICML 2008], Kohli et al. [CVPR 2008, IJCV 2009], Rother et al. [2009]
Scalability and Efficiency Kohli Torr [ICCV 2005, PAMI 2007], Juan and Boykov [CVPR 2006], Alahari Kohli Torr [CVPR 2008] , Delong and Boykov [CVPR 2008]
Iterated Conditional Modes (ICM)
Simulated Annealing
Dynamic Programming (DP)
Belief Propagtion (BP)
Tree-Reweighted (TRW), Diffusion
Graph Cut (GC)
Branch & Bound
Relaxation methods:
…
Classical Move making algorithms
Combinatorial Algorithms
Message passing
Convex Optimization(Linear Programming,
...)