Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian...

65
Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented by Xiaodan Lia

Transcript of Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian...

Page 1: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Tractable Higher Order Models in Computer Vision (Part II)

Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet KhliMicrosoft Research Cambridge

Presented by Xiaodan Liang

Page 2: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Part II

• Submodularity • Move making algorithms

• Higher-order model : Pn Potts model

Page 3: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Feature selection

Page 4: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Factoring distributions

Problem inherently combinatorial!

Page 5: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Example: Greedy algorithm for feature selection

Page 6: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

6

s

Key property: Diminishing returnsSelection A = {} Selection B = {X2,X3}

Adding X1 will help a lot!

Adding X1 doesn’t help much

New feature X1

B As

+

+

Large improvement

Small improvement

Submodularity:

Y“Sick”

X1

“Fever”

X2

“Rash”X3

“Male”

Y“Sick”

Theorem [Krause, Guestrin UAI ‘05]: Information gain F(A) in Naïve Bayes models is submodular!

Page 7: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

7

Why is submodularity useful?Theorem [Nemhauser et al ‘78]Greedy maximization algorithm returns Agreedy:

F(Agreedy) ¸ (1-1/e) max|A| k F(A)

• Greedy algorithm gives near-optimal solution!• For info-gain: Guarantees best possible unless P = NP!

[Krause, Guestrin UAI ’05]

~63%

Page 8: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

8

Submodularity in Machine Learning• Many ML problems are submodular, i.e., for F

submodular require:• Minimization: A* = argmin F(A)– Structure learning (A* = argmin I(XA; XV\A))– Clustering– MAP inference in Markov Random Fields– …

• Maximization: A* = argmax F(A) – Feature selection– Active learning– Ranking– …

Page 9: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Set functions

Page 10: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Submodular set functions• Set function F on V is called submodular if

• Equivalent diminishing returns characterization:

SB AS

+

+

Large improvement

Small improvement

Submodularity:

BA A [ B

AÅB

++ ¸

Page 11: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Submodularity and supermodularity

Page 12: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Example: Mutual information

Page 13: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

13

Closedness propertiesF1,…,Fm submodular functions on V and 1,…,m > 0

Then: F(A) = i i Fi(A) is submodular!

Submodularity closed under nonnegative linear combinations!

Extremely useful fact!!– F(A) submodular ) P() F(A) submodular!– Multicriterion optimization:

F1,…,Fm submodular, i¸0 ) i i Fi(A) submodular

Page 14: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

14

Submodularity and Concavity

|A|

g(|A|)

Page 15: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

15

Maximum of submodular functions

Suppose F1(A) and F2(A) submodular.

Is F(A) = max(F1(A),F2(A)) submodular?

|A|

F2(A)

F1(A)

F(A) = max(F1(A),F2(A))

max(F1,F2) not submodular in general!

Page 16: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

16

Minimum of submodular functions

Well, maybe F(A) = min(F1(A),F2(A)) instead?

F1(A) F2(A) F(A)

; 0 0 0{a} 1 0 0{b} 0 1 0{a,b} 1 1 1

F({b}) – F(;)=0

F({a,b}) – F({a})=1

<

But stay tuned

min(F1,F2) not submodular in general!

Page 17: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

18

Submodularity and convexity

Page 18: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

19

The submodular polyhedron PFExample: V = {a,b}

x({a}) · F({a})

x({b}) · F({b})

x({a,b}) · F({a,b})

PF

-1 x{a}

x{b}

0 1

1

2

-2

A F(A); 0{a} -1{b} 2{a,b} 0

Page 19: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Lovasz extension

Page 20: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.
Page 21: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

22

-1 w{a}

w{b}

0 1

1

2

-2

Example: Lovasz extension

g([0,1]) = [0,1]T [-2,2] = 2 = F({b})

g([1,1]) = [1,1]T [-1,1] = 0 = F({a,b})

{} {a}

{b} {a,b}[-1,1][-2,2]

g(w) = max {wT x: x 2 PF}

w=[0,1]want g(w)

Greedy ordering:e1 = b, e2 = a

w(e1)=1 > w(e2)=0

xw(e1)=F({b})-F(;)=2

xw(e2)=F({b,a})-F({b})=-2

xw=[-2,2]

A F(A); 0{a} -1{b} 2{a,b} 0

Page 22: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

23

Why is this useful?Theorem [Lovasz ’83]:g(w) attains its minimum in [0,1]n at a corner!

If we can minimize g on [0,1]n, can minimize F…(at corners, g and F take same values)

F(A) submodular g(w) convex (and efficient to evaluate)

Does the converse also hold?No, consider g(w1,w2,w3) = max(w1,w2+w3)

{a} {b} {c} F({a,b})-F({a})=0 < F({a,b,c})-F({a,c})=1

Page 23: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Minimizing a submodular function

Ellipsoid algorithm

Interior Points algorithm

Page 24: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Example: Image denoising

Page 25: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

26

Example: Image denoising

X1

X4

X7

X2

X5

X8

X3

X6

X9

Y1

Y4

Y7

Y2

Y5

Y8

Y3

Y6

Y9

P(x1,…,xn,y1,…,yn) = i,j i,j(yi,yj) i i(xi,yi)

Want argmaxy P(y | x) =argmaxy log P(x,y) =argminy i,j Ei,j(yi,yj)+i Ei(yi)

When is this MAP inference efficiently solvable(in high treewidth graphical models)?

Ei,j(yi,yj) = -log i,j(yi,yj)

Pairwise Markov Random Field

Xi: noisy pixels

Yi: “true” pixels

Page 26: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

MAP inference in Markov Random Fields[Kolmogorov et al, PAMI ’04, see also: Hammer, Ops Res ‘65]

Page 27: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

28

Constrained minimization

Page 28: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Part II

• Submodularity • Move making algorithms

• Higher-order model : Pn Potts model

Page 29: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Multi-Label problems

Page 30: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Move makingexpansions move and swap move for this problem

Page 31: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Metric and Semi metric Potential functions

Page 32: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

• if the pairwise potential functions define a metric then the energy function in equation (8) can be approximately minimized using alpha expansions.

• if pairwise potential functions defines a semi-metric, it can be minimized using alpha beta-swaps.

Page 33: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Move Energy

• Each move:• A transformation function: • The energy of a move t:• The optimal move:

Submodular set functions play an important role in energy minimization as they can be minimized in polynomial time

Page 34: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

The swap move algorithm

Page 35: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

The expansion move algorithm

Page 36: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Higher order potential

• The class of higher order clique potentialsfor which the expansion and swap moves can be

computed in polynomial timeThe clique potential take the form:

Page 37: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

• Question you should be asking:

• Show that move energy is submodular for all xc

Can my higher order potential be solved using α-expansions?

Page 38: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

• Form of the Higher Order Potentials

Moves for Higher Order Potentials

Clique Inconsistency function:

Pairwise potential:

xi

xj

xk

xm xl

cSum Form

Max Form

Page 39: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Theoretical Results: Swap• Move energy is always submodular if

non-decreasing concave.

proofs

Page 40: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Condition for Swap move

Concave Function:

Page 41: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Prove • all projections on two variables of any alpha

beta-swap move energy are submodular.

• The cost of any configuration

Page 42: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

substitute

Constraints 1:Lema 1:Constraints2:

The theorem is true

Page 43: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Condition for alpha expansion

• Metric:

Page 44: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

• Form of the Higher Order Potentials

Moves for Higher Order Potentials

Clique Inconsistency function:

Pairwise potential:

xi

xj

xk

xm xl

cSum Form

Max Form

Page 45: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Part II

• Submodularity • Move making algorithms

• Higher-order model : Pn Potts model

Page 46: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Image Segmentation

E(X) = ∑ ci xi + ∑ dij |xi-xj|i i,j

E: {0,1}n → R

0 →fg, 1→bg

n = number of pixels

[Boykov and Jolly ‘ 01] [Blake et al. ‘04] [Rother et al.`04]

Image Unary Cost Segmentation

Page 47: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Pn Potts Potentials

Patch Dictionary

(Tree)

Cmax 0

{0 if xi = 0, i ϵ p Cmax otherwise

h(Xp) =

p

[slide credits: Kohli]

Page 48: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Pn Potts Potentials

E(X) = ∑ ci xi + ∑ dij |xi-xj| + ∑ hp (Xp) i i,j p

p

{0 if xi = 0, i ϵ p Cmax otherwise

h(Xp) =

E: {0,1}n → R

0 →fg, 1→bg

n = number of pixels

[slide credits: Kohli]

Page 49: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Theoretical Results: Expansion

• Move energy is always submodular if

increasing linear

See paper for proofs

Page 50: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

PN Potts Model

c

Page 51: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

PN Potts Model

c Cost : g

Page 52: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

PN Potts Model

c Cost : gmax

Page 53: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Optimal moves for PN Potts• Computing the optimal swap move

c

Label 3Label 4

Case 1Not all variables assigned label 1 or 2

Move Energy is independent of tc

and can be ignored.

Label 1( )aLabel 2 ( )b

Page 54: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Optimal moves for PN Potts• Computing the optimal swap move

c

Label 1( )aLabel 2 ( )b

Label 3Label 4

Case 2All variables assigned label 1 or 2

Page 55: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Optimal moves for PN Potts• Computing the optimal swap move

c

Label 3Label 4

Case 2All variables assigned label 1 or 2

Can be minimized by solving a st-mincut problem

Label 1( )aLabel 2 ( )b

Page 56: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Solving the Move Energy

Add a constant

This transformation does not effect the solution

add a constant K to all possible values of the clique potential without changing the optimal move

Page 57: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Solving the Move Energy• Computing the optimal swap move

Source

Sink

v1 v2 vn

Ms

Mt

ti = 0 vi Source Set

tj = 1 vj Sink Set

Page 58: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Solving the Move Energy• Computing the optimal swap move

Case 1: all xi = a (vi Source)

Cost:

Source

Sink

v1 v2 vn

Ms

Mt

Page 59: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Solving the Move Energy• Computing the optimal swap move

v1 v2 vn

Ms

MtCost:

Source

Sink

Case 2: all xi = b (vi Sink)

Page 60: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Solving the Move Energy• Computing the optimal swap move

Cost:

v1 v2 vn

Ms

Mt

Source

Sink

Case 3: all xi = ,a b (vi Source, Sink)

Recall that the cost of an st-mincut is the sum of weights of the edges included in the stmincut which go from the source set to the sink set.

Page 61: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Optimal moves for PN Potts• The expansion move energy

• Similar graph construction.

Page 62: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Experimental Results• Texture Segmentation

Unary(Colour)

Pairwise(Smoothness)

Higher Order(Texture)

Original Pairwise Higher order

Page 63: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Experimental Results

Original Swap (3.2 sec)

Expansion (2.5 sec)

Pairwise Higher Order

Swap (4.2 sec)

Expansion (3.0 sec)

Page 64: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Experimental Results

Original

Pairwise Higher Order

Swap (4.7 sec)

Expansion (3.7sec)

Swap (5.0 sec)

Expansion (4.4 sec)

Page 65: Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

More Higher-order models