Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition:...

59
Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 1 / 31

Transcript of Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition:...

Page 1: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Visual Recognition: Inference in Graphical Models

Raquel Urtasun

TTI Chicago

Feb 28, 2012

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 1 / 31

Page 2: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Graphical models

Applications

Representation

Inference

message passing (LP relaxations)graph cuts

Learning

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 2 / 31

Page 3: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Inference with graph cuts

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 3 / 31

Page 4: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Submodular Functions

A Pseudo-boolean function f : {0, 1}n → < is submodular if

f (A) + f (B) ≥ f (A ∨ B)︸ ︷︷ ︸OR

+ f (A ∧ B)︸ ︷︷ ︸AND

∀A,B ∈ {0, 1}n

Example: n = 2, A = [1, 0], B = [0, 1]

f ([1, 0]) + f ([0, 1]) ≥ f ([1, 1]) + f ([0, 0])

Sum of submodular functions is submodular → Easy to proof.

Some energies in computer vision can be submodular

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 4 / 31

Page 5: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Minimizing submodular Functions

Pairwise submodular functions can be transformed to st-mincut/max-flow[Hammer, 65].

Very low running time ∼ O(n)

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 5 / 31

Page 6: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

The ST-mincut problem

Suppose we have a graph G = {V ,E ,C}, with vertices V , Edges E andcosts C .

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 6 / 31

Page 7: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

The ST-mincut problem

An st-cut (S,T) divides the nodes between source and sink.

The cost of a st-cut is the sum of cost of all edges going from S to T

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 7 / 31

Page 8: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

The ST-mincut problem

The st-mincut is the st-cut with the minimum cost

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 8 / 31

Page 9: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Back to our energy minimization

Construct a graph such that

1 Any st-cut corresponds to an assignment of x

2 The cost of the cut is equal to the energy of x : E(x)

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 9 / 31

Page 10: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

St-mincut and Energy Minimization

[Source: P. Kohli]

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 10 / 31

Page 11: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

How are they equivalent?

A B

C D

0 1

0

1 xi

xj

= A + 0 0

C-A C-A

0 1

0

1

0 D-C

0 D-C

0 1

0

1

0 B+C-A-D

0 0

0 1

0

1 + +

if x1=1 add C-A

if x2 = 1 add D-C

B+C-A-D ! 0 is true from the submodularity of !ij

A = !ij (0,0) B = !ij(0,1) C = !ij

(1,0) D = !ij (1,1)

!ij (xi,xj) = !ij(0,0) + (!ij(1,0)-!ij(0,0)) xi + (!ij(1,0)-!ij(0,0)) xj + (!ij(1,0) + !ij(0,1) - !ij(0,0) - !ij(1,1)) (1-xi) xj

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 11 / 31

Page 12: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Graph Construction

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 12 / 31

Page 13: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Graph Construction

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 12 / 31

Page 14: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Graph Construction

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 12 / 31

Page 15: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Graph Construction

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 12 / 31

Page 16: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Graph Construction

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 12 / 31

Page 17: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Graph Construction

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 12 / 31

Page 18: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Graph Construction

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 12 / 31

Page 19: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Graph Construction

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 12 / 31

Page 20: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Graph Construction

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 12 / 31

Page 21: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

How to compute the St-mincut?

[Source: P. Kohli]

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 13 / 31

Page 22: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

How does the code look like

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 14 / 31

Page 23: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

How does the code look like

[Source: P. Kohli]

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 14 / 31

Page 24: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

How does the code look like

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 14 / 31

Page 25: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

How does the code look like

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 14 / 31

Page 26: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Graph cuts for multi-label problems

Exact Transformation to QPBF [Roy and Cox 98] [Ishikawa 03] [Schlesingeret al. 06] [Ramalingam et al. 08]

Very high computational cost

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 15 / 31

Page 27: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Alternative: Move making

[Source: P. Kohli]

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 16 / 31

Page 28: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Alternative: Move making

[Source: P. Kohli]

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 16 / 31

Page 29: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Computing the Optimal Move

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 17 / 31

Page 30: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Move Making Algorithms

[Source: P. Kohli]

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 18 / 31

Page 31: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Energy Minimization

Consider pairwise MRFs

E (f ) =∑

{p,q}∈N

Vp,q(fp, fq) +∑p

Dp(fp)

with N defining the interactions between nodes, e.g., pixels

Dp non-negative, but arbitrary.

Same as before, where Vp,q ≡ −θα and Dp ≡ −θp.

This is the graph-cuts notation.

Important to notice it’s the same thing.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 19 / 31

Page 32: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Energy Minimization

Consider pairwise MRFs

E (f ) =∑

{p,q}∈N

Vp,q(fp, fq) +∑p

Dp(fp)

with N defining the interactions between nodes, e.g., pixels

Dp non-negative, but arbitrary.

Same as before, where Vp,q ≡ −θα and Dp ≡ −θp.

This is the graph-cuts notation.

Important to notice it’s the same thing.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 19 / 31

Page 33: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Energy Minimization

Consider pairwise MRFs

E (f ) =∑

{p,q}∈N

Vp,q(fp, fq) +∑p

Dp(fp)

with N defining the interactions between nodes, e.g., pixels

Dp non-negative, but arbitrary.

Same as before, where Vp,q ≡ −θα and Dp ≡ −θp.

This is the graph-cuts notation.

Important to notice it’s the same thing.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 19 / 31

Page 34: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Energy Minimization

Consider pairwise MRFs

E (f ) =∑

{p,q}∈N

Vp,q(fp, fq) +∑p

Dp(fp)

with N defining the interactions between nodes, e.g., pixels

Dp non-negative, but arbitrary.

Same as before, where Vp,q ≡ −θα and Dp ≡ −θp.

This is the graph-cuts notation.

Important to notice it’s the same thing.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 19 / 31

Page 35: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Metric vs Semimetric

Two general classes of pairwise interactions

Metric if it satisfies for any set of labels α, β, γ

V (α, β) = 0 ↔ α = β

V (α, β) = V (β, α) ≥ 0

V (α, β) ≤ V (α, γ) + V (γ, β)

Semi-metric if it satisfies for any set of labels α, β, γ

V (α, β) = 0 ↔ α = β

V (α, β) = V (β, α) ≥ 0

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 20 / 31

Page 36: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Metric vs Semimetric

Two general classes of pairwise interactions

Metric if it satisfies for any set of labels α, β, γ

V (α, β) = 0 ↔ α = β

V (α, β) = V (β, α) ≥ 0

V (α, β) ≤ V (α, γ) + V (γ, β)

Semi-metric if it satisfies for any set of labels α, β, γ

V (α, β) = 0 ↔ α = β

V (α, β) = V (β, α) ≥ 0

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 20 / 31

Page 37: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Examples for 1D label set

Truncated quadratic is a semi-metric

V (α, β) = min(K , |α− β|2)

with K a constant.

Truncated absolute distance is a metric

V (α, β) = min(K , |α− β|)

with K a constant.

For multi-dimensional, replace | · | by any norm.

Potts model is a metric

V (α, β) = K · T (α 6= β)

with T (·) = 1 if the argument is true and 0 otherwise.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 21 / 31

Page 38: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Examples for 1D label set

Truncated quadratic is a semi-metric

V (α, β) = min(K , |α− β|2)

with K a constant.

Truncated absolute distance is a metric

V (α, β) = min(K , |α− β|)

with K a constant.

For multi-dimensional, replace | · | by any norm.

Potts model is a metric

V (α, β) = K · T (α 6= β)

with T (·) = 1 if the argument is true and 0 otherwise.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 21 / 31

Page 39: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Examples for 1D label set

Truncated quadratic is a semi-metric

V (α, β) = min(K , |α− β|2)

with K a constant.

Truncated absolute distance is a metric

V (α, β) = min(K , |α− β|)

with K a constant.

For multi-dimensional, replace | · | by any norm.

Potts model is a metric

V (α, β) = K · T (α 6= β)

with T (·) = 1 if the argument is true and 0 otherwise.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 21 / 31

Page 40: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Examples for 1D label set

Truncated quadratic is a semi-metric

V (α, β) = min(K , |α− β|2)

with K a constant.

Truncated absolute distance is a metric

V (α, β) = min(K , |α− β|)

with K a constant.

For multi-dimensional, replace | · | by any norm.

Potts model is a metric

V (α, β) = K · T (α 6= β)

with T (·) = 1 if the argument is true and 0 otherwise.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 21 / 31

Page 41: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Binary Moves

α− β moves works for semi-metrics

α expansion works for V being a metric

Figure: Figure from P. Kohli tutorial on graph-cuts

For certain x1 and x2, the move energy is sub-modular QPBF

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 22 / 31

Page 42: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Swap Move

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 23 / 31

Page 43: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Swap Move

[Source: P. Kohli]

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 23 / 31

Page 44: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Expansion Move

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 24 / 31

Page 45: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Expansion Move

[Source: P. Kohli]Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 24 / 31

Page 46: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

More formally

Any labeling can be uniquely represented by a partition of image pixelsP = {Pl |l ∈ L}, where Pl = {p ∈ P|fp = l} is a subset of pixels assignedlabel l .

There is a one to one correspondence between labelings f and partitions P.

Given a pair of labels α, β, a move from a partition P (labeling f ) to a newpartition P’ (labeling f ′) is called an α− β swap if Pl = P ′ for any labell 6= α, β.

The only difference between P and P ′ is that some pixels that were labeledin P are now labeled in P ′, and vice-versa.

Given a label l , a move from a partition P (labeling f ) to a new partition P ′

(labeling f ′) is called an α-expansion if Pα ⊂ P ′α and P ′

l ⊂ Pl .

An α-expansion move allows any set of image pixels to change their labelsto α.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 25 / 31

Page 47: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

More formally

Any labeling can be uniquely represented by a partition of image pixelsP = {Pl |l ∈ L}, where Pl = {p ∈ P|fp = l} is a subset of pixels assignedlabel l .

There is a one to one correspondence between labelings f and partitions P.

Given a pair of labels α, β, a move from a partition P (labeling f ) to a newpartition P’ (labeling f ′) is called an α− β swap if Pl = P ′ for any labell 6= α, β.

The only difference between P and P ′ is that some pixels that were labeledin P are now labeled in P ′, and vice-versa.

Given a label l , a move from a partition P (labeling f ) to a new partition P ′

(labeling f ′) is called an α-expansion if Pα ⊂ P ′α and P ′

l ⊂ Pl .

An α-expansion move allows any set of image pixels to change their labelsto α.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 25 / 31

Page 48: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

More formally

Any labeling can be uniquely represented by a partition of image pixelsP = {Pl |l ∈ L}, where Pl = {p ∈ P|fp = l} is a subset of pixels assignedlabel l .

There is a one to one correspondence between labelings f and partitions P.

Given a pair of labels α, β, a move from a partition P (labeling f ) to a newpartition P’ (labeling f ′) is called an α− β swap if Pl = P ′ for any labell 6= α, β.

The only difference between P and P ′ is that some pixels that were labeledin P are now labeled in P ′, and vice-versa.

Given a label l , a move from a partition P (labeling f ) to a new partition P ′

(labeling f ′) is called an α-expansion if Pα ⊂ P ′α and P ′

l ⊂ Pl .

An α-expansion move allows any set of image pixels to change their labelsto α.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 25 / 31

Page 49: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

More formally

Any labeling can be uniquely represented by a partition of image pixelsP = {Pl |l ∈ L}, where Pl = {p ∈ P|fp = l} is a subset of pixels assignedlabel l .

There is a one to one correspondence between labelings f and partitions P.

Given a pair of labels α, β, a move from a partition P (labeling f ) to a newpartition P’ (labeling f ′) is called an α− β swap if Pl = P ′ for any labell 6= α, β.

The only difference between P and P ′ is that some pixels that were labeledin P are now labeled in P ′, and vice-versa.

Given a label l , a move from a partition P (labeling f ) to a new partition P ′

(labeling f ′) is called an α-expansion if Pα ⊂ P ′α and P ′

l ⊂ Pl .

An α-expansion move allows any set of image pixels to change their labelsto α.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 25 / 31

Page 50: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

More formally

Any labeling can be uniquely represented by a partition of image pixelsP = {Pl |l ∈ L}, where Pl = {p ∈ P|fp = l} is a subset of pixels assignedlabel l .

There is a one to one correspondence between labelings f and partitions P.

Given a pair of labels α, β, a move from a partition P (labeling f ) to a newpartition P’ (labeling f ′) is called an α− β swap if Pl = P ′ for any labell 6= α, β.

The only difference between P and P ′ is that some pixels that were labeledin P are now labeled in P ′, and vice-versa.

Given a label l , a move from a partition P (labeling f ) to a new partition P ′

(labeling f ′) is called an α-expansion if Pα ⊂ P ′α and P ′

l ⊂ Pl .

An α-expansion move allows any set of image pixels to change their labelsto α.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 25 / 31

Page 51: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

More formally

Any labeling can be uniquely represented by a partition of image pixelsP = {Pl |l ∈ L}, where Pl = {p ∈ P|fp = l} is a subset of pixels assignedlabel l .

There is a one to one correspondence between labelings f and partitions P.

Given a pair of labels α, β, a move from a partition P (labeling f ) to a newpartition P’ (labeling f ′) is called an α− β swap if Pl = P ′ for any labell 6= α, β.

The only difference between P and P ′ is that some pixels that were labeledin P are now labeled in P ′, and vice-versa.

Given a label l , a move from a partition P (labeling f ) to a new partition P ′

(labeling f ′) is called an α-expansion if Pα ⊂ P ′α and P ′

l ⊂ Pl .

An α-expansion move allows any set of image pixels to change their labelsto α.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 25 / 31

Page 52: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Example

Figure: (a) Current partition (b) local move (c) α− β-swap (d) α-expansion.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 26 / 31

Page 53: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Algorithms

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 27 / 31

Page 54: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Finding optimal Swap move

Given an input labeling f (partition P) and a pair of labels α, β we want tofind a labeling f̂ that minimizes E over all labelings within one α− β-swapof f .

This is going to be done by computing a labeling corresponding to aminimum cut on a graph Gαβ = (Vαβ , Eαβ).

The structure of this graph is dynamically determined by the currentpartition P and by the labels α, β.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 28 / 31

Page 55: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Finding optimal Swap move

Given an input labeling f (partition P) and a pair of labels α, β we want tofind a labeling f̂ that minimizes E over all labelings within one α− β-swapof f .

This is going to be done by computing a labeling corresponding to aminimum cut on a graph Gαβ = (Vαβ , Eαβ).

The structure of this graph is dynamically determined by the currentpartition P and by the labels α, β.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 28 / 31

Page 56: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Finding optimal Swap move

Given an input labeling f (partition P) and a pair of labels α, β we want tofind a labeling f̂ that minimizes E over all labelings within one α− β-swapof f .

This is going to be done by computing a labeling corresponding to aminimum cut on a graph Gαβ = (Vαβ , Eαβ).

The structure of this graph is dynamically determined by the currentpartition P and by the labels α, β.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 28 / 31

Page 57: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Graph Construction

The set of vertices includes the two terminals α and β, as well as imagepixels p in the sets Pα and Pβ (i.e., fp ∈ {α, β}).

Each pixel p ∈ Pαβ is connected to the terminals α and β, called t-links.

Each set of pixels p, q ∈ Pαβ which are neighbors is connected by an edgeep,q

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 29 / 31

Page 58: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Computing the Cut

Any cut must have a single t-link not cut.

This defines a labeling

There is a one-to-one correspondences between a cut and a labeling.

The energy of the cut is the energy of the labeling.

See Boykov et al, ”fast approximate energy minimization via graph cuts”PAMI 2001.

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 30 / 31

Page 59: Visual Recognition: Inference in Graphical Modelsrurtasun/courses/Visual... · Visual Recognition: Inference in Graphical Models Raquel Urtasun TTI Chicago Feb 28, 2012 Raquel Urtasun

Properties

For any cut, then

Raquel Urtasun (TTI-C) Visual Recognition Feb 28, 2012 31 / 31