Discrete Optimization in Computer Vision Nikos Komodakis Ecole des Ponts ParisTech, LIGM Traitement...

41
Discrete Optimization in Computer Vision Nikos Komodakis Ecole des Ponts ParisTech, LIGM Traitement de l’information et vision arti cielle

Transcript of Discrete Optimization in Computer Vision Nikos Komodakis Ecole des Ponts ParisTech, LIGM Traitement...

Discrete Optimization in Computer Vision

Nikos KomodakisEcole des Ponts ParisTech, LIGM

Traitement de l’information et vision artificielle

Message passing algorithms for energy minimization

Message-passing algorithms

Central concept: messages

These methods work by propagating messages across the MRF graph

Widely used algorithms in many areas

Message-passing algorithms

But how do messages relate to optimizing the energy?

Let’s look at a simple example first: we will examine the case where the MRF graph is a chain

Message-passing on chains

MRF graph

Message-passing on chains

Corresponding lattice or trellis

Message-passing on chains

Global minimum in linear time

Optimization proceeds in two passes: Forward pass (dynamic

programming) Backward pass

Message-passing on chains

(example on board)

(algebraic derivation of messages)

sqp r

Message-passing on chains

),()( qppqpp xxx

sqp r

),()( min)( jiijM pqpi

pq

j

Forward pass (dynamic programming)

),()( qppqpp xxx

sqp r

),()( min)( jiijM pqpi

pq

j

Forward pass (dynamic programming)

),()( qppqpp xxx

sqp r

),()( min)( jiijM pqpi

pq

j

Forward pass (dynamic programming)

),()( qppqpp xxx

sqp r

),()( min)( jiijM pqpi

pq

j

5.1

1.0

1

5.2

pqM

Forward pass (dynamic programming)

sqp r

),()()(min)( kjjMjkM qrpqqj

qr

0.5

2

1.2

2.0

qrM

Forward pass (dynamic programming)

k

sqp r

Forward pass (dynamic programming)

1.0

4.0

2.0

1.0

rsM

0.5

2

1.2

2.0

s

Min-marginal for node s and label j:

min ( )E x js x x

sqp r

Backward pass

xs

( ) min ( ) ( ) ( , )M x j M j j xrs s r qr rs sj

arg min ( ) ( ) ( , )x j M j j xr r qr rs sj

xr

( ) min ( ) ( ) ( , )M x j M j j xqr r q pq qr rj

arg min ( ) ( ) ( , )x j M j j xq q pq qr rj

xqxp

Message-passing on chains

How can I compute min-marginals for any node in the chain?

How to compute min-marginals for all nodes efficiently?

What is the running time of message-passing on chains?

Message-passing on trees

We can apply the same idea to tree-structured graphs

Slight generalization from chains

Resulting algorithm called: belief propagation (also called under many other names: e.g., max-product, min-sum etc.)(for chains, it is also often called the Viterbi algorithm)

Belief propagation(BP)

Dynamic programming: global minimum in linear time

BP: Inward pass (dynamic programming) Outward pass

Gives min-marginals

qp r

BP on a tree [Pearl’88]

rootleaf

leaf

),()( qppqpp xxx

qp r

),()( min)( jiijM pqpi

pq

j

Inward pass (dynamic programming)

),()( qppqpp xxx

qp r

),()( min)( jiijM pqpi

pq

j

Inward pass (dynamic programming)

),()( qppqpp xxx

qp r

),()( min)( jiijM pqpi

pq

j

Inward pass (dynamic programming)

),()( qppqpp xxx

qp r

),()( min)( jiijM pqpi

pq

j

5.1

1.0

1

5.2

pqM

Inward pass (dynamic programming)

qp r

),()()(min)( kjjMjkM qrpqqj

qr

0.2

2.1

2

5.0

pqM

Inward pass (dynamic programming)

k

qp r

Inward pass (dynamic programming)

qp r

Inward pass (dynamic programming)

qp r

Outward pass

qp r

BP on a tree: min-marginals

Min-marginal for node q and label j:

jxE q )(min xx

)()()( jMjMj rqpqq

j

Belief propagation: message-passing on trees

Belief propagation: message-passing on trees

min-marginals = ???min-marginals = sum of all messages + unary potential

What is the running time of message-passing for trees?

Message-passing on chains

Essentially, message passing on chains is dynamic programming

Dynamic programming meansreuse of computations

Generalizing belief propagation Key property: min(a+b,a+c) =

a+min(b,c) BP can be generalized to any operators

satisfying the above property E.g., instead of (min,+), we could have:

(max,*) Resulting algorithm called max-product.What does it compute?

(+,*) Resulting algorithm called sum-product.What does it compute?

Belief propagation as a distributive algorithm BP works distributively

(as a result, it can be parallelized)

Essentially BP is a decentralized algorithm

Global results through local exchange of information

Simple example to illustrate this: counting soldiers

Counting soldiers in a line

Can you think of a distributive algorithm for the commander to count its soldiers?

(From David MacKay’s book “Information Theory, Inference, and Learning”)

Counting soldiers in a line

Counting soldiers in a tree

Can we do the same for this case?

Counting soldiers in a tree

Counting soldiers Simple example to illustrate BP

Same idea can be used in cases which are seemingly more complex: counting paths through a point in a grid probability of passing through a node in the

grid

In general, we have used the same idea for minimizing MRFs (a much more general problem)

Graphs with loops

How about counting these soldiers?

Hmmm…overcounting?