Primal-dual Algorithm for Convex Markov Random Fields Vladimir Kolmogorov University College London...

Post on 28-Mar-2015

214 views 2 download

Tags:

Transcript of Primal-dual Algorithm for Convex Markov Random Fields Vladimir Kolmogorov University College London...

Primal-dual Algorithmfor Convex Markov Random Fields

Vladimir Kolmogorov

University College London

GDR (Optimisation Discrète, Graph Cuts et Analyse d'Images) Paris, 29 November 2005

Note: these slides contain animation

Convex MRF functions

qp

pqpqp

pp xxVxDE,

)()()(x

• Functions Dp(.) , Vpq(.) are convex

• xp[0,…,K-1] (K is # of labels)

• Goal: compute global minimum of E

Example: Panoramic image stitching[Levin,Zomet,Peleg,Weiss’04]

• Main idea: gradients of output image x should match gradients of input images

),,( 1 nxx x

• Energy function:

– - output image (e.g. xp[0,…,255])

qp

pqpq xxVE,

)()(x

Vpq

xq - xp

0 255-255• Vpq(.) is convex!

Example: Panoramic image stitching[Levin,Zomet,Peleg,Weiss’04]

Algorithms for MRF minimisation

• Arbitrary Dp(.), convex Vpq(.)– “Battleship” construction [Ishikawa’03], [Ahuja et al.’04]

• Construct graph with O(nK) nodes

• Minimum cut gives global minimum

– Needs a lot of memory!

• Convex Dp(.), convex Vpq(.)– Dual algorithms (maintain dual variables - flow f)

• [Karzanov et al. ’97], [Ahuja et al. ’03].

• Best complexity is O(n m log (n2/m) log(nK))

– Primal algorithm (maintains primal variables - configuration x)• Iterative min cut [Bioucas-Dias & Valadão’05]

• Advantage: relies on maxflow algorithm

• Complexity?

• New results (convex Dp, convex Vpq):

– Establishing complexity of primal algorithm• At most 2K steps

– Extending primal algorithm to primal-dual algorithm• Maintains both primal and dual variables

• Can be speeded up using Dijkstra’s shortest path procedure

• Experimentally much faster than primal algorithm

Algorithms for MRF minimisation

Overview of primal algorithm (iterative min cut)

label K-1

label 0

label 1

Primal algorithm (iterative min cut)

Graph:

label K-1

label 0

label 1

• Start with arbitrary configuration x• Procedure UP:

nE }1,0{)(minarg bbxb

Primal algorithm (iterative min cut)

Primal algorithm (iterative min cut)

label K-1

label 0

label 1

bxx :

• Procedure UP:

nE }1,0{)(minarg bbxb

label K-1

label 0

label 1

• Procedure DOWN:

nE }1,0{)(minarg bbxb

Primal algorithm (iterative min cut)

label K-1

label 0

label 1

bxx :

• Procedure DOWN:

nE }1,0{)(minarg bbxb

Primal algorithm (iterative min cut)

label K-1

label 0

label 1

• UP:

Primal algorithm (iterative min cut)

label K-1

label 0

label 1

• UP:• DOWN:

Primal algorithm (iterative min cut)

label K-1

label 0

label 1

• DOWN:

Primal algorithm (iterative min cut)

label K-1

label 0

label 1

• DOWN:• Done!– UP and DOWN do not decrease energy

Primal algorithm (iterative min cut)

Discussion• [Bioucas-Dias & Valadão’05]:

– Procedure yields global minimum! • No unary terms Dp, terms Vpq are convex

– Straighforward extension to convex MRF functions• Convex Dp , convex Vpq

– Non-polynomial bound on the number of steps

• [Murota ’00,’03] (steepest descent algorithm)– Procedure yields global minimum for L♮-convex functions

• Convex MRF functions are special case of L♮-convex functions– O(nK) bound on the number of steps

• New result:– Global minimum after at most 2K steps– Holds for L♮-convex functions (including convex MRF functions)

Contribution #1: Complexity of primal algorithm

Background

Two classes of functions• Consider function E(x) = E(x1,…,xn)

– xp[0,…,K-1]

• Algorithm can be applied to any such function:

UP:

DOWN:

• Question #1: When UP and DOWN can be solved efficiently?

• Question #2: When does it yield global minimum?

bxxbbxb :}1,0{)(minarg nE

bxxbbxb :}1,0{)(minarg nE

Submodular functions

L♮-convex functions

Submodular functions

)()()()( yxyxyx EEEE • E is submodular if for all configurations x, y

– “” and “” are component-wise minimum/maximum:

• Definition is extended from binary variables (K=2) to multi-valued variables (K>2)

• Can be minimised in time polynomial in K, n, m– Functions with unary, pairwise and ternary terms:

reduction to min cut/max flow [Kovtun’04]

ppp

ppp

yx

yx

, max)(

, min)(

yx

yx

L♮-convex functions

)()(22

yxyxyx

EEEE

• E is L♮-convex if for all configurations x, y

– “” and “” are component-wise round-down and round-up (floor and ceiling)

• Note: in continuous case, E is convex if for all x, y

)()(2

2 yxyx

EEE

Submodulariry and L♮-convexity• K=2: submodular functions = L♮-convex functions• K>2: submodular functions L♮-convex functions

• Example: qp

pqpqp

pp xxVxDE,

)()()(x

Dp arbitraryVpq convex

E is submodular

Dp convexVpq convex

E is L♮-convex

D1

x1

D1

x1

Contribution #1: Complexity of primal algorithm

for L♮-convex functions

Proof

• For configuration x, define +(x), (x)

– 0 +(x), (x) < K

• Prove that UP and DOWN do not increase +(x), (x)

• Prove that if +(x) > 0, then UP will decrease it – Similarly for (x) and DOWN

• Prove that +(x) = (x) = 0 implies that x is a global minimum

Overview

Property of submodular functions• Let OPT(E) be the set of global minima of E

• There exists minimal and maximal optimal configurations:

• In general, not true for non-submodular functions! (e.g. Potts interactions)

)( allfor

)(,maxmin

maxmin

EOPT

EOPT

xxxx

xx

• Step 1: Let E+ be a restriction of E to configurations y x

Defining +(x)

• Step 3: Define +(x) = || x+ - x ||= max { x+p

- xp }

• Step 2: Let x+ be the minimal optimal configuration of E+

x

x+

• Step 1: Let E ¯ be a restriction of E to configurations y x

Defining (x)

• Step 3: Define (x) = || x - x¯ ||= max { xp - x¯p }

• Step 2: Let x¯ be the maximal optimal configuration of E ¯

x

Algorithm’s behaviour

(x) = 2

(x) = 3

Algorithm’s behaviour

(x) = 1

(x) = 3

Algorithm’s behaviour

(x) = 0

(x) = 3

Algorithm’s behaviour

(x) = 0

(x) = 2

Algorithm’s behaviour

(x) = 0

(x) = 1

Algorithm’s behaviour

(x) = 0

(x) = 0

Contribution #2: Primal-dual algorithm

Primal-dual algorithm• Primal algorithm maintains only primal variables

(configuration x)– Each maxflow problem is solved independently

• Motivation: reuse flow from previous computation

• New primal-dual algorithm– Applies to convex MRF functions

– Maintains both primal variables (configuration x) and dual variables (flow f )

– Upon termination, optimal x and f

– Can be speeded up via Dijkstra’s algorithm

– Experimentally much faster than primal algorithm

Flow and reparameterisation

Dp(xp)

xp xqxq-xp

Vpq(xq-xp) Dq(xq)

Dp(xp) + f·xp

xp xqxq-xp

Vpq(xq-xp) + f·(xq-xp) Dp(xp) f·xq

Flow and reparameterisation

Dp(xp) + fp·xp

xp xq-xp

Vpq(xq-xp) + fpq·(xq-xp)

• Flow: vector f = { fp , fpq } satisfying antisymmetry and flow conservation constraints

• Any flow defines reparameterisation

Optimality conditions

Dp(xp) + fp·xp

xp xq-xp

Vpq(xq-xp) + fpq·(xq-xp)

• ( x, f ) is an optimal primal-dual pair iff

)(min)(

)(min)(

zVxxV

zDxD

pqf

zpqpq

f

pf

zpp

f

Primal-dual algorithm• UP:

– MAXFLOW-UP• Construct graph for minimising

E(x + b), b{0,1}n

• Compute maximum flow• Update x and f accordingly

– DIJKSTRA-UP• Update x

• DOWN: similar

nodes

edges

• Algorithm’s property: – Maintains optimality condition for edges

(but not necessarily for nodes)

DIJKSTRA-UP• Increase xp until Dp(xp) starts

increasing

• Maintain optimality condition for edges

• Compute maximal such labeling x– Dijkstra’s shortest path algorithm

nodes

edges

• Complexity is preserved (at most 2K steps)

Experimental results

input pair

maximaloptimal configuration

minimal optimal configuration

average

Experimental results

input pair

maximaloptimal configuration

minimal optimal configuration

average

Experimental results

input pair

maximaloptimal configuration

minimal optimal configuration

average

Running times

0

20

40

60

80

100

120

140

160

180

1 2 3

secs primal

primal-dual

• Primal• Primal-dual

Running times• Initialisation is important

– If x = global minimum, then terminates in 2 steps

• Two-stage process:– Solve the problem in the overlap area

• Small graph

– Use it as initialisation for the full image• Experimentally, second stage takes 3 steps

Running times• MCNF (minimum cost network flow, [Goldberg’97])• Primal-dual, 1 stage• Primal-dual, 2 stages

0

5

10

15

20

25

30

35

1 2 3

secs

MCNF

primal-dual, 1stage

primal-dual, 2 stages

Conclusions• Complexity of primal algorithm for minimising

L♮-convex functions– Tight bound on the number of steps– Improves bounds of Murota and Bioucas-Dias et al.

• New primal-dual algorithm– Applies to convex MRF functions– Experimentally much faster than primal algorithm– With good initialisation, outperforms MCNF