Graph Cut based Inference with Co-occurrence Statistics · Standard CRF Energy Pairwise CRF models...

Post on 28-Nov-2020

10 views 0 download

Transcript of Graph Cut based Inference with Co-occurrence Statistics · Standard CRF Energy Pairwise CRF models...

Graph Cut based Inference with Co-occurrence Statistics

Ľubor Ladický, Chris Russell, Pushmeet Kohli, Philip Torr

Image labelling Problems

Image Denoising Geometry Estimation Object Segmentation

Assign a label to each image pixel

Building

Sky

Tree Grass

Standard CRF Energy

Pairwise CRF models

Data term Smoothness term

Standard CRF Energy

Pairwise CRF models

Restricted expressive power

Data term Smoothness term

Structures in CRF

Taskar et al. 02 – associative potentials Kohli et al. 08 – segment consistency Woodford et al. 08 – planarity constraint Vicente et al. 08 – connectivity constraint Nowozin & Lampert 09 – connectivity constraint Roth & Black 09 – field of experts Ladický et al. 09 – consistency over several scales Woodford et al. 09 – marginal probability Delong et al. 10 – label occurrence costs

Pairwise CRF models

Standard CRF Energy for Object Segmentation

Cannot encode global consistency of labels!!

Local context

Image from Torralba et al. 10

Detection Suppression

roadtablechair

keyboardtablecar

road

If we have 1000 categories (detectors), and each detector produces 1 fp every 10 images, we will have 100 false alarms per image… pretty much garbage…

[Torralba et al. 10, Leibe & Schiele 09, Barinova et al. 10]

• Thing – Thing • Stuff - Stuff • Stuff - Thing

[ Images from Rabinovich et al. 07 ]

Encoding Co-occurrence

Co-occurrence is a powerful cue [Heitz et al. '08] [Rabinovich et al. ‘07]

• Thing – Thing • Stuff - Stuff • Stuff - Thing

[ Images from Rabinovich et al. 07 ]

Encoding Co-occurrence

Co-occurrence is a powerful cue [Heitz et al. '08] [Rabinovich et al. ‘07]

Proposed solutions : 1. Csurka et al. 08 - Hard decision for label estimation 2. Torralba et al. 03 - GIST based unary potential 3. Rabinovich et al. 07 - Full-connected CRF

So...

What properties should these global co-occurence potentials have ?

Desired properties

1. No hard decisions

Desired properties

1. No hard decisions

Incorporation in probabilistic framework

Unlikely possibilities are not completely ruled out

Desired properties

1. No hard decisions 2. Invariance to region size

Desired properties

1. No hard decisions 2. Invariance to region size

Cost for occurrence of {people, house, road etc .. } invariant to image area

Desired properties

1. No hard decisions 2. Invariance to region size

The only possible solution :

Local context Global context

Cost defined over the assigned labels L(x)

L(x)={ , , }

Desired properties

1. No hard decisions 2. Invariance to region size 3. Parsimony – simple solutions preferred

L(x)={ building, tree, grass, sky }

L(x)={ aeroplane, tree, flower, building, boat, grass, sky }

Desired properties

1. No hard decisions 2. Invariance to region size 3. Parsimony – simple solutions preferred 4. Efficiency

Desired properties

1. No hard decisions 2. Invariance to region size 3. Parsimony – simple solutions preferred 4. Efficiency

a) Memory requirements as O(n) with the image size and number or labels b) Inference tractable

• Torralba et al.(2003) – Gist-based unary potentials

• Rabinovich et al.(2007) - complete pairwise graphs

• Csurka et al.(2008) - hard estimation of labels present

Previous work

Zhu & Yuille 1996 – MDL prior Bleyer et al. 2010 – Surface Stereo MDL prior Hoiem et al. 2007 – 3D Layout CRF MDL Prior • Delong et al. 2010 – label occurence cost

Related work

C(x) = K |L(x)|

C(x) = ΣLKLδL(x)

Zhu & Yuille 1996 – MDL prior Bleyer et al. 2010 – Surface Stereo MDL prior Hoiem et al. 2007 – 3D Layout CRF MDL Prior • Delong et al. 2010 – label occurence cost

Related work

C(x) = K |L(x)|

C(x) = ΣLKLδL(x)

All special cases of our model

Inference

Pairwise CRF Energy

Inference

IP formulation (Schlesinger 73)

Inference

Pairwise CRF Energy with co-occurence

Inference

IP formulation with co-occurence

Inference

IP formulation with co-occurence

Pairwise CRF cost Pairwise CRF constaints

Inference

IP formulation with co-occurence

Co-occurence cost

Inference

IP formulation with co-occurence

Inclusion constraints

Inference

IP formulation with co-occurence

Exclusion constraints

Inference

LP relaxation

Relaxed constraints

Inference

LP relaxation

Very Slow! 80 x 50 subsampled image takes 20 minutes

Inference: Our Contribution

Pairwise representation • One auxiliary variable Z 2 L

• Infinite pairwise costs if xi Z [see technical report] *Solvable using standard methods: BP, TRW etc.

Inference: Our Contribution

Pairwise representation • One auxiliary variable Z 2 L

• Infinite pairwise costs if xi Z [see technical report] *Solvable using standard methods: BP, TRW etc.

Relatively faster but still computationally expensive!

Inference using Moves

Graph Cut based move making algorithms [Boykov et al. 01]

α-expansion transformation function

• Series of locally optimal moves

• Each move reduces energy

• Optimal move by minimizing submodular function

Space of Solutions (x) : LN

Move Space (t) : 2N

Search Neighbourhood

Current Solution

N Number of Variables

L Number of Labels

Inference using Moves

Graph Cut based move making algorithms [Boykov, Veksler, Zabih. 01]

α-expansion transformation function

Inference using Moves

Label indicator functions

Co-occurence representation

Inference using Moves

Move Energy

Cost of current label set

Inference using Moves

Move Energy

Decomposition to α-dependent and α-independent part

α-independent α-dependent

Inference using Moves

Move Energy

Decomposition to α-dependent and α-independent part

Either α or all labels in the image after the move

Inference using Moves

Move Energy

submodular non-submodular

Inference

Move Energy

non-submodular

Non-submodular energy overestimated by E'(t) – E'(t) = E(t) for current solution – E'(t) E(t) for any other labelling

Inference

Move Energy

non-submodular

Non-submodular energy overestimated by E'(t) – E'(t) = E(t) for current solution – E'(t) E(t) for any other labelling

Occurrence - tight

Inference

Move Energy

non-submodular

Non-submodular energy overestimated by E'(t) – E'(t) = E(t) for current solution – E'(t) E(t) for any other labelling

Co-occurrence overestimation

Inference

Move Energy

non-submodular

Non-submodular energy overestimated by E'(t) – E'(t) = E(t) for current solution – E'(t) E(t) for any other labelling

General case [See the paper]

Inference

Move Energy

non-submodular

Non-submodular energy overestimated by E'(t) – E'(t) = E(t) for current solution – E'(t) E(t) for any other labelling

Quadratic representation

Application: Object Segmentation

Standard MRF model for Object Segmentation

Label based Costs

Cost defined over the assigned labels L(x)

Training of label based potentials

Indicator variables for occurrence of each label

Label set costs

Approximated by 2nd order representation

Experiments

• Methods – Segment CRF

– Segment CRF + Co-occurrence Potential

– Associative HCRF [Ladický et al. ‘09]

– Associative HCRF + Co-occurrence Potential

• Datasets

MSRC-21

• Number of Images: 591

• Number of Classes: 21

• Training Set: 50%

• Test Set: 50%

PASCAL VOC 2009

• Number of Images: 1499

• Number of Classes: 21

• Training Set: 50%

• Test Set: 50%

MSRC - Qualitative

VOC 2010-Qualitative

Quantitative Results

MSRC-21

PASCAL VOC 2009

• Incorporated label based potentials in CRFs

• Proposed feasible inference

• Open questions

– Optimal training method for co-occurence

– Bounds of graph cut based inference

• Questions ?

Summary and further work