Discussion of Pictorial Structures Pedro Felzenszwalb Daniel Huttenlocher Sicily Workshop September,...

Discussion ofPictorial Structures

Pedro FelzenszwalbDaniel Huttenlocher

Sicily WorkshopSeptember, 2006

2

What are Pictorial Structures?

Local appearance– Part models– Parts feature detection

Global geometry– Not necessarily fully

connected graph

Joint optimization– Combine appearance

and geometry withouthard constraints• “Stretch and fit”• Qualitative

3

Pictorial Structure Models

Parts have match quality at each location– Location in a configuration space

– No feature detection

Maps for parts combined together into overall quality map– According to underlying graph structure

4

A History of Pictorial Structures

Fischler and Elschlager original 1973 paper

Burl, Weber and Perona ECCV 1998– Probabilistic formulation

– Full joint Gaussian spatial model

– Computational challenges led to feature-based

Felzenszwalb and Huttenlocher CVPR 2000– Explicit revisiting of FE73 for trees, probabilistic

– Efficient algorithms using distance transforms

Crandall et al CVPR 2005, ECCV 2006– Low tree-width graph structures, unsupervised

5

Matching Pictorial Structures

Cost map for each part

Distance transform (soft max) using spatial model

Shift and combine– Localize root then recursively other parts

6

Learning Models

Automatically determine which spatial relationships to represent [FH03]

Weakly supervised learning [CH06]– Learn part appearance and geometric relations

simultaneously

– No labeling of part locations

– Use large number of patches, similar to Ullman

– Better detection accuracy than strongly supervised

Car (rear) star topology

7

Parts as Context

No part detected without using context provided by other parts– Detect overall configuration composed of parts

in a spatial arrangement

Allows for weak evidence for a part– Unlike feature detection

Combination of matches can constrain pose

In contrast to scene-level context – More spatial regularity

8

Factored Models

For n parts in fixed arrangement with k templates per part– Exponential number of possibilities, O(kn)

For variable arrangement, another exponential factor

Important both for representation and algorithmic efficiency

Pictorial structures takes particular advantage of this factoring

9

Closely Related Work

Ioffe and Forsyth, Ramanan and Forsyth human body pose– Part detection but very “dense” part locations

Constellation models– Fergus, Perona, Zisserman and others

– Hard feature detection in contrast with BWP98 soft feature matching

Amit’s patch models– No assumption of independent part appearance

Fergus and Zisserman star models

10

What’s Important

No decisions until the end– No feature detection

• Quality maps or likelihoods

– No hard geometric constraints• Deformation costs or priors

Efficient algorithms– Dynamic programming critical or can’t get

away without making intermediate decisions

– Not applicable to all problems, need good factorizations of geometry and appearance

11

Some Pros

Good for categorical object recognition– Qualitative descriptions of appearance

– Factoring variability in appearance and geometry

Deals well with occlusion– In contrast to hard feature detection

Weakly supervised learning algorithms

Sampling as way of dealing with models that don’t factor – more Saturday

12

Some Cons/Limitations

Most applicable to 2D objects defined by relatively small number of parts

Unclear how to extend to large number of transformation parameters per part– Explicit representation grows exponentially

No known way of using to index into model databases

13

Role of Spatial Constraints

For k-fans, spatial information substantially improves detection accuracy– However, limited by relatively small number of

parts compared to features in a bag

General question

Discussion of Pictorial Structures Pedro Felzenszwalb Daniel Huttenlocher Sicily Workshop September,...

Documents

Transcript of Discussion of Pictorial Structures Pedro Felzenszwalb Daniel Huttenlocher Sicily Workshop September,...