An Ef˜cient Branch-and-Bound Algorithm for Optimal Human ...

An Ef�cient Branch-and-Bound Algorithm for Optimal Human Pose Estimation

Dept. of Electrical Engineering and Computer Science, University of Michigan at Ann Arbor, USASilvio SavareseMurali Telaprolu

VISION LAB

Min Sun Honglak Lee

1. OverviewGOAL: Propose an e�cient and exact inference algorithm based on branch-and-bound (BB) to solve the human pose estimation problem onloopy graphical models

Motivation:- Cast human pose estimation problem as MAP-MRFs inference problem- Solving MAP inference on general MRFs is challenging

Pros: e�cient inference by dynamicprogramming ( ).Cons: approximated model; common misclassi�cation errors.

Pros: more interactions between parts.Cons: exact inference NP-hard; require approximated inference [1] or reduced state space.

Contributions:- Solve loopy models with large # of part hypotheses exactly.- E�ciency is provably guaranteed when # of nodes is small.- Up to 74 times faster than competing techniques [2] !- Enable more accurate results than non-loopy models [3] !

Key Intuitions:- Flexible bound by relaxing the loopy model into a mixture of star models.- Novel data structure (BMT) and an e�cient search routine (OBMS) to signi�cantly reduce the time complexity for calculating the bound in each branch of the BB search

Given the model ,∑∑∈∈

+=εij

jip

ijNi

iui hhhf ),()(θ)(h; θθ

search for the best body parts locations .

Tree Model [3,8,11]Models:Inferred State

VariablesTree EdgesLoopy Edges

Loopy Model [6,7,9,10]

θp

θu

θ)(h;maxh hMAP f=

)O(H2

- Source code available online (http://www.eecs.umich.edu/vision/BBproj.html).

from to (Q<<H).)O(H2 H)O(Qlog2

2. Branch-and-Bound Basic- Bound: of

- Branching: φ=∩∪=Η 2121 HH;HH NNNNN

( ))Η(LB),Η(UB NN

- Stopping criteria: )Η(LB)Η(UB NN =

NΗ

1HN2HN

11HN12HN

)θ;h( MAPf

3. Flexible Bound

Torso θu

βHead Upper

Arm

)),ˆ(max)((β),θ(h; û ∑∑

∈Η∈

∈

+=i

jjNj

ijjihNi

iui

R hhhf βθ

),(),(),( s.t. jip

ijjiijijji hhhhhh θββ =+

Mixture of Star Models:(Time complexity ) )O(H2

β),θ(h;max)β,θ;(UB uh

uN

f RN θ);(hMAPf≥=Η Η∈

Bounds:

(Constant time) θ);(hθ));((h)β,θ;(LB MAP*uNN ff ≤Η=Η

Tightness of the Bound:

- controls the tightness of the bound.β)β,θ;(LB)β,θ;(UB uu

NN Η−Η=Primal-Dual-Gap (PDG)

- PDG reduced by solving using Message Passing [1].β

∑ ∑∈ ∈

Η∈Η∈ +=ΗNi Nj

ijjihiuihN

ijjii

hhh )),ˆ(max)((maxβ),θ;UB( û βθ

10 8 -1 5

10 5

10

6 7 -1 5

7 5

7

4. E�cient BoundNaive Bound: time complexity ; H is #states.)O(H2

Branch-Max-Tree (BMT): Similar to [5]

),(max ][Amax ji

jj ijjihh

h hhhjjβΗ∈Η∈ =- 1D array:

- E�cient data structure: given A =[10 8 -1 5].

Building time: O(H) Query time: O(1)

Opportunity Branch-Max-Search (OBMS)

Max(A[0..3]) = 10

Max(A[0,1]) = 10

Max(A[2,3]) = 510 8 -1 5

10 5

10

BMT

- Time complexity: O(H).

∑∈

Η∈ Η=ΗNi

NiiN h );;(max)β,θ;(UBNh

u λ- Re-de�ne

- 1D array: );(max ]A[max ii Niihh hhiii

Η= Η∈Η∈ λ

2)bound for NN Η⊂Η1) only a few statescompeting for max )ˆ;(][A ]A[ i Niii hhh Η=≥ λ

6 8 -1 5

8 5

8

Given, A[0] was the max.

- Opportunity Search by updating BMT : ][Amax iî

hih Η∈

Given A =[10 8 -1 5] and A=[6 7 -2 4].ˆ

Update BMT by A[0]Îs A[0] the max? Noˆ

Update BMT by A[1]Îs A[1] the max? Yesˆ

- Time complexity: , Q<<H.H)O(Qlog2

6. Experiment ResultsHuman Pose Estimation (HPE): extending a tree-model (CPS) [3] to a fully-connected model.

Video Pose Estimation: Solving Stretchable Model (SM) [4]

)O(Hq

TorsoUpper ArmsLower ArmsHead

Ours Sapp et al. Ours Sapp et al. Ours Sapp et al.

Bu�y

Stic

kmen

Vide

oPos

e2

10^−1 10^0 10^1 10^2 10^3 10^410^−1

10^0

10^1

10^2

10^3

10^4

our time(sec)

CP ti

me(

sec)

Hard Problems

15 20 25 30 35 40

20

40

60

80

Pixel Error Threshold

Accu

racy

%

Elbow, OursElbow, [4]Wrist, OursWrist, [4]

20

40

60

80

Pixel Error Threshold20 25 30 35 40

Accu

racy

%

15

20

40

60

80

15Pixel Error Threshold

20 25 30 35 40

/

Baseline Methods : CP [2] (q is max cluster size).

- WithOBMS74 times fasterthan CP [2]

- Solve 13 out of 18 sequences within 20 minutes.

References[1] A. Globerson and T. Jaakkola.Fixing max-product: Convergent message passing algorithms for MAP LP-relaxations. In NIP’08.

[2] D. Sontag, T. Meltzer, A. Globerson, Y. Weiss, and T. Jaakkola. Tightening LP relaxations for MAP using message-passing. In UAI’08.

[3] B. Sapp, A. Toshev, and B. Taskar. Cascaded models for articulated pose estimation. In ECCV’10.

[5] D. Harel and R. E. Tarjan. Fast algorithms for �nding nearest common ancestors. SIAM Journal on Computing, 1984.

[4] B. Sapp, D. Weiss, and B. Taskar. Parsing human motion with stretchable models. In CVPR’11.

[6] M. Sun, M. Telaprolu, H. Lee, and S. Savarese. E�cient and Exact MAP-MRF Inference using Branch and Bound. In AISTATS’12.

[7] M. Sun, M. Telaprolu, H. Lee, and S. Savarese. An E�cient Branch-and-Bound Algorithm for Optimal Human Pose Estimation. In CVPR’12.

[8] P. Felzenszwalb and D. P. Huttenlocher. Pictorial Structures for Object Recognition. In IJCV’05.

[9] D. Tran and D. Forsyth. Improved human parsing with a full relational model. In ECCV’10.

[10] T. P. Tian and S. Sclaro�. Fast globally optimal 2D human detection with loopy graph models. In CVPR’10.

[11] Y. Yang and D. Ramanan. Articulated pose estimation using �exible mixtures of parts. In CVPR’11.

Acknowledgement

[12] W. Choi, K. Shahid, and S. Savarese. Learning Context for Collective Activity Recognition. In CVPR’11.

We acknowledge the supportof the ONR grant N000141110389, ARO grant W911NF-09-1-0310.

[email protected]

[email protected]@eecs.umich.edu

[email protected]

7. Conclusion- Flexible bound for MRFs with di�erent structures (HPE and SM).

- Faster than state-of-the-art exact Inference (CP) [2].- Loopy model achieves superior accuracy than tree model [3].

- BMT and OBMS are used to speed up from to .)O(H2 H)O(Qlog2

Guided Variable Selection (GVS): Split the selected variable,e.g., human pose: select a speci�c body part.

- Re-de�ne Upper Bound

- Re-de�ne Lower Bound

- De�ne Node-wise-Local-Primal-Dual-Gap (NLPDG)

- Properties of NLPDG

Variable State Ordering (VSO)- Split by domain knowledge. E.g.: Human Pose: location of body part.

-Select the Largest NLPDG

5. Branching Strategy

Primal-Dual-Gap (PDG) = sum of NLPDG

PDG = 0 implies the BB search is completed.

),ˆ(max)()(

);(β),θ(h;max)β,θ;(UB

ˆ

*uh

uN

∑

∑

∈Η∈

∈Η∈

+=

==Η

ijj

Njijjihi

uiii

Niii

RN

hhhh

hf

βθλ

λ

∑

∑

∈

∈

+=

==Η

iNjijjii

uii

NiiN

hhh

f

),()()h(ˆ

);h(ˆθ);(h)β,θ;(LB **u

βθλ

λ

)h(ˆ)()h( ***iiii h λλδ −=

0),(),(max)h(iN

**** ≥−= ∑∈j

ijjiijjihi hhhhj

ββδ

PDG)β,θ;(LB)β,θ;(UB)h( uu* =Η−Η=∑∈

NNNi

iδ

Non-Negative

An Ef˜cient Branch-and-Bound Algorithm for Optimal Human ...

Documents

Transcript of An Ef˜cient Branch-and-Bound Algorithm for Optimal Human ...