Optimization & Learning for Registration of Moving Dynamic Textures Junzhou Huang 1, Xiaolei Huang...

Optimization & Learning for Registration of Moving Dynamic Textures

Junzhou Huang1, Xiaolei Huang2, Dimitris Metaxas1

Rutgers University1, Lehigh University2

Outline

Background Goals & Problems Related Work Proposed Method Experimental Results Discussion & Conclusion

Background

Dynamic Textures (DT) static camera, exhibiting certain stationary properties

Moving Dynamic Textures (MDT) dynamic textures captured by a moving camera

DT [Kwatra et al. SIGGRAPH’03] MDT [Fitzgibbbon ICCV’01]

Background

Video registration Required by many video analysis applications

Traditional assumption Static, rigid, brightness constancy Bergen et al. ECCV’92, Black et al. ICCV’93

Relaxing rigidity assumption Dynamic textures Fitzgibbon, ICCV’01; Doretto et al. IJCV’03; Yuan et al.

ECCV’04; Chan et al. NIPS’05; Vidal et al. CVPR’05; Lin et al. PAMI’07; Rav-Acha et al. Dynamic Vision Workshop at ICCV’05; Vidal et al. ICCV’07

Our Goal

Registration of Moving Dynamic Textures Recover the camera motion and register image

frames in the MDT image sequence

Translation to the left Translation to the right

Complex Optimization Problem

Complex optimization W.r.t. camera motion, dynamic texture model Chicken-and-Egg Problem

Challenges About the mean images About Linear Dynamic System (LDS) model About the camera motion

Related Works

Fitzgibbon, ICCV’01 Pioneering attempt Stochastic rigidity Non-linear optimization

Vidal et al. CVPR’05 Time varying LDS model Static assumption in small time windows Simple and general framework but often under-

estimate motion

Formulation Registration of MDT

I(t), the video frame , camera motion parameters y0 , the desired average image of the video

y(t), appearance of DT x(t), dynamics of DT

)(t

Generative Model

x(t-1) x(t) x(t+1)

y(t-1) y(t) y(t+1)

I (t-1) I (t) I (t+1)

y0

W(t-1) W(t) W(t+1)

Generative image model for a MDT

First Observation

Good registration A good registration according to the accurate

camera motion should simplify the dynamic texture model while preserving all useful information

Used by Fitzgibbon, ICCV’01, Minimizing the entropy function of an auto regressive process

Used by Vidal, CVPR’05, optimizing time varying LDS model by optimizing piecewise LDS model

Second Observation

Good registration A good registration according to the accurate

camera motion should lead to a sharp average image whose statistics of derivative filters are similar to those of the input image frames.

Statistics of derivative filters in images Student-t distribution/heavy-tailed image priors Huang et al. CVPR’99, Roth et al. CVPR’05

Prior Models

The Average Image Prior The Motion Prior The Dynamics Prior

Average Image Priors Student-t distribution

Model parameters / contrastive divergence method

(a) Before registration, (b) In the middle of registration (c) After registration

Motion / Dynamics Priors Gaussian Perturbation (Motion)

Uncertainty in the motion is modeled by a Gaussian perturbation about the mean estimation M0 with the covariance matrix S ( a diagonal matrix)

Motivated by the work [Pickup et al. NIPS’06] GPDM / MAR model (Dynamic)

Marginalizing over all possible mappings between appearance and dynamics

Motivated by the work [Wang et al. NIPS’05], [Moon et al. CVPR’06]

Joint Optimization Generative image model

Optimization Final marginal likelihood

Scaled conjugate gradients algorithm (SCG)

Procedures

Obtaining image derivative prior model Dividing the long sequence into many short image

sequences Initialization for video registration Performing model optimization with the proposed

prior models until model convergence. With estimated y0, Y and X, the camera motion is then

obtained iteratively by Maximum Likelihood estimation using SCG optimization

Obtaining Data Three DT video sequences

DT data [Kwatra et al. SIGGRAPH’03] Synthesized MDT video sequence

60 frames each, no motion from 1st to 20th frame and from 41st to 60th

Camera motion with speed [1, 0] from 21st to 40th

Grass MDT Video

The average image

(a) One frame, (b) the average image after registration, (c) average image before registration

Grass MDT Video

The statistics of derivative filter responses

-60 -40 -20 0 20 40 600

0.05

0.1

0.15

Gradient

Pro

ba

bili

ty d

istr

ibu

tion

Input ImagesAfter RegistrationBefore Registration

Evaluation / Comparison

False Estimation Fraction

Comparison with two classical methods Hybrid method, [Bergen et al. ECCV’92] [Black et

al. ICCV’93] Vidal’method, [Vidal et al. CVPR’05]

Waterfall MDT Video Motion estimation

(a) Ground truth, (b) by hybrid method, (c) by Vidal’s, (d) by our method

Waterfall MDT Video The average Image and its statistics

The average image and its derivative filter response distribution after registration by: (a) our method, (b) Vidal’s method, (c) hybrid method

FEF Comparison

On three synthesized MDT video

Experiment on real MDT Video Moving flower bed video 554 frames total Ground truth motion 110

pixels Estimation 104.52 pixels

( FEF 4.98%)

Conclusions Proposed:

Powerful priors for MDT registration Solution for:

Camera motion, Average image of video, Dynamic texture model

What have we learned? Correct registration simplifies DT model while

preserving useful information Better registration leads to sharper average image

Thank you !

Future work

More complex camera motion Different metrics for performance evaluation Multiple dynamic texture segmentation

Experiment on real MDT Video Moving flower bed video Our method

554 frames total Ground truth motion 110 pixels Estimation 104.52 pixels ( FEF

4.98%) Vidal’s method

250 frames [Vidal et al. CVPR’05]

Ground truth motion 85 pixels Estimation 60 pixels (FEF

29.41%)

Optimization & Learning for Registration of Moving Dynamic Textures Junzhou Huang 1, Xiaolei Huang...

Documents

Transcript of Optimization & Learning for Registration of Moving Dynamic Textures Junzhou Huang 1, Xiaolei Huang...