Optimization & Learning for Registration of Moving Dynamic Textures Junzhou Huang 1, Xiaolei Huang...
-
Upload
ross-cross -
Category
Documents
-
view
215 -
download
0
Transcript of Optimization & Learning for Registration of Moving Dynamic Textures Junzhou Huang 1, Xiaolei Huang...
Optimization & Learning for Registration of Moving Dynamic Textures
Junzhou Huang1, Xiaolei Huang2, Dimitris Metaxas1
Rutgers University1, Lehigh University2
Outline
Background Goals & Problems Related Work Proposed Method Experimental Results Discussion & Conclusion
Background
Dynamic Textures (DT) static camera, exhibiting certain stationary properties
Moving Dynamic Textures (MDT) dynamic textures captured by a moving camera
DT [Kwatra et al. SIGGRAPH’03] MDT [Fitzgibbbon ICCV’01]
Background
Video registration Required by many video analysis applications
Traditional assumption Static, rigid, brightness constancy Bergen et al. ECCV’92, Black et al. ICCV’93
Relaxing rigidity assumption Dynamic textures Fitzgibbon, ICCV’01; Doretto et al. IJCV’03; Yuan et al.
ECCV’04; Chan et al. NIPS’05; Vidal et al. CVPR’05; Lin et al. PAMI’07; Rav-Acha et al. Dynamic Vision Workshop at ICCV’05; Vidal et al. ICCV’07
Our Goal
Registration of Moving Dynamic Textures Recover the camera motion and register image
frames in the MDT image sequence
Translation to the left Translation to the right
Complex Optimization Problem
Complex optimization W.r.t. camera motion, dynamic texture model Chicken-and-Egg Problem
Challenges About the mean images About Linear Dynamic System (LDS) model About the camera motion
Related Works
Fitzgibbon, ICCV’01 Pioneering attempt Stochastic rigidity Non-linear optimization
Vidal et al. CVPR’05 Time varying LDS model Static assumption in small time windows Simple and general framework but often under-
estimate motion
Formulation Registration of MDT
I(t), the video frame , camera motion parameters y0 , the desired average image of the video
y(t), appearance of DT x(t), dynamics of DT
)(t
Generative Model
x(t-1) x(t) x(t+1)
y(t-1) y(t) y(t+1)
I (t-1) I (t) I (t+1)
y0
W(t-1) W(t) W(t+1)
Generative image model for a MDT
First Observation
Good registration A good registration according to the accurate
camera motion should simplify the dynamic texture model while preserving all useful information
Used by Fitzgibbon, ICCV’01, Minimizing the entropy function of an auto regressive process
Used by Vidal, CVPR’05, optimizing time varying LDS model by optimizing piecewise LDS model
Second Observation
Good registration A good registration according to the accurate
camera motion should lead to a sharp average image whose statistics of derivative filters are similar to those of the input image frames.
Statistics of derivative filters in images Student-t distribution/heavy-tailed image priors Huang et al. CVPR’99, Roth et al. CVPR’05
Prior Models
The Average Image Prior The Motion Prior The Dynamics Prior
Average Image Priors Student-t distribution
Model parameters / contrastive divergence method
(a) Before registration, (b) In the middle of registration (c) After registration
Motion / Dynamics Priors Gaussian Perturbation (Motion)
Uncertainty in the motion is modeled by a Gaussian perturbation about the mean estimation M0 with the covariance matrix S ( a diagonal matrix)
Motivated by the work [Pickup et al. NIPS’06] GPDM / MAR model (Dynamic)
Marginalizing over all possible mappings between appearance and dynamics
Motivated by the work [Wang et al. NIPS’05], [Moon et al. CVPR’06]
Joint Optimization Generative image model
Optimization Final marginal likelihood
Scaled conjugate gradients algorithm (SCG)
Procedures
Obtaining image derivative prior model Dividing the long sequence into many short image
sequences Initialization for video registration Performing model optimization with the proposed
prior models until model convergence. With estimated y0, Y and X, the camera motion is then
obtained iteratively by Maximum Likelihood estimation using SCG optimization
Obtaining Data Three DT video sequences
DT data [Kwatra et al. SIGGRAPH’03] Synthesized MDT video sequence
60 frames each, no motion from 1st to 20th frame and from 41st to 60th
Camera motion with speed [1, 0] from 21st to 40th
Grass MDT Video
The average image
(a) One frame, (b) the average image after registration, (c) average image before registration
Grass MDT Video
The statistics of derivative filter responses
-60 -40 -20 0 20 40 600
0.05
0.1
0.15
Gradient
Pro
ba
bili
ty d
istr
ibu
tion
Input ImagesAfter RegistrationBefore Registration
Evaluation / Comparison
False Estimation Fraction
Comparison with two classical methods Hybrid method, [Bergen et al. ECCV’92] [Black et
al. ICCV’93] Vidal’method, [Vidal et al. CVPR’05]
Waterfall MDT Video Motion estimation
(a) Ground truth, (b) by hybrid method, (c) by Vidal’s, (d) by our method
Waterfall MDT Video The average Image and its statistics
The average image and its derivative filter response distribution after registration by: (a) our method, (b) Vidal’s method, (c) hybrid method
FEF Comparison
On three synthesized MDT video
Experiment on real MDT Video Moving flower bed video 554 frames total Ground truth motion 110
pixels Estimation 104.52 pixels
( FEF 4.98%)
Conclusions Proposed:
Powerful priors for MDT registration Solution for:
Camera motion, Average image of video, Dynamic texture model
What have we learned? Correct registration simplifies DT model while
preserving useful information Better registration leads to sharper average image
Thank you !
Future work
More complex camera motion Different metrics for performance evaluation Multiple dynamic texture segmentation
Experiment on real MDT Video Moving flower bed video Our method
554 frames total Ground truth motion 110 pixels Estimation 104.52 pixels ( FEF
4.98%) Vidal’s method
250 frames [Vidal et al. CVPR’05]
Ground truth motion 85 pixels Estimation 60 pixels (FEF
29.41%)