A General Framework for Tracking Multiple People from a Moving Camera
description
Transcript of A General Framework for Tracking Multiple People from a Moving Camera
1
A General Framework for Tracking MultiplePeople from a Moving Camera
Wongun Choi, Caroline Pantofaru, Silvio Savarese
IEEE TRANSACTION ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, July 2013
2
Overview• Motivation• Related Work• Introduction• Proposed Method• Experiment Result• Conclusion
3
Motivation1.Final goal is tracking multiple people from a moving camera, including outdoor video scene and indoor video scene. 2.There are some challenge to solve:1) People have variety poses 2) Complexity of the motion patterns of multiple peoplein the same scene3) Changeable scene and illumination effect
4
Related work1. Tracking by online learning :Learning appearance model [10],[5],[34],[7],[26]Color histogram and mean shift [10]2. Tracking with a moving camera:Probabilistic framework multiple detectors [42],[43]Stereo and graphical model [12],[13][5] S. Avidan. Ensemble tracking. In PAMI, 2007[7] C. Bibby and I. Reid. Robust real-time visual tracking using pixelwise posteriors. In ECCV, 2008[10] D. Comaniciu and P. Meer. Mean shift:Arobust approach toward feature space analysis. In PAMI, 2002.[12] A. Ess, B. Leibe, K. Schindler, and L. van Gool. A mobile vision system for robust multi-person tracking. In CVPR, 2008.[13] A. Ess, B. Leibe, K. Schindler, and L. van Gool. Robust multi person tracking from a mobile platform. PAMI, 2009.[26] S. Kwak, W. Nam, B. Han, and J. Han. Learning occlusion with likelihoods for visual tracking. In ICCV, 2011[34] D. Ramanan, D. Forsyth, and A. Zisserman. Tracking people by learning their appearance. PAMI, Jan. 2007.[42] C. Wojek, S. Walk, S. Roth, and B. Schiele. Monocular 3d scene understanding with explicit occlusion reasoning. In CVPR, 2011.[43] C. Wojek, S. Walk, and B. Schiele. Multi-cue onboard pedestrian detection. In CVPR, 2009
5
Introduction(1)To solve these issues proposed method:1) People have variety poses :Fusing multiple person detection method and some observations
2) Complexity of the motion patterns of multiple people in the same sceneBuild a motion model that capture the interaction between targets
3) Changeable scene and illumination effectProposed a novel 3D model which explain the process of video generation
6
Introduction(2)Observation cues:
7
Introduction(3)Build 3D Model:
8
Introduction(4)Particle filter:1.Def: posterior density estimation algorithms that estimate the posterior density of the state-space by directly implementing the Bayesian recursion equations
2.Using sampling for generating state distribution of posterior and using resamplingTo reconstruct the new distribution
9
Introduction(5)Reversible-Jump Markov Chain Monte Carlo(RJMCMC):A class of algorithms for sampling from probability distributions based on constructing a Markov chain which allows changes of the dimensionality of the state
10
Proposed MethodSystem overview:1.Using observation cues to generate detection hypotheses and an observationModel2.Build a motion model account both for people’s unexpected motions as well as interactions between people3. Sampling procedure for the RJ-MCMC tracker which include evaluation(resampling)
11
Proposed MethodModel representation:
12
Proposed Method Using as random variables and model their relationship by joint
posterior probability The tracking problem can formulate as finding maximum-a-posteri (MAP)
(a) Observation likelihood(b) Motion model (transition model)(c) Posterior at time t-1
13
Proposed Method(a) Observation likelihood:
Camera projection function:
14
Proposed MethodTarget Observation Likelihood:
j:detectorswj: weight for detector j
15
Proposed MethodTarget Observation Likelihood: 1) pedestrian detector 2) upper body detector 3) target-specific detector based on appearance model 4) detector based on upper-body shape from depth 5) face detector 6) skin detector 7) motion detector
16
Proposed MethodPedestrian and upper body detector using HOG:
17
Proposed MethodFace detector using OpenCV Viola-jones face detector:
18
Proposed MethodSkin color detector using threshold on HSV color space:
19
Proposed MethodDepth shape detector using world coordinate system:
20
Proposed MethodMotion detector by project motion points into image plane and threshold:
21
Proposed MethodGeometric Feature likelihood by interest point detector:
is the uniform distribution
22
Proposed MethodMotion prior:
23
Proposed MethodCamera motion prior:
24
Proposed MethodTarget motion prior:
25
Proposed MethodExistence prior:
26
Proposed Method
Motion prior:IndependentInteracting
27
Proposed MethodIndependent Motion prior :
update
28
Proposed MethodInteracting Motion prior:
Mode variable
29
Proposed MethodRepulsion:
Group motion:
Repulsion force
30
Proposed MethodTracking by Reversible Jump Markov Chain Monte Carlo Particle filtering: Sampling:
Convert posterior problem:
31
Experimental result Using ETH dataset [12]
Video frame rate ~14Hz
Resolution 640*480 pixels
32
Experimental result Single frame detection accuracy via overlap ratio between the ground truth bounding
box and tracked bounding box.
33
Experimental result
34
Conclusion
• Combine probabilistic model with joint variables– Relationship between the camera, targets’ and geometric features
• Combine multiple cues– adaptable to different sensor configurations and different
environments• Allowing people to interact• Automatically detecting people