Multi-View Super Vector for Action Recognition
description
Transcript of Multi-View Super Vector for Action Recognition
![Page 1: Multi-View Super Vector for Action Recognition](https://reader036.fdocuments.in/reader036/viewer/2022062518/568144d8550346895db1a3d2/html5/thumbnails/1.jpg)
Multi-View Super Vector for Action Recognition
Shenzhen Institutes of Advanced Technology, CASChinese University of Hong Kong
Zhuowei Cai Yu QiaoXiaojiang PengLimin Wang
![Page 2: Multi-View Super Vector for Action Recognition](https://reader036.fdocuments.in/reader036/viewer/2022062518/568144d8550346895db1a3d2/html5/thumbnails/2.jpg)
Content
• Motivation
• M-PCCA model & MVSV representation
• Experimental Results
![Page 3: Multi-View Super Vector for Action Recognition](https://reader036.fdocuments.in/reader036/viewer/2022062518/568144d8550346895db1a3d2/html5/thumbnails/3.jpg)
Content
• Motivation
• M-PCCA model & MVSV representation
• Experimental Results
![Page 4: Multi-View Super Vector for Action Recognition](https://reader036.fdocuments.in/reader036/viewer/2022062518/568144d8550346895db1a3d2/html5/thumbnails/4.jpg)
Actions in Video Clips can be captured by ...
HOG HOF MBHx/MBHy
*Video from chalearn looking at people chanllenge
static feature dynamic features
![Page 5: Multi-View Super Vector for Action Recognition](https://reader036.fdocuments.in/reader036/viewer/2022062518/568144d8550346895db1a3d2/html5/thumbnails/5.jpg)
![Page 6: Multi-View Super Vector for Action Recognition](https://reader036.fdocuments.in/reader036/viewer/2022062518/568144d8550346895db1a3d2/html5/thumbnails/6.jpg)
Feature Fusion - Concatenation
HOG HOF
Concatenation before GMM : HOG + HOF
Defect : features presumed to be strongly correlated
![Page 7: Multi-View Super Vector for Action Recognition](https://reader036.fdocuments.in/reader036/viewer/2022062518/568144d8550346895db1a3d2/html5/thumbnails/7.jpg)
Feature Fusion - Kernel Average
HOG HOF
Concatenation after GMM : CodeHOG + CodeHOF
Defect : features presumed to be mutually independent
CodeHOG CodeHOF
![Page 8: Multi-View Super Vector for Action Recognition](https://reader036.fdocuments.in/reader036/viewer/2022062518/568144d8550346895db1a3d2/html5/thumbnails/8.jpg)
Decomposition
HOG HOF
HOG-Specific HOF-SpecificHOG/HOF-Shared
![Page 9: Multi-View Super Vector for Action Recognition](https://reader036.fdocuments.in/reader036/viewer/2022062518/568144d8550346895db1a3d2/html5/thumbnails/9.jpg)
HOG-Specific HOF-SpecificHOG/HOF-Shared
Merit : features are decomposed into relatively independent components
![Page 10: Multi-View Super Vector for Action Recognition](https://reader036.fdocuments.in/reader036/viewer/2022062518/568144d8550346895db1a3d2/html5/thumbnails/10.jpg)
HOG-Specific HOF-SpecificHOG/HOF-Shared
M-PCCAMixture of Probabilistic Canonical Correlation Analyzers
![Page 11: Multi-View Super Vector for Action Recognition](https://reader036.fdocuments.in/reader036/viewer/2022062518/568144d8550346895db1a3d2/html5/thumbnails/11.jpg)
Content
• Motivation
• M-PCCA model & MVSV representation
• Experimental Results
![Page 12: Multi-View Super Vector for Action Recognition](https://reader036.fdocuments.in/reader036/viewer/2022062518/568144d8550346895db1a3d2/html5/thumbnails/12.jpg)
Mixture of Probabilistic Models
X = Wx Z + Zx
Y = Wy Z + ZyZ ~ N(0, I), Zx ~ N(μx, Φx), Zy ~ N(μy, Φy).
Latent Variable Models
V = W Z + Zv
Probabilistic Canonical Correlation Analyzer. *B. Francis, M. Jordan; K. Arto, S. Kaski
Z ~ N(0, I), Zv ~ N(μ, Φ)
Probabilistic Principal Component Analysis: Φ = σI. *M. Tipping, C. Bishop
Probabilistic Factor Analysis: Φ is diagonal.
Mixture Version: M-PPCA *M. Tipping, C. Bishop, M-FA *G. Zoubin, G. Hinton
![Page 13: Multi-View Super Vector for Action Recognition](https://reader036.fdocuments.in/reader036/viewer/2022062518/568144d8550346895db1a3d2/html5/thumbnails/13.jpg)
M-PCCA
EM Learning Algorithm
![Page 14: Multi-View Super Vector for Action Recognition](https://reader036.fdocuments.in/reader036/viewer/2022062518/568144d8550346895db1a3d2/html5/thumbnails/14.jpg)
M-PCCA
![Page 15: Multi-View Super Vector for Action Recognition](https://reader036.fdocuments.in/reader036/viewer/2022062518/568144d8550346895db1a3d2/html5/thumbnails/15.jpg)
M-PCCA
X = Wx Z + Zx
Y = Wy Z + Zy
![Page 16: Multi-View Super Vector for Action Recognition](https://reader036.fdocuments.in/reader036/viewer/2022062518/568144d8550346895db1a3d2/html5/thumbnails/16.jpg)
M-PCCA
Z2
Z1
Z3
= Shared Part11 , 1{ } {[ , ] ( )}K T T K
k k x y k i k i k ki
Z z W W v
![Page 17: Multi-View Super Vector for Action Recognition](https://reader036.fdocuments.in/reader036/viewer/2022062518/568144d8550346895db1a3d2/html5/thumbnails/17.jpg)
M-PCCA
![Page 18: Multi-View Super Vector for Action Recognition](https://reader036.fdocuments.in/reader036/viewer/2022062518/568144d8550346895db1a3d2/html5/thumbnails/18.jpg)
M-PCCA
gx
gy
= Private Part
1
( ) ( ){ , }Kkk k
x x
E L E L
1
( ) ( ){ , }Kkk k
y y
E L E L
![Page 19: Multi-View Super Vector for Action Recognition](https://reader036.fdocuments.in/reader036/viewer/2022062518/568144d8550346895db1a3d2/html5/thumbnails/19.jpg)
M-PCCA
gx gy Private Part
Shared Part +
= Multi-View Super Vector
![Page 20: Multi-View Super Vector for Action Recognition](https://reader036.fdocuments.in/reader036/viewer/2022062518/568144d8550346895db1a3d2/html5/thumbnails/20.jpg)
Content
• Motivation
• M-PCCA model & MVSV representation
• Experimental Results
![Page 21: Multi-View Super Vector for Action Recognition](https://reader036.fdocuments.in/reader036/viewer/2022062518/568144d8550346895db1a3d2/html5/thumbnails/21.jpg)
Performance w.r.t number of Components Performance w.r.t Latent Dimension
MVSV with SVM classifier on HMDB51with various configurations
![Page 22: Multi-View Super Vector for Action Recognition](https://reader036.fdocuments.in/reader036/viewer/2022062518/568144d8550346895db1a3d2/html5/thumbnails/22.jpg)
Results#components = 256, dimension = 45
HMDB51 FV VLAD MVSV
Fusion d-level k-level d-level k-level k-level
HOG+MBH 50.9% 50.4% 47.0% 48.5% 52.1%
HOG+HOF 47.0% 48.3% 44.4% 47.7% 48.9%
MBHx+MBHy 49.2% 49.1% 45.2% 47.0% 51.1%
Combine 52.4% 53.2% 51.5% 52.6% 55.9%
UCF101 FV VLAD MVSV
Fusion d-level k-level d-level k-level k-level
HOG+HOF 76.1% 77.7% 75.7% 77.5% 78.9%
MBHx+MBHy 78.9% 78.7% 75.6% 76.3% 80.9%
Combine 81.1% 81.9% 80.6% 81.0% 83.5%
![Page 23: Multi-View Super Vector for Action Recognition](https://reader036.fdocuments.in/reader036/viewer/2022062518/568144d8550346895db1a3d2/html5/thumbnails/23.jpg)
X
Fusion
Y
X
Y
Descriptor-level
(linear) Kernel-level
GMM
GMM
Fusion
GMM
SVM Score
SVM Score
X
Y
MVSV
M-PCCA Fusion SVM Score
X
Y
Score-levelGMM
GMM
Fusion Score
SVM Score
SVM Score