Fusion of Skeletal and Silhouette-based Features for Human Action Recognition with RGB-D Devices
-
Upload
francisco-paco-florez-revuelta -
Category
Technology
-
view
1.000 -
download
0
description
Transcript of Fusion of Skeletal and Silhouette-based Features for Human Action Recognition with RGB-D Devices
ALEXANDROS A. CHAARAOUI
JOSÉ R. PADILLA-LÓPEZ
FRANCISCO FLÓREZ-REVUELTASydney,
December 2, 2013
3rd IEEE
Workshop
on
Consumer
Depth
Cameras
for
Computer
Vision
(CDC4CV)
© A
.A. C
haa
rao
ui, J
.R. P
adill
a-L
óp
ez
and
F. F
lóre
z-R
evu
elta
(C
DC
4C
V’1
3)
1. Introduction2
Motivation:
Use of both skeleton and silhouette in previous works
Problems with skeleton: lack of precision or noisy
caused by occlusion caused by body parts or objects
Pick-up and Throw
© A
.A. C
haa
rao
ui, J
.R. P
adill
a-L
óp
ez
and
F. F
lóre
z-R
evu
elta
(C
DC
4C
V’1
3)
1. Introduction3
Motivation:
Use of both skeleton and silhouette in previous works
Problems with silhouettes: the only available
viewpoint is unfavourable for recognition
Tennis Serve Forward Punch Hammer
© A
.A. C
haa
rao
ui, J
.R. P
adill
a-L
óp
ez
and
F. F
lóre
z-R
evu
elta
(C
DC
4C
V’1
3)
1. Introduction4
Solution:
Fusing different features that complement each other:
skeleton, RGB colour, silhouette (2D), volume (3D)…
In this work, we fuse skeleton and silhouette
© A
.A. C
haa
rao
ui, J
.R. P
adill
a-L
óp
ez
and
F. F
lóre
z-R
evu
elta
(C
DC
4C
V’1
3)
Concatenation of skeleton and silhouette
features
2. Fusion of skeleton and
silhouette5
Skeleton:
3D coordinates of the
joints
Silhouette:
Radial summary
16
18 19
17
15 14
5 6
7
4
1
8
10
12 11
13
9
2
3
20
© A
.A. C
haa
rao
ui, J
.R. P
adill
a-L
óp
ez
and
F. F
lóre
z-R
evu
elta
(C
DC
4C
V’1
3)
3. Classification method based on
a bag of key poses6
[1] A.A. Chaaraoui, P. Climent-Pérez and F. Flórez-Revuelta. Silhouette-based human action recognition using sequences of key poses. Pattern Recognition Letters, 34(15):1799-1807, 2013.
© A
.A. C
haa
rao
ui, J
.R. P
adill
a-L
óp
ez
and
F. F
lóre
z-R
evu
elta
(C
DC
4C
V’1
3)
3. Classification method based on
a bag of key poses7
[1] A.A. Chaaraoui, P. Climent-Pérez and F. Flórez-Revuelta. Silhouette-based human action recognition using sequences of key poses. Pattern Recognition Letters, 34(15):1799-1807, 2013.
© A
.A. C
haa
rao
ui, J
.R. P
adill
a-L
óp
ez
and
F. F
lóre
z-R
evu
elta
(C
DC
4C
V’1
3)
3. Classification method based on
a bag of key poses8
Sequence recognition
Transform a sequence into a sequences of key poses
using the bag of key poses
Sequence matching using dynamic time warping
[1] A.A. Chaaraoui, P. Climent-Pérez and F. Flórez-Revuelta. Silhouette-based human action recognition using sequences of key poses. Pattern Recognition Letters, 34(15):1799-1807, 2013.
© A
.A. C
haa
rao
ui, J
.R. P
adill
a-L
óp
ez
and
F. F
lóre
z-R
evu
elta
(C
DC
4C
V’1
3)
4. Experimentation9
Evaluation with the MSR Action3D dataset
© A
.A. C
haa
rao
ui, J
.R. P
adill
a-L
óp
ez
and
F. F
lóre
z-R
evu
elta
(C
DC
4C
V’1
3)
4. Experimentation10
Cross-subject validation as in [2]:
Training: actors 1, 3, 5, 7 and 9
Testing: actors 2, 4, 6, 8 and 10
[2] W. Li, Z. Zhang, and Z. Liu. Action recognition based on a bag of 3D points. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 9-14, 2010.
© A
.A. C
haa
rao
ui, J
.R. P
adill
a-L
óp
ez
and
F. F
lóre
z-R
evu
elta
(C
DC
4C
V’1
3)
4. Experimentation11
Cross-subject validation as in [2]:
[2] W. Li, Z. Zhang, and Z. Liu. Action recognition based on a bag of 3D points. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 9-14, 2010.
© A
.A. C
haa
rao
ui, J
.R. P
adill
a-L
óp
ez
and
F. F
lóre
z-R
evu
elta
(C
DC
4C
V’1
3)
4. Experimentation12
Confusion matrices for AS1:
a02 a03 a05 a06 a10 a13 a18 a20
a02 0,92 0,08
a03 1,00
a05 0,91 0,09
a06 0,09 0,73 0,18
a10 1,00
a13 1,00
a18 1,00
a20 0,14 0,07 0,29 0,50
Skeleton
a02 a03 a05 a06 a10 a13 a18 a20
a02 0,67 0,25 0,08
a03 0,58 0,42
a05 0,18 0,73 0,09
a06 0,18 0,82
a10 1,00
a13 0,07 0,93
a18 0,33 0,20 0,07 0,40
a20 0,07 0,14 0,07 0,14 0,57
Silhouette
a02 a03 a05 a06 a10 a13 a18 a20
a02 1,00
a03 1,00
a05 0,09 0,91
a06 0,18 0,73 0,09
a10 1,00
a13 1,00
a18 1,00
a20 0,29 0,71
Fusion
© A
.A. C
haa
rao
ui, J
.R. P
adill
a-L
óp
ez
and
F. F
lóre
z-R
evu
elta
(C
DC
4C
V’1
3)
4. Experimentation13
Leave-one-actor-out:
© A
.A. C
haa
rao
ui, J
.R. P
adill
a-L
óp
ez
and
F. F
lóre
z-R
evu
elta
(C
DC
4C
V’1
3)
5. Conclusions and future work14
Straightforward fusion of skeleton and silhouette
Improvement in the recognition rate
Include also side and top projected silhouettes
Select the weight for each feature vector
Feature subset selection
© A
.A. C
haa
rao
ui, J
.R. P
adill
a-L
óp
ez
and
F. F
lóre
z-R
evu
elta
(C
DC
4C
V’1
3)
5. Conclusions and future work15
We have already applied the approach in [3] for
feature selection to the fusion of skeleton and
silhouette
[3] A.A. Chaaraoui, J.R. Padilla-López, P. Climent-Pérez, and F. Flórez-Revuelta. Evolutionary joint selection to improve human action recognition with RGB-D devices. Expert Systems with Applications, 41(3):786-794,2014.
Cross-Subject
LOAO
© A
.A. C
haa
rao
ui, J
.R. P
adill
a-L
óp
ez
and
F. F
lóre
z-R
evu
elta
(C
DC
4C
V’1
3)
5. Conclusions and future work16
Straightforward fusion of skeleton and silhouette
Improvement in the recognition rate
Include also side and top projected silhouettes
Select the weight for each feature vector
Feature subset selection
Should we create a large bank of features and
select them appropriately?
ALEXANDROS A. CHAARAOUI
JOSÉ R. PADILLA-LÓPEZ
FRANCISCO FLÓREZ-REVUELTASydney,
December 2, 2013
3rd IEEE
Workshop
on
Consumer
Depth
Cameras
for
Computer
Vision
(CDC4CV)