Continuous human action recognition in ambient assisted living scenarios

20
Francisco Flórez-Revuelta Continuous Human Action Recognition in Ambient Assisted Living Scenarios International Workshop on Enhanced Living Environments (ELEMENT 2014) Würzburg, Germany Alexandros Andre Chaaraoui

description

Paper presented at the First International Workshop on Enhanced Living EnvironMENTs (ELEMENT-2014) Abstract: Ambient assisted living technologies and services make it possible to help elderly and impaired people and increase their personal autonomy. Specifically, vision-based approaches enable the recognition of human behaviour, which in turn allows to build valuable services upon. However, a main constraint is that these have to be able to work online and in real time. In this work, a human action recognition method based on a bag-of-key-poses model and sequence alignment is extended to support continuous human action recognition. The detection of action zones is proposed to locate the most discriminative segments of an action. For the recognition, a method based on a sliding and growing window approach is presented. Furthermore, an evaluation scheme particularly designed for ambient assisted living scenarios is introduced. Experimental results on two publicly available datasets are provided. These show that the proposed action zones lead to a significant improvement and allow real-time processing.

Transcript of Continuous human action recognition in ambient assisted living scenarios

Page 1: Continuous human action recognition in ambient assisted living scenarios

Francisco Flórez-Revuelta

Continuous Human Action Recognition in Ambient Assisted Living Scenarios

International Workshop on Enhanced Living Environments (ELEMENT 2014) Würzburg, Germany

Alexandros Andre Chaaraoui

Page 2: Continuous human action recognition in ambient assisted living scenarios

Human action recognition in AAL Human action recognition with a bag of key poses Continuous human action recognition Experimentation

Overview

Page 3: Continuous human action recognition in ambient assisted living scenarios

Architecture for AAL

Camera 1

Motion Detection

Human Behaviour Analysis

Multi-view Human Behaviour Analysis

Privacy

Reasoning System

Alarm Actuators

Camera 2

Motion Detection

Human Behaviour Analysis

Camera N

Motion Detection

Human Behaviour Analysis

...

Setup and Profiles DB (Activities, Inhabitants, Objects, ...)

Log

Event

Long-term analysis

...

...

Environmental Sensor Information

Caregiver

Page 4: Continuous human action recognition in ambient assisted living scenarios

Human behaviour analysis

Page 5: Continuous human action recognition in ambient assisted living scenarios

Bag of key poses

Page 6: Continuous human action recognition in ambient assisted living scenarios

Use with RGB and RGB-D data

Page 7: Continuous human action recognition in ambient assisted living scenarios

Results with RGB data

Weizmann  MuHAVi-­‐8  

MuHAVi-­‐14  

DHA  

IXMAS  

Page 8: Continuous human action recognition in ambient assisted living scenarios

Results with RGB-D data

Page 9: Continuous human action recognition in ambient assisted living scenarios

More information

Original method Chaaraoui, A.A.; Climent-Pérez, P.; Flórez-Revuelta, F.: Silhouette-based Human Action Recognition using Sequences of Key Poses, Pattern Recognition Letters, 34(15):1799–1807, 2013. Multi-view action recognition Chaaraoui, A.A.; Climent-Pérez, P.; Flórez-Revuelta, F.: An Efficient Approach for Multi-view Human Action Recognition Based on Bag-of-Key-Poses, Lecture Notes in Computer Science, 7559:29-40, 2012. Evolutionary optimisation Chaaraoui, A.A.; Flórez-Revuelta, F.: Optimizing human action recognition based on a cooperative coevolutionary algorithm, Engineering Applications of Artificial Intelligence, Volume 31:116–125, 2014. Incremental learning Chaaraoui, A.A.; Flórez-Revuelta, F.: Adaptive Human Action Recognition With an Evolving Bag of Key Poses, IEEE Transactions on Autonomous Mental Development, 6(2):139-152, 2014 Use of RGB-D data Chaaraoui, A.A.; Padilla-López, J.R.; Climent-Pérez, P.; Flórez-Revuelta, F.: Evolutionary joint selection to improve human action recognition with RGB-D devices, Expert Systems with Applications, 41(3):786-794, 2014.

Page 10: Continuous human action recognition in ambient assisted living scenarios

The previous methods work with pre-segmented sequences However, accurate recognition and outstanding temporal performance led us to extend it for continuous scenarios A sliding and growing window is used to process the continuous stream at different overlapping locations and scales A null class is considered in order to discard unknown actions and avoid false positives. This class corresponds to all the behaviours that may be observed and have not been modelled during the learning Continuous human action recognition is performed by detecting and classifying action zones

Continuous recognition

Page 11: Continuous human action recognition in ambient assisted living scenarios

Action sequences may contain irrelevant segments which are common among actions and therefore ambiguous for classication Action zones = most discriminative segments with respect to the other action classes in the course of an action Action zones are shorter than the original sequences. Then, the matching time will be signicantly reduced This will allow to consider a larger number of sliding windows at every moment

Action zones

Page 12: Continuous human action recognition in ambient assisted living scenarios

The bag of key poses is built similarly to the original method The discrimination value of each key pose wkp is obtained:

For each training sequence of action class a and specific temporal instant t: 1.  For each action class a, the nearest neighbour key pose kpa(t) is obtained

2.  The raw class evidence values for all the classes

Learning of action zones

Page 13: Continuous human action recognition in ambient assisted living scenarios

The bag of key poses is built similarly to the original method The discrimination value of each key pose wkp is obtained:

For each training sequence of action class a and specific temporal instant t: 1.  For each action class a, the nearest neighbour key pose kpa(t) is obtained

2.  The raw class evidence values for all the classes

Learning of action zones

Page 14: Continuous human action recognition in ambient assisted living scenarios

The bag of key poses is built similarly to the original method The discrimination value of each key pose wkp is obtained:

For each training sequence of action class a and specific temporal instant t: 1.  For each action class a, the nearest neighbour key pose kpa(t) is obtained

2.  The raw class evidence values for all the classes

3. Normalisation is applied with respect to the highest value observed:

Learning of action zones

Page 15: Continuous human action recognition in ambient assisted living scenarios

4.  Gaussian smoothing is performed centered in the current frame, considering only the frames from a temporal instant u ≤ t

5.  The final class evidence H (t ) is obtained by attenuating the resulting value:

Learning of action zones

Page 16: Continuous human action recognition in ambient assisted living scenarios

4.  Gaussian smoothing is performed centered in the current frame, considering only the frames from a temporal instant u ≤ t

5.  The final class evidence H (t ) is obtained by attenuating the resulting value:

Action zones are detected by defining thresholds HT1(t), HT2(t),…, HTA(t) So, for a sequence belonging to an action class, the action zone is determined by the frames where

Learning of action zones

Page 17: Continuous human action recognition in ambient assisted living scenarios

Then, the action zones for every learning sequence constitute the knowledge base A sliding and growing window is used to process the continuous stream at different overlapping locations and scales These segments of key poses are compared with the learned action zones using DTW In some cases, even the nearest key pose is very different to the input frame Therefore, a set of threshold parameters DT1, DT2, …, DTA indicate the highest allowed distance to trigger the recognition If the match is not good enough, the frame is labelled as null class Continuous recognition

Page 18: Continuous human action recognition in ambient assisted living scenarios

Two sets of parameters to establish: HT1(t), HT2(t),…, HTA(t) and DT1, DT2, …, DTA Set with an evolutionary algorithm that finds the best performing combination for both sets Comparison with the ground truth is obtained at segment level An action must be recognised with a delay lower than τ frames Experimentation

Page 19: Continuous human action recognition in ambient assisted living scenarios

Validation with the multi-view IXMAS and the single-view Weizmann datasets The windows grows in 5-frame steps and when lengthmax is reached, it slides 10 frames A delayed recognition is accepted for τ = 60 frames ≅ 2 seconds

Experimentation

NOTE: Approach 1: Use of action zones Approach 2: Use of the whole sequences

Page 20: Continuous human action recognition in ambient assisted living scenarios

Francisco (Paco) Flórez-Revuelta [email protected] @fflorezrevuelta www.dtic.ua.es/~florez franciscoflorezrevuelta