Retrieving Actions in Group Contexts Tian Lan, Yang Wang, Greg Mori, Stephen Robinovitch Simon...

Retrieving Actions in Group Contexts

Tian Lan, Yang Wang, Greg Mori, Stephen Robinovitch Simon Fraser University

Sept. 11, 2010

Outline

• Action Retrieval as Ranking

• Results and Future Work

• Contextual Representation of Actions

Nursing Home

• Fall analysis in nursing home surveillance videos– a system automatically rank the videos according

to the relevance to fall action is expected

Action-Action Context

Context

What other people are

doing ?

Actions in Group Context

• Motivation– human actions are rarely performed in isolation,

the actions of individuals in a group can serve as context for each other.

• Goal– explore the benefit of contextual information in

action retrieval in challenging real-world applications

Action Context Descriptorτ

action

τ

z

+action

Focal person Context

Action Context Descriptor

Feature Descriptor

Multi-class SVM

action class

scor

e

action class

scor

e

…action class

scor

e

max

action classsc

ore

e.g. HOG by Dalal & Triggs

Outline




Classification or Retrieval

• Previous Work–Most work in human action understanding

focuses on action classification.

Classification or Retrieval • Most surveillance tasks are typical retrieval

tasks– retrieve a small video segment contains a

particular action from thousands of hours of videos.

• The “action of interest” is rare event– Extremely imbalanced classes

Action Retrieval

Rank according to the relevance to falls

Query : fall

Learning

• Input: document-rank pair (xi,yi)

• Optimization

Joachims, KDD 06

Ranking SVM

• Ranking function h(x)

h(x)

Rank r1Rank r2Rank r3

Action Retrieval - training

irrelevant

very relevant

relevant

Outline




Dataset

• Nursing Home Dataset • 5 action categories: walking, standing, sitting, bending

and falling. (per person)• 18 video clips.• Query: fall

• Collective Activity Dataset (Choi et al. VS. 09)

• 5 action categories: crossing, waiting, queuing, walking, talking. (per person)

• 44 video clips.• Query: each of the five actions

• Nursing Home DatasetDataset

Dataset• Collective Activity Dataset

System Overview

Person

DetectorPerson

DescriptorVideo

u

v

RankSVM

• Pedestrian Detection by Felzenszwalb et al.• Background Subtraction

• HOG by Dalal & Triggs • LST by Loy et al. at cvpr 09

Baselines

• Context vs No Context– Action Context Descriptor– Original feature descriptors, e.g. HOG (Dalal & Triggs at CVPR

05), LST (Loy et al. at CVPR 09)

• RankSVM vs SVM

• Methods– Context + RankSVM (our method)– Context + SVM– No Context + RankSVM– No Context + SVM

Retrieval Results

Nursing Home Dataset

Retrieval Results

Collective Activity Dataset

1 2

3 4

7 8

65

Action Classification

[10] Choi et al. in VS. 09

Collective Activity Dataset

Conclusion

• A new contextual feature descriptor to represent actions– action context (AC) descriptor

• Formulate our problem as a retrieval task.

Future Work

• Contextual Feature Descriptors– How to only encode useful context?

• Rank-SVM loss, optimize the NDCG score

Thank you!

7 8

65

Retrieving Actions in Group Contexts Tian Lan, Yang Wang, Greg Mori, Stephen Robinovitch Simon...

Documents

Transcript of Retrieving Actions in Group Contexts Tian Lan, Yang Wang, Greg Mori, Stephen Robinovitch Simon...