Generic Object and Action Detection with LARK (Locally Adaptive

28
 Generic Object and Action Detection with LARK (Locally Adaptive Regression Kernels) Haejong Seo University of California, Santa Cruz Mentor: Gary Bradski

Transcript of Generic Object and Action Detection with LARK (Locally Adaptive

Page 1: Generic Object and Action Detection with LARK (Locally Adaptive

   

Generic Object and Action Detection withLARK (Locally Adaptive Regression Kernels)

Haejong SeoUniversity of California, Santa Cruz

Mentor: Gary Bradski

Page 2: Generic Object and Action Detection with LARK (Locally Adaptive

   Haejong Seo (summer Internship) (1)

MotivationHuman­Robot InteractionRobot­Robot Interaction

Where to look?

Is there any motion?

Where is a bottle of beer located?How big is this bottle?What pose is this ? 

Is he/she waving hands to me?Is other PR2 approaching to me?

122

3

4

Page 3: Generic Object and Action Detection with LARK (Locally Adaptive

   

MotivationDOT (dominant orientation templates) by Stefan

1) can handle object detection2) run with a single web­cam3) pretty fast

However, can not deal with 

Haejong Seo (summer Internship) (2)

BiGGPy (binarized gradients grid pyramid) by Gary 

3

1 2 4

LARK can tackle all the problems like 

Page 4: Generic Object and Action Detection with LARK (Locally Adaptive

   

Motivation & Goal

static saliency object detection

space­time saliency action detection

Haejong Seo (summer Internship) (3)

Develop fast and robust detection systems in open sourse

1

2 4

3

Page 5: Generic Object and Action Detection with LARK (Locally Adaptive

   

OutlineOutline

LARK Overview

Saliency Detection

Object Detection

Conclusion

Space­time Saliency Detection

Action Detection

1

2

3

4

5

6

Page 6: Generic Object and Action Detection with LARK (Locally Adaptive

   

LARK (locally adaptive regression kernels)

Haejong Seo (summer Internship) (4)

●Euclidean distance vs. Geodesic distance

Page 7: Generic Object and Action Detection with LARK (Locally Adaptive

   

Image as a Surface Embedded in the Euclidean 3‐space

Arclength on the surface

Chain rule

Haejong Seo (summer Internship) (5)

Page 8: Generic Object and Action Detection with LARK (Locally Adaptive

   

LARK   self­similarity →

Haejong Seo (summer Internship) (6)

Page 9: Generic Object and Action Detection with LARK (Locally Adaptive

   

LARK (example)

Haejong Seo (summer Internship) (7)

Page 10: Generic Object and Action Detection with LARK (Locally Adaptive

   

LARK (speed­up)Step1: downsample by a factor of 4

Step 2: interpolate C = [C11, C12, C22] after computing in a lower scale 

C11 C22 C12

Haejong Seo (summer Internship) (8)

0.02 sec (70 times faster)

Page 11: Generic Object and Action Detection with LARK (Locally Adaptive

   

Saliency Detection

Haejong Seo (summer Internship) (9)

LARK self­resemblance

Saliency map

thresholding

Page 12: Generic Object and Action Detection with LARK (Locally Adaptive

   

Saliency Detection (video)

Haejong Seo (summer Internship) (10)

Page 13: Generic Object and Action Detection with LARK (Locally Adaptive

   

Object Detection

Haejong Seo (summer Internship) (11)

Compute LARKCompute LARK

templatetemplate

imageimage

Page 14: Generic Object and Action Detection with LARK (Locally Adaptive

   

Object Detection (speed­up)

Use saliency to reduce search space

Haejong Seo (summer Internship) (12)

Page 15: Generic Object and Action Detection with LARK (Locally Adaptive

   

Face Detection (video)

Haejong Seo (summer Internship) (13)

One template Three templates

Page 16: Generic Object and Action Detection with LARK (Locally Adaptive

   

Object Detection (video)

Haejong Seo (summer Internship) (14)

Door knob PR2

Drawing Small robot

Three templates

Page 17: Generic Object and Action Detection with LARK (Locally Adaptive

   

3­D Object Detection (speed­up)

Pyramid searchPyramid searchTree structure for template

Haejong Seo (summer Internship) (15)

Page 18: Generic Object and Action Detection with LARK (Locally Adaptive

   

3­D Object Detection (video)

Haejong Seo (summer Internship) (18)

CD case mouse

naked organizer

Page 19: Generic Object and Action Detection with LARK (Locally Adaptive

   

3­D Object Detection (video)

Haejong Seo (summer Internship) (19)

Two objects Three objects

Page 20: Generic Object and Action Detection with LARK (Locally Adaptive

   

3­D LARK   self­similarity in 3­D→

Haejong Seo (summer Internship) (19)

Page 21: Generic Object and Action Detection with LARK (Locally Adaptive

   

Space­time Saliency Detection

Haejong Seo (summer Internship) (11)

Page 22: Generic Object and Action Detection with LARK (Locally Adaptive

   

Space­time Saliency (video)

Haejong Seo (summer Internship) (12)

Page 23: Generic Object and Action Detection with LARK (Locally Adaptive

   

Action Detection

Haejong Seo (summer Internship) (20)

template

Input video

LARKs(30~35 frames)

Page 24: Generic Object and Action Detection with LARK (Locally Adaptive

   

Action Detection (speed­up)

Haejong Seo (summer Internship) (21)

space­time saliencypyramid search

5 frames 5 frames

35 frames

7 frames of 3­D LARK (3x3 (space)x5 (time))

Page 25: Generic Object and Action Detection with LARK (Locally Adaptive

   

Action Detection (video)

Haejong Seo (summer Internship) (22)

4 actions

sitting down

moving closer

boxing

waving

Page 26: Generic Object and Action Detection with LARK (Locally Adaptive

   

Code Availability

Package larks: service that trains object templates and detects objects locations and poses (available now)

 → stacks/object_recognition_experimental/larks

Haejong Seo (summer Internship) (23)

Package saliency: service that provides salient regions in images and videos (will be available) 

 → cturtle/wg­ros­pkg­unreleased/sandbox/saliency

Package actiondetection: service that detect generic human actions in videos (will be available) 

 → cturtle/wg­ros­pkg­unreleased/sandbox/actiondetection

Page 27: Generic Object and Action Detection with LARK (Locally Adaptive

   

Discussion & Future Work

Use parts­based detection to deal with occlusion (Steve Gould)

Use a tracking algorithm to avoid blinking effects

Improve scalability   build a common tree for all the objects→

Haejong Seo (summer Internship) (24)

Learn threshold values for each object and action 

Use LARK as a post filter for BiGGPy

Page 28: Generic Object and Action Detection with LARK (Locally Adaptive

   

Thank you!

Any Questions?