Visual Object Recognition

38
Visual Object Recognition Rob Fergus Courant Institute, New York University http://cs.nyu.edu/~fergus/icml_tutorial/

description

Visual Object Recognition. Rob Fergus Courant Institute, New York University. http://cs.nyu.edu/~fergus/icml_tutorial/. Agenda. Introduction Bag-of-words models Visual words with spatial location Part-based models Discriminative methods Segmentation and recognition - PowerPoint PPT Presentation

Transcript of Visual Object Recognition

Page 1: Visual Object Recognition

Visual Object Recognition

Rob FergusCourant Institute,

New York University

http://cs.nyu.edu/~fergus/icml_tutorial/

Page 2: Visual Object Recognition

AgendaAgenda• Introduction

• Bag-of-words models

• Visual words with spatial location

• Part-based models

• Discriminative methods

• Segmentation and recognition

• Recognition-based image retrieval

• Datasets & Conclusions

Page 3: Visual Object Recognition

Li Fei-Fei, PrincetonRob Fergus, NYU

Antonio Torralba, MIT

Recognizing and Learning Recognizing and Learning Object Categories: Year 2007Object Categories: Year 2007

http://people.csail.mit.edu/torralba/shortCourseRLOC

Page 4: Visual Object Recognition

AgendaAgenda• Introduction

• Bag-of-words models

• Visual words with spatial location

• Part-based models

• Discriminative methods

• Segmentation and recognition

• Recognition-based image retrieval

• Datasets & Conclusions

Page 5: Visual Object Recognition

So what does object recognition involve?

Page 6: Visual Object Recognition

Classification: are there street-lights in the image?

Page 7: Visual Object Recognition

Detection: localize the street-lights in the image

Page 8: Visual Object Recognition

Object categorization

mountain

buildingtree

banner

vendorpeople

street lamp

Page 9: Visual Object Recognition

Scene and context categorization

• outdoor• city• …

Page 10: Visual Object Recognition

Application: Assisted driving

meters

met

ers

Ped

Ped

Car

Lane detection

Pedestrian and car detection

• Collision warning systems with adaptive cruise control, • Lane departure warning systems, • Rear object detection systems,

Page 11: Visual Object Recognition

Application:Computational photography

Page 12: Visual Object Recognition

Application: Improving online search

Query:STREET

Organizing photo collections

Page 13: Visual Object Recognition

Challenges 1: view point variation

Michelangelo 1475-1564

Page 14: Visual Object Recognition

Challenges 2: scale

Page 15: Visual Object Recognition

Challenges 3: illumination

slide credit: S. Ullman

Page 16: Visual Object Recognition

Challenges 4: background clutter

Bruegel, 1564

Page 17: Visual Object Recognition

Challenges 5: occlusion

http://lh5.ggpht.com/_wJc6t2hDl2M/RrL7Gh6sS7I/AAAAAAAAAYY/n3xaHc2opls/DSC00633.JPG

Page 18: Visual Object Recognition

Challenges 6: deformation

http://img.timeinc.net/time/asia/magazine/2007/1112/racehorse_1112.jpg

Page 19: Visual Object Recognition

History: single object recognition

Object 1 Object 2

Object 3

Page 20: Visual Object Recognition

David Lowe [1985]

Single object recognition history: Geometric methods

Rothwell et al. [1992]

Page 21: Visual Object Recognition

Single object recognition history: Appearance-based methods

• Murase & Nayer 1995 • Schmid & Mohr 1997• Lowe, et al. 1999, 2003• Mahamud and Herbert, 2000• Ferrari et al. 2004• Rothganger et al. 2004• Moreels and Perona, 2005• …

Page 22: Visual Object Recognition

Instance 1 Instance 2

Instance 3

Challenges 7: intra-class variation

Shoe class

Page 23: Visual Object Recognition

History: early object categorization

Page 24: Visual Object Recognition

• Fischler, Elschlager, 1973 • Turk and Pentland, 1991• Belhumeur, Hespanha, & Kriegman, 1997• Rowley & Kanade, 1998• Schneiderman & Kanade 2004• Viola and Jones, 2000• Heisele et al., 2001

• Amit and Geman, 1999• LeCun et al. 1998• Belongie and Malik, 2002• DeCoste and Scholkopf, 2002• Simard et al. 2003

• Poggio et al. 1993• Argawal and Roth, 2002• Schneiderman & Kanade, 2004…..

Page 25: Visual Object Recognition
Page 26: Visual Object Recognition

Three main issuesThree main issues

• Representation– How to represent an object category

• Learning– How to form the classifier, given training data

• Recognition– How the classifier is to be used on novel data

Page 27: Visual Object Recognition

Representation– Generative /

discriminative / hybrid

Page 28: Visual Object Recognition

Representation– Generative /

discriminative / hybrid– Appearance only or

location and appearance

Page 29: Visual Object Recognition

Representation– Generative /

discriminative / hybrid– Appearance only or

location and appearance

– Invariances• View point• Illumination• Occlusion• Scale• Deformation• Clutter• etc.

Page 30: Visual Object Recognition

Representation– Generative /

discriminative / hybrid– Appearance only or

location and appearance

– Invariances– Part-based or global with sub-window

Page 31: Visual Object Recognition

Representation– Generative /

discriminative / hybrid– Appearance only or

location and appearance

– Invariances– Parts or global w/sub-

window– Use set of features or

each pixel in image

Page 32: Visual Object Recognition

– Unclear how to model categories, so learn rather than manually specify

Learning

Page 33: Visual Object Recognition

– Unclear how to model categories, so learn rather than manually specify

– Methods of training: generative vs. discriminative

Learning

Page 34: Visual Object Recognition

– Unclear how to model categories, so learn rather than manually specify

– Methods of training: generative vs. discriminative

– Level of supervision• Manual segmentation; bounding box; image labels; noisy labels

Learning

Contains a motorbike

Page 35: Visual Object Recognition

Learning– Unclear how to model categories, so learn rather than manually specify

– Methods of training: generative vs. discriminative

– Level of supervision• Manual segmentation; bounding box; image labels; noisy labels

-- Training images:Issue of over-fitting (typically limited training data)Negative images for discriminative methods

Page 36: Visual Object Recognition

Learning– Unclear how to model categories, so learn rather than manually specify

– Methods of training: generative vs. discriminative

– Level of supervision• Manual segmentation; bounding box; image labels; noisy labels

-- Training images:Issue of over-fitting (typically limited training data)Negative images for discriminative methods

-- Priors

Page 37: Visual Object Recognition

– Scale / orientation range to search over – Speed– Context

Recognition

Page 38: Visual Object Recognition

Hoi

em, E

fros,

Her

bert,

200

6

– Context enables pruning of detector output

Recognition