9.913 Pattern Recognition for Vision Class9 - Object Detection and Recognition Bernd Heisele.

Post on 18-Jan-2018

219 views 0 download

description

Object Detection Task: Given an input image, determine if there are objects of a given class in the image and where they are located.

Transcript of 9.913 Pattern Recognition for Vision Class9 - Object Detection and Recognition Bernd Heisele.

9.913 Pattern Recognition for Vision

Class9 - Object Detection and Recognition

Bernd Heisele

Outline

• Object Detection• Object Recognition

Object Detection

• Task: Given an input image, determine if there are objects of a given class in the image and where they are located.

Face Detection System Architecture

Testing

Image Features

ROC for Image Features

Gray

Gray + Haar

Haar

Gray + Grad

Positive Training Data

Real vs. Synthetic

Real

Synthetic

ROC for Classifiers

LDA

Linear SVM

Poly2

Global vs. Components

(Whole Face)

Component-based Detection

Some Examples

ROC Component vs. Global

• About 40000 faces

• 68 people

• 13 poses

• 43 illuminations condition

• CMU PIE database

Training on Faces

Positive

Facial Negative

Non-facial Negative

Use the remainder of the face in the negative training set

Training on Faces

Red: Trained on facial and non-facial negative set.

Blue: Trained only on non-facial negative set.

Pair-wise Biasing

Often, many components classify correctly, with only a few errors.

Use the pair-wise relative position information from training data to bias the result image.

Pair-wise Biasing

Result Images

Biased Results

ROC Pair-wise Biasing

Red: Trained on facial and non-facial negative set.

Blue: Trained only on non-facial negative set.

Dashed: Biasing andtrained on facial and non-facial negative set.

Pedestrian Detection

Object Recognition

• Task: Given an image of and object of a particular class identify which exemplar it is.

Recognition System Architecture

Multi-class Classification with SVM

Training: N (N-1) / 2Classification: N - 1

Training: NClassification: N

The two different architecture has similar performance!!

Global Approach

1. Detect and extract face

2. Feed gray values of extracted face into N SVMs

3. Classify based on maximum output

Each SVM is one vs. all approach

Global Approach with Clustering

T1. Partition training images of each person into viewpoint- specific clusters

T2. Train a SVM on each cluster.

R1. Detect and extract face

R2. Feed extracted face to all SVMs

R3. Take maximum over all SVM outputs

Component-based Approach

1. Detect face and extract components

2. Combine gray values of components to a feature vector , and feed to the N SVMs

3. Take maximum over all SVM outputs

ROC Component vs. Global Recognition

• Trained and tested on frontal and rotated faces.