Object recognition

51
Object recognition

description

Object recognition. Object Classes. Individual Recognition. Is this a dog?. Variability of Airplanes Detected. Variability of Horses Detected. ClassNon-class. Class Non-class. Recognition with 3-D primitives. Geons. Visual Class: Common Building Blocks. - PowerPoint PPT Presentation

Transcript of Object recognition

Page 1: Object recognition

Object recognition

Page 2: Object recognition

Object Classes

Page 3: Object recognition

Individual Recognition

Page 4: Object recognition

Is this a dog?

Page 5: Object recognition

Variability of Airplanes Detected

Page 6: Object recognition

Variability of Horses Detected

Page 7: Object recognition

Class Non-class

Page 8: Object recognition

Class Non-class

Page 9: Object recognition
Page 10: Object recognition

Recognition with 3-D primitives

Geons

Page 11: Object recognition

Visual Class: Common Building Blocks

Page 12: Object recognition

Optimal Class Components?

• Large features are too rare

• Small features are found

everywhere

Find features that carry the highest amount of information

Page 13: Object recognition

Entropy

Entropy:

x = 0 1 H

p = 0.5 0.5 ? 0.1 0.9 0.47 0.01 0.99 0.08

)p(x log )p(x- H i2i

Page 14: Object recognition

Mutual Information I(x,y)

X alone: p(x) = 0.5, 0.5 H = 1.0

X given Y:

Y = 0 Y = 1

p(x) = 0.8, 0.2 H = 0.72

p(x) = 0.1, 0.9H = 0.47

H(X|Y) = 0.5*0.72 + 0.5*0.47 = 0.595

H(X) – H(X|Y) = 1 – 0.595 = 0.405

I(X,Y) = 0.405

Page 15: Object recognition

Mutual information

H(C) when F=1 H(C) when F=0

I(C;F) = H(C) – H(C/F)

F=1 F=0

H(C)

))(()()( cPLogcPcH

Page 16: Object recognition

Mutual Information II

yx ypxp

yxpyxpYXI

, )()(

),(log),(),(

Page 17: Object recognition

Computing MI from Examples

• Mutual information can be measured from examples:

100 Faces 100 Non-faces

Feature: 44 times 6 times

Mutual information: 0.1525H(C) = 1, H(C|F) = 0.8475

Page 18: Object recognition

Full KL Classification Error

FC

p(F|C)

q(C|F)

p(C)

Page 19: Object recognition

Optimal classification features

• Theoretically: maximizing delivered information minimizes classification error

• In practice: informative object components can be identified in training images

Page 20: Object recognition

Mutual Info vs. Threshold

0.00 20.00 40.00

Detection threshold

Mu

tu

al

Info

forehead

hairline

mouth

eye

nose

nosebridge

long_hairline

chin

twoeyes

Selecting Fragments

Page 21: Object recognition

Adding a New Fragment(max-min selection)

?

MIΔ

MI = MI [Δ ;class] - MI [ ;class ]Select: Maxi Mink ΔMI (Fi, Fk)

)Min. over existing fragments, Max. over the entire pool(

);(),;(min);(),;( jjiij

i FCMIFFCMIFCMIFFCMI

Page 22: Object recognition

Highly Informative Face Fragments

Page 23: Object recognition

Intermediate Complexity

0

5

10

15

0 1 2 3

Relative object size

100

0123456

0 1 2 3 4

Relative object size

100

x M

erit

0

0.2

0.4

0.6

0.8

1

1.2

0 0.5 1 1.5 2

Relative resolution

- 0 . 5

0

0 . 5

1

1 . 5

0 1 2 3

Relative object size

Relative mutual info.

100 x Merit, weight

a. b.

100 x Merit, weight

100 x Merit, weight

Page 24: Object recognition

Decision

Combine all detected fragments Fk:

∑wk Fk > θ

Page 25: Object recognition

Optimal Separation

SVMPerceptron

∑wk Fk = θ is a hyperplane

Page 26: Object recognition

Combining fragments linearlyConditional independence:

P(F1,F2 | C) = p(F1|C) p(F2|C)

)/()/(

NCFpCFp

> θ

)|(

)|(

NCFip

cFip

> θ

W(Fi) = log)|(

)|(

NCFip

cFip

Σw(Fi) > θ

Page 27: Object recognition

• Σw(Fi) > θ

If Fi=1 take log)|1(

)|1(

NCFip

cFip

If Fi=0 take log)|0(

)|0(

NCFip

cFip

Instead: Σ wi > θOn all the detected fragments

only

With: wi = w(Fi=1) – w(Fi=0)

Page 28: Object recognition

Class II

Page 29: Object recognition

Class Non-class

Page 30: Object recognition

Fragments with positions

∑wk Fk > θ

On all detected fragments within their regions

Page 31: Object recognition

Horse-class features

Page 32: Object recognition

Examples of Horses Detected

Page 33: Object recognition

Interest points (Harris)SIFT Descriptors

Ix2 IxIy

IxIy

Iy2

Page 34: Object recognition

Harris Corner Operator

<Ix2> < IxIy<

< < yIxI < yI2>

H=

Averages within a neighborhood.

Corner: The two eigenvalues λ1, λ2 are large

Indirectly:

‘Corner’ = det(H) – k trace2(H)

Page 35: Object recognition

Harris Corner Examples

Page 36: Object recognition

SIFT descriptor

David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-110

Example :

4*4 sub-regions

Histogram of 8 orientations in each

V = 128 values:

g1,1,…g1,8,… …g16,1,…g16,8

Page 37: Object recognition

SIFT

Page 38: Object recognition

Constellation of Patches Using interest points

Fegurs, Perona, Zissermann 2003

Six-part motorcycle model, joint Gaussian ,

Page 39: Object recognition

Bag of wordsand Unsupervised Classification

ObjectObject Bag of ‘words’Bag of ‘words’

Page 40: Object recognition

Bag of visual words A large collection of image patches

1.Feature detection 1.Feature detection and representationand representation

•Regular grid– & VogelSchiele ,2003

–Fei- ,Fei & Perona2005

Page 41: Object recognition

Each class has its words historgram

Page 42: Object recognition

pLSAClassify document automatically, find related documents, etc. based on word

frequency.

Documents contain different ‘topics’ such as Economics, Sports, Politics, France… Each topic has its typical word frequency. Economics will have high occurrence of

‘interest’, ‘bonds’ ‘inflation’ etc.

We observe the probabilities p(wi | dn) of words and documents

Each document contains several topics, zk

A word has different probabilities in each topic, p(wi | zk). A given document has a mixture of topics: p(zk | dn) The word-frequency model is:

p(wi | dn) = Σkp(wi|zk) p(zk | dn)

pLSA was used to discover topics, and arrange documents according to their topics.

Page 43: Object recognition

pLSA

The word-frequency model is:

p(wi | dn) = Σkp(wi|zk) p(zk | dn)

We observe p(wi | dn) and find the best p(wi|zk) and p(zk | dn) to explain the data

pLSA was used to discover topics, and then arrange documents

according to their topics.

Page 44: Object recognition

Discovering objects and their location in images

Sivic, Russel, Efros, Freedman & Zisserman CVPR 2005

Uses simple ‘visual words’ for classification

Not the best classifier, but obtains unsupervised classification, using pLSA

Page 45: Object recognition

Visual words – unsueprvised classification

• Four classes: faces, cars, airplanes, motorbikes, and non-class. Training images are mixed.

• Allowed 7 topics, one per class, the background includes 3 topics.

• Visual words: local patches using SIFT descriptors. – (say local 10*10 patches)

codewords dictionarycodewords dictionary

Page 46: Object recognition

Learning

• Data: the matrix Dij = p(wi | Ij)• During learning – discover ‘topics’ (classes +

background) • p(wi | Ij) = Σ p(wi | Tk) p(Tk | Ij )

• Optimize over p(wi | Tk), p(Tk | Ij )• The topics are expected to discover classes• Got mainly one topic per class image.

Page 47: Object recognition

Results of learning

Page 48: Object recognition

Classifying a new image

• New image I:

• Measure p(wi | I)

• Find topics for the new image:

• p(wi | I) = Σ p(wi | Tk) p(Tk | I)

• Optimize over the topics Tk

• Find the largest (non-background) topic

Page 49: Object recognition

Classifying a new image

Page 50: Object recognition

On general model learning

• The goal is to classify C using a set of features F. • F have been selected (must have high MI(C;F)) • The next goal is to use F to decide on the class C.

• Probabilistic approach: • Use observations to learn the joint distribution p(C,F)• In a new image, F is observed, find the most likely C, • Max (C) p(C,F)

Page 51: Object recognition

General model learning • To learn the joint distribution p(C,F): • The model is of the form pθ(C,F)

– Or: pθ(C,X,F)

• For example we had – words in documents: – p(w,D) = Πp(wi,D)– p(wi | D) = Σ p(wi | Tk) p(Tk | D)

• Training examples used to determine optimal θ by maximizing pθ(data)– max (C,X, θ) pθ(C,X,F)

• When θ known, classify new example:– max (C,X) pθ(C,X,F)