Christopher M. Bishop Object Recognition: A Statistical Learning Perspective Microsoft Research,...

11
Christopher M. Bishop Object Recognition: A Statistical Learning Perspective Microsoft Research, Cambridge Sicily, 2003

Transcript of Christopher M. Bishop Object Recognition: A Statistical Learning Perspective Microsoft Research,...

Page 1: Christopher M. Bishop Object Recognition: A Statistical Learning Perspective Microsoft Research, Cambridge Sicily, 2003.

Christopher M. Bishop

Object Recognition:A Statistical Learning

Perspective

Microsoft Research, Cambridge

Sicily, 2003

Page 2: Christopher M. Bishop Object Recognition: A Statistical Learning Perspective Microsoft Research, Cambridge Sicily, 2003.

Object Recognition Workshop, Sicily

Christopher M. Bishop

Question 1

• “Will visual category recognition be solved by an architecture based on classification of feature vectors using advanced learning algorithms?”

• No– large number of classes– many degrees of freedom of variability (geometric,

photometric, ...)– transformations are highly

non-linear in the pixel values(objects live on non-linear manifolds)

– occlusion– expensive to provide detailed

labelling of training data

Page 3: Christopher M. Bishop Object Recognition: A Statistical Learning Perspective Microsoft Research, Cambridge Sicily, 2003.

Object Recognition Workshop, Sicily

Christopher M. Bishop

Question 2

• “If we want to achieve a human like capacity to recognise 1000s of visual categories, learning from a few examples, what will move us forward most significantly?”

• Large training sets– algorithms which can effectively utilize lots of

unlabelled/partially labelled data• But: should the models be generative or discriminative?

Page 4: Christopher M. Bishop Object Recognition: A Statistical Learning Perspective Microsoft Research, Cambridge Sicily, 2003.

Object Recognition Workshop, Sicily

Christopher M. Bishop

Generative vs. Discriminative Models

• Generative approach: separately model class-conditional densities and priors

then evaluate posterior probabilities using Bayes’ theorem

• Discriminative approaches:

1. model posterior probabilities directly

2. just predict class label (no inference stage)

Page 5: Christopher M. Bishop Object Recognition: A Statistical Learning Perspective Microsoft Research, Cambridge Sicily, 2003.

Object Recognition Workshop, Sicily

Christopher M. Bishop

Generative vs. Discriminative

Page 6: Christopher M. Bishop Object Recognition: A Statistical Learning Perspective Microsoft Research, Cambridge Sicily, 2003.

Object Recognition Workshop, Sicily

Christopher M. Bishop

Advantages of Knowing Posterior Probabilities

• No re-training if loss matrix changes– inference hard, decision stage is easy

• Reject option: don’t make decision when largest probability is less than threshold

• Compensating for skewed class priors• Combining models

– e.g. independent measurements:

Page 7: Christopher M. Bishop Object Recognition: A Statistical Learning Perspective Microsoft Research, Cambridge Sicily, 2003.

Object Recognition Workshop, Sicily

Christopher M. Bishop

Unlabelled Data

Class 1

Class 2

Test point

Page 8: Christopher M. Bishop Object Recognition: A Statistical Learning Perspective Microsoft Research, Cambridge Sicily, 2003.

Object Recognition Workshop, Sicily

Christopher M. Bishop

Unlabelled Data

Page 9: Christopher M. Bishop Object Recognition: A Statistical Learning Perspective Microsoft Research, Cambridge Sicily, 2003.

Object Recognition Workshop, Sicily

Christopher M. Bishop

Generative Methods

Relatively straightforward to characterize invariances They can handle partially labelled data They wastefully model variability which is unimportant

for classification They scale badly with the number of classes and the

number of invariant transformations (slow on test data)

Page 10: Christopher M. Bishop Object Recognition: A Statistical Learning Perspective Microsoft Research, Cambridge Sicily, 2003.

Object Recognition Workshop, Sicily

Christopher M. Bishop

Discriminative Methods

They use the flexibility of the model in relevant regions of input space

They can be extremely fast once trained They interpolate between training examples, and

hence can fail if novel inputs are presented They don’t easily handle compositionality (e.g. faces

can have glasses and/or moutaches and/or hats)

Page 11: Christopher M. Bishop Object Recognition: A Statistical Learning Perspective Microsoft Research, Cambridge Sicily, 2003.

Object Recognition Workshop, Sicily

Christopher M. Bishop

Hybrid Approaches

• Generatively inspired models, trained discriminatively– state of the art in speech recognition– hidden Markov model handles time-warp invariances– parameters determined by maximum mutual

information not maximum likelihood