Using classifiers to compute similarities between face images. Prof. Lior Wolf, Tel-Aviv University

Post on 17-Jan-2015

10.951 views 1 download

description

Prof. Lior Wolf, Tel-Aviv University He is a faculty member at the School of Computer Science at Tel-Aviv University. Previously, he was a post-doctoral associate in Prof. Poggio's lab at MIT. He graduated from the Hebrew University, Jerusalem, where he worked under the supervision of Prof. Shashua. He was awarded the 2008 Sackler Career Development Chair, the Colton Excellence Fellowship for new faculty (2006-2008), the Max Shlumiuk award for 2004, and the Rothchild fellowship for 2004. His joint work with Prof. Shashua in ECCV 2000 received the best paper award, and their work in ICCV 2001 received the Marr prize honorable mention. He was also awarded the best paper award at the post ICCV workshop on eHeritage 2009. In addition, Lior has held several development, consulting and advisory positions in computer vision companies including face.com and superfish, and is a co-founder of FDNA. Presentation topic: Using classifiers to compute similarities between images of faces. Key points: The One-Shot-Similarity (OSS) is a framework for classifier-based similarity functions. It is based on the use of background samples and was shown to excel in tasks ranging from face recognition to document analysis. In this talk we will present the framework as well as the following results: (1) when using a version of LDA as the underlying classifier, this score is a Conditionally Positive Definite kernel and may be used within kernel-methods (e.g., SVM), (2) OSS can be efficiently computed, and (3) a metric learning technique that is geared toward improved OSS performance.

Transcript of Using classifiers to compute similarities between face images. Prof. Lior Wolf, Tel-Aviv University

1

LEARNING VISUAL SIMILARITY USING CLASSIFIERSLior Wolf, The Blavatnik School of Computer Science, Tel-Aviv University

Collaborators: Students: Yaniv Taigman Tal Hassner Orit Klipper-Gross Itay Maoz face.com Open U Weizmann inst. Tel-Aviv U

The Blavatnik School of Computer ScienceTel-Aviv University

An example of higher education in Israel2

• A school in the Faculty of Exact Sciences– that also includes: Mathematics, Physics, Chemistry, Geophysics and

Planetary Sciences• Originated in the 1970’s as part of the School of

Math, – since 2000 a separate School

• 39 Faculty Members• ~1000 undergrads• ~200 MSc students• ~70 PhD students• Post Docs and other research personnel

3

School Ranking in the world• TAU/CS Ranked #29 in number of citations - Thompson

Scientific, (for the years 2000-2010).[Technion #33 , Weizmann #72, HebrewU #105]

• TAU/CS Ranked #28 by the Shanghai Academic Ranking of World Universities in Computer Science – 2011[Weizmann #12, Technion #15, HebrewU #21]

• TAU/CS Ranked #14 in the world in CS impact – Scientometrics, Vol. 76, No. 2, 2008.

• 12 TAU/CS faculty in positions 1-100 in “list of most central computer Scientists in Theory of Computer Science” (Kuhn – Wattenhofer, Sigact news, Dec ’07)

4

Raw data:images,video,audio

Information:objects,

tags, IDs,context

Query

Searchresults

Computer vision in search

Preprocessing

5

Over 1,000,000,000 photos uploaded each month

shared by 200,000,000+ users 10’s of billions served/week No tags No Photos…

On :

“can I see all my photos?”“tagging takes hours, can you do that for

me?”

The pain: too many images

6

The evolution of perceptual search

Text-basedimage search

With basicproperties

Specializationin face identification

Catalog basedsearch

Gist-based Image similarity

Reranking bysimilarity

No vision Low-level vision Mid-level vision

High-level vision: scene understanding

7

Photo Finder for facebook

8

9

THE 1st MOBILE APP TO FIND 3D ITEMS

10

WHAT MAKES IT SO HARD?

High-level vision: what is where?

High-level vision: scene understanding

A happy couple walks in a field

What kind of field?

Where? Which season?

How old are they? Gender?

How attractive?

What are they wearing?

11

LEARNING VISUAL SIMILARITY USING CLASSIFIERSLior Wolf, The Blavatnik School of Computer Science, Tel-Aviv University

Collaborators: Students: Yaniv Taigman Tal Hassner Orit Klipper-Gross Itay Maoz face.com Open U Weizmann inst. Tel-Aviv U

YaC, Moscow, September 19, 2011

12

13

The Pair-Matching Problem

Training:

14

The Pair-Matching Problem

Training:

Modeling never before seen

objectsNatural setup for image retrieval

with no categories

15

Instances

Face Recogniti0n

Video Face Recogniti0n

Document Analysis

Video Action Recogniti0n

16

The Pair-Matching Problem

Training:

17

Labeled Faces in the Wild (LFW)

Training:

13,000 labeled images of faces collected from the web

5,749 individuals1-150 images per individual

18

Restricted Protocol10-fold cross validation tests on randomly generated splits, each with:

300 same pairs300 not same pairs

19

Pipeline (take 1)*

* “Descriptor Based Methods in the Wild,” ECCVw’08

same

not same

Training. Note: no use of labels!

Sim ( , )Sim ( , )

Sim ( , )

Sim ( , )

=1

=2

=i

=i+1

Classifier

(e.g.SVM)

Threshold

20

Pipeline (take 1)*

* “Descriptor Based Methods in the Wild,” ECCVw’08

same

not same

Training – multiple descriptors \ similarities

(1,1, 1,2,…,1,n)

(2,1, 2,2,…,2,n)

(i,1, i,2,…,i,n)

(i+1,1, i+1,2,…,i+1,n)

Classifier

(e.g.SVM)

21

Some Questions

How to represent the images?

Which similarity to use?

Later on:How can subject IDs help improve pair-

matching performance?

Grayscales, Edge responces [Brunelli & Poggio’93], C1-Gabor [e.g., Riesenhuber & Poggio’99], SIFT [Lowe’04], LBP [e.g., Ojala & Pietikainen & Harwood’96],…

L2, Correlation, Learned metrics [e.g., Bilenko etal.’04, Cristianini etal.’02, Hertz etal. 04, …], “hand-crafted” metrics [e.g., Belongie etal.’01]

22

One-Shot Similarity (OSS) Score* What:

A measure of the similarity between two vectors

Input: The two vectors A set of “Background samples”

How: Use “One-Shot Learning” (classification

with one positive example)

* “Descriptor Based Methods in the Wild,” ECCVw’08 “The One-Shot Similarity Kernel”, ICCV’09

23

Computing the “One-Shot” Similarity

p

q

Similarity

Set “A” of background

examples

Step a: Model1 = train (p, A)

Step b: Score1 = classify(q, Model1)

Step c: Model2 = train (q, A)

Step d: Score2 = classify(p, Model1)

One-Shot-Sim = (score1 + score2) /2

24

Euclidean vs. One-Shot Visualized

Euclidean

One-Shot

25

Euclidean vs. One-Shot Visualized

Euclidean

One-Shot

26

Computing the “One-Shot” Similarity

* “The One-Shot Similarity Kernel”, ICCV’09

LDA

22One-Shot-Sim

T j AT ++ i Aj A W ii A W j

i j + +W i A W j A

x + μx + μ x - μ S x -x - μ S x -x ,x , A = +

S x - μ S x - μ

Using LDA as the underlying classifier :

Where is the mean of set A, and is the pseudo-inverse of the intra-class cov. matrix.

Aμ +WS

27

Computing the “One-Shot” Similarity

* “The One-Shot Similarity Kernel”, ICCV’09

LDAOne-Shot-Sim2 2FS

TT j A+ +i Ai j i A W j j A W i

x + μx + μx ,x , A = x - μ S x - + x - μ S x -

Using Free-Scale LDA as the underlying classifier :

Where is the mean of set A, and is the pseudo-inverse of the intra-class cov. matrix.

Aμ +WS

28

Some Properties of the OSS*

* “The One-Shot Similarity Kernel”, ICCV’09

Uses unlabeled training data OSS based on

Free-Scale LDA is a CPD Kernel

May be efficiently computed

Complexity: is independent of the two vectors compared, and so computed only once. Also, repeated comparisons of a vector xi to different xj may be performed in O(n)

+WS

29

Some Properties of the OSS*

* “The One-Shot Similarity Kernel”, ICCV’09

30

Some Properties of the OSS*

* “The One-Shot Similarity Kernel”, ICCV’09

OSS based on Free-Scale LDA is

a CPD Kernel

31

Metric learning for OSS*

*“One Shot Similarity Metric Learning for Action Recognition”, In submission.

Instead of examples xi use Txi for some “optimal” T

The transformation T is obtained by a gradient decent procedure that optimizes the score:

),,(OSS),,,(OSSML TATxTxTAxx jiji

2

0samenot

ML

same

ML ),,,(OSS),,,(OSS)( TTTAxxTAxxTf jiji

32

The Unrestricted Protocol10-fold cross validation tests on randomly generated splits, each with:

300 same pairs300 not same pairs

Training now includes subject

labels

33

Multiple One-Shots*

We now have IDs. How do we use them? Compute multiple OSS, each time using

examples from a single class

* “Multiple One-Shots for Utilizing Class Label Information,” BMVC’09

34

Multiple One-Shots

ID-based OSS

35

Multiple One-Shots

We now have IDs. How do we use them? Compute multiple OSS, each time using

examples from a single class Discrimination based on different

sources of variation: Subject ID, Pose, etc.

36

The Pose IssueMost confident wrong

results*

* “Descriptor Based Methods in the Wild,” ECCVw’08

37

Getting Poses

7 fiducial points (eyes, mouth, nose) 14 x,y coordinates 14D vector of alignment errors (similarity

trnsf.) Project to first Principal Component Bin to 10 classes

To compute Pose based OSS, you need sets of images in

the same pose…

38

Multiple One-Shots

Pose-based OSS

39

Multiple One-Shots - Examples

IdentityPose

5 Id-based OSS and5 Pose-based OSS scores

40

Multiple One-Shots - Examples

IdentityPose

41

Multiple One-Shots - Examples

IdentityPose

42

Pipeline*

* “Multiple One-Shots for Utilizing Class Label Information,” BMVC’09

Input image pair

Image alignment

Commercial alignment software by

43

Pipeline*

* “Multiple One-Shots for Utilizing Class Label Information,” BMVC’09

Input image pair

Image alignment

Feature vectors

Using:•SIFT [Lowe’04]•LBP [Ojala etal.’96,

01,02]•TPLBP, FPLBP [Wolf

etal.’08]

44

Pipeline*

* “Multiple One-Shots for Utilizing Class Label Information,” BMVC’09

Input image pair

Image alignment

Feature vectors

PCA+ITML

Information Theoretic Metric Learning [Davis etal.’07]

45

Pipeline*

* “Multiple One-Shots for Utilizing Class Label Information,” BMVC’09

Input image pair

Image alignment

Feature vectors

PCA+ITML

Multiple OSS scores

20 Subjects10 Poses

46

Pipeline*

* “Multiple One-Shots for Utilizing Class Label Information,” BMVC’09

Input image pair

Image alignment

Feature vectors

PCA+ITML

Multiple OSS scores

Output

Same \ Not-same

SVM classifier

47

Pipeline – Multiple Descriptors*

* “Multiple One-Shots for Utilizing Class Label Information,” BMVC’09

Output

Same \ Not-same

Feature vectors

SIFT

PCA+ITML

Multiple OSS

scores

Feature vectors

LBP

PCA+ITML

Multiple OSS

scores

Image alignment

SVM classifier

48

0.7847 ± 0.0051 [WHT’08]0.8398 ± 0.0035 [WHT’08 + alignment]

Results

0.8517 ± 0.0061 [this work, only LBP]0.8950 ± 0.0051 [this work,

multi-desc.]0.9753 [Kumar etal. 09 - HUMAN]

49

Pair-Matching of Sets

* Face Recognition in Unconstrained Videos with Matched B/G Similarity. CVPR 2011.

50

Pair-Matching of Sets

Training:

51

Conventional methods

all pairs comparison, distance between all frames of the first video and the second video.

pose based methods, comparing the two most frontal faces in each video or the two faces with the most similar pose.

algebraic methods set-to-set methods, such as max correlation, projection and Procrustes.

non algebraic methods such as PMK and LLC.

52

Matched B/G similarity

• X1 & X2: Sets of video frame descriptors.• B: background set of faces.

Similarity = MBGS(X1, X2, B) B1 = Find_Nearest_Neighbors(X1,B) Model1 = train(X1, B1) Confidences1 = classify(X2,

Model1) Sim1 = mean(Confidences1)

X1

X2

Similarity

53

Matched B/G similarity

• X1 & X2: Sets of video frame descriptors.• B: background set of faces.

Similarity = MBGS(X1, X2, B) B1 = Find_Nearest_Neighbors(X1,B) Model1 = train(X1, B1) Confidences1 = classify(X2,

Model1) Sim1 = mean(Confidences1) B2 = Find_Nearest_Neighbors(X2, B) Model2 = train(X2, B2) Confidences2 = classify(X1,

Model2) Sim2 = mean(Confidences2) Similarity = (Sim1+Sim2)/2

54

Thank you!

Software available:http://www.cs.tau.ac.il/~wolf