Using classifiers to compute similarities between face images. Prof. Lior Wolf, Tel-Aviv University
-
Upload
yaevents -
Category
Technology
-
view
10.951 -
download
1
description
Transcript of Using classifiers to compute similarities between face images. Prof. Lior Wolf, Tel-Aviv University
1
LEARNING VISUAL SIMILARITY USING CLASSIFIERSLior Wolf, The Blavatnik School of Computer Science, Tel-Aviv University
Collaborators: Students: Yaniv Taigman Tal Hassner Orit Klipper-Gross Itay Maoz face.com Open U Weizmann inst. Tel-Aviv U
The Blavatnik School of Computer ScienceTel-Aviv University
An example of higher education in Israel2
• A school in the Faculty of Exact Sciences– that also includes: Mathematics, Physics, Chemistry, Geophysics and
Planetary Sciences• Originated in the 1970’s as part of the School of
Math, – since 2000 a separate School
• 39 Faculty Members• ~1000 undergrads• ~200 MSc students• ~70 PhD students• Post Docs and other research personnel
3
School Ranking in the world• TAU/CS Ranked #29 in number of citations - Thompson
Scientific, (for the years 2000-2010).[Technion #33 , Weizmann #72, HebrewU #105]
• TAU/CS Ranked #28 by the Shanghai Academic Ranking of World Universities in Computer Science – 2011[Weizmann #12, Technion #15, HebrewU #21]
• TAU/CS Ranked #14 in the world in CS impact – Scientometrics, Vol. 76, No. 2, 2008.
• 12 TAU/CS faculty in positions 1-100 in “list of most central computer Scientists in Theory of Computer Science” (Kuhn – Wattenhofer, Sigact news, Dec ’07)
4
Raw data:images,video,audio
Information:objects,
tags, IDs,context
Query
Searchresults
Computer vision in search
Preprocessing
5
Over 1,000,000,000 photos uploaded each month
shared by 200,000,000+ users 10’s of billions served/week No tags No Photos…
On :
“can I see all my photos?”“tagging takes hours, can you do that for
me?”
The pain: too many images
6
The evolution of perceptual search
Text-basedimage search
With basicproperties
Specializationin face identification
Catalog basedsearch
Gist-based Image similarity
Reranking bysimilarity
No vision Low-level vision Mid-level vision
High-level vision: scene understanding
7
Photo Finder for facebook
8
9
THE 1st MOBILE APP TO FIND 3D ITEMS
10
WHAT MAKES IT SO HARD?
High-level vision: what is where?
High-level vision: scene understanding
A happy couple walks in a field
What kind of field?
Where? Which season?
How old are they? Gender?
How attractive?
What are they wearing?
11
LEARNING VISUAL SIMILARITY USING CLASSIFIERSLior Wolf, The Blavatnik School of Computer Science, Tel-Aviv University
Collaborators: Students: Yaniv Taigman Tal Hassner Orit Klipper-Gross Itay Maoz face.com Open U Weizmann inst. Tel-Aviv U
YaC, Moscow, September 19, 2011
12
13
The Pair-Matching Problem
Training:
14
The Pair-Matching Problem
Training:
Modeling never before seen
objectsNatural setup for image retrieval
with no categories
15
Instances
Face Recogniti0n
Video Face Recogniti0n
Document Analysis
Video Action Recogniti0n
16
The Pair-Matching Problem
Training:
17
Labeled Faces in the Wild (LFW)
Training:
13,000 labeled images of faces collected from the web
5,749 individuals1-150 images per individual
18
Restricted Protocol10-fold cross validation tests on randomly generated splits, each with:
300 same pairs300 not same pairs
19
Pipeline (take 1)*
* “Descriptor Based Methods in the Wild,” ECCVw’08
same
not same
Training. Note: no use of labels!
Sim ( , )Sim ( , )
Sim ( , )
Sim ( , )
=1
=2
=i
=i+1
Classifier
(e.g.SVM)
Threshold
20
Pipeline (take 1)*
* “Descriptor Based Methods in the Wild,” ECCVw’08
same
not same
Training – multiple descriptors \ similarities
(1,1, 1,2,…,1,n)
(2,1, 2,2,…,2,n)
(i,1, i,2,…,i,n)
(i+1,1, i+1,2,…,i+1,n)
Classifier
(e.g.SVM)
21
Some Questions
How to represent the images?
Which similarity to use?
Later on:How can subject IDs help improve pair-
matching performance?
Grayscales, Edge responces [Brunelli & Poggio’93], C1-Gabor [e.g., Riesenhuber & Poggio’99], SIFT [Lowe’04], LBP [e.g., Ojala & Pietikainen & Harwood’96],…
L2, Correlation, Learned metrics [e.g., Bilenko etal.’04, Cristianini etal.’02, Hertz etal. 04, …], “hand-crafted” metrics [e.g., Belongie etal.’01]
22
One-Shot Similarity (OSS) Score* What:
A measure of the similarity between two vectors
Input: The two vectors A set of “Background samples”
How: Use “One-Shot Learning” (classification
with one positive example)
* “Descriptor Based Methods in the Wild,” ECCVw’08 “The One-Shot Similarity Kernel”, ICCV’09
23
Computing the “One-Shot” Similarity
p
q
Similarity
Set “A” of background
examples
Step a: Model1 = train (p, A)
Step b: Score1 = classify(q, Model1)
Step c: Model2 = train (q, A)
Step d: Score2 = classify(p, Model1)
One-Shot-Sim = (score1 + score2) /2
24
Euclidean vs. One-Shot Visualized
Euclidean
One-Shot
25
Euclidean vs. One-Shot Visualized
Euclidean
One-Shot
26
Computing the “One-Shot” Similarity
* “The One-Shot Similarity Kernel”, ICCV’09
LDA
22One-Shot-Sim
T j AT ++ i Aj A W ii A W j
i j + +W i A W j A
x + μx + μ x - μ S x -x - μ S x -x ,x , A = +
S x - μ S x - μ
Using LDA as the underlying classifier :
Where is the mean of set A, and is the pseudo-inverse of the intra-class cov. matrix.
Aμ +WS
27
Computing the “One-Shot” Similarity
* “The One-Shot Similarity Kernel”, ICCV’09
LDAOne-Shot-Sim2 2FS
TT j A+ +i Ai j i A W j j A W i
x + μx + μx ,x , A = x - μ S x - + x - μ S x -
Using Free-Scale LDA as the underlying classifier :
Where is the mean of set A, and is the pseudo-inverse of the intra-class cov. matrix.
Aμ +WS
28
Some Properties of the OSS*
* “The One-Shot Similarity Kernel”, ICCV’09
Uses unlabeled training data OSS based on
Free-Scale LDA is a CPD Kernel
May be efficiently computed
Complexity: is independent of the two vectors compared, and so computed only once. Also, repeated comparisons of a vector xi to different xj may be performed in O(n)
+WS
29
Some Properties of the OSS*
* “The One-Shot Similarity Kernel”, ICCV’09
30
Some Properties of the OSS*
* “The One-Shot Similarity Kernel”, ICCV’09
OSS based on Free-Scale LDA is
a CPD Kernel
31
Metric learning for OSS*
*“One Shot Similarity Metric Learning for Action Recognition”, In submission.
Instead of examples xi use Txi for some “optimal” T
The transformation T is obtained by a gradient decent procedure that optimizes the score:
),,(OSS),,,(OSSML TATxTxTAxx jiji
2
0samenot
ML
same
ML ),,,(OSS),,,(OSS)( TTTAxxTAxxTf jiji
32
The Unrestricted Protocol10-fold cross validation tests on randomly generated splits, each with:
300 same pairs300 not same pairs
Training now includes subject
labels
33
Multiple One-Shots*
We now have IDs. How do we use them? Compute multiple OSS, each time using
examples from a single class
* “Multiple One-Shots for Utilizing Class Label Information,” BMVC’09
34
Multiple One-Shots
ID-based OSS
35
Multiple One-Shots
We now have IDs. How do we use them? Compute multiple OSS, each time using
examples from a single class Discrimination based on different
sources of variation: Subject ID, Pose, etc.
36
The Pose IssueMost confident wrong
results*
* “Descriptor Based Methods in the Wild,” ECCVw’08
37
Getting Poses
7 fiducial points (eyes, mouth, nose) 14 x,y coordinates 14D vector of alignment errors (similarity
trnsf.) Project to first Principal Component Bin to 10 classes
To compute Pose based OSS, you need sets of images in
the same pose…
38
Multiple One-Shots
Pose-based OSS
39
Multiple One-Shots - Examples
IdentityPose
5 Id-based OSS and5 Pose-based OSS scores
40
Multiple One-Shots - Examples
IdentityPose
41
Multiple One-Shots - Examples
IdentityPose
42
Pipeline*
* “Multiple One-Shots for Utilizing Class Label Information,” BMVC’09
Input image pair
Image alignment
Commercial alignment software by
43
Pipeline*
* “Multiple One-Shots for Utilizing Class Label Information,” BMVC’09
Input image pair
Image alignment
Feature vectors
Using:•SIFT [Lowe’04]•LBP [Ojala etal.’96,
01,02]•TPLBP, FPLBP [Wolf
etal.’08]
44
Pipeline*
* “Multiple One-Shots for Utilizing Class Label Information,” BMVC’09
Input image pair
Image alignment
Feature vectors
PCA+ITML
Information Theoretic Metric Learning [Davis etal.’07]
45
Pipeline*
* “Multiple One-Shots for Utilizing Class Label Information,” BMVC’09
Input image pair
Image alignment
Feature vectors
PCA+ITML
Multiple OSS scores
20 Subjects10 Poses
46
Pipeline*
* “Multiple One-Shots for Utilizing Class Label Information,” BMVC’09
Input image pair
Image alignment
Feature vectors
PCA+ITML
Multiple OSS scores
Output
Same \ Not-same
SVM classifier
47
Pipeline – Multiple Descriptors*
* “Multiple One-Shots for Utilizing Class Label Information,” BMVC’09
Output
Same \ Not-same
Feature vectors
SIFT
PCA+ITML
Multiple OSS
scores
Feature vectors
LBP
PCA+ITML
Multiple OSS
scores
Image alignment
SVM classifier
48
0.7847 ± 0.0051 [WHT’08]0.8398 ± 0.0035 [WHT’08 + alignment]
Results
0.8517 ± 0.0061 [this work, only LBP]0.8950 ± 0.0051 [this work,
multi-desc.]0.9753 [Kumar etal. 09 - HUMAN]
49
Pair-Matching of Sets
* Face Recognition in Unconstrained Videos with Matched B/G Similarity. CVPR 2011.
50
Pair-Matching of Sets
Training:
51
Conventional methods
all pairs comparison, distance between all frames of the first video and the second video.
pose based methods, comparing the two most frontal faces in each video or the two faces with the most similar pose.
algebraic methods set-to-set methods, such as max correlation, projection and Procrustes.
non algebraic methods such as PMK and LLC.
52
Matched B/G similarity
• X1 & X2: Sets of video frame descriptors.• B: background set of faces.
Similarity = MBGS(X1, X2, B) B1 = Find_Nearest_Neighbors(X1,B) Model1 = train(X1, B1) Confidences1 = classify(X2,
Model1) Sim1 = mean(Confidences1)
X1
X2
Similarity
53
Matched B/G similarity
• X1 & X2: Sets of video frame descriptors.• B: background set of faces.
Similarity = MBGS(X1, X2, B) B1 = Find_Nearest_Neighbors(X1,B) Model1 = train(X1, B1) Confidences1 = classify(X2,
Model1) Sim1 = mean(Confidences1) B2 = Find_Nearest_Neighbors(X2, B) Model2 = train(X2, B2) Confidences2 = classify(X1,
Model2) Sim2 = mean(Confidences2) Similarity = (Sim1+Sim2)/2
54
Thank you!
Software available:http://www.cs.tau.ac.il/~wolf