Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May...

34
Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    217
  • download

    0

Transcript of Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May...

Page 1: Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

Supervised Distance Metric Learning

Presented at CMU’s Computer Vision Misc-Read Reading Group

May 9, 2007

by Tomasz Malisiewicz

Page 2: Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

Overview

Short Metric Learning Overview Eric Xing et al (2002) Kilian Weinberger et al (2005)

Local Distance Functions Andrea Frome et al (2006)

Page 3: Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

What is a metric?

Technically, a metric on a space X is a function

Satisfies : non-negativity, identity, symmetry, triangle inequality

Just forget about the technicalities, and think of it as a distance function that measures similarity

Depending on context and mathematical properties other useful names: Comparison function, distance function, distance metric, similarity measure, kernel function, matching measure

1: d

Page 4: Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

When are distance functions used?

Almost everywhere

Density estimation (e.g. parzen windows)

Clustering (e.g. k-means) Instance-based classifiers (e.g. NN)

Page 5: Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

How to choose a distance function?

A priori Euclidean Distance L1 distance

Cross Validation within small class of functions e.g. choosing order of polynomial for a

kernel

Page 6: Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

Learning distance function from data

Unsupervised Metric Learning (aka Manifold Learning) Linear: e.g. PCA Non-linear: e.g. LLE, Isomap

Supervised Metric Learning (using labels associated with points) Global Learning Local Learning

Page 7: Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

A Mahalanobis Distance Metric

)()(||)(||),( 22ji

TTjijiji xxAAxxxxAxxD

Most Commonly Used Distance Metric in Machine Learning Community

Equivalent to first applying linear transformation y = Ax, then using Euclidean distance in new space of y’s

Page 8: Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

Supervised Distance Metric Learning

Global vs Local

Page 9: Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

Supervised Global Distance Metric Learning (Xing et al 2002)

Input data has labels (C classes) Consider all pairs of points from the same

class (equivalence class) Consider all pairs of points from different

classes (inequivalence class) Learn a Mahalanobis Distance Metric that

brings equivalent points closer together while staying far from inequivalent points

E. Xing, A. Ng, and M. Jordan, “Distance metric learning with application to clusteringwith side information,” in NIPS, 2003.

Page 10: Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

Supervised Global Distance Metric Learning (Xing et al 2002)

Page 11: Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

Supervised Global Distance Metric Learning (Xing et al 2002)

Convex Optimization problem Minimize pairwise distances between all

similarly labeled examples

Page 12: Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

Supervised Global Distance Metric Learning (Xing et al 2002)

Anybody see a problem with this approach?

Page 13: Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

Problems with Global Distance Metric Learning

Problem with multimodal classes

Page 14: Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

Supervised Local Distance Metric Learning

Many different supervised distance metric learning algorithms that do not try to bring all points from same class together

Many approaches still try to learn Mahalanobis distance metric

Account for multimodal data by integrating only local constraints

Page 15: Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

Supervised Local Distance Metric Learning (Weinberger et al 2005)

Consider a KNN Classifier: for each Query Point x, we want the K-nearest neighbors of same class to become closer to x in new metric

K.Q. Weinberger, J. Blitzer, and L.K. Saul, “Distance metric learning for large marginnearest neighbor classification,” in NIPS, 2005.

Page 16: Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

Supervised Local Distance Metric Learning (Weinberger et al 2005)

Convex Objective Function (SDP)

Penalize large distances between each input and target neighbors

Penalize small distances between each input and all other points of different class

Points from different classes are separated by large margin

K.Q. Weinberger, J. Blitzer, and L.K. Saul, “Distance metric learning for large marginnearest neighbor classification,” in NIPS, 2005.

Page 17: Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

Distance Metric Learning by Andrea Frome

A. Frome, Y. Singer, and J. Malik, “Image Retrieval and Classification Using Local Distance Functions,” in NIPS 2006.

Finally we get something more interesting than Mahalanobis Distance Metric!

Page 18: Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

Per-exemplar distance function

Goal: instead of learning a deformation of the space of exemplars, want to learn a distance function for each exemplar (think KNN classifier)

Page 19: Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

Distance functions and Learning Procedure

If there are N training images, we will solve N separate learning problems (the training image for a particular learning problem is referred to as the focal image)

Each learning problem solved with a subset of the remaining training images (called the learning set)

Page 20: Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

Elementary Distances

Distance functions built on top of elementary distance measures between patch-based features

Each input is not treated as a fixed-length vector

Page 21: Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

Elementary Patch-based Distances

Distance function is combination of elementary patch-based distances (distance from patch to image)

M patches in Focal Image Image to Image distance is linear

combination of patch to image distancesDistance between j-th patch in F and entire image I

Page 22: Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

Learning from triplets

Goal is to learn w for each Focal Image from Triplets of Focal Image I, Similar Image, and Different Image

Each Triplet gives us one constraint

Page 23: Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

Max Margin Formulation

Learn w for each Focal Image independently

Weights must be positive T triplets for each learning problem Slack Vars (like non-separable SVM)

Page 24: Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

Visual Features and Patch-to-Image Distance

Geometric Blur descriptors (for shape) at two different scales

Naïve Color Histogram descriptor Sampled from 400 or fewer edge points No geometric relations between features Distance between feature f and image I is

the smallest L2 distance between f and feature f’ in I of the same feature type

Page 25: Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

Image Browsing Example

Given a Focal Image, sort all training images by distance from Focal Image

Page 26: Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

Image Retrieval

Given a novel Image Q (not in training set), want to sort all training images with respect to distance to Q

Problem: local distances are not directly comparable (weights learned independently and not on same scale)

Page 27: Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

Second Round of Training: Logistic Regression to the Rescue

For each Focal Image I, fit Logistic Regression model to the binary (in-class versus out-of-class) training labels and learned distances.

To classify query image Q, we can get distance from Q to each training image Ii, then use logistic function to get probability that Q is in the same class as Ii

Assign Q to the class with the highest mean (or sum) probability

Page 28: Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

Choosing Triplets for Learning

For Focal Image I, images in same category are “similar” images and images from different category are “different” images

Want to select images that are similar to the focal image according to at least one elementary distance function

Page 29: Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

Choosing Triplets for Learning

For each of the M elementary distances in focal image F, we find the top K closest images

If K contains both in-class and out-of-class images, create all triplets from (F,in-class,out-class) combinations

If K are all in-class images, get closest out-class image then make K triplets (reverse if all K are out-class)

Page 30: Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

Caltech 101

Page 31: Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

Caltech 101: 60.3% Mean Recognition Rate

Page 32: Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

Conclusions

Metric Learning similar to Feature Selection Seems like for different visual object

categories, different notions of similarity are needed: color informative for some classes, local shape for others, geometry for others

Metric Learning well suited for instance-based classifiers like KNN

Local learning more meaningful than global learning

Page 33: Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

References

NIPS 06 Workshop on Learning to Compare Examples

Liu Yang’s Distance Metric Learning Comprehensive Survey

Xing et al, Weinberger et al, Frome et al

Page 34: Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.

Questions?

Thanks for Listening

Questions?