Approximate Correspondences in High Dimensions

MIT CSAILVision interfaces

Approximate Correspondences in High Dimensions

Kristen Grauman*

Trevor Darrell

MIT CSAIL

(*) UT Austin…


Key challenges: robustness

Illumination Object pose Clutter

ViewpointIntra-class appearance

Occlusions


Key challenges: efficiency

• Thousands to millions of pixels in an image

• 3,000-30,000 human recognizable object categories

• Billions of images indexed by Google Image Search

• 18 billion+ prints produced from digital camera images in 2004

• 295.5 million camera phones sold in 2005


Local representations

Superpixels [Ren et al.]

Shape context [Belongie et al.]

Maximally Stable Extremal Regions [Matas et al.]

Geometric Blur [Berg et al.]

SIFT [Lowe]

Salient regions [Kadir et al.]

Harris-Affine [Schmid et al.]

Spin images [Johnson and Hebert]

Describe component regions or patches separately


How to handle sets of features?

• Each instance is unordered set of vectors• Varying number of vectors per instance


Partial matching

Compare sets by computing a partial matching between their features.


Pyramid match overview

optimal partial matching


Computing the partial matching

• Optimal matching

• Greedy matching

• Pyramid match

for sets with features of dimension


Pyramid match overview

• Place multi-dimensional, multi-resolution grid over point sets

• Consider points matched at finest resolution where they fall into same grid cell

• Approximate optimal similarity with worst case similarity within pyramid cell

No explicit search for matches!

Pyramid match measures similarity of a partial matching between two sets:


Pyramid match

Number of newly matched pairs at level i

Measure of difficulty of a match at level i

Approximate partial match

similarity

[Grauman and Darrell, ICCV 2005]


Pyramid extraction

,

Histogram pyramid: level i has bins of size


Counting matches

Histogram intersection


Example pyramid match


Example pyramid matchpyramid match

optimal match


x

Randomly generated uniformly distributed point sets with m= 5 to 100, d=2

Approximating the optimal partial matching


PM preserves rank…


and is robust to clutter…


Learning with the pyramid match

• Kernel-based methods – Embed data into a Euclidean space via a

similarity function (kernel), then seek linear relationships among embedded data

– Efficient and good generalization– Include classification, regression,

clustering, dimensionality reduction,…

• Pyramid match forms a Mercer kernel


ComplexityKernel

Pyramid match

Match [Wallraven et al.]

Tim

e (s

)

Acc

ura

cyCategory recognition results

ETH-80 data set

Mean number of features Mean number of features


0.002 s / match

5 s / match

Category recognition results

Pyramid match kernel over spatial

features with quantized

appearance

2004

Time of publication

6/05 12/05 3/06 6/06


But rectangular histogram may scale poorly with input dimension…

Build data-dependent histogram structure…

New Vocabulary-guided PM [NIPS 06]:

• Hierarchical k-means over training set

• Irregular cells; record diameter of each bin

• VG pyramid structure stored O(kL); stored once

• Individual Histograms still stored sparsely

Vocabulary-guided pyramid match



Uniform bins • Tune pyramid partitions to the feature distribution

• Accurate for d > 100

• Requires initial corpus of features to determine pyramid structure

• Small cost increase over uniform bins: kL distances against bin centers to insert points

Vocabulary-guided bins



nij(X) : hist. X level i cell j

wij : weight for hist. X level i cell j(1) ~= diameter of cell

(2) ~= dij(X) + dij(Y) (dij(H)=max dist of H’s pts in cell i,j to center)

ch(n) : child h of node n

c2(n11)Mercer kernel

Upper bound

wij * (# matches in cell j level i - # matches in children)

W * # new matches @ level i


Results: Evaluation criteria

• Quality of match scores How similar are the rankings produced by the approximate measure to those produced by the optimal measure?

• Quality of correspondences How similar is the approximate correspondence field to the optimal one?

• Object recognition accuracy Used as a match kernel over feature sets, what is the recognition output?


Match score quality

Uniform bin pyramid match


ETH-80 images, sets of SIFT features

d=8 d=128

d=128d=8

Dense SIFT (d=128) k=10, L=5 for VG PM; PCA for low-dim feats


ETH-80 images, sets of SIFT features

Match score quality


Bin structure and match countsData-dependent bins allow more gradual distance ranges

d=8 d=13

d=68

d=3

d=113 d=128


Approximate correspondences

Use pyramid intersections to compute smaller explicit matchings.


Correspondence examples


ETH-80 images, sets of SIFT descriptorsApproximate correspondences


Impact on recognition accuracy

• VG-PMK as kernel for SVM• Caltech-4 data set• SIFT descriptors extracted

at Harris and MSER interest points


Sets of features elsewhere

diseases as sets of gene expressions

documents as bags of words

methods as sets of

instructions

Approximate Correspondences in High Dimensions

Documents

Transcript of Approximate Correspondences in High Dimensions