Recent Advances of Compact Hashing for Large-Scale Visual Search

Shih-Fu ChangColumbia University

October 2012

Joint work with Junfeng He (Facebook), Sanjiv Kumar (Google), Wei Liu (IBM Research), and Jun Wang (IBM Research)

digital video | multimedia lab

Outline Lessons learned in designing hashing functions

The importance of balancing hash bucket size How to incorporate supervised information

Prediction of NN search difficulty & hashing performance

Demo: Bag of hash bits for Mobile Visual Search

Fast Nearest Neighbor Search• Applications: image search, texture synthesis, denoising … • Avoid exhaustive search ( time complexity)

Dense matching, Coherence sensitive hashing (Korman&Avidan ’11)

Photo tourism patch search

Image search

Locality-Sensitive Hashing

• hash code collision probability proportional to original similarityl: # hash tables, K: hash bits per table

hash function

random

101 Index by compact code

[Indyk, and Motwani 1998] [Datar et al. 2004]

Hash Table based Search

• O(1) search time by table lookup• bucket size is important (affect accuracy & post processing

q01101

0110101110

hash tablehash bucket address

Different Approaches

Unsupervised Hashing

LSH ‘98, SH ‘08, KLSH ‘09,AGH ’10, PCAH, ITQ ‘11

Semi-Supervised Hashing

SSH ‘10, WeaklySH ‘10

Supervised Hashing

RBM ‘09, BRE ‘10, MLH ‘11, LDAH ’11,ITQ ‘11, KSH ‘12

PCA + Minimize Quantization Errors

• PCA to maximize variance in each hash dimension• find optimal rotation in the subspace to minimize

quantization error

ITQ method, Gong&Lazebnik, CVPR 11

Effects of Min Quantization Errors• 580K tiny images PCA-ITQ, Gong&Lazebnik, CVPR 11

PCA-random rotation PCA-ITQ optimal alignment

Utilize supervised labelsSemantic Category Supervision

Metric Supervision

similar

dissimilardissimilar

similar

dissimilar

Design Hash Codes to Match Supervised Information

similar

dissimilar

• Preferred hashing function

Adding Supervised Labels to PCA Hash

Relaxation:

Wang, Kumar, Chang, CVPR ’10, ICML’10

“adjusted” covariance matrix

• solution W: eigen vectors of adjusted covariance matrix• If no supervision (S=0), it is simply PCA hash

Fitting labels PCA covariance matrix

dissimilar pairsimilar pair

Semi-Supervised Hashing (SSH)1 Million GIST Images1% labels, 99% unlabeled

Supervised RBM

Random LSH

Unsupervised SH

SSHPrecision @ top 1K

Problem of orthogonal projections

• Many buckets become empty when # bits increases.

• Need to search many neighbor buckets at query time

Precision @ hamming radius 2

• Explicitly optimize two terms– Preserve similarity (accuracy)– Balanced bucket size max entropy min mutual info I (search time)

Search accuracy

ICA Type Hashing

( ) || ||N

pq p qp q

D Y W Y Y

Balanced bucket size

min ( ,..., ,..., )

while ( ) 0

I y y y

SPICA Hash, He et al, CVPR 11

Fast ICA to find non-orthogonal projections

The Importance of balanced size

Bucket index

ze LSHSPICA HashBalanced bucket size

Simulation over 1M tiny image samples

The largest bucket of LSH contains 10% of all 1M samples

Different Approaches

Unsupervised Hashing

LSH ‘98, SH ‘08, KLSH ‘09,AGH ’10, PCAH, ITQ ‘11

Semi-Supervised Hashing

SSH ‘10, WeaklySH ‘10

Supervised Hashing

RBM ‘09, BRE ‘10, MLH ‘11, LDAH ’11,ITQ ‘11, KSH ‘12

Better ways to handle supervised information?

MLH [Norouzi & Flee, ‘11]

BRE [Kulis & Darrell, ‘10]Hamming distance between H(xi) and H(xj)

hinge loss

But optimizing Hamming Distance (DH, XOR) is not easy!

A New Supervision Form: Code Inner Products

rsimilar

supervised hashing

labeled data

dissimilar

1 -1 1

-1 1 -1

-1 -1 1

1 1 -1Х

Tcode matrix

1 1 -1

-1 -1 1

x1 x2 x3

pair-wise label matrix

code inner products

code matrix

fitting

Liu, Wang, Ji, Jiang, Chang, CVPR’12

proof: code inner product ≡ Hamming distance

Code Inner Product enables efficient optimization

• Much easier/faster to optimize and extend to kernels

sample

hash bitHashing:

Design hash codes to match

supervised information

Liu, Wang, Ji, Jiang, Chang, CVPR2012

Extend Code Inner Product to Kernel• Following KLSH, construct a hash function using a kernel

function and m anchor samples:

zero-mean normalization applied to k(x).

1 -1 1

-1 1 -1

1 1 -1

hash coefficientskernel matrix

×l samples

m anchors

Benefits of Code Inner Product

•CIFAR 10, 60K object images from 10 classes, 1K query images.

•1K supervised labels. •KSH0 Spec Relax, KSH Sigmoid hashing function

Supervised Methods

Open Issue: empty buckets and balance not addressed

Speedup by Inner Code Product

22CVPR 2012

Method

Train Time Test Time

48 bits 48 bits

SSH 2.1 0.9×10−5

LDAH 0.7 0.9×10−5

BRE 494.7 2.9×10−5

MLH 3666.3 1.8×10−5

KSH0 7.0 3.3×10−5

KSH 156.1 4.3×10−5

Significant speedup

Tiny-1M: Visual Search Results

CVPR 2012

More visuallyrelevant

Comparison of Hashing vs. KD-Tree

Supervised Hashing

Photo Tourism Patch set (Norte Dame subset, 103K samples)512D GIFTAnchor Graph

Hashing

KD Tree

• How difficult is approximate nearest neighbor search in a dataset?

Understand Difficulty of Approximate Nearest Neighbor Search

Toy example

x is an ε-approximate NN if

Search not meaningful!

A concrete measure of difficulty of search in a dataset?

He, Kumar, Chang, ICML 2012

• A naïve search approach: Randomly pick a point and compare that to the NN

Relative Contrast

• High Relative Contrast easier search• If , search not meaningful

He, Kumar, Chang, ICML 2012

• With CLT, and binomial approximation

Estimation of Relative Contrast

ϕ - standard Gaussian cdf

σ' – a function of data properties (dimensionality and sparsity)

n: data sizep: Lp distance

• Data sampled randomly from U[0,1]

Synthetic Datare

higher dimensionality bad sparser vectors good

s: prob. of non-zero element in each dim.d: feature dimension

• Data sampled randomly from U[0,1]

Synthetic Data

lower p goodLarger database good

Predict Hashing Performance of Real-World Data

16 bits LSH

Dataset Dimensionality (d)

Sparsity (s)

Relative Contrast (Cr) for p = 1

SIFT 128 0.89 4.78

Gist 384 1.00 1.83

Color Hist 1382 0.027 3.19

Imagenet BoW 10000 0.024 1.90

28 bits LSH

Mobile Search System by Hashing

Light Computing Low Bit Rate Big Data Indexing

He, Feng, Liu, Cheng, Lin, Chung, Chang. Mobile Product Search with Bag of Hash Bits and Boundary Reranking , CVPR 2012.

Estimate the Complexity

• 500 local features per image– Feature size ~128 Kbytes– more than 10 seconds for transmission over 3G

• Database indexing– 1 million images need 0.5 billions local features– Finding matched features becomes challenging

• Idea: directly compute compact hash codes on mobile devices

Approach: hashing• Each local feature coded as hash bits

– locality sensitive, efficient for high dimensions• Each image is represented as Bag of Hash Bits

011001100100111100…

110110011001100110…

Bit Reuse for Multi-Table Hashing• To reduce transmission size

– Reuse a single hash bit pool by random subsampling

1 0 0 1 1 1 0 0 0 0 1 0 1 0 1 0 . . . 0 0 1 1 0 1 1 1

Optimal hash bit pool (e.g., 80 bits, PCA Hash or SPICA hash)

Random subset

Random subset. . .

Table 1 Table 2 Table 11 Table 12. . . 32 bits

Union Results

Rerank Results with Boundary Features• Use automatic salient object segmentation for every

image in DB [Cheng et al, CVPR 2011]

• Compute boundary features: normalized central distance, Fourier magnitude

• Invariance: translation, scaling, rotation

Boundary Feature – Central Distance

Distance to Center D(n) FFT: F(n) 39

Reranking with boundary feature

Server:• 1 million product images crawled from

Amazon, eBay and Zappos• Hundreds of categories; shoes, clothes,

electrical devices, groceries, kitchen supplies, movies, etc.

Speed• Feature extraction: ~1s • Transmission:

80 bits/feature, 1KB/image• Serer Search: ~0.4s• Download/display: 1-2s

Mobile Product Search System: Bags of Hash Bits and Boundary features

video demo (52”)

He, Feng, Liu, Cheng, Lin, Chung, Chang. Mobile Product Search with Bag of Hash Bits and Boundary Reranking, CVPR 2012.

Performance• Baseline [Chandrasekhar et al CVPR ‘10]:

Client: compress local features with CHoGServer: BoW with Vocabulary Tree (1M codes)

30% higher recall and 6X-30X search speedup

Summary• Some Ideas Discussed

– bucket balancing is important– code inner product – an efficient form of supervised

hashing– insights on search difficulty prediction– Large mobile search – a good test case for hashing

• Open Issues– supervised hashing vs. attribute discovery– hashing beyond point-to-point search– hashing to incorporate structured relation (spatio-

temporal)

References• (Supervised Kernel Hash)

W. Liu, J. Wang, R. Ji, Y. Jiang, and S.-F. Chang, Supervised Hashing with Kernels, CVPR 2012.

• (Difficulty of Nearest Neighbor Search)J. He, S. Kumar, S.-F. Chang, On the Difficulty of Nearest Neighbor Search, ICML 2012.

• (Hash Based Mobile Product Search)J. He, T. Lin, J. Feng, X. Liu, S.-F. Chang, Mobile Product Search with Bag of Hash Bits and Boundary Reranking, CVPR 2012

• (Hashing with Graphs)W. Liu, J. Wang, S. Kumar, S.-F. Chang. Hashing with Graphs, ICML 2011.

• (Iterative Quantization)Y. Gong and S. Lazebnik, Iterative Quantization: A Procrustean Approach to Learning Binary Codes, CVPR 2011.

• (Semi-Supervised Hash)J. Wang, S. Kumar, S.-F. Chang. Semi-Supervised Hashing for Scalable Image Retrieval. CVPR 2010.

• (ICA Hashing)J.He, R. Radhakrishnan, S.-F. Chang, C. Bauer. Compact Hashing with Joint Optimization of Search Accuracy and Time. CVPR 2011. 44

Recent Advances of Compact Hashing for Large-Scale Visual Search

Documents

Transcript of Recent Advances of Compact Hashing for Large-Scale Visual Search

Lecture 10 Sept 29 Goals: hashing dictionary operations general idea of hashing hash functions chaining closed hashing.

New Static hashing schemes - GitHub Pagesyljh21328.github.io/blog/pdf/EDHashing.pdf · 2014. 7. 13. · Dynamic Hashing - 2 • Dynamic hashing schemes - Extendible hashing - Dynamic

Maximum-Margin Hamming Hashing€¦ · aretwosolutionstoANNsearch: indexing[22]andhashing [46]. Hashing methods aim to convert high-dimensional vi-sual data into compact binary codes

Recent Advances of Compact Hashing for Large-Scale Visual Search Shih-Fu Chang Columbia University October 2012 Joint work with Junfeng He (Facebook),

Deep Semantic-Preserving and Ranking-Based Hashing for ... · rior image search accuracy than other state-of-the-art hashing techniques. ... Inspired by recent advances in image representation

Chapter 8 Hashing Concept of Hashing Static Hashing ...chun/DS(II)-Ch08-Hashing.pdf · 1 C-C Tsai P.1 Chapter 8 Hashing Concept of Hashing Static Hashing Dynamic Hashing In CS, a

Hashing 1 Hashing. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,

Hashing-baseddelayedduplicatedetectionasanapproachto ... · Dateofacceptance Grade Instructor Hashing-baseddelayedduplicatedetectionasanapproachto improvethescalabilityofoptimalBayesiannetworkstructure

Summer School on Hashing’14 Locality Sensitive Hashing

File Organizations Jan. 2008Yangjun Chen ACS-39021 Outline: Hashing (5.9, 5.10, 3 rd. ed.; 13.8, 4 th ed.) external hashing static hashing & dynamic hashing.

Minimal Loss Hashing for Compact Binary Codes Mohammad Norouzi David Fleet University of Toronto.

Minimal Loss Hashing for Compact Binary Codesnorouzi/research/papers/min_loss...Minimal Loss Hashing for Compact Binary Codes 2008). The resulting projection directions can be in-terpreted

Hashing Part Two: Static Perfect Hashing

Deep Supervised Hashing for Fast Image Retrieval · Deep Supervised Hashing (DSH) method to learn compact similarity-preserving binary code for the huge body of im-age data. Speciﬁcally,

Recent Advances of Compact Hashing for Large-Scale Visual Search Shih-Fu Chang Columbia University December 2012 Joint work with.

High Dimensional Search Min-Hashing Locality Sensitive Hashing

Cuckoo Hashing

CS235102 Data Structures Chapter 8 Hashing (Concentrating on Static Hashing)

10/17/2002CSE 202 - Hashing CSE 202 - Algorithms Hashing Universal Hash Functions Extendible Hashing.

Hashing PPT