Gang WangDerek HoiemDavid Forsyth. INTRODUCTION APROACH (implement detail) EXPERIMENTS CONCLUSION.

Learning Image Similarity from Flickr Groups Using

Stochastic Intersection Kernel Machines

ICCV 2009, UIUC

Gang Wang Derek Hoiem David Forsyth

INTRODUCTION

APROACH(implement detail)

EXPERIMENTS

CONCLUSION

OUTLINE

Using online photo sharing sites → Flickr(Group)

Determine which image are similar , how they are similar

Learn these Group membership likelihoodsDue to the time that it would take to learn categoriesPropose a new method for stochastic learning of SVMs

using Histogram Intersection Kernel (HIK)SIKMA

Combine with [14] and [18]

Introduction

Related work Algorithm classes (train very large scale kernel SVM)

i) Exploits the sparseness of the lagrange multipliers → SMO[22]

ii) Use stochastic gradient descentwithout touching every example

http://0rz.tw/BDHWJ

Kivinen [14] → method applies to kernel machines

Maji[18] → very quickly evaluating a histogram intersection kernel

Construction

http://0rz.tw/BDHWJ

Flickr provide an organizational structureHow people like to group

SIKMA classifier allows efficient and accurate learning of these categories

This property generalizes wellEven the test dataset was not obtained from Flickr

Construction conclusion

Approach(SIKMA)

Suppose we have a list of training examples

For the test example uThe classification score

Approach(SIKMA)Approximate the gradient by replacing the sum over all

examples(batch) with a sum over some subset, chosen at random. It is usual to consider a single example.

New decision function

It’s expensive to calculate ft-1. The NORMA Algo.[14] keeps a set of support vectors of fixed length by dropping the oldest ones.

Doing so comes at a considerable cost in accuracy ！

Approach(SIKMA)

D is feature dimension

SIKMA Conventional SVM solver

The Computational complexity O(TMD) O(T2D)

The Space cost O(MD)

O(T2)

O(D) is Evaluation

for each example

Approach

T: # of training example M: # of quantization binsD: # of feature dimension

Measuring image similarity

Found a simple Euclidean distance between the SVM outputs.

Since we have names(groups), we can also perform text-based queries

(get image like “people dancing”) and determine how two image are similar

Approach

Use four type of feature:

SIFT featureDetect and describe local patches

Gist feature960 dimensions Gist descriptor

Color featureRGB space, value range from 1 to 512

Gradient featureThe whole image is represented as a 256

dimensional vector

Implement detail

Combine the outputs of these four classifier to be a final prediction on a validation data set

SIKMA Training Time and Test Accuracy

For 103 Flickr categories, using

15,000 ~ 30,00 positive images and 60,000 negative images.

The average AP over these categories is0.433

Experiments

Experimentsimage matching with Feedback

Select top five negative examples and five randomly chosen positive examples from among the top 50 ranked images

yi is 1 if it is positive, otherwise 0

Experiments

Experimentstext-based queries

Flickr category can be described with several word, we can support text-based queries.

Input a word query find the Flickr group whose description contains such word

Test this on the Corel data set, with two queries ”airplane” and “sunset”.

Conclusion

SIKMA, an algorithm to quickly train an SVM with the histogram intersection kernel using tens of thousands of training examples

two images that are likely to belong to the same Flickr groups are considered similar.

Experimental results show that matching with Prediction features better than matching with visual features

Gang WangDerek HoiemDavid Forsyth. INTRODUCTION APROACH (implement detail) EXPERIMENTS CONCLUSION.

Documents

Transcript of Gang WangDerek HoiemDavid Forsyth. INTRODUCTION APROACH (implement detail) EXPERIMENTS CONCLUSION.