Gang WangDerek HoiemDavid Forsyth. INTRODUCTION APROACH (implement detail) EXPERIMENTS CONCLUSION.
-
Upload
camron-todd -
Category
Documents
-
view
216 -
download
0
Transcript of Gang WangDerek HoiemDavid Forsyth. INTRODUCTION APROACH (implement detail) EXPERIMENTS CONCLUSION.
Learning Image Similarity from Flickr Groups Using
Stochastic Intersection Kernel Machines
ICCV 2009, UIUC
Gang Wang Derek Hoiem David Forsyth
INTRODUCTION
APROACH(implement detail)
EXPERIMENTS
CONCLUSION
OUTLINE
Using online photo sharing sites → Flickr(Group)
Determine which image are similar , how they are similar
Learn these Group membership likelihoodsDue to the time that it would take to learn categoriesPropose a new method for stochastic learning of SVMs
using Histogram Intersection Kernel (HIK)SIKMA
Combine with [14] and [18]
Introduction
Related work Algorithm classes (train very large scale kernel SVM)
i) Exploits the sparseness of the lagrange multipliers → SMO[22]
ii) Use stochastic gradient descentwithout touching every example
http://0rz.tw/BDHWJ
Kivinen [14] → method applies to kernel machines
Maji[18] → very quickly evaluating a histogram intersection kernel
Construction
Flickr provide an organizational structureHow people like to group
SIKMA classifier allows efficient and accurate learning of these categories
This property generalizes wellEven the test dataset was not obtained from Flickr
Construction conclusion
Approach(SIKMA)
Suppose we have a list of training examples
For the test example uThe classification score
Approach(SIKMA)Approximate the gradient by replacing the sum over all
examples(batch) with a sum over some subset, chosen at random. It is usual to consider a single example.
New decision function
It’s expensive to calculate ft-1. The NORMA Algo.[14] keeps a set of support vectors of fixed length by dropping the oldest ones.
Doing so comes at a considerable cost in accuracy !
Approach(SIKMA)
D is feature dimension
SIKMA Conventional SVM solver
The Computational complexity O(TMD) O(T2D)
The Space cost O(MD)
O(T2)
O(D) is Evaluation
for each example
Approach
T: # of training example M: # of quantization binsD: # of feature dimension
Measuring image similarity
Found a simple Euclidean distance between the SVM outputs.
Since we have names(groups), we can also perform text-based queries
(get image like “people dancing”) and determine how two image are similar
Approach
Use four type of feature:
SIFT featureDetect and describe local patches
Gist feature960 dimensions Gist descriptor
Color featureRGB space, value range from 1 to 512
Gradient featureThe whole image is represented as a 256
dimensional vector
Implement detail
Combine the outputs of these four classifier to be a final prediction on a validation data set
SIKMA Training Time and Test Accuracy
For 103 Flickr categories, using
15,000 ~ 30,00 positive images and 60,000 negative images.
The average AP over these categories is0.433
Experiments
Experimentsimage matching with Feedback
Select top five negative examples and five randomly chosen positive examples from among the top 50 ranked images
yi is 1 if it is positive, otherwise 0
Experiments
Experiments
Experiments
Experimentstext-based queries
Flickr category can be described with several word, we can support text-based queries.
Input a word query find the Flickr group whose description contains such word
Test this on the Corel data set, with two queries ”airplane” and “sunset”.
Conclusion
SIKMA, an algorithm to quickly train an SVM with the histogram intersection kernel using tens of thousands of training examples
two images that are likely to belong to the same Flickr groups are considered similar.
Experimental results show that matching with Prediction features better than matching with visual features