Convolutional Patch Representations for Image Retrieval An unsupervised approach
-
Upload
universitat-de-barcelona -
Category
Data & Analytics
-
view
233 -
download
0
Transcript of Convolutional Patch Representations for Image Retrieval An unsupervised approach
![Page 1: Convolutional Patch Representations for Image Retrieval An unsupervised approach](https://reader031.fdocuments.in/reader031/viewer/2022030318/58ecdab31a28ab38568b46d9/html5/thumbnails/1.jpg)
Convolutional Patch Representations for Image Retrieval: an Unsupervised Approach
29th Mar 2016
Original slides by Eva MohedanoInsight Centre for Data Analytics (Dublin City University
Mattis Paulin, Julien Mairal, Matthijs Douze, Zaid Harchaoui, Florent Perronnin, Cordelia Schmidt
![Page 2: Convolutional Patch Representations for Image Retrieval An unsupervised approach](https://reader031.fdocuments.in/reader031/viewer/2022030318/58ecdab31a28ab38568b46d9/html5/thumbnails/2.jpg)
OverviewPublished ICCV 2015 (A.K.A. Local Convolutional Features With Unsupervised
Training for Image Retrieval)
Deep Convolutional Architecture to produce patch-level descriptors
• Unsupervised framework
• Comparison in patch and retrieval datasets
• “RomePatches” dataset
![Page 3: Convolutional Patch Representations for Image Retrieval An unsupervised approach](https://reader031.fdocuments.in/reader031/viewer/2022030318/58ecdab31a28ab38568b46d9/html5/thumbnails/3.jpg)
Related Work
• Shallow patch descriptors
• Deep learning for image retrieval
• Deep patch descriptors
![Page 4: Convolutional Patch Representations for Image Retrieval An unsupervised approach](https://reader031.fdocuments.in/reader031/viewer/2022030318/58ecdab31a28ab38568b46d9/html5/thumbnails/4.jpg)
Related Work• Shallow patch descriptors
SIFT – Scale-Invariant Feature Transform
- stereo matching
- retrieval
- classification
SURF, BRIEF, LIOP, (…)
Hand crafted → Relatively small number of parameters.
Note: A patch is an
image region extracted
from an image.
![Page 5: Convolutional Patch Representations for Image Retrieval An unsupervised approach](https://reader031.fdocuments.in/reader031/viewer/2022030318/58ecdab31a28ab38568b46d9/html5/thumbnails/5.jpg)
Related Work• Deep learning for image retrieval
CNN learned on a sufficiently large labeled dataset (ImageNet) generates intermediate layers that
can be used as image descriptors.
Those descriptors work for a wide variety of tasks, including image retrieval
![Page 6: Convolutional Patch Representations for Image Retrieval An unsupervised approach](https://reader031.fdocuments.in/reader031/viewer/2022030318/58ecdab31a28ab38568b46d9/html5/thumbnails/6.jpg)
Related Work• Deep learning for image retrieval
source image: http://pubs.sciepub.com/ajme/2/7/9/
![Page 7: Convolutional Patch Representations for Image Retrieval An unsupervised approach](https://reader031.fdocuments.in/reader031/viewer/2022030318/58ecdab31a28ab38568b46d9/html5/thumbnails/7.jpg)
Related Work• Deep learning for image retrieval
source image: http://pubs.sciepub.com/ajme/2/7/9/
Fully connected layers → Global Image Descriptors
● Compact representation
● lack of geometric invariance
Below state-of-the art in image
retrieval
Compute at different scales(Babenko, Razavian)
![Page 8: Convolutional Patch Representations for Image Retrieval An unsupervised approach](https://reader031.fdocuments.in/reader031/viewer/2022030318/58ecdab31a28ab38568b46d9/html5/thumbnails/8.jpg)
Related Work• Deep learning for image retrieval
source image: http://pubs.sciepub.com/ajme/2/7/9/
Convolutional layers
![Page 9: Convolutional Patch Representations for Image Retrieval An unsupervised approach](https://reader031.fdocuments.in/reader031/viewer/2022030318/58ecdab31a28ab38568b46d9/html5/thumbnails/9.jpg)
Related Work• Deep patch descriptors
3 different kind of supervision:
1. Category labels of ImageNet. [Long et al, 2014]
2. Surrogate patch labels: Each class is a given patch under different transformations [Fischer et al, 2014]
3. Matching/non-matching pairs. [Simo-Serra et al, 2015]
Works focussed in patch-level metrics, not image retrieval.
All approaches requiered some kind of supervision.
![Page 10: Convolutional Patch Representations for Image Retrieval An unsupervised approach](https://reader031.fdocuments.in/reader031/viewer/2022030318/58ecdab31a28ab38568b46d9/html5/thumbnails/10.jpg)
Image Retrieval Pipeline• Interest point detection
Hessian-Affine detector.
Rotation invariance.
• Interest point description
Feature representation in a Euclidean space
• Patch Matching
VLAD encoding.
Power normalization with exponent 0.5 + L2-norm.
![Page 11: Convolutional Patch Representations for Image Retrieval An unsupervised approach](https://reader031.fdocuments.in/reader031/viewer/2022030318/58ecdab31a28ab38568b46d9/html5/thumbnails/11.jpg)
Image Retrieval Pipeline• Interest point detection
Hessian-Affine detector.
Rotation invariance.
• Interest point description
Feature representation in a Euclidean space
• Patch Matching
VLAD encoding.
Power normalization with exponent 0.5 + L2-norm.
![Page 12: Convolutional Patch Representations for Image Retrieval An unsupervised approach](https://reader031.fdocuments.in/reader031/viewer/2022030318/58ecdab31a28ab38568b46d9/html5/thumbnails/12.jpg)
Convolutional DescriptorsPatch size = 51x51 – Optimal for SIFT on Oxford dataset.
CNN extended to retrieval by:
• Encoding local descriptors with model trained with an unrelated classification task
• Devising a surrogate classification problem that is as related as possible to image retrieval:
• Using unsupervised learning: Convolutional Kernel Network
![Page 13: Convolutional Patch Representations for Image Retrieval An unsupervised approach](https://reader031.fdocuments.in/reader031/viewer/2022030318/58ecdab31a28ab38568b46d9/html5/thumbnails/13.jpg)
Convolutional Descriptors• Using unsupervised learning: Convolutional Kernel Network
Feature representation based in a kernel (feature) map -- Data independent
![Page 14: Convolutional Patch Representations for Image Retrieval An unsupervised approach](https://reader031.fdocuments.in/reader031/viewer/2022030318/58ecdab31a28ab38568b46d9/html5/thumbnails/14.jpg)
Convolutional Descriptors• Using unsupervised learning: Convolutional Kernel Network
Projection in Hilbert space
Explicit kernel map can be computed to approximate it for computational efficiency.
- Sub-sample of patches
- Stochastic Gradient Optimization
![Page 15: Convolutional Patch Representations for Image Retrieval An unsupervised approach](https://reader031.fdocuments.in/reader031/viewer/2022030318/58ecdab31a28ab38568b46d9/html5/thumbnails/15.jpg)
Convolutional Descriptors• Using unsupervised learning: Convolutional Kernel Network
4 possible inputs
From left to right: CKN-raw, CKN-mean subs, CKN-white (mean subs + PCA-whitening), CKN-grad (fully invariant to color)
Only CKN-raw, CKN-white and CKN-grad are evaluated.
![Page 16: Convolutional Patch Representations for Image Retrieval An unsupervised approach](https://reader031.fdocuments.in/reader031/viewer/2022030318/58ecdab31a28ab38568b46d9/html5/thumbnails/16.jpg)
ExperimentsDatasets:
1. Rome Patches-Image
2. Oxford
3. UKbench and Holidays
CKN trained on 1M sub-patches. 300K iterations. Mini-batches size of 1000.
![Page 17: Convolutional Patch Representations for Image Retrieval An unsupervised approach](https://reader031.fdocuments.in/reader031/viewer/2022030318/58ecdab31a28ab38568b46d9/html5/thumbnails/17.jpg)
Experiments
![Page 18: Convolutional Patch Representations for Image Retrieval An unsupervised approach](https://reader031.fdocuments.in/reader031/viewer/2022030318/58ecdab31a28ab38568b46d9/html5/thumbnails/18.jpg)
Conclusions• CKN offer similar and sometimes better performance than CNN in the
context of patch description.
• Good patch retrieval translates into good image retrieval.
• CKNs are orders of magnitude faster to train than CNNs (10 min vs 2-3 days
on a modern GPU)
• Fully unsupervised – no labels.
![Page 19: Convolutional Patch Representations for Image Retrieval An unsupervised approach](https://reader031.fdocuments.in/reader031/viewer/2022030318/58ecdab31a28ab38568b46d9/html5/thumbnails/19.jpg)
ResourcesRomePatches+Code (Although code is not accessible!)
Discriminative Unsupervised Feature Learning with Exemplar Convolutional
Neural Networks
- Code with augmentations in matlab
- Code for training models.
- Models already trained :-)
Triplet’s net + Code !!
- Greyscale local patches of 32x32. Tested in matching datasets