Scalable image recognition model with deep embedding
-
Upload
- -
Category
Technology
-
view
86 -
download
0
Transcript of Scalable image recognition model with deep embedding
![Page 2: Scalable image recognition model with deep embedding](https://reader038.fdocuments.in/reader038/viewer/2022103118/55c324b4bb61ebbe128b45d5/html5/thumbnails/2.jpg)
Motivation
![Page 3: Scalable image recognition model with deep embedding](https://reader038.fdocuments.in/reader038/viewer/2022103118/55c324b4bb61ebbe128b45d5/html5/thumbnails/3.jpg)
Motivation: DNN raising
• Deep Neural Network achieved the best performance for variety of visual tasks.
![Page 4: Scalable image recognition model with deep embedding](https://reader038.fdocuments.in/reader038/viewer/2022103118/55c324b4bb61ebbe128b45d5/html5/thumbnails/4.jpg)
Motivation: popular mobiles
• devices like smartphone, in-car camera, GoPro, IOT devices pop up.
![Page 5: Scalable image recognition model with deep embedding](https://reader038.fdocuments.in/reader038/viewer/2022103118/55c324b4bb61ebbe128b45d5/html5/thumbnails/5.jpg)
Huge amount of valuable images stored not in server, but in mobile & IOT devices
![Page 6: Scalable image recognition model with deep embedding](https://reader038.fdocuments.in/reader038/viewer/2022103118/55c324b4bb61ebbe128b45d5/html5/thumbnails/6.jpg)
Motivation: exploit DNN
• High performance brought by DNN• Valuable data brought by mobile & IOT devices
How to exploit the best of both worlds ?
![Page 7: Scalable image recognition model with deep embedding](https://reader038.fdocuments.in/reader038/viewer/2022103118/55c324b4bb61ebbe128b45d5/html5/thumbnails/7.jpg)
Solution: client-server system
La Tour Eiffel
averaging 7 - 12 secCan’t do real-time application
![Page 8: Scalable image recognition model with deep embedding](https://reader038.fdocuments.in/reader038/viewer/2022103118/55c324b4bb61ebbe128b45d5/html5/thumbnails/8.jpg)
Or, another way
![Page 9: Scalable image recognition model with deep embedding](https://reader038.fdocuments.in/reader038/viewer/2022103118/55c324b4bb61ebbe128b45d5/html5/thumbnails/9.jpg)
Solution: pure mobile system
DatasetLib
Linear
Feature extractionClassification
OrFurther
Processing
Send low dim.feature to server formore complicated job
![Page 10: Scalable image recognition model with deep embedding](https://reader038.fdocuments.in/reader038/viewer/2022103118/55c324b4bb61ebbe128b45d5/html5/thumbnails/10.jpg)
Problem: Limited Storage & Computing power
• Too many parameters for a DNN model makes it impossible to fit in a storage & computing limited system like mobile & IOT devices
• How to perform image classification on mobile & IOT device?
![Page 11: Scalable image recognition model with deep embedding](https://reader038.fdocuments.in/reader038/viewer/2022103118/55c324b4bb61ebbe128b45d5/html5/thumbnails/11.jpg)
Krizhevsky et al model size (alexNet)
A. Krizhevsky, I. Sutskever, and G. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” NIPS, 2012.
Layer: Model Size(MB)Conv1: float*(48+48)*(3*11^2) = 0.1Conv2: float*(128+128)*(48*5^2) = 1.2Conv3: float*(192+192)*(256*3^2 = 3.4Conv4: float*(192+192)*(192*3^2) = 2.5Conv5: float*(128+128)*(192*3^2) = 1.7FC6: float*((128+128)*6^2)*4096 = 144(66%)FC7: float*4096*4096 = 64(29%)
Total = 217 MB
![Page 12: Scalable image recognition model with deep embedding](https://reader038.fdocuments.in/reader038/viewer/2022103118/55c324b4bb61ebbe128b45d5/html5/thumbnails/12.jpg)
Solution:Semantic-Rich Low Dim. Feature
• The activations of fully connected layer of alexNet model are viewed as a general high-semantic feature in recent years
• 95% of model parameters are for fully connected
![Page 13: Scalable image recognition model with deep embedding](https://reader038.fdocuments.in/reader038/viewer/2022103118/55c324b4bb61ebbe128b45d5/html5/thumbnails/13.jpg)
Solution:Semantic-Rich Low Dim. Feature
Drop fully connected layer in final model while still encoding it’s information !
![Page 14: Scalable image recognition model with deep embedding](https://reader038.fdocuments.in/reader038/viewer/2022103118/55c324b4bb61ebbe128b45d5/html5/thumbnails/14.jpg)
How ?
![Page 15: Scalable image recognition model with deep embedding](https://reader038.fdocuments.in/reader038/viewer/2022103118/55c324b4bb61ebbe128b45d5/html5/thumbnails/15.jpg)
Kernel Preserving Projection(KPP)• find a linear transformation that project
features into a lower dimensional space where ”preserve the relevance distance in kernel space”
YC Su et. al. ,”Scalable Mobile Visual Classification by Kernel Preserving Projection over High Dimensional Features”, IEEE, 2014
![Page 16: Scalable image recognition model with deep embedding](https://reader038.fdocuments.in/reader038/viewer/2022103118/55c324b4bb61ebbe128b45d5/html5/thumbnails/16.jpg)
Kernel Preserving Projection(KPP)
• find a explicit transform such that:
• In matrix representation, we want to find a matrix
![Page 17: Scalable image recognition model with deep embedding](https://reader038.fdocuments.in/reader038/viewer/2022103118/55c324b4bb61ebbe128b45d5/html5/thumbnails/17.jpg)
Kernel Preserving Projection(KPP)
• MVProjection:
• L1MVProjection:
![Page 18: Scalable image recognition model with deep embedding](https://reader038.fdocuments.in/reader038/viewer/2022103118/55c324b4bb61ebbe128b45d5/html5/thumbnails/18.jpg)
Deep Embedding
• Experimental result shows that on hand-craft feature, RBF kernel perform best
• Thought inf. dim. , RBF space itself is semantically meaningless !
![Page 19: Scalable image recognition model with deep embedding](https://reader038.fdocuments.in/reader038/viewer/2022103118/55c324b4bb61ebbe128b45d5/html5/thumbnails/19.jpg)
Deep Embedding
• For RBF kernel,
• For Deep Embedding,
![Page 20: Scalable image recognition model with deep embedding](https://reader038.fdocuments.in/reader038/viewer/2022103118/55c324b4bb61ebbe128b45d5/html5/thumbnails/20.jpg)
Deep Embedding
![Page 21: Scalable image recognition model with deep embedding](https://reader038.fdocuments.in/reader038/viewer/2022103118/55c324b4bb61ebbe128b45d5/html5/thumbnails/21.jpg)
Not only model reduced,but also the classifier
![Page 22: Scalable image recognition model with deep embedding](https://reader038.fdocuments.in/reader038/viewer/2022103118/55c324b4bb61ebbe128b45d5/html5/thumbnails/22.jpg)
Result
In the experiment, we use liblinear as our classifier and perform 10-fold on scene15 benchmark dataset. We first compare KPP(RBF) and other methods on hand-craft state-of-the-art feature(VLAD) to show how KPP outperform others.
![Page 23: Scalable image recognition model with deep embedding](https://reader038.fdocuments.in/reader038/viewer/2022103118/55c324b4bb61ebbe128b45d5/html5/thumbnails/23.jpg)
Result
![Page 24: Scalable image recognition model with deep embedding](https://reader038.fdocuments.in/reader038/viewer/2022103118/55c324b4bb61ebbe128b45d5/html5/thumbnails/24.jpg)
Result-Deep Embed
- Acc. boost from 75.6%(hand-craft) to 89.5%(alexNet) shows to power of DNN
- Deep embedding outperform other method by large on DNN feature.
The final model result in:- Requiring only 14% of parameters, 86% space saved.
(217M->30M)
- Accuracy drop only 1.12%.(89.5%->88.38%)
- Suitable for mobile & IOT device computing !
![Page 25: Scalable image recognition model with deep embedding](https://reader038.fdocuments.in/reader038/viewer/2022103118/55c324b4bb61ebbe128b45d5/html5/thumbnails/25.jpg)
Result-Deep Embed
21.1M030MB
![Page 26: Scalable image recognition model with deep embedding](https://reader038.fdocuments.in/reader038/viewer/2022103118/55c324b4bb61ebbe128b45d5/html5/thumbnails/26.jpg)
Result-Deep Embed
- Acc. boost from 75.6%(hand-craft) to 89.5%(alexNet) shows to power of DNN
- Deep embedding outperform other method by large on DNN feature.
The final model result in:- Requiring only 14% of parameters, 86% space saved.
(217M->30M)
- Accuracy drop only 1.12%.(89.5%->88.38%)
- Suitable for mobile & IOT device computing !
![Page 27: Scalable image recognition model with deep embedding](https://reader038.fdocuments.in/reader038/viewer/2022103118/55c324b4bb61ebbe128b45d5/html5/thumbnails/27.jpg)
Thank you !