DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)
-
Upload
xavier-giro -
Category
Technology
-
view
3.454 -
download
0
Transcript of DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)
![Page 1: DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)](https://reader034.fdocuments.in/reader034/viewer/2022052116/587d6c971a28ab32318b6ec5/html5/thumbnails/1.jpg)
DeepFix: A Fully ConvolutionalNeural Network for Predicting
Human Fixations
Srinivas S S Kruthiventi, Kumar Ayush, and R. Venkatesh Babu (arXiv October 2015) [URL]
Slides by Xavier Giró-i-Nieto, from the Computer Vision Reading Group. (27/10/2015)https://imatge.upc.edu/web/teaching/computer-vision-reading-group
![Page 2: DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)](https://reader034.fdocuments.in/reader034/viewer/2022052116/587d6c971a28ab32318b6ec5/html5/thumbnails/2.jpg)
Introduction
2
![Page 3: DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)](https://reader034.fdocuments.in/reader034/viewer/2022052116/587d6c971a28ab32318b6ec5/html5/thumbnails/3.jpg)
Introduction
3
Bottom-up attention
AutomaticReflexiveStimulus-driven
![Page 4: DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)](https://reader034.fdocuments.in/reader034/viewer/2022052116/587d6c971a28ab32318b6ec5/html5/thumbnails/4.jpg)
Introduction
4
Top-down attention
Subjective’s prior knowledgeExpectationsTask orientedMemoryBehavioral goals
![Page 5: DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)](https://reader034.fdocuments.in/reader034/viewer/2022052116/587d6c971a28ab32318b6ec5/html5/thumbnails/5.jpg)
Introduction
5
Visual Attentional Mechanisms
Bottom-upAutomaticReflexiveStimulus-driven
Top-downSubjective’s prior knowledgeExpectationsTask orientedMemoryBehavioral goals
![Page 6: DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)](https://reader034.fdocuments.in/reader034/viewer/2022052116/587d6c971a28ab32318b6ec5/html5/thumbnails/6.jpg)
Introduction
![Page 7: DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)](https://reader034.fdocuments.in/reader034/viewer/2022052116/587d6c971a28ab32318b6ec5/html5/thumbnails/7.jpg)
Introduction
7
DeepFixClassic method
![Page 10: DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)](https://reader034.fdocuments.in/reader034/viewer/2022052116/587d6c971a28ab32318b6ec5/html5/thumbnails/10.jpg)
The ingredients
10
![Page 11: DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)](https://reader034.fdocuments.in/reader034/viewer/2022052116/587d6c971a28ab32318b6ec5/html5/thumbnails/11.jpg)
Very deep network
11
Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014)
● Inspired by Oxford’s VGG net (19 layers).● 20 layers● Small kernel sizes.
![Page 12: DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)](https://reader034.fdocuments.in/reader034/viewer/2022052116/587d6c971a28ab32318b6ec5/html5/thumbnails/12.jpg)
Fully convolutional network (FCN)
12
● Fully connected layers at the end are replaced by convolutional layers with very large receptive fields.
● They capture the global context of the scene.
● End-to-end training
Long, J., Shelhamer, E., & Darrell, T. (2015). Fully Convolutional Networks for Semantic Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 3431-3440)
![Page 13: DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)](https://reader034.fdocuments.in/reader034/viewer/2022052116/587d6c971a28ab32318b6ec5/html5/thumbnails/13.jpg)
13
Inception layers
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., ... & Rabinovich, A. (2015). Going Deeper With Convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1-9)
● GoogLeNet● Different kernel sizes
operating in parallel.
![Page 14: DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)](https://reader034.fdocuments.in/reader034/viewer/2022052116/587d6c971a28ab32318b6ec5/html5/thumbnails/14.jpg)
14
Location Biased Convolutional (LBC) layer
● Centre-bias●
![Page 15: DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)](https://reader034.fdocuments.in/reader034/viewer/2022052116/587d6c971a28ab32318b6ec5/html5/thumbnails/15.jpg)
The network
15
![Page 16: DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)](https://reader034.fdocuments.in/reader034/viewer/2022052116/587d6c971a28ab32318b6ec5/html5/thumbnails/16.jpg)
Architecture
16
Small convolutional filters of 3x3 with stride of 1 to allow a large depth without increasing the memory requirement
![Page 17: DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)](https://reader034.fdocuments.in/reader034/viewer/2022052116/587d6c971a28ab32318b6ec5/html5/thumbnails/17.jpg)
Architecture
17
Max pooling layers (in red) reduce computation.
![Page 18: DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)](https://reader034.fdocuments.in/reader034/viewer/2022052116/587d6c971a28ab32318b6ec5/html5/thumbnails/18.jpg)
Architecture
18
Gradual increase in the amount of channels to progressively learn richer semantic representations: 64, 128, 256, 512...
![Page 19: DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)](https://reader034.fdocuments.in/reader034/viewer/2022052116/587d6c971a28ab32318b6ec5/html5/thumbnails/19.jpg)
Architecture
19
Weights initialized from VGG-16 net for stable and effective learning
![Page 20: DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)](https://reader034.fdocuments.in/reader034/viewer/2022052116/587d6c971a28ab32318b6ec5/html5/thumbnails/20.jpg)
Architecture
20
Convolution kernel 3x3 with hole size 2 have a receptive field of 5x5.
![Page 21: DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)](https://reader034.fdocuments.in/reader034/viewer/2022052116/587d6c971a28ab32318b6ec5/html5/thumbnails/21.jpg)
Architecture
21
Capture multi-scale semantic structure using two inception style convolutional modules
![Page 22: DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)](https://reader034.fdocuments.in/reader034/viewer/2022052116/587d6c971a28ab32318b6ec5/html5/thumbnails/22.jpg)
Architecture
22
Very large receptive fields of 25x25 by introducing holes of size 6 in kernels
![Page 23: DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)](https://reader034.fdocuments.in/reader034/viewer/2022052116/587d6c971a28ab32318b6ec5/html5/thumbnails/23.jpg)
Architecture
23
Location Biased Convolutional (LBC) layers
![Page 24: DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)](https://reader034.fdocuments.in/reader034/viewer/2022052116/587d6c971a28ab32318b6ec5/html5/thumbnails/24.jpg)
Architecture
24
Location Biased Convolutional (LBC) layers
![Page 25: DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)](https://reader034.fdocuments.in/reader034/viewer/2022052116/587d6c971a28ab32318b6ec5/html5/thumbnails/25.jpg)
Architecture
25
constant during training learnt during training
weights from c’th filter in a convolutional layer
input blob
![Page 26: DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)](https://reader034.fdocuments.in/reader034/viewer/2022052116/587d6c971a28ab32318b6ec5/html5/thumbnails/26.jpg)
Architecture
26
Final output W/8xH/8 is upsampled.
![Page 27: DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)](https://reader034.fdocuments.in/reader034/viewer/2022052116/587d6c971a28ab32318b6ec5/html5/thumbnails/27.jpg)
Experiments
27
![Page 28: DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)](https://reader034.fdocuments.in/reader034/viewer/2022052116/587d6c971a28ab32318b6ec5/html5/thumbnails/28.jpg)
Training
28
2nd stage
MIT 1003
CAT2000Mouse clicks from Microsoft CoCo
Not mentioned how to go from eye fixations to heat mapa !!
![Page 29: DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)](https://reader034.fdocuments.in/reader034/viewer/2022052116/587d6c971a28ab32318b6ec5/html5/thumbnails/29.jpg)
Training
29
● End to end (as JuntingNet)● Caffeframework● 1 day in K40 GOU!
![Page 30: DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)](https://reader034.fdocuments.in/reader034/viewer/2022052116/587d6c971a28ab32318b6ec5/html5/thumbnails/30.jpg)
Results
30
![Page 31: DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)](https://reader034.fdocuments.in/reader034/viewer/2022052116/587d6c971a28ab32318b6ec5/html5/thumbnails/31.jpg)
Results
31
![Page 32: DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)](https://reader034.fdocuments.in/reader034/viewer/2022052116/587d6c971a28ab32318b6ec5/html5/thumbnails/32.jpg)
Results
32
![Page 33: DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)](https://reader034.fdocuments.in/reader034/viewer/2022052116/587d6c971a28ab32318b6ec5/html5/thumbnails/33.jpg)
Results
33
![Page 34: DeepFix: a fully convolutional neural network for predicting human fixations (UPC Reading Group)](https://reader034.fdocuments.in/reader034/viewer/2022052116/587d6c971a28ab32318b6ec5/html5/thumbnails/34.jpg)
Results
34