Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf ·...
Transcript of Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf ·...
![Page 1: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/1.jpg)
Imagenet
Day 2 Lecture 4
Xavier Giró-i-Nieto
![Page 2: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/2.jpg)
2
ImageNet ILSRVCLi Fei-Fei, “How we’re teaching computers to understand pictures” TEDTalks 2014.
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., ... & Fei-Fei, L. (2015). Imagenet large scale visual recognition challenge. arXiv preprint arXiv:1409.0575. [web]
![Page 3: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/3.jpg)
3
ImageNet ILSRVC
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., ... & Fei-Fei, L. (2015). Imagenet large scale visual recognition challenge. arXiv preprint arXiv:1409.0575. [web]
![Page 4: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/4.jpg)
4
ImageNet ILSRVC
● 1,000 object classes (categories).
● Images:○ 1.2 M train○ 100k test.
![Page 5: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/5.jpg)
ImageNet ILSRVC
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., ... & Fei-Fei, L. (2015). Imagenet large scale visual recognition challenge. arXiv preprint arXiv:1409.0575. [web]
● Top 5 error rate
![Page 6: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/6.jpg)
Slide credit: Rob Fergus (NYU)
Image Classification 2012
-9.8%
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., ... & Fei-Fei, L. (2014). Imagenet large scale visual recognition challenge. arXiv preprint arXiv:1409.0575. [web] 6
ImageNet ILSRVCBased on SIFT + Fisher Vectors
![Page 7: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/7.jpg)
AlexNet (Supervision)
7Slide credit: Junting Pan, “Visual Saliency Prediction using Deep Learning Techniques” (ETSETB-UPC 2015)
Orange
A Krizhevsky, I Sutskever, GE Hinton “Imagenet classification with deep convolutional neural networks” Part of: Advances in Neural Information Processing Systems 25 (NIPS 2012)
![Page 8: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/8.jpg)
8Slide credit: Junting Pan, “Visual Saliency Prediction using Deep Learning Techniques” (ETSETB-UPC 2015)
AlexNet (Supervision)
![Page 9: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/9.jpg)
9Slide credit: Junting Pan, “Visual Saliency Prediction using Deep Learning Techniques” (ETSETB-UPC 2015)
AlexNet (Supervision)
![Page 10: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/10.jpg)
10Image credit: Deep learning Tutorial (Stanford University)
AlexNet (Supervision)
![Page 11: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/11.jpg)
11Image credit: Deep learning Tutorial (Stanford University)
AlexNet (Supervision)
![Page 12: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/12.jpg)
12Image credit: Deep learning Tutorial (Stanford University)
AlexNet (Supervision)
![Page 13: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/13.jpg)
13
RectifiedLinearUnit (non-linearity)
f(x) = max(0,x)
Slide credit: Junting Pan, “Visual Saliency Prediction using Deep Learning Techniques” (ETSETB-UPC 2015)
AlexNet (Supervision)
![Page 14: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/14.jpg)
14
Dot Product
Slide credit: Junting Pan, “Visual Saliency Prediction using Deep Learning Techniques” (ETSETB-UPC 2015)
AlexNet (Supervision)
![Page 15: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/15.jpg)
ImageNet Classification 2013
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., ... & Fei-Fei, L. (2015). Imagenet large scale visual recognition challenge. arXiv preprint arXiv:1409.0575. [web]
Slide credit: Rob Fergus (NYU)
15
ImageNet ILSRVC
![Page 16: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/16.jpg)
The development of better convnets is reduced to trial-and-error.
16
Zeiler-Fergus (ZF)
Visualization can help in proposing better architectures.
Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. In Computer Vision–ECCV 2014 (pp. 818-833). Springer International Publishing.
![Page 17: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/17.jpg)
“A convnet model that uses the same components (filtering, pooling) but in reverse, so instead of mapping pixels to features does the opposite.”
Zeiler, Matthew D., Graham W. Taylor, and Rob Fergus. "Adaptive deconvolutional networks for mid and high level feature learning." Computer Vision (ICCV), 2011 IEEE International Conference on. IEEE, 2011.
17
Zeiler-Fergus (ZF)
![Page 18: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/18.jpg)
18
Zeiler-Fergus (ZF)
Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. In Computer Vision–ECCV 2014 (pp. 818-833). Springer International Publishing.
Dec
onvN
etC
onvN
et
![Page 19: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/19.jpg)
Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. In Computer Vision–ECCV 2014 (pp. 818-833). Springer International Publishing. 19
Zeiler-Fergus (ZF)
![Page 20: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/20.jpg)
Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. In Computer Vision–ECCV 2014 (pp. 818-833). Springer International Publishing. 20
Zeiler-Fergus (ZF)
![Page 21: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/21.jpg)
21
The smaller stride (2 vs 4) and filter size (7x7 vs 11x11) results in more distinctive features and fewer “dead" features.
AlexNet (Layer 1) ZF (Layer 1)
Zeiler-Fergus (ZF): Stride & filter size
![Page 22: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/22.jpg)
22
Cleaner features in ZF, without the aliasing artifacts caused by the stride 4 used in AlexNet.
AlexNet (Layer 2) ZF (Layer 2)
Zeiler-Fergus (ZF)
![Page 23: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/23.jpg)
23
Regularization with more dropout: introduced in the input layer.
Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580.
Chicago
Zeiler-Fergus (ZF): Drop out
![Page 24: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/24.jpg)
24
Zeiler-Fergus (ZF): Results
![Page 25: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/25.jpg)
25
Zeiler-Fergus (ZF): Results
![Page 26: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/26.jpg)
ImageNet Classification 2013
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., ... & Fei-Fei, L. (2015). Imagenet large scale visual recognition challenge. arXiv preprint arXiv:1409.0575. [web]
-5%
26
E2E: Classification: ImageNet ILSRVC
![Page 27: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/27.jpg)
E2E: Classification
27
![Page 29: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/29.jpg)
E2E: Classification: GoogLeNet
29
● 22 layers, but 12 times fewer parameters than AlexNet.
Szegedy, Christian, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. "Going deeper with convolutions." CVPR 2015. [video] [slides] [poster]
![Page 30: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/30.jpg)
E2E: Classification: GoogLeNet
30
![Page 31: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/31.jpg)
E2E: Classification: GoogLeNet
31Lin, Min, Qiang Chen, and Shuicheng Yan. "Network in network." ICLR 2014.
![Page 32: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/32.jpg)
E2E: Classification: GoogLeNet
32Lin, Min, Qiang Chen, and Shuicheng Yan. "Network in network." ICLR 2014.
Multiple scales
![Page 33: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/33.jpg)
E2E: Classification: GoogLeNet (NiN)
33
3x3 and 5x5 convolutions deal with different scales.
Lin, Min, Qiang Chen, and Shuicheng Yan. "Network in network." ICLR 2014. [Slides]
![Page 34: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/34.jpg)
E2E: Classification: GoogLeNet
34Lin, Min, Qiang Chen, and Shuicheng Yan. "Network in network." ICLR 2014.
Dimensionality reduction
![Page 35: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/35.jpg)
35
1x1 convolutions does dimensionality reduction (c3<c2) and accounts for rectified linear units (ReLU).
Lin, Min, Qiang Chen, and Shuicheng Yan. "Network in network." ICLR 2014. [Slides]
E2E: Classification: GoogLeNet (NiN)
![Page 36: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/36.jpg)
E2E: Classification: GoogLeNet
36
In GoogLeNet, the Cascaded 1x1 Convolutions compute reductions before the expensive 3x3 and 5x5 convolutions.
![Page 37: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/37.jpg)
E2E: Classification: GoogLeNet
37Lin, Min, Qiang Chen, and Shuicheng Yan. "Network in network." ICLR 2014.
![Page 38: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/38.jpg)
E2E: Classification: GoogLeNet
38
They somewhat spatial invariance, and has proven a benefitial effect by adding an alternative parallel path.
![Page 39: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/39.jpg)
E2E: Classification: GoogLeNet
39
Two Softmax Classifiers at intermediate layers combat the vanishing gradient while providing regularization at training time.
...and no fully connected layers needed !
![Page 40: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/40.jpg)
E2E: Classification: GoogLeNet
40
![Page 41: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/41.jpg)
E2E: Classification: GoogLeNet
41NVIDIA, “NVIDIA and IBM CLoud Support ImageNet Large Scale Visual Recognition Challenge” (2015)
![Page 42: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/42.jpg)
E2E: Classification: GoogLeNet
42Szegedy, Christian, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. "Going deeper with convolutions." CVPR 2015. [video] [slides] [poster]
![Page 43: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/43.jpg)
E2E: Classification: VGG
43Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." International Conference on Learning Representations (2015). [video] [slides] [project]
![Page 44: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/44.jpg)
E2E: Classification: VGG
44Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." International Conference on Learning Representations (2015). [video] [slides] [project]
![Page 45: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/45.jpg)
E2E: Classification: VGG: 3x3 Stacks
45
Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." International Conference on Learning Representations (2015). [video] [slides] [project]
![Page 46: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/46.jpg)
E2E: Classification: VGG
46
Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." International Conference on Learning Representations (2015). [video] [slides] [project]
● No poolings between some convolutional layers.● Convolution strides of 1 (no skipping).
![Page 47: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/47.jpg)
E2E: Classification
47
3.6% top 5 error…with 152 layers !!
![Page 48: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/48.jpg)
E2E: Classification: ResNet
48He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. "Deep Residual Learning for Image Recognition." arXiv preprint arXiv:1512.03385 (2015). [slides]
![Page 49: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/49.jpg)
E2E: Classification: ResNet
49
● Deeper networks (34 is deeper than 18) are more difficult to train.
He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. "Deep Residual Learning for Image Recognition." arXiv preprint arXiv:1512.03385 (2015). [slides]
Thin curves: training errorBold curves: validation error
![Page 50: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/50.jpg)
ResNet
50
● Residual learning: reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions
He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. "Deep Residual Learning for Image Recognition." arXiv preprint arXiv:1512.03385 (2015). [slides]
![Page 51: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/51.jpg)
E2E: Classification: ResNet
51He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. "Deep Residual Learning for Image Recognition." arXiv preprint arXiv:1512.03385 (2015). [slides]
![Page 52: Imagenet - GitHub Pagesimatge-upc.github.io/telecombcn-2016-dlcv/slides/D2L4-imagenet.pdf · Residual learning: reformulate the layers as learning residual functions with reference](https://reader036.fdocuments.in/reader036/viewer/2022081515/5ec67a2ddb0d1917dc626bf3/html5/thumbnails/52.jpg)
52
Thanks ! Q&A ?Follow me at
https://imatge.upc.edu/web/people/xavier-giro
@DocXavi/ProfessorXavi