CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of...
Transcript of CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of...
![Page 1: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/1.jpg)
CNN Basics
Chongruo Wu
![Page 2: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/2.jpg)
1. Forward: compute the output of each layer2. Back propagation: compute gradient3. Updating: update the parameters with computed gradient
Overview
![Page 3: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/3.jpg)
1. Forward Conv, Fully Connected, Pooing, Non-linear Function
Loss functions
2. Back Propagation, Computing Gradient Chain rule
3. Updating Parameters SGD
4. Training
Agenda
![Page 4: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/4.jpg)
1. Forward Conv, Fully Connected, Pooing, non-linear Function
Loss functions
2. Backward, Computing Gradient Chain rule
3. Updating Parameters SGD
4. Training
Agenda
![Page 5: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/5.jpg)
Before Deep Learning
Slide credit: Kristen Grauman
![Page 6: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/6.jpg)
Before Deep Learning
Slide credit: Kristen Grauman
![Page 7: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/7.jpg)
Convolution ( One Channel )
![Page 8: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/8.jpg)
Convolution, One Channel
Gif Animation, https://goo.gl/2L2KdP
● Kernels are learned. ● Here we show the forward process● After each iteration, parameters of kernels are changed.
![Page 9: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/9.jpg)
Convolution, Padding
Gif Animation, https://goo.gl/Rm38CJ
Pad = 1
output size = input size
![Page 10: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/10.jpg)
Convolution, Stride
Gif Animation, https://goo.gl/Rm38CJ
Stride > 1
![Page 11: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/11.jpg)
Dilated Convolution
Gif Animation, https://goo.gl/Rm38CJ
Dilated Stride > 1
![Page 12: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/12.jpg)
Convolution ( Multiple Channels )
![Page 13: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/13.jpg)
Convolution, Multiple Channels
Representation for Feature Map:
( batch, channels, height, width )
For input image
( batch, 3 , height, width )
R,G,B
![Page 14: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/14.jpg)
Andrej Karpathy, Bay Area Deep Learning School, 2016
![Page 15: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/15.jpg)
Andrej Karpathy, Bay Area Deep Learning School, 2016
![Page 16: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/16.jpg)
Andrej Karpathy, Bay Area Deep Learning School, 2016
![Page 17: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/17.jpg)
Andrej Karpathy, Bay Area Deep Learning School, 2016
![Page 18: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/18.jpg)
Andrej Karpathy, Bay Area Deep Learning School, 2016
![Page 19: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/19.jpg)
Andrej Karpathy, Bay Area Deep Learning School, 2016
![Page 20: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/20.jpg)
Andrej Karpathy, Bay Area Deep Learning School, 2016In pytorch, conv2 = nn.Conv2d(3, 6, kernel_size=5)
![Page 21: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/21.jpg)
Andrej Karpathy, Bay Area Deep Learning School, 2016
![Page 22: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/22.jpg)
Andrej Karpathy, Bay Area Deep Learning School, 2016
![Page 23: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/23.jpg)
Andrej Karpathy, Bay Area Deep Learning School, 2016
![Page 24: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/24.jpg)
Andrej Karpathy, Bay Area Deep Learning School, 2016
![Page 25: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/25.jpg)
Convolution, Extended Work
Depthwise Convolution
● Model compression
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
![Page 26: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/26.jpg)
Convolution, Extended Work
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Input Feature Map
![Page 27: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/27.jpg)
Convolution, Extended Work
Deformable Convolutional Networks
Deformable Convolution
![Page 28: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/28.jpg)
Convolution, Extended Work
Deformable Convolutional Networks
Deformable Convolution
![Page 29: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/29.jpg)
Non-Linear Function
![Page 30: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/30.jpg)
Non-Linear Function
Rectified Linear Unit (ReLU)
Conv -> ReLU -> Conv
Why we need Non-Linearity Function ?
● Conv, linear operation
● Two consecutive convolutional layers (no ReLU) are considered as one convolutional layer.
![Page 31: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/31.jpg)
Extended Work, PReLU
Parametric Rectified Linear Unit (PReLU)
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
![Page 32: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/32.jpg)
Andrej Karpathy, Bay Area Deep Learning School, 2016
![Page 33: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/33.jpg)
Andrej Karpathy, Bay Area Deep Learning School, 2016
![Page 34: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/34.jpg)
Pooling
![Page 35: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/35.jpg)
Andrej Karpathy, Bay Area Deep Learning School, 2016
![Page 36: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/36.jpg)
Andrej Karpathy, Bay Area Deep Learning School, 2016
![Page 37: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/37.jpg)
Pooling
● Max pooling
● Average pooling
![Page 38: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/38.jpg)
Fully Connected Layer
![Page 39: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/39.jpg)
Fully Connected Layer
Representation for Feature Map:
( batch, channels, height, width )
Fully Connected Layer:
( batch, channels, 1, 1) or ( batch, channels )
#parameters: input_channel * output_channel
Stanford cs231n.
![Page 40: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/40.jpg)
Fully Connected Layer
Usually we apply reshape operation for input feature map
( n, c, h, w ) -> ( n, c*h*w, 1, 1) or ( n, c*h*w )
#parameters: input_channel * output_channel
Stanford cs231n.
![Page 41: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/41.jpg)
Convolutional Vs. Fully Connected Layer
Convolutional Layer
● Local information● Kernels are shared
Fully Connected Layer
● Global information● #parameters large
Stanford cs231n.
![Page 42: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/42.jpg)
1. Forward Conv, Fully Connected, Pooing, non-linear Function
Loss functions
2. Back Propagation, Computing Gradient Chain rule
3. Updating Parameters SGD
4. Training
Agenda
![Page 43: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/43.jpg)
1. Forward: compute output of each layer2. Back propagation: compute gradient3. Updating: update the parameters with computed gradient
Overview
![Page 44: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/44.jpg)
Loss Functions
● Softmax
● Regression
● Contrastive/Triplet Loss
![Page 45: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/45.jpg)
Stanford cs231n.
![Page 46: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/46.jpg)
Andrej Karpathy, Bay Area Deep Learning School, 2016
![Page 47: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/47.jpg)
Stanford cs231n.
![Page 48: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/48.jpg)
Stanford cs231n.
![Page 49: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/49.jpg)
Stanford cs231n.
![Page 50: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/50.jpg)
Stanford cs231n.
y_i is the groundtruth
![Page 51: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/51.jpg)
Stanford cs231n. y_i is the groundtruth
![Page 52: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/52.jpg)
Stanford cs231n.
![Page 53: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/53.jpg)
Stanford cs231n.
[ 1,
0,
0 ]
Groundtruth(one hot vector)
![Page 54: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/54.jpg)
Stanford cs231n.
[ 1,
0,
0 ]
Groundtruth(one hot vector)
![Page 55: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/55.jpg)
Stanford cs231n.
![Page 56: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/56.jpg)
Stanford cs231n.
![Page 57: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/57.jpg)
Loss Functions
● Softmax
● Regression
Bounding box regression
● Contrastive/Triplelet Loss
Object(face) recognition, clustering
![Page 58: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/58.jpg)
Andrej Karpathy, Bay Area Deep Learning School, 2016
![Page 59: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/59.jpg)
Define Network
https://goo.gl/mQEw15
![Page 60: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/60.jpg)
Network Case Study (optional)
![Page 61: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/61.jpg)
Andrej Karpathy, Bay Area Deep Learning School, 2016
![Page 62: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/62.jpg)
Andrej Karpathy, Bay Area Deep Learning School, 2016
![Page 63: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/63.jpg)
Andrej Karpathy, Bay Area Deep Learning School, 2016
VGG Network
![Page 64: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/64.jpg)
Andrej Karpathy, Bay Area Deep Learning School, 2016
VGG Network
![Page 65: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/65.jpg)
1. Forward Conv, Fully Connected, Pooing, non-linear Function
Loss functions
2. Back Propagation, Computing Gradient Chain rule
3. Updating Parameters SGD
4. Training
Agenda
![Page 66: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/66.jpg)
1. Forward: compute output of each layer2. Backward: compute gradient3. Update: update the parameters with computed gradient
Overview
![Page 67: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/67.jpg)
Back Propagation
Parameters: Convolutional Layer, Fully Connected LayerPReLU, Batch Normalization
No Parameters:Pooling, ReLU, Dropout
![Page 68: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/68.jpg)
Stanford cs231n.
![Page 69: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/69.jpg)
Stanford cs231n.
![Page 70: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/70.jpg)
Stanford cs231n.
![Page 71: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/71.jpg)
Stanford cs231n.
![Page 72: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/72.jpg)
Stanford cs231n.
![Page 73: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/73.jpg)
Stanford cs231n.
![Page 74: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/74.jpg)
Stanford cs231n.
![Page 75: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/75.jpg)
Stanford cs231n.
![Page 76: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/76.jpg)
Stanford cs231n.
![Page 77: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/77.jpg)
Stanford cs231n.
![Page 78: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/78.jpg)
Stanford cs231n.
![Page 79: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/79.jpg)
Stanford cs231n.
![Page 80: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/80.jpg)
Convolutional Layer
Fully Connected Layer
Back Propagation
![Page 81: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/81.jpg)
Once you finish your computation you can call .backward() and have all the gradients computed automatically. “
Stanford cs231n. Pytorch Tutorial. www.pytorch.org
Back Propagation
![Page 82: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/82.jpg)
1. Forward Conv, Fully Connected, Pooing, non-linear Function
Loss functions
2. Backward, Computing Gradient Chain rule
3. Updating Parameters SGD
4. Training
Agenda
![Page 83: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/83.jpg)
1. Forward: compute output of each layer2. Backward: compute gradient3. Update: update the parameters with computed gradient
Overview
![Page 84: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/84.jpg)
Updating Gradient
Stanford cs231n.
![Page 85: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/85.jpg)
Updating Gradient
Andrew Ng’s Machine Learning course, Coursera
![Page 86: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/86.jpg)
Updating Gradient
Deep Residual Learning for Image Recognition
decrease the learning rate after several epochs
![Page 87: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/87.jpg)
Stanford cs231n.
![Page 88: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/88.jpg)
Updating Gradient
Stanford cs231n.
Momentum
![Page 89: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/89.jpg)
Stanford cs231n.
![Page 90: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/90.jpg)
Other updating Strategies
Stanford cs231n.
● SGD
● Momentum
● Rmsprop
● Adagrad
● Adam
Gif Animation, http://cs231n.github.io/neural-networks-3/
![Page 91: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/91.jpg)
1. Forward: compute output of each layer2. Backward: compute gradient3. Update: update the parameters with computed gradient
Overview
![Page 92: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/92.jpg)
1. Forward Conv, Fully Connected, Pooing, non-linear Function
Loss functions
2. Back Propagation, Computing Gradient Chain rule
3. Updating Parameters SGD
4. Training
Agenda
![Page 93: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/93.jpg)
Pytorch, zero to all. HKUST
![Page 94: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/94.jpg)
Learning Rate
![Page 95: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/95.jpg)
Resources
![Page 96: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/96.jpg)
ResourcesCourse: Stanford CS231n http://cs231n.stanford.edu/
![Page 97: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/97.jpg)
ResourcesBay Area Deep Learning School https://goo.gl/MvWKYr
![Page 98: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters](https://reader036.fdocuments.in/reader036/viewer/2022071107/5fe104b5d7a51d3583265ff3/html5/thumbnails/98.jpg)
Thank You