CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of...

98
CNN Basics Chongruo Wu

Transcript of CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of...

Page 1: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

CNN Basics

Chongruo Wu

Page 2: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

1. Forward: compute the output of each layer2. Back propagation: compute gradient3. Updating: update the parameters with computed gradient

Overview

Page 3: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

1. Forward Conv, Fully Connected, Pooing, Non-linear Function

Loss functions

2. Back Propagation, Computing Gradient Chain rule

3. Updating Parameters SGD

4. Training

Agenda

Page 4: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

1. Forward Conv, Fully Connected, Pooing, non-linear Function

Loss functions

2. Backward, Computing Gradient Chain rule

3. Updating Parameters SGD

4. Training

Agenda

Page 5: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Before Deep Learning

Slide credit: Kristen Grauman

Page 6: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Before Deep Learning

Slide credit: Kristen Grauman

Page 7: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Convolution ( One Channel )

Page 8: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Convolution, One Channel

Gif Animation, https://goo.gl/2L2KdP

● Kernels are learned. ● Here we show the forward process● After each iteration, parameters of kernels are changed.

Page 9: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Convolution, Padding

Gif Animation, https://goo.gl/Rm38CJ

Pad = 1

output size = input size

Page 10: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Convolution, Stride

Gif Animation, https://goo.gl/Rm38CJ

Stride > 1

Page 11: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Dilated Convolution

Gif Animation, https://goo.gl/Rm38CJ

Dilated Stride > 1

Page 12: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Convolution ( Multiple Channels )

Page 13: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Convolution, Multiple Channels

Representation for Feature Map:

( batch, channels, height, width )

For input image

( batch, 3 , height, width )

R,G,B

Page 14: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Andrej Karpathy, Bay Area Deep Learning School, 2016

Page 15: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Andrej Karpathy, Bay Area Deep Learning School, 2016

Page 16: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Andrej Karpathy, Bay Area Deep Learning School, 2016

Page 17: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Andrej Karpathy, Bay Area Deep Learning School, 2016

Page 18: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Andrej Karpathy, Bay Area Deep Learning School, 2016

Page 19: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Andrej Karpathy, Bay Area Deep Learning School, 2016

Page 20: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Andrej Karpathy, Bay Area Deep Learning School, 2016In pytorch, conv2 = nn.Conv2d(3, 6, kernel_size=5)

Page 21: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Andrej Karpathy, Bay Area Deep Learning School, 2016

Page 22: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Andrej Karpathy, Bay Area Deep Learning School, 2016

Page 23: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Andrej Karpathy, Bay Area Deep Learning School, 2016

Page 24: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Andrej Karpathy, Bay Area Deep Learning School, 2016

Page 25: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Convolution, Extended Work

Depthwise Convolution

● Model compression

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

Page 26: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Convolution, Extended Work

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

Input Feature Map

Page 27: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Convolution, Extended Work

Deformable Convolutional Networks

Deformable Convolution

Page 28: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Convolution, Extended Work

Deformable Convolutional Networks

Deformable Convolution

Page 29: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Non-Linear Function

Page 30: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Non-Linear Function

Rectified Linear Unit (ReLU)

Conv -> ReLU -> Conv

Why we need Non-Linearity Function ?

● Conv, linear operation

● Two consecutive convolutional layers (no ReLU) are considered as one convolutional layer.

Page 31: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Extended Work, PReLU

Parametric Rectified Linear Unit (PReLU)

Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification

Page 32: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Andrej Karpathy, Bay Area Deep Learning School, 2016

Page 33: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Andrej Karpathy, Bay Area Deep Learning School, 2016

Page 34: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Pooling

Page 35: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Andrej Karpathy, Bay Area Deep Learning School, 2016

Page 36: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Andrej Karpathy, Bay Area Deep Learning School, 2016

Page 37: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Pooling

● Max pooling

● Average pooling

Page 38: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Fully Connected Layer

Page 39: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Fully Connected Layer

Representation for Feature Map:

( batch, channels, height, width )

Fully Connected Layer:

( batch, channels, 1, 1) or ( batch, channels )

#parameters: input_channel * output_channel

Stanford cs231n.

Page 40: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Fully Connected Layer

Usually we apply reshape operation for input feature map

( n, c, h, w ) -> ( n, c*h*w, 1, 1) or ( n, c*h*w )

#parameters: input_channel * output_channel

Stanford cs231n.

Page 41: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Convolutional Vs. Fully Connected Layer

Convolutional Layer

● Local information● Kernels are shared

Fully Connected Layer

● Global information● #parameters large

Stanford cs231n.

Page 42: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

1. Forward Conv, Fully Connected, Pooing, non-linear Function

Loss functions

2. Back Propagation, Computing Gradient Chain rule

3. Updating Parameters SGD

4. Training

Agenda

Page 43: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

1. Forward: compute output of each layer2. Back propagation: compute gradient3. Updating: update the parameters with computed gradient

Overview

Page 44: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Loss Functions

● Softmax

● Regression

● Contrastive/Triplet Loss

Page 45: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Stanford cs231n.

Page 46: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Andrej Karpathy, Bay Area Deep Learning School, 2016

Page 47: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Stanford cs231n.

Page 48: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Stanford cs231n.

Page 49: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Stanford cs231n.

Page 50: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Stanford cs231n.

y_i is the groundtruth

Page 51: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Stanford cs231n. y_i is the groundtruth

Page 52: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Stanford cs231n.

Page 53: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Stanford cs231n.

[ 1,

0,

0 ]

Groundtruth(one hot vector)

Page 54: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Stanford cs231n.

[ 1,

0,

0 ]

Groundtruth(one hot vector)

Page 55: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Stanford cs231n.

Page 56: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Stanford cs231n.

Page 57: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Loss Functions

● Softmax

● Regression

Bounding box regression

● Contrastive/Triplelet Loss

Object(face) recognition, clustering

Page 58: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Andrej Karpathy, Bay Area Deep Learning School, 2016

Page 59: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Define Network

https://goo.gl/mQEw15

Page 60: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Network Case Study (optional)

Page 61: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Andrej Karpathy, Bay Area Deep Learning School, 2016

Page 62: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Andrej Karpathy, Bay Area Deep Learning School, 2016

Page 63: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Andrej Karpathy, Bay Area Deep Learning School, 2016

VGG Network

Page 64: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Andrej Karpathy, Bay Area Deep Learning School, 2016

VGG Network

Page 65: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

1. Forward Conv, Fully Connected, Pooing, non-linear Function

Loss functions

2. Back Propagation, Computing Gradient Chain rule

3. Updating Parameters SGD

4. Training

Agenda

Page 66: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

1. Forward: compute output of each layer2. Backward: compute gradient3. Update: update the parameters with computed gradient

Overview

Page 67: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Back Propagation

Parameters: Convolutional Layer, Fully Connected LayerPReLU, Batch Normalization

No Parameters:Pooling, ReLU, Dropout

Page 68: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Stanford cs231n.

Page 69: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Stanford cs231n.

Page 70: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Stanford cs231n.

Page 71: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Stanford cs231n.

Page 72: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Stanford cs231n.

Page 73: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Stanford cs231n.

Page 74: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Stanford cs231n.

Page 75: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Stanford cs231n.

Page 76: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Stanford cs231n.

Page 77: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Stanford cs231n.

Page 78: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Stanford cs231n.

Page 79: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Stanford cs231n.

Page 80: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Convolutional Layer

Fully Connected Layer

Back Propagation

Page 81: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Once you finish your computation you can call .backward() and have all the gradients computed automatically. “

Stanford cs231n. Pytorch Tutorial. www.pytorch.org

Back Propagation

Page 82: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

1. Forward Conv, Fully Connected, Pooing, non-linear Function

Loss functions

2. Backward, Computing Gradient Chain rule

3. Updating Parameters SGD

4. Training

Agenda

Page 83: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

1. Forward: compute output of each layer2. Backward: compute gradient3. Update: update the parameters with computed gradient

Overview

Page 84: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Updating Gradient

Stanford cs231n.

Page 85: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Updating Gradient

Andrew Ng’s Machine Learning course, Coursera

Page 86: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Updating Gradient

Deep Residual Learning for Image Recognition

decrease the learning rate after several epochs

Page 87: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Stanford cs231n.

Page 88: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Updating Gradient

Stanford cs231n.

Momentum

Page 89: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Stanford cs231n.

Page 90: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Other updating Strategies

Stanford cs231n.

● SGD

● Momentum

● Rmsprop

● Adagrad

● Adam

Gif Animation, http://cs231n.github.io/neural-networks-3/

Page 91: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

1. Forward: compute output of each layer2. Backward: compute gradient3. Update: update the parameters with computed gradient

Overview

Page 92: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

1. Forward Conv, Fully Connected, Pooing, non-linear Function

Loss functions

2. Back Propagation, Computing Gradient Chain rule

3. Updating Parameters SGD

4. Training

Agenda

Page 93: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Pytorch, zero to all. HKUST

Page 94: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Learning Rate

Page 95: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Resources

Page 96: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

ResourcesCourse: Stanford CS231n http://cs231n.stanford.edu/

Page 97: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

ResourcesBay Area Deep Learning School https://goo.gl/MvWKYr

Page 98: CNN Basics - University of California, DavisCNN Basics Chongruo Wu 1. Forward: compute the output of each layer 2. Back propagation: compute gradient 3. Updating: update the parameters

Thank You