Midterm Review -...
Transcript of Midterm Review -...
![Page 1: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/1.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018Midterm Review May 4th, 2018
Midterm Review
![Page 2: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/2.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
50 minutes is short!
This is just to help you get going with your studies.
Midterm Review May 4th, 2018
![Page 3: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/3.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
Overview of today’s sessionSummary of Course Material:
● How we “power” neural networks:○ Loss function○ Optimization
● How we build complex network models○ Nonlinear Activations○ Convolutional Layers
● How we “rein in” complexity○ Regularization
Practice Midterm ProblemsQ&A, time permitting
![Page 4: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/4.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
Overview of today’s sessionSummary of Course Material
● How we “power” neural networks:○ Loss function○ Optimization
● How we build complex network models○ Nonlinear Activations○ Convolutional Layers
● How we “rein in” complexity○ Regularization
Practice Midterm Problems
![Page 5: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/5.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 20185
Lecture 3:Loss Functions
and Optimization
![Page 6: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/6.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
An optimization problemAt the end of the day, we want to train a model that performs a desired task well – and a proxy for best achieving this is minimizing a loss function
![Page 7: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/7.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 20187
SVM/Softmax Loss- We have some dataset of (x,y)- We have a score function: - We have a loss function:
e.g.
Softmax
SVM
Full loss
![Page 8: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/8.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 20188
Know how to derive the SVM and Softmax gradients!
![Page 9: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/9.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
Stochastic Gradient Descent (SGD)
9
Full sum expensive when N is large!
Approximate sum using a minibatch of examples32 / 64 / 128 common
![Page 10: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/10.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
Learning Rate Loss Curves
![Page 11: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/11.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 7 - April 24, 2018Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 7 - April 24, 201811
Optimization: Problems with SGD
What if the loss function has a local minima or saddle point?
![Page 12: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/12.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 7 - April 24, 2018Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 7 - April 24, 201812
Optimization: Problems with SGD
What if the loss function has a local minima or saddle point?
Zero gradient, gradient descent gets stuck
![Page 13: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/13.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 7 - April 24, 2018Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 7 - April 24, 201813
Optimization: Problems with SGDWhat if loss changes quickly in one direction and slowly in another?What does gradient descent do?Very slow progress along shallow dimension, jitter along steep direction
Loss function has high condition number: ratio of largest to smallest singular value of the Hessian matrix is large
![Page 14: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/14.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 7 - April 24, 2018Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 7 - April 24, 201814
Optimization: Problems with SGD
Our gradients come from minibatches so they can be noisy!
![Page 15: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/15.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 7 - April 24, 2018Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 7 - April 24, 201815
Update Rules
SGDMomentumNesterov MomentumAdaGradRMSPropAdam
![Page 16: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/16.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
Overview of today’s sessionSummary of Course Material:
● How we “power” neural networks:○ Loss function○ Optimization
● How we build complex network models○ Nonlinear Activations○ Convolutional Layers
● How we “rein in” complexity○ Regularization
Practice Midterm Problems
![Page 17: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/17.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 6 - April 19, 2018Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 6 - April 19, 201817
Lecture 6:Training Neural Networks,
Part I
![Page 18: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/18.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 6 - April 19, 2018Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 6 - April 19, 201818
Activation FunctionsSigmoid
tanh
ReLU
Leaky ReLU
Maxout
ELU
![Page 19: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/19.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 6 - April 19, 2018Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 6 - April 19, 201819
Activation Functions
Sigmoid
- Squashes numbers to range [0,1]- Historically popular since they
have nice interpretation as a saturating “firing rate” of a neuron
3 problems:
1. Saturated neurons “kill” the gradients
2. Sigmoid outputs are not zero-centered
3. exp() is a bit compute expensive
![Page 20: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/20.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 6 - April 19, 2018Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 6 - April 19, 201820
Consider what happens when the input to a neuron is always positive...
What can we say about the gradients on w?Always all positive or all negative :((this is also why you want zero-mean data!)
hypothetical optimal w vector
zig zag path
allowed gradient update directions
allowed gradient update directions
![Page 21: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/21.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 6 - April 19, 2018Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 6 - April 19, 201821
Activation Functions
ReLU(Rectified Linear Unit)
- Computes f(x) = max(0,x)
- Does not saturate (in +region)- Very computationally efficient- Converges much faster than
sigmoid/tanh in practice (e.g. 6x)- Actually more biologically plausible
than sigmoid
- Not zero-centered output- An annoyance:
hint: what is the gradient when x < 0?
![Page 22: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/22.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 6 - April 19, 2018Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 6 - April 19, 201822
DATA CLOUDactive ReLU
dead ReLUwill never activate => never update
h = WX + bo = relu(h)
![Page 23: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/23.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 6 - April 19, 2018Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 6 - April 19, 201823
DATA CLOUDactive ReLU
dead ReLUwill never activate => never update
h = WX + bo = relu(h)
do / dh = 0
![Page 24: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/24.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 6 - April 19, 2018Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 6 - April 19, 201824
DATA CLOUDactive ReLU
dead ReLUwill never activate => never update
h = WX + bo = relu(h)
do / dh = 0dL / dh = 0
![Page 25: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/25.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 6 - April 19, 2018Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 6 - April 19, 201825
DATA CLOUDactive ReLU
dead ReLUwill never activate => never update
h = WX + bo = relu(h)
do / dh = 0dL / dh = 0dL / dh * dh / dW = 0
![Page 26: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/26.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 7 - April 24, 2018Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 7 - April 24, 201826
Vanishing/Exploding Gradient
Vanishing Gradient:- Gradient becomes too small
![Page 27: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/27.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 7 - April 24, 2018Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 7 - April 24, 201827
Vanishing/Exploding Gradient
Vanishing Gradient:- Gradient becomes too small- Some causes:
- Choice of activation function- Multiplying many small numbers
together
![Page 28: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/28.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 7 - April 24, 2018Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 7 - April 24, 201828
Vanishing/Exploding Gradient
Vanishing Gradient:- Gradient becomes too small- Some causes:
- Choice of activation function- Multiplying many small numbers
together
Exploding Gradient:- Gradient becomes too large
![Page 29: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/29.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - May 3, 2018Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 10 - May 3, 201829
Vanilla RNN Gradient Flow
h0 h1 h2 h3 h4
x1 x2 x3 x4
Largest singular value > 1: Exploding gradients
Largest singular value < 1:Vanishing gradients
Gradient clipping: Scale gradient if its norm is too bigComputing gradient
of h0 involves many factors of W(and repeated tanh)
Bengio et al, “Learning long-term dependencies with gradient descent is difficult”, IEEE Transactions on Neural Networks, 1994Pascanu et al, “On the difficulty of training recurrent neural networks”, ICML 2013
![Page 30: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/30.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
Overview of today’s sessionSummary of Course Material:
● How we “power” neural networks:○ Loss function○ Optimization
● How we build complex network models○ Nonlinear Activations○ Convolutional Layers
● How we “rein in” complexity○ Regularization
Practice Midterm Problems
![Page 31: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/31.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung April 17, 2018Lecture 5 -Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 5 - April 17, 201831
32
32
3
Convolution Layer32x32x3 image5x5x3 filter
convolve (slide) over all spatial locations
activation map
1
28
28
![Page 32: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/32.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung April 17, 2018Lecture 5 -Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 5 - April 17, 201832
32
32
3
Convolution Layer
activation maps
6
28
28
For example, if we had 6 5x5 filters, we’ll get 6 separate activation maps:
We stack these up to get a “new image” of size 28x28x6!
![Page 33: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/33.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung April 17, 2018Lecture 5 -Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 5 - April 17, 2018
Convolution LayerIn contrast to fully connected layer, Each term in output is dependent on spatially local ‘subregions’ of input
![Page 34: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/34.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung April 17, 2018Lecture 5 -Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 5 - April 17, 2018
Convolution LayerIn contrast to fully connected layer, Each term in output is dependent on spatially local ‘subregions’ of input
![Page 35: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/35.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung April 17, 2018Lecture 5 -Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 5 - April 17, 2018
Convolution LayerIn contrast to fully connected layer, Each term in output is dependent on spatially local ‘subregions’ of input
Question: connection between an FC layer and a convolutional layer?
![Page 36: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/36.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung April 17, 2018Lecture 5 -Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 5 - April 17, 2018
Convolution LayerIn contrast to fully connected layer, Each term in output is dependent on spatially local ‘subregions’ of input
Question: connection between an FC layer and a convolutional layer?Answer: FC looks like convolution layer with filter size HxW
![Page 37: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/37.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
Overview of today’s sessionSummary of Course Material:
● How we “power” neural networks:○ Loss function○ Optimization
● How we build complex network models○ Nonlinear Activations○ Convolutional Layers
● How we “rein in” complexity○ Regularization
Practice Midterm Problems
![Page 38: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/38.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
Drawbacks of increased complexity: Overfitting(Bias vs Variance)
Source: Wikipedia
![Page 39: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/39.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
Combat overfitting● Increase data quantity/quality● Impose extra constraints● Introduce randomness/uncertainty
![Page 40: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/40.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
Combat overfitting● Increase data quantity/quality
○ Data augmentation
● Impose extra constraints ○ On model parameters: L2 regularization○ On layer outputs: Batchnorm
● Introduce randomness/uncertainty ○ Dropout○ Batchnorm○ Stochastic depth, drop connect
![Page 41: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/41.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
Overview of today’s sessionSummary of Course Material:
● How we “power” neural networks:○ Loss function○ Optimization
● How we build complex network models○ Nonlinear Activations○ Convolutional Layers
● How we “rein in” complexity○ Regularization
Practice Midterm Problems
![Page 42: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/42.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
![Page 43: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/43.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
![Page 44: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/44.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
Receptive field size
‘Input data seen/received’ in single activation layer ‘pixel’
Input Conv2d
![Page 45: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/45.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
Receptive field size
Input Conv2d
‘Input data seen/received’ in single activation layer ‘pixel’
![Page 46: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/46.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
Receptive field size
‘Input data seen/received’ in single output layer ‘pixel’
Input Conv2d
![Page 47: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/47.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
Input Conv2d
![Page 48: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/48.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
Input Conv2d
![Page 49: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/49.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
Summary
![Page 50: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/50.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
Summary
Note: Generally when we refer to ‘receptive field’, we mean with respect to input data/layer 0/original image,
not with respect to direct input to the layer
![Page 51: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/51.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
Summary
(Need to compute recursively!)
![Page 52: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/52.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
Going back to activation dimensions...
![Page 53: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/53.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
Cumulative receptive field of layer output = layer input
Going back to activation dimensions...
![Page 54: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/54.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
Cumulative receptive field of layer output = layer input
Going back to activation dimensions...
![Page 55: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/55.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
Activation dimensions
![Page 56: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/56.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
Activation dimensions
![Page 57: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/57.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
Activation dimensions
![Page 58: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/58.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
Summary
![Page 59: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/59.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
Receptive field size
Conv2dk=3, s=1
Conv2dk=5, s=1
k=3, s=1, m=1
![Page 60: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/60.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
Receptive field sizek=3, s=1, m=1
n=3
Conv2dk=3, s=1
Conv2dk=5, s=1
![Page 61: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/61.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
Receptive field sizek=3, s=1, m=1
n=3
Conv2dk=3, s=1
Conv2dk=5, s=1
![Page 62: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/62.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
Receptive field sizek=5, s=1, m=3
Conv2dk=3, s=1
Conv2dk=5, s=1
![Page 63: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/63.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
Receptive field sizek=5, s=1, m=3
n=7
Conv2dk=3, s=1
Conv2dk=5, s=1
![Page 64: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/64.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 9 - May 1, 2018Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 9 - May 1, 2018
Case Study: VGGNet
64
[Simonyan and Zisserman, 2014]
Q: Why use smaller filters? (3x3 conv)
3x3 conv, 128
Pool
3x3 conv, 64
3x3 conv, 64
Input
3x3 conv, 128
Pool
3x3 conv, 256
3x3 conv, 256
Pool
3x3 conv, 512
3x3 conv, 512
Pool
3x3 conv, 512
3x3 conv, 512
Pool
FC 4096
FC 1000
Softmax
FC 4096
3x3 conv, 512
3x3 conv, 512
3x3 conv, 384
Pool
5x5 conv, 256
11x11 conv, 96
Input
Pool
3x3 conv, 384
3x3 conv, 256
Pool
FC 4096
FC 4096
Softmax
FC 1000
Pool
Input
Pool
Pool
Pool
Pool
Softmax
3x3 conv, 512
3x3 conv, 512
3x3 conv, 256
3x3 conv, 256
3x3 conv, 128
3x3 conv, 128
3x3 conv, 64
3x3 conv, 64
3x3 conv, 512
3x3 conv, 512
3x3 conv, 512
3x3 conv, 512
3x3 conv, 512
3x3 conv, 512
FC 4096
FC 1000
FC 4096
AlexNet VGG16 VGG19
Stack of three 3x3 conv (stride 1) layers has same effective receptive field as one 7x7 conv layer
But deeper, more non-linearities
And fewer parameters: 3 * (32C2) vs. 72C2 for C channels per layer
![Page 65: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/65.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
![Page 66: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/66.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
![Page 67: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/67.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
![Page 68: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/68.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
Chain Rule
![Page 69: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/69.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
Chain Rule?
![Page 70: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/70.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
Chain Rule!
![Page 71: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/71.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
![Page 72: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/72.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
![Page 73: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/73.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
![Page 74: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/74.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
![Page 75: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/75.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
![Page 76: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/76.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 6 - April 19, 2018Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 6 - April 19, 201876
Loss
time
Bad initialization a prime suspect
![Page 77: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/77.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
![Page 78: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/78.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
![Page 79: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/79.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
![Page 80: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/80.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
![Page 81: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/81.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
![Page 82: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/82.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
![Page 83: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/83.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
![Page 84: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/84.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018Midterm Review May 4th, 2018
Symmetry Breaking
W1 X + b1b1
max(x, 0) W2 X + b2
max(b1, 0)
W1 X + b1 Lossb2
L
![Page 85: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/85.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018
![Page 86: Midterm Review - cs231n.stanford.educs231n.stanford.edu/slides/2018/cs231n_2018_midterm_review.pdf · Fei-Fei Li & Justin Johnson & Serena Yeung Midterm ReviewLecture 3 - May 4th,](https://reader031.fdocuments.in/reader031/viewer/2022021909/5be8628f09d3f2bf7c8be80a/html5/thumbnails/86.jpg)
Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 3 - April 10, 2018