Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We...
Transcript of Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We...
![Page 1: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/1.jpg)
Implementing a Neural Network from Scratch
Implementing a Neural Network from Scratch #2
Christian Bartz, Joseph Bethge
![Page 2: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/2.jpg)
Implementing a Neural Network from Scratch
Last Exercise: Feedback
● how long did it take?
Slide #2
![Page 3: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/3.jpg)
Implementing a Neural Network from Scratch
Last Exercise: Feedback
● how long did it take?
● most difficult/easy task?
Slide #3
![Page 4: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/4.jpg)
Implementing a Neural Network from Scratch
Last Exercise: Feedback
● how long did it take?
● most difficult/easy task?
● favorite/most disliked task?
Slide #4
![Page 5: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/5.jpg)
Implementing a Neural Network from Scratch
Last Exercise: Feedback
● how long did it take?
● most difficult/easy task?
● favorite/most disliked task?
● more suggestions, comments?
Slide #5
![Page 6: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/6.jpg)
Implementing a Neural Network from Scratch
What We Will Do With You
● tasks for this exercise
● LENGTHy introduction
● time to hack
● outlook
● at home: finish any remaining tasks until next time (three weeks)!
Slide #6
![Page 7: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/7.jpg)
Implementing a Neural Network from Scratch
● we added some tests for today’s tasks
● stash your changes
git stash
● fetch the updates from Github
git fetch
● rebase your current branch on our master
git rebase origin/master
● apply stash
git stash apply
Slide #7
Prepare your Environment
![Page 8: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/8.jpg)
Implementing a Neural Network from Scratch
Tasks for today
1. Initialization - Christian
2. Sigmoid - Joseph
3. ReLU - Christian
4. Adam - Joseph
5. Dropout - Christian
Bonus:
1. Convolution including tests!
2. Pooling functions (max_pooling, average_pooling) including tests!
3. Tanh
Slide #8
![Page 9: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/9.jpg)
Implementing a Neural Network from Scratch
Task 1: Initialization
Slide #9
Should we initialize a network with zeros everywhere?
![Page 10: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/10.jpg)
Implementing a Neural Network from Scratch
Task 1: Initialization(length/initializers/xavier.py)
Slide #10
● it is important to have a good initialization
○ allows convergence
○ enables faster convergence
● why do we care about initialization and don’t just take:
W = np.random.randn(n) with n being the number of inputs?
![Page 11: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/11.jpg)
Implementing a Neural Network from Scratch
● using the right initialization we get evenly distributed activations
○ makes training easier
○ mitigates saturation of activation functions and vanishing
gradient
Task 1: Initialization(length/initializers/xavier.py)
Slide #11
naive initialization normalized initialization
![Page 12: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/12.jpg)
Implementing a Neural Network from Scratch
● we can get evenly distributed activation values by scaling the
random weights:
Task 1: Initialization(length/initializers/xavier.py)
Slide #12
naive initialization normalized initialization
![Page 13: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/13.jpg)
Implementing a Neural Network from Scratch
Task 2: Sigmoid(length/functions/sigmoid.py)
Slide #13
● forward pass:
○ trivial to implement for one value
○ batch processing: use numpy methods instead
● backward pass:
○ stepwise derivatives
○ direct derivative (lecture)
○ chain rule!
● why are we using non-linearities?
![Page 14: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/14.jpg)
Implementing a Neural Network from Scratch
● very simple activation function
● enables faster convergence
○ why?
Task 3: ReLU(length/functions/relu.py)
Slide #14
![Page 15: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/15.jpg)
Implementing a Neural Network from Scratch
● very simple activation function
● enables faster convergence
○ does not saturate
○ stable gradient
● forward pass:
○ element-wise maximum
● backward pass:
○ only sub-differentiable!
○ think about every case
Task 3: ReLU(length/functions/relu.py)
Slide #15
![Page 16: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/16.jpg)
Implementing a Neural Network from Scratch
Task 4: Adam(length/optimizers/adam.py)
Slide #16
● baseline: SGD - “man walking the steepest way down”
○ does anyone know how adam works?
param_deltas = [self.lr * grad for grad in gradients]
![Page 17: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/17.jpg)
Implementing a Neural Network from Scratch
Task 4: Adam(length/optimizers/adam.py)
Slide #17
● baseline: SGD - “man walking the steepest way down”
○ adam - “ball rolling down the hill”
○ how could we implement this?
param_deltas = [self.lr * grad for grad in gradients]
![Page 18: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/18.jpg)
Implementing a Neural Network from Scratch
def adam(self, gradient):
self.t += 1
self.m = beta1 * self.m + (1 - beta1) * gradient
self.v = beta2 * self.v + (1 - beta2) * (gradient ** 2)
m_corrected = m / (1 - (beta1 ** self.t))
v_corrected = v / (1 - (beta2 ** self.t))
delta = alpha * m_corrected / ((v_corrected ** 0.5) + epsilon)
return delta
Task 4: Adam(length/optimizers/adam.py)
Slide #18
def init(): m = 0 v = 0 t = 0
alpha = 0.001beta1 = 0.9beta2 = 0.999epsilon = 10e-8
![Page 19: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/19.jpg)
Implementing a Neural Network from Scratch
def adam(self, gradient):
self.t += 1
self.m = beta1 * self.m + (1 - beta1) * gradient
self.v = beta2 * self.v + (1 - beta2) * (gradient ** 2)
m_corrected = m / (1 - (beta1 ** self.t))
v_corrected = v / (1 - (beta2 ** self.t))
delta = alpha * m_corrected / ((v_corrected ** 0.5) + epsilon)
return delta
Task 4: Adam(length/optimizers/adam.py)
Slide #19
def init(): m = 0 v = 0 t = 0
alpha = 0.001beta1 = 0.9beta2 = 0.999epsilon = 10e-8
increase timestep(needed for bias correction)
![Page 20: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/20.jpg)
Implementing a Neural Network from Scratch
def adam(self, gradient):
self.t += 1
self.m = beta1 * self.m + (1 - beta1) * gradient
self.v = beta2 * self.v + (1 - beta2) * (gradient ** 2)
m_corrected = m / (1 - (beta1 ** self.t))
v_corrected = v / (1 - (beta2 ** self.t))
delta = alpha * m_corrected / (sqrt(v_corrected) + epsilon)
return delta
Task 4: Adam(length/optimizers/adam.py)
Slide #20
def init(): m = 0 v = 0 t = 0
alpha = 0.001beta1 = 0.9beta2 = 0.999epsilon = 10e-8
adapt first order momentum (mean)90 % - previous momentum10 % - new gradients
![Page 21: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/21.jpg)
Implementing a Neural Network from Scratch
def adam(self, gradient):
self.t += 1
self.m = beta1 * self.m + (1 - beta1) * gradient
self.v = beta2 * self.v + (1 - beta2) * (gradient ** 2)
m_corrected = m / (1 - (beta1 ** self.t))
v_corrected = v / (1 - (beta2 ** self.t))
delta = alpha * m_corrected / (sqrt(v_corrected) + epsilon)
return delta
Task 4: Adam(length/optimizers/adam.py)
Slide #21
def init(): m = 0 v = 0 t = 0
alpha = 0.001beta1 = 0.9beta2 = 0.999epsilon = 10e-8
adapt second order momentum (variance)99.9 % - previous momentum 0.1 % - element-wise square of new gradients
![Page 22: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/22.jpg)
Implementing a Neural Network from Scratch
def adam(self, gradient):
self.t += 1
self.m = beta1 * self.m + (1 - beta1) * gradient
self.v = beta2 * self.v + (1 - beta2) * (gradient ** 2)
m_corrected = m / (1 - (beta1 ** self.t))
v_corrected = v / (1 - (beta2 ** self.t))
delta = alpha * m_corrected / (sqrt(v_corrected) + epsilon)
return delta
Task 4: Adam(length/optimizers/adam.py)
Slide #22
def init(): m = 0 v = 0 t = 0
alpha = 0.001beta1 = 0.9beta2 = 0.999epsilon = 10e-8
bias correction - most relevant for the first iterations(m and v were initalized with zero)
![Page 23: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/23.jpg)
Implementing a Neural Network from Scratch
def adam(self, gradient):
self.t += 1
self.m = beta1 * self.m + (1 - beta1) * gradient
self.v = beta2 * self.v + (1 - beta2) * (gradient ** 2)
m_corrected = m / (1 - (beta1 ** self.t))
v_corrected = v / (1 - (beta2 ** self.t))
delta = alpha * m_corrected / (sqrt(v_corrected) + epsilon)
return delta
Task 4: Adam(length/optimizers/adam.py)
Slide #23
def init(): m = 0 v = 0 t = 0
alpha = 0.001beta1 = 0.9beta2 = 0.999epsilon = 10e-8
calculate parameter delta(alpha = learning rate)
what is the influence of v?
![Page 24: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/24.jpg)
Implementing a Neural Network from Scratch
def adam(self, gradient):
self.t += 1
self.m = beta1 * self.m + (1 - beta1) * gradient
self.v = beta2 * self.v + (1 - beta2) * (gradient ** 2)
m_corrected = m / (1 - (beta1 ** self.t))
v_corrected = v / (1 - (beta2 ** self.t))
delta = alpha * m_corrected / (sqrt(v_corrected) + epsilon)
return delta
Task 4: Adam(length/optimizers/adam.py)
Slide #24
def init(): m = 0 v = 0 t = 0
alpha = 0.001beta1 = 0.9beta2 = 0.999epsilon = 10e-8
calculate parameter delta(alpha = learning rate)
v - decreases delta on alternating gradients
abs(delta) <= alpha
![Page 25: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/25.jpg)
Implementing a Neural Network from Scratch
Task 5: Dropout(length/functions/dropout.py)
Slide #25
● regularization function that randomly drops units
● why does this help the training of the network?
![Page 26: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/26.jpg)
Implementing a Neural Network from Scratch
Task 5: Dropout(length/functions/dropout.py)
Slide #26
● regularization function that randomly drops units
● why does this help the training of the network?
○ forces network to find meaningful features
● anything we have to think of?
![Page 27: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/27.jpg)
Implementing a Neural Network from Scratch
Task 5: Dropout(length/functions/dropout.py)
Slide #27
● regularization function that randomly drops units
● why does this help the training of the network?
○ forces network to find meaningful features
● anything we have to think of?
○ no dropout at testing time!
○ scaling necessary!
![Page 28: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/28.jpg)
Implementing a Neural Network from Scratch
Task 5: Dropout(length/functions/dropout.py)
Slide #28
● forward pass:
○ drop a value in input with probability p
○ scale outputs of functions by probability p
● backward pass:
○ anyone an idea?
![Page 29: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/29.jpg)
Implementing a Neural Network from Scratch
Task 5: Dropout(length/functions/dropout.py)
Slide #29
● forward pass:
○ drop a value in input with probability p
○ scale outputs of functions by probability p
● backward pass:
○ set gradients of dropped units to 0
● testing time:
○ do nothing
![Page 30: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/30.jpg)
Implementing a Neural Network from Scratch
LENGTH - Lightning-fast Extensible Neural-network Guarding The HPI
![Page 31: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/31.jpg)
Implementing a Neural Network from Scratch
LENGTH - Recap
Slide #31
● very simple neural network implementation based on Chainer
○ entirely written in Python using Numpy
○ simple, object oriented API
○ uses dynamic computational graph
![Page 32: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/32.jpg)
Implementing a Neural Network from Scratch
Static Computational Graph (define and run)
Slide #32
LENGTH - Computational Graph
Dynamic Computational Graph(define by run)
x
y +
z*
x = 3
y = 4
z = 2
14
![Page 33: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/33.jpg)
Implementing a Neural Network from Scratch
Static Computational Graph (define and run)
Slide #33
LENGTH - Computational Graph
Dynamic Computational Graph(define by run)
x
y +
z*
x = 3
y = 4
z = 2
14
![Page 34: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/34.jpg)
Implementing a Neural Network from Scratch Slide #34
LENGTH - Backward Computation
import numpy as np
import length.functions as F
from length.graph import Graph
x = Graph(np.array([3], dtype=np.float32))
y = Graph(np.array([4], dtype=np.float32))
z = Graph(np.array([2], dtype=np.float32))
h = F.add(x, y)
out = F.multiply(h, z)
create input data and prepare computational graph
![Page 35: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/35.jpg)
Implementing a Neural Network from Scratch Slide #35
LENGTH - Backward Computation
import numpy as np
import length.functions as F
from length.graph import Graph
x = Graph(np.array([3], dtype=np.float32))
y = Graph(np.array([4], dtype=np.float32))
z = Graph(np.array([2], dtype=np.float32))
h = F.add(x, y)
out = F.multiply(h, z)
perform computation and keep track of computational graph
>>> out.visualize()id | layer | next 1 | input (1,) | 4 2 | input (1,) | 4 3 | input (1,) | 5 4 | Add (1,) | 5 5 | Multiply (1,) | 6
![Page 36: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/36.jpg)
Implementing a Neural Network from Scratch Slide #36
LENGTH - Backward Computation(length/graph.py)
● out.backward(optimizer) → starts computation of gradients and update
of learnable parametersdef backward(self, optimizer):
if self.data.size == 1 and self.grad is None:
self.grad = np.ones((1,), dtype=constants.DTYPE)
candidate_layers = []
seen_layers = set()
def add_candidate_layer(candidate):
if candidate is not None and candidate not in seen_layers:
candidate_layers.append(candidate)
seen_layers.add(candidate)
add_candidate_layer(self)
df-- = 1df
![Page 37: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/37.jpg)
Implementing a Neural Network from Scratch Slide #37
LENGTH - Backward Computation(length/graph.py)
● out.backward(optimizer) → starts computation of gradients and update
of learnable parametersdef backward(self, optimizer):
if self.data.size == 1 and self.grad is None:
self.grad = np.ones((1,), dtype=constants.DTYPE)
candidate_layers = []
seen_layers = set()
def add_candidate_layer(candidate):
if candidate is not None and candidate not in seen_layers:
candidate_layers.append(candidate)
seen_layers.add(candidate)
add_candidate_layer(self)
prepare gradient computation for each function in computational graph
![Page 38: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/38.jpg)
Implementing a Neural Network from Scratch Slide #38
LENGTH - Backward Computation(length/graph.py)def backward(self, optimizer):
[…]
while candidate_layers:
candidate_layer = candidate_layers.pop()
if candidate_layer.creator is None:
continue
if candidate_layer.creator.needs_optimizer:
candidate_layer.creator.optimizer = optimizer
gradients = candidate_layer.creator.backward(candidate_layer.grad)
for predecessor, gradient in zip(candidate_layer.predecessors, gradients):
predecessor.grad = gradient
if gradient is not None:
# the gradient flows to another layer (does not happen with loss layers)
add_candidate_layer(predecessor)
as long as we are not at the top of the computational graph, we go on
![Page 39: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/39.jpg)
Implementing a Neural Network from Scratch
def backward(self, optimizer):
[…]
while candidate_layers:
candidate_layer = candidate_layers.pop()
if candidate_layer.creator is None:
continue
if candidate_layer.creator.needs_optimizer:
candidate_layer.creator.optimizer = optimizer
gradients = candidate_layer.creator.backward(candidate_layer.grad)
for predecessor, gradient in zip(candidate_layer.predecessors, gradients):
predecessor.grad = gradient
if gradient is not None:
# the gradient flows to another layer (does not happen with loss layers)
add_candidate_layer(predecessor)
Slide #39
LENGTH - Backward Computation(length/graph.py)
set optimizer if necessary
![Page 40: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/40.jpg)
Implementing a Neural Network from Scratch
def backward(self, optimizer):
[…]
while candidate_layers:
candidate_layer = candidate_layers.pop()
if candidate_layer.creator is None:
continue
if candidate_layer.creator.needs_optimizer:
candidate_layer.creator.optimizer = optimizer
gradients = candidate_layer.creator.backward(candidate_layer.grad)
for predecessor, gradient in zip(candidate_layer.predecessors, gradients):
predecessor.grad = gradient
if gradient is not None:
# the gradient flows to another layer (does not happen with loss layers)
add_candidate_layer(predecessor)
Slide #40
LENGTH - Backward Computation(length/graph.py)
compute gradients of this layer/function
![Page 41: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/41.jpg)
Implementing a Neural Network from Scratch Slide #41
LENGTH - Layers(length/layer.py and length/layers/)
class Layer(Function):
needs_optimizer = True
name = "Layer"
def internal_update(self, parameter_deltas):
raise NotImplementedError
def backward(self, gradients):
gradients = super().backward(gradients)
input_gradient = gradients[:len(self.inputs)]
parameter_gradients = gradients[len(self.inputs):]
if len(parameter_gradients) > 0:
parameter_deltas = self.optimizer.run_update_rule(parameter_gradients, self)
self.internal_update(parameter_deltas)
return input_gradient
compute gradients with respect to inputs and parameters of the layer
![Page 42: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/42.jpg)
Implementing a Neural Network from Scratch Slide #42
LENGTH - Layers(length/layer.py and length/layers/)
class Layer(Function):
needs_optimizer = True
name = "Layer"
def internal_update(self, parameter_deltas):
raise NotImplementedError
def backward(self, gradients):
gradients = super().backward(gradients)
input_gradient = gradients[:len(self.inputs)]
parameter_gradients = gradients[len(self.inputs):]
if len(parameter_gradients) > 0:
parameter_deltas = self.optimizer.run_update_rule(parameter_gradients, self)
self.internal_update(parameter_deltas)
return input_gradient
use optimizer to compute updates for internal parameters, based on computed gradients
![Page 43: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/43.jpg)
Implementing a Neural Network from Scratch
def backward(self, optimizer):
[…]
while candidate_layers:
candidate_layer = candidate_layers.pop()
if candidate_layer.creator is None:
continue
if candidate_layer.creator.needs_optimizer:
candidate_layer.creator.optimizer = optimizer
gradients = candidate_layer.creator.backward(candidate_layer.grad)
for predecessor, gradient in zip(candidate_layer.predecessors, gradients):
predecessor.grad = gradient
if gradient is not None:
# the gradient flows to another layer (does not happen with loss layers)
add_candidate_layer(predecessor)
Slide #43
LENGTH - Backward Computation(length/graph.py)
find next functions to compute gradients for and scatter gradients to them
![Page 44: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/44.jpg)
Implementing a Neural Network from Scratch Slide #44
LENGTH - Backward Computation
● do you see any problems with this backward implementation?
![Page 45: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/45.jpg)
Implementing a Neural Network from Scratch Slide #45
LENGTH - Backward Computation
● do you see any problems with this backward implementation?
○ can not handle networks with graphs that split at a certain point
1
*
1
1
*
4
+
5
4
![Page 46: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/46.jpg)
Implementing a Neural Network from Scratch Slide #46
How Can We Improve Our Results?
def __init__(self):
self.fully_connected_1 = FullyConnected(784, 512)
self.fully_connected_2 = FullyConnected(512, 512)
self.fully_connected_3 = FullyConnected(512, 10)
[...]
def forward(self, batch, train=True):
[...]
hidden = self.fully_connected_1(batch.data)
hidden = self.fully_connected_2(hidden)
self.predictions = self.fully_connected_3(hidden)
self.loss = F.mean_squared_error(self.predictions, batch.labels)
![Page 47: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/47.jpg)
Implementing a Neural Network from Scratch Slide #47
How Can We Improve Our Results?
def __init__(self):
self.fully_connected_1 = FullyConnected(784, 512)
self.fully_connected_2 = FullyConnected(512, 512)
self.fully_connected_3 = FullyConnected(512, 10)
[...]
def forward(self, batch, train=True):
[...]
hidden = self.fully_connected_1(batch.data)
hidden = self.fully_connected_2(hidden)
self.predictions = self.fully_connected_3(hidden)
self.loss = F.mean_squared_error(self.predictions, batch.labels)
replacemean_squared_error with softmax_cross_entropy
add dropout
add relu/sigmoid
use adam
python train.py --optimizer adam
increase layer size
![Page 48: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/48.jpg)
Implementing a Neural Network from Scratch
Task Overview - Time to Hack!
1. Initialization■ length/initializers/xavier.py
2. Sigmoid■ length/functions/sigmoid.py
3. ReLU■ length/functions/relu.py
4. Adam■ length/optimizers/adam.py
5. Dropout■ length/functions/dropout.py
Run test with: pytest
Run actual training: python train.py --optimizer [sgd,adam]
Slide #48
$ python train.py --optimizer adamtrain: epoch: 0, loss: 0.12, accuracy 0.94, iteration: 900running test set... test: epoch: 0, loss: 0.18, accuracy 0.96train: epoch: 1, loss: 0.14, accuracy 0.94, iteration: 900running test set... test: epoch: 1, loss: 0.11, accuracy 0.98train: epoch: 2, loss: 0.09, accuracy 0.97, iteration: 900running test set... test: epoch: 2, loss: 0.08, accuracy 0.99train: epoch: 3, loss: 0.02, accuracy 0.98, iteration: 900running test set... test: epoch: 3, loss: 0.11, accuracy 0.98train: epoch: 4, loss: 0.03, accuracy 1.00, iteration: 900running test set... test: epoch: 4, loss: 0.10, accuracy 0.98train: epoch: 5, loss: 0.10, accuracy 0.97, iteration: 900running test set... test: epoch: 5, loss: 0.08, accuracy 0.99train: epoch: 6, loss: 0.04, accuracy 0.98, iteration: 900running test set... test: epoch: 6, loss: 0.08, accuracy 0.99train: epoch: 7, loss: 0.05, accuracy 0.98, iteration: 900running test set... test: epoch: 7, loss: 0.08, accuracy 0.99train: epoch: 8, loss: 0.00, accuracy 1.00, iteration: 900running test set... test: epoch: 8, loss: 0.07, accuracy 1.00train: epoch: 9, loss: 0.00, accuracy 1.00, iteration: 900running test set... test: epoch: 9, loss: 0.08, accuracy 0.99
![Page 49: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/49.jpg)
Implementing a Neural Network from Scratch
Next Time
Slide #49
We use a real framework for inference with a trained model.
Send an email or visit us anytime with questions!
Christian: [email protected] H-1.11
Joseph: [email protected] H-1.21
![Page 50: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/50.jpg)
Implementing a Neural Network from Scratch Slide #50
Bitte bringen Sie die Studenten dazu den Raum zu verlassen, um die Präsentation zu beenden.
![Page 51: Implementing a Neural Network from Scratch #2...Implementing a Neural Network from Scratch What We Will Do With You tasks for this exercise LENGTHy introduction time to hack outlook](https://reader030.fdocuments.in/reader030/viewer/2022040817/5e604db13ae8eb558e5216e3/html5/thumbnails/51.jpg)
Implementing a Neural Network from Scratch
All Tasks
1. Data Loading
2. Initialization
3. Fully Connected Layer
4. Mean Squared Error
5. SGD
6. Sigmoid
7. ReLU
8. Adam
9. Dropout
Bonus Bonus:
1. tanh
Slide #51