“Hello world” of deep learningDeep Learning Toolkit: Keras Author Hung-yi Lee Created Date...
Transcript of “Hello world” of deep learningDeep Learning Toolkit: Keras Author Hung-yi Lee Created Date...
![Page 1: “Hello world” of deep learningDeep Learning Toolkit: Keras Author Hung-yi Lee Created Date 10/28/2016 2:11:18 AM ...](https://reader033.fdocuments.in/reader033/viewer/2022053118/609ddb6da44d8174706ec65c/html5/thumbnails/1.jpg)
“Hello world” of deep learning
![Page 2: “Hello world” of deep learningDeep Learning Toolkit: Keras Author Hung-yi Lee Created Date 10/28/2016 2:11:18 AM ...](https://reader033.fdocuments.in/reader033/viewer/2022053118/609ddb6da44d8174706ec65c/html5/thumbnails/2.jpg)
Keras
keras
http://speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015_2/Lecture/Theano%20DNN.ecm.mp4/index.html
http://speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015_2/Lecture/RNN%20training%20(v6).ecm.mp4/index.html
Very flexible
Need some effort to learn
Easy to learn and use
(still have some flexibility)
You can modify it if you can write TensorFlow or Theano
Interface of TensorFlow or Theano
or
If you want to learn theano:
![Page 3: “Hello world” of deep learningDeep Learning Toolkit: Keras Author Hung-yi Lee Created Date 10/28/2016 2:11:18 AM ...](https://reader033.fdocuments.in/reader033/viewer/2022053118/609ddb6da44d8174706ec65c/html5/thumbnails/3.jpg)
Keras
• François Chollet is the author of Keras. • He currently works for Google as a deep learning
engineer and researcher.
• Keras means horn in Greek
• Documentation: http://keras.io/
• Example: https://github.com/fchollet/keras/tree/master/examples
![Page 4: “Hello world” of deep learningDeep Learning Toolkit: Keras Author Hung-yi Lee Created Date 10/28/2016 2:11:18 AM ...](https://reader033.fdocuments.in/reader033/viewer/2022053118/609ddb6da44d8174706ec65c/html5/thumbnails/4.jpg)
使用 Keras心得
感謝沈昇勳同學提供圖檔
![Page 5: “Hello world” of deep learningDeep Learning Toolkit: Keras Author Hung-yi Lee Created Date 10/28/2016 2:11:18 AM ...](https://reader033.fdocuments.in/reader033/viewer/2022053118/609ddb6da44d8174706ec65c/html5/thumbnails/5.jpg)
Example Application
• Handwriting Digit Recognition
Machine “1”
“Hello world” for deep learning
MNIST Data: http://yann.lecun.com/exdb/mnist/
Keras provides data sets loading function: http://keras.io/datasets/
28 x 28
![Page 6: “Hello world” of deep learningDeep Learning Toolkit: Keras Author Hung-yi Lee Created Date 10/28/2016 2:11:18 AM ...](https://reader033.fdocuments.in/reader033/viewer/2022053118/609ddb6da44d8174706ec65c/html5/thumbnails/6.jpg)
Keras
y1 y2 y10
……
……
……
……
Softmax
500
500
28x28
softplus, softsign, relu, tanh,
hard_sigmoid, linear
![Page 7: “Hello world” of deep learningDeep Learning Toolkit: Keras Author Hung-yi Lee Created Date 10/28/2016 2:11:18 AM ...](https://reader033.fdocuments.in/reader033/viewer/2022053118/609ddb6da44d8174706ec65c/html5/thumbnails/7.jpg)
Keras
Several alternatives: https://keras.io/objectives/
![Page 8: “Hello world” of deep learningDeep Learning Toolkit: Keras Author Hung-yi Lee Created Date 10/28/2016 2:11:18 AM ...](https://reader033.fdocuments.in/reader033/viewer/2022053118/609ddb6da44d8174706ec65c/html5/thumbnails/8.jpg)
Keras
Step 3.1: Configuration
Step 3.2: Find the optimal network parameters
Training data(Images)
Labels(digits)
In the following slides
SGD, RMSprop, Adagrad, Adadelta, Adam, Adamax, Nadam
![Page 9: “Hello world” of deep learningDeep Learning Toolkit: Keras Author Hung-yi Lee Created Date 10/28/2016 2:11:18 AM ...](https://reader033.fdocuments.in/reader033/viewer/2022053118/609ddb6da44d8174706ec65c/html5/thumbnails/9.jpg)
Keras
Step 3.2: Find the optimal network parameters
https://www.tensorflow.org/versions/r0.8/tutorials/mnist/beginners/index.html
Number of training examples
numpy array
28 x 28=784
numpy array
10
Number of training examples
…… ……
![Page 10: “Hello world” of deep learningDeep Learning Toolkit: Keras Author Hung-yi Lee Created Date 10/28/2016 2:11:18 AM ...](https://reader033.fdocuments.in/reader033/viewer/2022053118/609ddb6da44d8174706ec65c/html5/thumbnails/10.jpg)
Keras
http://keras.io/getting-started/faq/#how-can-i-save-a-keras-model
How to use the neural network (testing):
case 1:
case 2:
Save and load models
![Page 11: “Hello world” of deep learningDeep Learning Toolkit: Keras Author Hung-yi Lee Created Date 10/28/2016 2:11:18 AM ...](https://reader033.fdocuments.in/reader033/viewer/2022053118/609ddb6da44d8174706ec65c/html5/thumbnails/11.jpg)
Keras
• Using GPU to speed training
• Way 1
• THEANO_FLAGS=device=gpu0 python YourCode.py
• Way 2 (in your code)
• import os
• os.environ["THEANO_FLAGS"] = "device=gpu0"
![Page 12: “Hello world” of deep learningDeep Learning Toolkit: Keras Author Hung-yi Lee Created Date 10/28/2016 2:11:18 AM ...](https://reader033.fdocuments.in/reader033/viewer/2022053118/609ddb6da44d8174706ec65c/html5/thumbnails/12.jpg)
Live Demo
![Page 13: “Hello world” of deep learningDeep Learning Toolkit: Keras Author Hung-yi Lee Created Date 10/28/2016 2:11:18 AM ...](https://reader033.fdocuments.in/reader033/viewer/2022053118/609ddb6da44d8174706ec65c/html5/thumbnails/13.jpg)
Mini-batch
x1 NN
……
y1 ො𝑦1
𝐶1
x31 NN y31 ො𝑦31
𝐶31
x2 NN
……
y2 ො𝑦2
𝐶2
x16 NN y16 ො𝑦16
𝐶16
Pick the 1st batch
Randomly initialize network parameters
Pick the 2nd batchMin
i-b
atch
Min
i-b
atch
𝐿′ = 𝐶1 + 𝐶31 +⋯
𝐿′′ = 𝐶2 + 𝐶16 +⋯
Update parameters once
Update parameters once
Until all mini-batches have been picked
…
one epoch
Repeat the above process
We do not really minimize total loss!
![Page 14: “Hello world” of deep learningDeep Learning Toolkit: Keras Author Hung-yi Lee Created Date 10/28/2016 2:11:18 AM ...](https://reader033.fdocuments.in/reader033/viewer/2022053118/609ddb6da44d8174706ec65c/html5/thumbnails/14.jpg)
Mini-batch
x1 NN
……
y1 ො𝑦1
𝑙1
x31 NN y31 ො𝑦31
𝑙31Min
i-b
atch
100 examples in a mini-batch
Repeat 20 times
Pick the 1st batch
Pick the 2nd batch
𝐿′ = 𝐶1 + 𝐶31 +⋯
𝐿′′ = 𝐶2 + 𝐶16 +⋯
Update parameters once
Update parameters once
Until all mini-batches have been picked
…
one epoch
Batch size = 1
Stochastic gradient descent
Batch size influences both speed and performance. You have to tune it.
![Page 15: “Hello world” of deep learningDeep Learning Toolkit: Keras Author Hung-yi Lee Created Date 10/28/2016 2:11:18 AM ...](https://reader033.fdocuments.in/reader033/viewer/2022053118/609ddb6da44d8174706ec65c/html5/thumbnails/15.jpg)
Speed
• Smaller batch size means more updates in one epoch
• E.g. 50000 examples
• batch size = 1, 50000 updates in one epoch
• batch size = 10, 5000 updates in one epoch
GTX 980 on MNIST with 50000 training examples
166s
166s
17s
17s
1 epoch
10 epoch
Batch size = 1 or 10, update the same amount of times in the same period.
Batch size = 10 is more stable
![Page 16: “Hello world” of deep learningDeep Learning Toolkit: Keras Author Hung-yi Lee Created Date 10/28/2016 2:11:18 AM ...](https://reader033.fdocuments.in/reader033/viewer/2022053118/609ddb6da44d8174706ec65c/html5/thumbnails/16.jpg)
= 𝜎 𝜎
1x
2x
……
Nx
……
……
……
……
……
……
……
y1
y2
yM
Speed - Matrix Operation
W1 W2 WL
b2 bL
x a1 a2 y
y = 𝑓 x
b1W1 x +𝜎 b2W2 + bLWL +…
b1
…
Forward pass (Backward pass is similar)
![Page 17: “Hello world” of deep learningDeep Learning Toolkit: Keras Author Hung-yi Lee Created Date 10/28/2016 2:11:18 AM ...](https://reader033.fdocuments.in/reader033/viewer/2022053118/609ddb6da44d8174706ec65c/html5/thumbnails/17.jpg)
Speed - Matrix Operation
• Why mini-batch is faster than stochastic gradient descent?
Stochastic Gradient Descent
Mini-batchmatrix
Practically, which one is faster?
𝑊1 𝑊1
𝑊1
𝑥 𝑥
𝑥 𝑥
𝑧1 = 𝑧1 = ……
𝑧1 =𝑧1
![Page 18: “Hello world” of deep learningDeep Learning Toolkit: Keras Author Hung-yi Lee Created Date 10/28/2016 2:11:18 AM ...](https://reader033.fdocuments.in/reader033/viewer/2022053118/609ddb6da44d8174706ec65c/html5/thumbnails/18.jpg)
Performance
• Larger batch size yields more efficient computation.
• However, it can yield worse performance
![Page 19: “Hello world” of deep learningDeep Learning Toolkit: Keras Author Hung-yi Lee Created Date 10/28/2016 2:11:18 AM ...](https://reader033.fdocuments.in/reader033/viewer/2022053118/609ddb6da44d8174706ec65c/html5/thumbnails/19.jpg)
x1 NN
……
y1 ො𝑦1
𝑙1
x31 NN y31 ො𝑦31
𝑙31
x2 NN
……
y2 ො𝑦2
𝑙2
x16 NN y16 ො𝑦16
𝑙16
Min
i-b
atch
Min
i-b
atch
Shuffle the training examples for each epoch
Epoch 1
x1 NN
……
y1 ො𝑦1
𝑙1
x31 NN y31 ො𝑦31
𝑙17
x2 NN
……
y2 ො𝑦2
𝑙2
x16 NN y16 ො𝑦16
𝑙26
Min
i-b
atch
Min
i-b
atch
Epoch 2
Don’t worry. This is the default of Keras.
![Page 20: “Hello world” of deep learningDeep Learning Toolkit: Keras Author Hung-yi Lee Created Date 10/28/2016 2:11:18 AM ...](https://reader033.fdocuments.in/reader033/viewer/2022053118/609ddb6da44d8174706ec65c/html5/thumbnails/20.jpg)
Analysis
1x
2x
……
Nx
……
Arranging the weights according to the pixels they connected
Red: positive
Blue: negative
When did the neuron has the largest output?
The neurons in the first layer usually detect part of the digits.
![Page 21: “Hello world” of deep learningDeep Learning Toolkit: Keras Author Hung-yi Lee Created Date 10/28/2016 2:11:18 AM ...](https://reader033.fdocuments.in/reader033/viewer/2022053118/609ddb6da44d8174706ec65c/html5/thumbnails/21.jpg)
![Page 22: “Hello world” of deep learningDeep Learning Toolkit: Keras Author Hung-yi Lee Created Date 10/28/2016 2:11:18 AM ...](https://reader033.fdocuments.in/reader033/viewer/2022053118/609ddb6da44d8174706ec65c/html5/thumbnails/22.jpg)
![Page 23: “Hello world” of deep learningDeep Learning Toolkit: Keras Author Hung-yi Lee Created Date 10/28/2016 2:11:18 AM ...](https://reader033.fdocuments.in/reader033/viewer/2022053118/609ddb6da44d8174706ec65c/html5/thumbnails/23.jpg)
![Page 24: “Hello world” of deep learningDeep Learning Toolkit: Keras Author Hung-yi Lee Created Date 10/28/2016 2:11:18 AM ...](https://reader033.fdocuments.in/reader033/viewer/2022053118/609ddb6da44d8174706ec65c/html5/thumbnails/24.jpg)
Try another task
http://top-breaking-news.com/
Machine
政治
體育
經濟
“president” in document
“stock” in document
體育 政治 財經
![Page 25: “Hello world” of deep learningDeep Learning Toolkit: Keras Author Hung-yi Lee Created Date 10/28/2016 2:11:18 AM ...](https://reader033.fdocuments.in/reader033/viewer/2022053118/609ddb6da44d8174706ec65c/html5/thumbnails/25.jpg)
Try another task
![Page 26: “Hello world” of deep learningDeep Learning Toolkit: Keras Author Hung-yi Lee Created Date 10/28/2016 2:11:18 AM ...](https://reader033.fdocuments.in/reader033/viewer/2022053118/609ddb6da44d8174706ec65c/html5/thumbnails/26.jpg)
Live Demo