Introduction to Deep Learning - Robot Learning...
Transcript of Introduction to Deep Learning - Robot Learning...
![Page 1: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,](https://reader035.fdocuments.in/reader035/viewer/2022062606/5fe37cd0320dec77cd0cccb1/html5/thumbnails/1.jpg)
Introduction to Deep LearningRecurrent Neural Networks
Prof. Songhwai OhECE, SNU
Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 1
![Page 2: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,](https://reader035.fdocuments.in/reader035/viewer/2022062606/5fe37cd0320dec77cd0cccb1/html5/thumbnails/2.jpg)
RECURRENT NEURAL NETWORKS (RNN)Adapted from slides by Hyemin Ahn
Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 2
![Page 3: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,](https://reader035.fdocuments.in/reader035/viewer/2022062606/5fe37cd0320dec77cd0cccb1/html5/thumbnails/3.jpg)
Recurrent Neural Networks : For what?
Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 3
• Human remembers and uses the pattern of a sequence.
• Try ‘a b c d e f g…’
• But how about ‘z y x w v u t s…’ ?
• The idea behind RNN is to make use of sequential
information.
• Let’s learn a pattern of a sequence and utilize it (estimate,
generate, etc.)
• But HOW?
![Page 4: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,](https://reader035.fdocuments.in/reader035/viewer/2022062606/5fe37cd0320dec77cd0cccb1/html5/thumbnails/4.jpg)
Recurrent Neural Networks : Typical RNNs
Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 4
OUTPUT
INPUT
ONE STEP DELAY
HIDDEN STATE
• RNNs are called RECURRENT because they perform the same task for every element of a sequence, with the output being depended on the previous computations.
• RNNs have a memory which captures information about what has been calculated so far.
• The hidden state captures some information about the sequence.
• If we use tanh , Vanishing or exploding gradient problem happens.
• To overcome this, we use LSTM or GRU.• LSTM: Long short‐term memory• GRU: gated recurrent unit
![Page 5: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,](https://reader035.fdocuments.in/reader035/viewer/2022062606/5fe37cd0320dec77cd0cccb1/html5/thumbnails/5.jpg)
Recurrent Neural Networks : LSTM
Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 5
• Let’s think about the machine, which guesses the dinner menu from stuffs in a shopping bag.
Umm,,Carbonara!
![Page 6: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,](https://reader035.fdocuments.in/reader035/viewer/2022062606/5fe37cd0320dec77cd0cccb1/html5/thumbnails/6.jpg)
Recurrent Neural Networks : LSTM
Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 6
Cell state,Internal memory unit,Like a conveyor belt!
![Page 7: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,](https://reader035.fdocuments.in/reader035/viewer/2022062606/5fe37cd0320dec77cd0cccb1/html5/thumbnails/7.jpg)
Recurrent Neural Networks : LSTM
Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 7
Cell state,Internal memory unit,Like a conveyor belt!
ForgetSome
Memories!
![Page 8: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,](https://reader035.fdocuments.in/reader035/viewer/2022062606/5fe37cd0320dec77cd0cccb1/html5/thumbnails/8.jpg)
Recurrent Neural Networks : LSTM
Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 8
Cell state,Internal memory unit,Like a conveyor belt!
ForgetSome
Memories!
LSTM learns (1) How to forget a memory when the and new input is given,(2) Then how to add the new memory from given and .
![Page 9: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,](https://reader035.fdocuments.in/reader035/viewer/2022062606/5fe37cd0320dec77cd0cccb1/html5/thumbnails/9.jpg)
Recurrent Neural Networks : LSTM
Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 9
Cell state,Internal memory unit,Like a conveyor belt!
InsertSome
Memories!
LSTM learns (1) How to forget a memory when the and new input is given,(2) Then how to add the new memory from given and .
![Page 10: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,](https://reader035.fdocuments.in/reader035/viewer/2022062606/5fe37cd0320dec77cd0cccb1/html5/thumbnails/10.jpg)
Recurrent Neural Networks : LSTM
Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 10
Cell state,Internal memory unit,Like a conveyor belt!
LSTM learns (1) How to forget a memory when the and new input is given,(2) Then how to add the new memory from given and .
![Page 11: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,](https://reader035.fdocuments.in/reader035/viewer/2022062606/5fe37cd0320dec77cd0cccb1/html5/thumbnails/11.jpg)
Recurrent Neural Networks : LSTM
Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 11
Cell state,Internal memory unit,Like a conveyor belt!
LSTM learns (1) How to forget a memory when the and new input is given,(2) Then how to add the new memory from given and .
![Page 12: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,](https://reader035.fdocuments.in/reader035/viewer/2022062606/5fe37cd0320dec77cd0cccb1/html5/thumbnails/12.jpg)
Recurrent Neural Networks : LSTM
Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 12
Cell state,Internal memory unit,Like a conveyor belt!
LSTM learns (1) How to forget a memory when the and new input is given,(2) Then how to add the new memory from given and .
![Page 13: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,](https://reader035.fdocuments.in/reader035/viewer/2022062606/5fe37cd0320dec77cd0cccb1/html5/thumbnails/13.jpg)
Recurrent Neural Networks : LSTM
Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 13
LSTM learns (1) How to forget a memory when the and new input is given,(2) Then how to add the new memory from given and .
Figures from http://colah.github.io/posts/2015‐08‐Understanding‐LSTMs/
![Page 14: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,](https://reader035.fdocuments.in/reader035/viewer/2022062606/5fe37cd0320dec77cd0cccb1/html5/thumbnails/14.jpg)
Recurrent Neural Networks : LSTM
Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 14
LSTM learns (1) How to forget a memory when the and new input is given,(2) Then how to add the new memory from given and .
Figures from http://colah.github.io/posts/2015‐08‐Understanding‐LSTMs/
Forget gate
Input gate
![Page 15: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,](https://reader035.fdocuments.in/reader035/viewer/2022062606/5fe37cd0320dec77cd0cccb1/html5/thumbnails/15.jpg)
Recurrent Neural Networks : LSTM
Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 15
LSTM learns (1) How to forget a memory when the and new input is given,(2) Then how to add the new memory from given and .
Figures from http://colah.github.io/posts/2015‐08‐Understanding‐LSTMs/
Output gate
Cell state
Hidden state
![Page 16: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,](https://reader035.fdocuments.in/reader035/viewer/2022062606/5fe37cd0320dec77cd0cccb1/html5/thumbnails/16.jpg)
Recurrent Neural Networks : GRU
Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 16
∙ ,∙ ,∙ ,
tanh ∙ ,∗ ∗∗ tanh
Maybe we can simplify this structure, efficiently!
GRU
∙ ,∙ ,
tanh ∙ ∗ ,1 ∗ ∗
Figures from http://colah.github.io/posts/2015‐08‐Understanding‐LSTMs/
updatereset
![Page 17: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,](https://reader035.fdocuments.in/reader035/viewer/2022062606/5fe37cd0320dec77cd0cccb1/html5/thumbnails/17.jpg)
Sequence to Sequence Model: What is it?
Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 17
1 2 3 4 5 LSTM/GRUEncoder
LSTM/GRUDecoder
1
Western FoodTo
Korean FoodTransition
![Page 18: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,](https://reader035.fdocuments.in/reader035/viewer/2022062606/5fe37cd0320dec77cd0cccb1/html5/thumbnails/18.jpg)
Sequence to Sequence Model: Implementation
Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 18
• The simplest way to implement sequence to sequence model is
to just pass the last hidden state of decoder to the first GRU cell of encoder!
• However, this method’s power gets weaker when the encoder need to generate a longer sequence.
![Page 19: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,](https://reader035.fdocuments.in/reader035/viewer/2022062606/5fe37cd0320dec77cd0cccb1/html5/thumbnails/19.jpg)
Sequence to Sequence Model: Attention Decoder
Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 19
Bidirectional GRU Encoder
Attention GRU Decoder
• For each GRU cell consisting the decoder, let’s pass the encoder’s information differently!
, ,1 ∗ ∗
tanh ∗
∑
tanh
![Page 20: Introduction to Deep Learning - Robot Learning Laboratorycpslab.snu.ac.kr/courses/deep-learning-2018/files/04_dl... · Recurrent Neural Networks : Typical RNNs Prof. Songhwai Oh (ECE,](https://reader035.fdocuments.in/reader035/viewer/2022062606/5fe37cd0320dec77cd0cccb1/html5/thumbnails/20.jpg)
Wrap Up
• Recurrent neural networks• LSTM• GRU• Seq2Seq
Prof. Songhwai Oh (ECE, SNU) Introduction to Deep Learning 20