Visualizing and Understanding Recurrent...
Transcript of Visualizing and Understanding Recurrent...
![Page 1: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/1.jpg)
Visualizing and Understanding Recurrent Networks
Andrej Karpathy, Justin Johnson, Li Fei-Fei
Presented by: Ismail
![Page 2: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/2.jpg)
![Page 3: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/3.jpg)
![Page 4: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/4.jpg)
![Page 5: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/5.jpg)
![Page 6: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/6.jpg)
![Page 7: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/7.jpg)
![Page 8: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/8.jpg)
![Page 9: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/9.jpg)
![Page 10: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/10.jpg)
![Page 11: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/11.jpg)
LSTM (Long Short-Term Memory)
RNN
![Page 12: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/12.jpg)
![Page 13: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/13.jpg)
![Page 14: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/14.jpg)
Let’s pause for a moment...
![Page 15: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/15.jpg)
![Page 16: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/16.jpg)
![Page 17: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/17.jpg)
![Page 18: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/18.jpg)
![Page 19: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/19.jpg)
![Page 20: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/20.jpg)
![Page 21: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/21.jpg)
![Page 22: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/22.jpg)
Experiments in the paperDataset:
● Leo Tolstoy’s War and Peace(WP) Novel -- 3, 258, 256 characters, K = 87● Linux Kernel(LK) -- 6, 206, 996 characters, K = 101
Training (Cross product of):
● type (LSTM/RNN/GRU)● number of layers (1/2/3)● number of parameters (4 settings)● both datasets (WP & LK)
![Page 23: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/23.jpg)
Experiments in the paper
● depth >= 2 is beneficial● LSTM, GRU >> RNN
![Page 24: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/24.jpg)
Internal Mechanism of LSTMS
![Page 25: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/25.jpg)
![Page 26: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/26.jpg)
![Page 27: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/27.jpg)
![Page 28: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/28.jpg)
![Page 29: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/29.jpg)
Understanding long range interactions of LSTM
![Page 30: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/30.jpg)
n-gram vs n-NN
- The best RNN outperforms 20-gram model (WP -- 1.077 vs 1.195; LK -- 0.84 vs 0.889)
![Page 31: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/31.jpg)
Error analysis- A character is error = If the probability assigned to it in previous time-step is <
0.5
![Page 32: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/32.jpg)
Unique errors
![Page 33: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/33.jpg)
LSTM on “}”
![Page 34: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/34.jpg)
Training dynamics
![Page 35: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/35.jpg)
![Page 36: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/36.jpg)
Other RNN-based applications
![Page 37: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/37.jpg)
![Page 38: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/38.jpg)
![Page 39: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/39.jpg)
![Page 40: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/40.jpg)
![Page 41: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/41.jpg)
![Page 42: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,](https://reader034.fdocuments.in/reader034/viewer/2022052022/60375645135dd66cf575a5ca/html5/thumbnails/42.jpg)
Feels like it was a RNN day. ;)
Questions?