Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990)...
Transcript of Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990)...
![Page 1: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/1.jpg)
Deep Learning for Natural Language
ProcessingStephen Clark et al…
DeepMind and University of Cambridge
![Page 2: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/2.jpg)
5. Recurrent Neural Networks
Felix HillDeepMind
![Page 3: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/3.jpg)
What are neural nets for?
![Page 4: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/4.jpg)
What are neural nets for?
![Page 5: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/5.jpg)
How can you apply a neural net to language?
“language does not naturally go here, ahem, but fortunately…..”
![Page 6: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/6.jpg)
How can you apply a neural net to language?
“language does not naturally go here, ahem, but fortunately…..”
what’s the issue here????
![Page 7: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/7.jpg)
That’s the whole point!!
![Page 8: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/8.jpg)
What is James doing in the store room?
![Page 9: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/9.jpg)
searching for a book…
![Page 10: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/10.jpg)
What is that empty cup doing over there?
![Page 11: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/11.jpg)
err..being a cup?
![Page 12: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/12.jpg)
time flies like an arrow
![Page 13: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/13.jpg)
fruit flies like a banana
![Page 14: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/14.jpg)
The networks that are good at Go and Atari were first developed for this reason!
![Page 15: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/15.jpg)
Finding structure in time - Elman, 1990
![Page 16: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/16.jpg)
The simple recurrent network (now RNN)
![Page 17: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/17.jpg)
![Page 18: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/18.jpg)
now
![Page 19: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/19.jpg)
now
this
![Page 20: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/20.jpg)
this now
is
![Page 21: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/21.jpg)
is now this
a
![Page 22: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/22.jpg)
a now this is
story
![Page 23: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/23.jpg)
story now this is a
all
![Page 24: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/24.jpg)
all now this is a story
about
![Page 25: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/25.jpg)
about now this is a story all
how
![Page 26: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/26.jpg)
how now this is a story all about
my
![Page 27: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/27.jpg)
my now this is a story all about how
life
![Page 28: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/28.jpg)
life now this is a story all about how my
got
![Page 29: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/29.jpg)
got now this is a story all about how my life
flipped
![Page 30: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/30.jpg)
flipped now this is a story all about how my life got
turned
![Page 31: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/31.jpg)
turned now this is a story all about how my life got flipped
upside
![Page 32: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/32.jpg)
upside now this is a story all about how my life got flipped turned
down
![Page 33: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/33.jpg)
down now this is a story all about how my life got flipped turned upside
![Page 34: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/34.jpg)
Suppose we have a vocabulary of 100k words.
How many weights are there in Elman’s network?
![Page 35: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/35.jpg)
ht = tanh(Uht�1 +Wxt)
yt = V ht
A
B
C
D
E
F
G
![Page 36: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/36.jpg)
down
what is represented here?
now this is a story all about how my life got flipped turned upside
![Page 37: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/37.jpg)
what is represented here?
now this is a story all about how my life got flipped turned upside down
![Page 38: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/38.jpg)
Finding structure in time
![Page 39: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/39.jpg)
Finding more structure in time
![Page 40: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/40.jpg)
Any downsides?
down now this is a story all about how my life got flipped turned upside
now this
is a
stor
y all a
bout
how
my
life
got flipp
ed tur
ned
=
upside
![Page 41: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/41.jpg)
“Vanishing” gradients
now this
is a
stor
y all a
bout
how
my
life
got flipp
ed tur
ned
upside =
a1
a2
wn
wn�1
w1
c(f(x), y) y
x =
dC
dw1/ �0(z1)⇥ w2 ⇥ �0(z2)⇥ w3 · · ·⇥ wn ⇥ �0(zn)⇥
dC
dan
whereai = �(zi)
an
f
![Page 42: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/42.jpg)
“Vanishing” gradients
now this
is a
stor
y all a
bout
how
my
life
got flipp
ed tur
ned
upside =
a1
a2
wn
wn�1
w1
c(f(x), y) y
x =
an
�(x)�
0(x)
dC
dw1/ wn ⇥ �0(z1)⇥ · · ·⇥ �0(zn)⇥
dC
dan
(or exploding)
small change, big consequencesf
in an RNN
![Page 43: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/43.jpg)
One final thing…no output words…..
BPTT
![Page 44: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/44.jpg)
But, more typically…
http://www.cs.toronto.edu/~ilya/rnn.html
![Page 45: Deep Learning for Natural Language Processing...References Finding structure in time (Elman, 1990) Description and analysis of a recurrent neural network, inference of structure in](https://reader035.fdocuments.in/reader035/viewer/2022063022/5fe74e07ba785161b06da124/html5/thumbnails/45.jpg)
ReferencesFinding structure in time (Elman, 1990)
Description and analysis of a recurrent neural network, inference of structure in unsegmented sequences
Connectionist Temporal Classification: Labelling Unsegmented Sequence Data with Recurrent Neural Networks (Graves et al, 2006)
Scales Elman up to the ML age
Recurrent neural network-based language model (Mikolov et al. 2010)
Scale Graves up to running text
Learning to understand phrases by embedding the dictionary (Hill et al. 2015)
Learns to predict words from dictionary definitions