Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague,...

43
Deep Learning for Natural Language Processing Stephen Clark et al. University of Cambridge and DeepMind

Transcript of Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague,...

Page 1: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

Deep Learning for Natural Language

ProcessingStephen Clark et al.

University of Cambridge and DeepMind

Page 2: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

13. Sentence Representations

Felix HillDeepMind

Page 3: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

What does a sentence mean?

Page 4: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

Classical perspective

• Sentence representations are logical expressions.

• Sentence understanding is parsing and combining constituents to obtain logical form.

• Syntax guides semantics.

Page 5: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

Classical perspective

Page 6: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

Classical perspective

Page 7: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

Classical perspective

Page 8: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

Pros:

• Intuitive and interpretable(?) representations.

• Leverage the power of predicate logic to model semantics

• Evaluate the truth of statements, derive conclusions, etc.

Classical perspective

Page 9: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

Thanks to Jay McClelland for

examples

Page 10: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

Cons (1)- “John loves Mary”: loves(John, Mary)

- “John loves ice cream” loves(John, ice cream)

Page 11: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

Cons (1)- “John loves Mary”: loves(John, Mary)

- “John loves ice cream” loves(John, ice cream)

All meaning is context-dependent

Page 12: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

Cons (2)- “the tiger threatens the giraffe”: threatens(tiger, giraffe)

- “the protege threatens the master” threatens(protege, master)

- “the scandal threatens the profits” threatens(scandal, reputation)

Page 13: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

- “Dave pushed the button”:

- “Dave pushed the trainees”:

- “Dave pushed the agenda ”

- “Dave pushed the drugs”

Cons (2)

Metaphoricity is the rule, not the exception

Page 14: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

Cons (3)- “the apple was in the container”

- “the juice was in the container”

Page 15: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

Cons (3)- “the apple was in the container”

- “the juice was in the container”

Page 16: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

Cons (3)- “the man cut his steak”

Page 17: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

Cons (3)- “the man cut his steak”

Where did you come from?

Page 18: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

How many animals of each kind did Moses take on the Ark?

Page 19: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

“The haystack was important because the cloth ripped.”

Page 20: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

Cons (3)

Page 21: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

Cons (3)

“The haystack was important because the cloth ripped.”

Meaning is not in language, language indicates meaning

Page 22: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

Neural networks to the rescue

• Nothing is an atom, everything a molecule (in theory)

• Linguistic signal (e.g. words), perceptual clues (e.g. vision) and semantic knowledge all represented similarly

• Representations of one information type constrain and interact with representations of others

Page 23: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

Sentence representations in neural nets

Kalchbrenner & Blunsom, 2014

+ / - / ?

+ / - / ?

vector vector

Page 24: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

Can we improve on this?

Page 25: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

One approach..

[SHIFT, SHIFT,REDUCE, SHIFT,SHIFT, REDUCE,

REDUCE]

[SHIFT, SHIFT,SHIFT, SHIFT,

REDUCE, REDUCE, REDUCE]

[SHIFT, SHIFT,SHIFT, REDUCE,SHIFT, REDUCE,

REDUCE]

Figure due to Sam Bowman. Reproduced with author's permission.

Page 26: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

(1) the man inspects a painting in a museum

(2) the man is sleeping

CONTRADICTION

Stanford NLI task

Page 27: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

bu↵er

down

sat

stack

cat

the

composition

tracking

transition

reduce

down

sat

the cat

composition

tracking

transition

shift

down

sat

the cat

tracking

+ $$$$$$$$$$

Bowman et al. 2015 (see also Socher et al. 2013)

Page 28: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

Wang and Jiang (2015)

Parallel interactive processing wins

Page 29: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

perception and conceptual knowledge?

Page 30: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

Knowledge from stories

“Skip-Thought Vectors”Kiros et al. 2015

Page 31: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

Fast knowledge from stories

“Learning distributed representations of sentences from unlabelled data”

Hill et al. 2015

Page 32: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

“Learning distributed representations of sentences from unlabelled data”

Hill et al. 2015

Sequential Deniosing Auto-Encoder

Knowledge from raw text

Page 33: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

“Learning distributed representations of sentences from unlabelled data”

Hill et al. 2015

Knowledge from dictionaries

Page 34: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

“Learning distributed representations of sentences from unlabelled data”

Hill et al. 2015

Knowledge from images?

Page 35: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

Next time: full “embodiment”

Page 36: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

Richer representation spaces

Page 37: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

Auto-encoding for representation learning

http://kvfrans.com/variational-autoencoders-explained/

loss = pixel reconstruction loss

Page 38: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

Auto-encoding via a richer latent space

http://kvfrans.com/variational-autoencoders-explained/

Page 39: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

VAE: variational auto-encoder

loss = pixel reconstruction loss +

a unit normal distribution

encoder predictions of mean and variance

Page 40: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

She went for a walk in the forest

VAE for text

Page 41: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

Benefits of VAE1. Smooth(er) latent space of representations

2. Generate from the model

Page 42: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

Conclusions

• The meaning of language is not in the language itself

• Neural networks provide a model for combining the necessary information sources

• Finding and using the right information is just as important as elaborate modelling

Page 43: Deep Learning for Natural Language Processing · 2020-01-27 · Reading Formal semantics: Montague, R. (1970).English as a formal language. Meaning in context: McClelland, J. L. (1992).Can

ReadingFormal semantics: Montague, R. (1970). English as a formal language.

Meaning in context: McClelland, J. L. (1992). Can connectionist models discover the structure of natural language? 

Dictionary definitions to guide meaning: Hill, F, Cho, K and Korhonen, A. Learning to Understand Phrases by Embedding the Dictionary TACL. (2015).

Skip-Thought Vectors: Kiros, R. et al. (NIPS 2015)

Comparison of sentence representations (SDAE, FastSent): Hill, F, Cho, K and Korhonen, A. Learning Distributed Representations of Sentences from Unlabelled Data, NAACL. (2015).

Variational AutoEncoder: Kingma, D. P., & Welling, M. (2013). Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114.

Variational AutoEncoder for sentences: Bowman, S. R., Vilnis, L., Vinyals, O., Dai, A. M., Jozefowicz, R., & Bengio, S. (2015). Generating sentences from a continuous space. arXiv preprint arXiv:1511.06349.