NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng...

36
deeplearning.ai NLP and Word Embeddings Word representation

Transcript of NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng...

Page 1: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

deeplearning.ai

NLPandWordEmbeddings

Wordrepresentation

Page 2: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

AndrewNg

Word representationV = [a, aaron, …, zulu, <UNK>]

1-hot representation

Apple(456)

Orange(6257) I want a glass of orange ______.

I want a glass of apple______.

King(4914)000⋮1⋮000

Woman(9853)00000⋮1⋮0

Man(5391)

0000⋮1⋮00

Queen(7157)

00000⋮1⋮0

0⋮1⋮00000

00000⋮1⋮0

Page 3: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

AndrewNg

Featurized representation: word embeddingApple(456)

Orange(6257)

King(4914)

Woman(9853)

Man(5391)

Queen(7157)

I want a glass of orange ______.I want a glass of apple______.

-0.95 0.97 0.00 0.01

0.93 0.95 -0.01 0.00

0.7 0.69 0.03 -0.02

0.02 0.01 0.95 0.97

Page 4: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

AndrewNg

Visualizing word embeddings

fish

dogcat

applegrape

orangeonethree

two

four

king

man

queen

woman

[van der Maaten and Hinton., 2008. Visualizing data using t-SNE]

Page 5: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

deeplearning.ai

NLPandWordEmbeddings

Usingwordembeddings

Page 6: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

AndrewNg

Named entity recognition example

Sally Johnson is an orange farmer

1 1 0 0 0 0

Robert Lin is an apple farmer

Page 7: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

AndrewNg

Transfer learning and word embeddings

1. Learn word embeddings from large text corpus. (1-100B words)

(Or download pre-trained embedding online.)

2. Transfer embedding to new task with smaller training set. (say, 100k words)

3. Optional: Continue to finetune the word embeddings with newdata.

Page 8: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

AndrewNg

Relation to face encoding

$(&)

$(()

)*

[Taigman et. al., 2014. DeepFace: Closing the gap to human level performance]

f($(&))

f($(())

Page 9: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

deeplearning.ai

NLPandWordEmbeddings

Propertiesofwordembeddings

Page 10: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

AndrewNg

AnalogiesApple(456)

Orange(6257)

King(4914)

Woman(9853)

Man(5391)

Queen(7157)

Gender

Royal

Age

Food

−10.010.030.09

10.020.020.01

-0.950.930.700.02

0.970.950.690.01

0.00-0.010.030.95

0.010.00-0.020.97

[Mikolov et. al., 2013, Linguistic regularities in continuous space word representations]

Page 11: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

AndrewNg

Analogies using word vectorsfish

dog

cat

applegrape

orangeone

three

two

four

kingman

queen

woman

()*+ − (,-)*+ ≈ (/0+1 − (?

Page 12: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

AndrewNg

Cosine similarity

345((,, (/0+1 − ()*+ + (,-)*+)

Man:Woman as Boy:GirlOttawa:Canada as Nairobi:KenyaBig:Bigger as Tall:TallerYen:Japan as Ruble:Russia

Page 13: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

deeplearning.ai

NLPandWordEmbeddings

Embeddingmatrix

Page 14: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

AndrewNg

Embedding matrix

In practice, use specialized function to look up an embedding.

Page 15: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

deeplearning.ai

NLPandWordEmbeddings

Learningwordembeddings

Page 16: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

AndrewNg

Neural language modelI want a glass of orange ______.4343 9665 1 3852 6163 6257

I

want

a

glass

of

orange

*+,+,

*-../*0*,1/2*.0.,*.2/3

4

44

44

4

5+,+,

5-../505,1/25.0.,

5.2/3[Bengio et. al., 2003, A neural probabilistic language model]

Page 17: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

AndrewNg

Other context/target pairsI want a glass of orange juice to go along with my cereal.

Context: Last 4 words.

4 words on left & right

Last 1 word

Nearby 1 word

Page 18: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

deeplearning.ai

NLPandWordEmbeddings

Word2Vec

Page 19: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

AndrewNg

Skip-gramsI want a glass of orange juice to go along with my cereal.

[Mikolov et. al., 2013. Efficient estimation of word representations in vector space.]

Page 20: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

AndrewNg

ModelVocab size = 10,000k

Page 21: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

AndrewNg

Problems with softmax classification

! " # = %&'()*∑ %&,()*-.,...01-

How to sample the context #?

Page 22: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

deeplearning.ai

NLPandWordEmbeddings

Negativesampling

Page 23: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

AndrewNg

Defining a new learning problem

I want a glass of orange juice to go along with my cereal.

[Mikolov et. al., 2013. Distributed representation of words and phrases and their compositionality]

Page 24: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

AndrewNg

Model

Softmax: ! " # = %&'()*∑ %&,()*-.,...01-

context wordorangeorangeorange

juicekingbook

target?

theof

orangeorange

100

00

Page 25: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

AndrewNg

Selecting negative examples

context wordorangeorangeorange

juicekingbook

target?

theof

orangeorange

100

00

Page 26: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

deeplearning.ai

NLPandWordEmbeddings

GloVe wordvectors

Page 27: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

AndrewNg

GloVe (global vectors for word representation)

I want a glass of orange juice to go along with my cereal.

[Pennington et. al., 2014. GloVe: Global vectors for word representation]

Page 28: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

AndrewNg

Model

Page 29: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

AndrewNg

A note on the featurization view of word embeddings

minimize ∑ ∑ ( )*+ ,*-.+ + 0* − 0+2 − log)*+678,888

+:778,888*:7

King(4914)

Woman(9853)

Man(5391)

Queen(7157)

-0.950.930.700.02

0.970.950.690.01

−10.010.030.09

10.020.020.01

GenderRoyalAgeFood

Page 30: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

deeplearning.ai

NLPandWordEmbeddings

Sentimentclassification

Page 31: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

AndrewNg

Sentiment classification problem! "

The dessert is excellent.

Service was quite slow.

Good for a quick meal, but nothing special.

Completely lacking in good taste, good service, and good ambience.

Page 32: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

AndrewNg

Simple sentiment classification model

The

desert

is

excellent

#$%&$

#&'($

#'(%'

#)*$+

,

,

,

,

-$%&$

-&'($

-'(%'

-)*$+

8928 2468 4694 3180The dessert is excellent

“Completely lacking in good taste, good service, and good ambience.”

Page 33: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

AndrewNg

RNN for sentiment classification

Completely lacking in good …. ambience

, , , , ,

-*$6& -'%(( -''&7 -)$$& -))+

"8

softmax

⋯:;+< :;*< :;&< :;)< :;'< :;*+<

Page 34: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

deeplearning.ai

NLPandWordEmbeddings

Debiasingwordembeddings

Page 35: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

AndrewNg

The problem of bias in word embeddings

[Bolukbasi et. al., 2016. Man is to computer programmer as woman is to homemaker? Debiasing word embeddings]

Man:Woman as King:Queen

Man:Computer_Programmer as Woman:

Father:Doctor as Mother:

Word embeddings can reflect gender, ethnicity, age, sexual orientation, and other biases of the text used to train the model.

Homemaker

Nurse

Page 36: NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng GloVe (global vectors for word representation) I want a glass of orange juice to

AndrewNg

Addressing bias in word embeddings1. Identify bias direction.

2. Neutralize: For every word that is not definitional, project to get rid of bias.

3. Equalize pairs.

[Bolukbasi et. al., 2016. Man is to computer programmer as woman is to homemaker? Debiasing word embeddings]