NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng...

Post on 27-Aug-2018

249 views 0 download

Transcript of NLP and Word Embeddings - web.stanford.edu · NLP and Word Embeddings GloVe word vectors. Andrew Ng...

deeplearning.ai

NLPandWordEmbeddings

Wordrepresentation

AndrewNg

Word representationV = [a, aaron, …, zulu, <UNK>]

1-hot representation

Apple(456)

Orange(6257) I want a glass of orange ______.

I want a glass of apple______.

King(4914)000⋮1⋮000

Woman(9853)00000⋮1⋮0

Man(5391)

0000⋮1⋮00

Queen(7157)

00000⋮1⋮0

0⋮1⋮00000

00000⋮1⋮0

AndrewNg

Featurized representation: word embeddingApple(456)

Orange(6257)

King(4914)

Woman(9853)

Man(5391)

Queen(7157)

I want a glass of orange ______.I want a glass of apple______.

-0.95 0.97 0.00 0.01

0.93 0.95 -0.01 0.00

0.7 0.69 0.03 -0.02

0.02 0.01 0.95 0.97

AndrewNg

Visualizing word embeddings

fish

dogcat

applegrape

orangeonethree

two

four

king

man

queen

woman

[van der Maaten and Hinton., 2008. Visualizing data using t-SNE]

deeplearning.ai

NLPandWordEmbeddings

Usingwordembeddings

AndrewNg

Named entity recognition example

Sally Johnson is an orange farmer

1 1 0 0 0 0

Robert Lin is an apple farmer

AndrewNg

Transfer learning and word embeddings

1. Learn word embeddings from large text corpus. (1-100B words)

(Or download pre-trained embedding online.)

2. Transfer embedding to new task with smaller training set. (say, 100k words)

3. Optional: Continue to finetune the word embeddings with newdata.

AndrewNg

Relation to face encoding

$(&)

$(()

)*

[Taigman et. al., 2014. DeepFace: Closing the gap to human level performance]

f($(&))

f($(())

deeplearning.ai

NLPandWordEmbeddings

Propertiesofwordembeddings

AndrewNg

AnalogiesApple(456)

Orange(6257)

King(4914)

Woman(9853)

Man(5391)

Queen(7157)

Gender

Royal

Age

Food

−10.010.030.09

10.020.020.01

-0.950.930.700.02

0.970.950.690.01

0.00-0.010.030.95

0.010.00-0.020.97

[Mikolov et. al., 2013, Linguistic regularities in continuous space word representations]

AndrewNg

Analogies using word vectorsfish

dog

cat

applegrape

orangeone

three

two

four

kingman

queen

woman

()*+ − (,-)*+ ≈ (/0+1 − (?

AndrewNg

Cosine similarity

345((,, (/0+1 − ()*+ + (,-)*+)

Man:Woman as Boy:GirlOttawa:Canada as Nairobi:KenyaBig:Bigger as Tall:TallerYen:Japan as Ruble:Russia

deeplearning.ai

NLPandWordEmbeddings

Embeddingmatrix

AndrewNg

Embedding matrix

In practice, use specialized function to look up an embedding.

deeplearning.ai

NLPandWordEmbeddings

Learningwordembeddings

AndrewNg

Neural language modelI want a glass of orange ______.4343 9665 1 3852 6163 6257

I

want

a

glass

of

orange

*+,+,

*-../*0*,1/2*.0.,*.2/3

4

44

44

4

5+,+,

5-../505,1/25.0.,

5.2/3[Bengio et. al., 2003, A neural probabilistic language model]

AndrewNg

Other context/target pairsI want a glass of orange juice to go along with my cereal.

Context: Last 4 words.

4 words on left & right

Last 1 word

Nearby 1 word

deeplearning.ai

NLPandWordEmbeddings

Word2Vec

AndrewNg

Skip-gramsI want a glass of orange juice to go along with my cereal.

[Mikolov et. al., 2013. Efficient estimation of word representations in vector space.]

AndrewNg

ModelVocab size = 10,000k

AndrewNg

Problems with softmax classification

! " # = %&'()*∑ %&,()*-.,...01-

How to sample the context #?

deeplearning.ai

NLPandWordEmbeddings

Negativesampling

AndrewNg

Defining a new learning problem

I want a glass of orange juice to go along with my cereal.

[Mikolov et. al., 2013. Distributed representation of words and phrases and their compositionality]

AndrewNg

Model

Softmax: ! " # = %&'()*∑ %&,()*-.,...01-

context wordorangeorangeorange

juicekingbook

target?

theof

orangeorange

100

00

AndrewNg

Selecting negative examples

context wordorangeorangeorange

juicekingbook

target?

theof

orangeorange

100

00

deeplearning.ai

NLPandWordEmbeddings

GloVe wordvectors

AndrewNg

GloVe (global vectors for word representation)

I want a glass of orange juice to go along with my cereal.

[Pennington et. al., 2014. GloVe: Global vectors for word representation]

AndrewNg

Model

AndrewNg

A note on the featurization view of word embeddings

minimize ∑ ∑ ( )*+ ,*-.+ + 0* − 0+2 − log)*+678,888

+:778,888*:7

King(4914)

Woman(9853)

Man(5391)

Queen(7157)

-0.950.930.700.02

0.970.950.690.01

−10.010.030.09

10.020.020.01

GenderRoyalAgeFood

deeplearning.ai

NLPandWordEmbeddings

Sentimentclassification

AndrewNg

Sentiment classification problem! "

The dessert is excellent.

Service was quite slow.

Good for a quick meal, but nothing special.

Completely lacking in good taste, good service, and good ambience.

AndrewNg

Simple sentiment classification model

The

desert

is

excellent

#$%&$

#&'($

#'(%'

#)*$+

,

,

,

,

-$%&$

-&'($

-'(%'

-)*$+

8928 2468 4694 3180The dessert is excellent

“Completely lacking in good taste, good service, and good ambience.”

AndrewNg

RNN for sentiment classification

Completely lacking in good …. ambience

, , , , ,

-*$6& -'%(( -''&7 -)$$& -))+

"8

softmax

⋯:;+< :;*< :;&< :;)< :;'< :;*+<

deeplearning.ai

NLPandWordEmbeddings

Debiasingwordembeddings

AndrewNg

The problem of bias in word embeddings

[Bolukbasi et. al., 2016. Man is to computer programmer as woman is to homemaker? Debiasing word embeddings]

Man:Woman as King:Queen

Man:Computer_Programmer as Woman:

Father:Doctor as Mother:

Word embeddings can reflect gender, ethnicity, age, sexual orientation, and other biases of the text used to train the model.

Homemaker

Nurse

AndrewNg

Addressing bias in word embeddings1. Identify bias direction.

2. Neutralize: For every word that is not definitional, project to get rid of bias.

3. Equalize pairs.

[Bolukbasi et. al., 2016. Man is to computer programmer as woman is to homemaker? Debiasing word embeddings]