Deep Generative Model for Joint Alignment and Word ...

67
Introduction Model Evaluation Conclusions and Future Work Deep Generative Model for Joint Alignment and Word Representation Embedalign Miguel Rios, Wilker Aziz, Khalil Sima’an University of Amsterdam Statistical Language Processing and Learning Lab June 22, 2018 Rios Embedalign 1 / 26

Transcript of Deep Generative Model for Joint Alignment and Word ...

Page 1: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Deep Generative Model for Joint Alignmentand Word Representation

Embedalign

Miguel Rios, Wilker Aziz, Khalil Sima’anUniversity of Amsterdam

Statistical Language Processing and Learning Lab

June 22, 2018

Rios Embedalign 1 / 26

Page 2: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Outline

1 Introduction

2 Model

3 Evaluation

4 Conclusions and Future Work

Rios Embedalign 2 / 26

Page 3: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Introduction

TL;DRGenerative model that embeds words in their completeobserved context

Model learns from bilingual sentence-aligned corporaby marginalisation of latent lexical alignments

Model embeds words as probability densities

Model shows competitive results on context dependentNatural Language Processing applications

Rios Embedalign 3 / 26

Page 4: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Introduction

Rios Embedalign 4 / 26

Discriminative embedding modelsword2vec

In the event of a chemical spill, most children know they shouldevacuate as advised by people in charge.

Place words in Rd as to answer questions like

“Have I seen this word in this context?”

Page 5: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Introduction

Rios Embedalign 4 / 26

Discriminative embedding modelsword2vec

In the event of a chemical spill, most children know they shouldevacuate as advised by people in charge.

Place words in Rd as to answer questions like

“Have I seen this word in this context?”

Fit a binary classifier

positive examples

negative examples

Page 6: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Introduction

Rios Embedalign 5 / 26

In the event of a chemical spill, most children know they shouldevacuate as advised by people in charge.

Limitations

Representation learning is an unsupervised problem we onlyobserve positive/complete contextDistributional hypothesis is strong but fails when context isnot discriminativeWord senses are collapsed into one vector

Page 7: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Introduction

Rios Embedalign 5 / 26

In the event of a chemical spill, most children know they shouldevacuate as advised by people in charge.

Limitations

Representation learning is an unsupervised problem we onlyobserve positive/complete context

Distributional hypothesis is strong but fails when context isnot discriminativeWord senses are collapsed into one vector

Page 8: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Introduction

Rios Embedalign 5 / 26

In the event of a chemical spill, most children know they shouldevacuate as advised by people in charge.

Limitations

Representation learning is an unsupervised problem we onlyobserve positive/complete contextDistributional hypothesis is strong but fails when context isnot discriminative

Word senses are collapsed into one vector

Page 9: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Introduction

Rios Embedalign 5 / 26

In the event of a chemical spill, most children know they shouldevacuate as advised by people in charge.

Limitations

Representation learning is an unsupervised problem we onlyobserve positive/complete contextDistributional hypothesis is strong but fails when context isnot discriminativeWord senses are collapsed into one vector

Page 10: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Outline

1 Introduction

2 Model

3 Evaluation

4 Conclusions and Future Work

Rios Embedalign 6 / 26

Page 11: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Embedalign

Rios Embedalign 7 / 26

Generative model to induce word representations

Learn from positive examples

Learn from richer (less ambiguous) contextForeign text is proxy to sense supervision (Diab and Resnik,2002)

Page 12: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Embedalign

Rios Embedalign 7 / 26

In the event of a chemical spill, most children know they shouldevacuate as advised by people in charge.

Generative model to induce word representations

Learn from positive examples

Learn from richer (less ambiguous) contextForeign text is proxy to sense supervision (Diab and Resnik,2002)

Page 13: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Embedalign

Rios Embedalign 7 / 26

In the event of a chemical spill, most children know they shouldevacuate as advised by people in charge.

Generative model to induce word representations

Learn from positive examples

Learn from richer (less ambiguous) contextForeign text is proxy to sense supervision (Diab and Resnik,2002)

En caso de un derrame de productos quımicos, la mayorıa de los ninossaben que deben abandonar el lugar segun lo aconsejado por las

personas a cargo.

Page 14: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Variational Autoencoder

Defines a joint probability distribution over data and latentvariables: p(x , z)

x

z θ

N

decomposes into the likelihood and prior:p(x , z) = p(x |z)p(z)

Draw latent variables zi ∼ p(z)

Draw word xi ∼ p(x |z)

Rios Embedalign 8 / 26

Page 15: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Variational Autoencoder

Defines a joint probability distribution over data and latentvariables: p(x , z)

x

z θ

N

decomposes into the likelihood and prior:p(x , z) = p(x |z)p(z)

Draw latent variables zi ∼ p(z)

Draw word xi ∼ p(x |z)

Rios Embedalign 8 / 26

Page 16: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Variational Autoencoder

Defines a joint probability distribution over data and latentvariables: p(x , z)

x

z θ

N

decomposes into the likelihood and prior:p(x , z) = p(x |z)p(z)

Draw latent variables zi ∼ p(z)

Draw word xi ∼ p(x |z)

Rios Embedalign 8 / 26

Page 17: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Variational Autoencoder

Defines a joint probability distribution over data and latentvariables: p(x , z)

x

z θ

N

decomposes into the likelihood and prior:p(x , z) = p(x |z)p(z)

Draw latent variables zi ∼ p(z)

Draw word xi ∼ p(x |z)

Rios Embedalign 8 / 26

Page 18: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Variational Autoencoder

Intractable posterior pθ(z |x)

Approximation qφ(z |x)

x

N

Rios Embedalign 9 / 26

Page 19: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Variational Autoencoder

Intractable posterior pθ(z |x)

Approximation qφ(z |x)

x

N

Rios Embedalign 9 / 26

Page 20: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Generative Model

Rios Embedalign 10 / 26

quickly evacuate the area / deje el lugar rapidamente

x

y

z

a

θ

m

n

|B|

Page 21: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Generative Model

Rios Embedalign 10 / 26

quickly evacuate the area / deje el lugar rapidamente

x

y

z

a

θ

m

n

|B|

Page 22: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Generative Model

Rios Embedalign 10 / 26

quickly evacuate the area / deje el lugar rapidamente

x

y

z

a

θ

m

n

|B|

Page 23: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Generative Model

Rios Embedalign 10 / 26

quickly evacuate the area / deje el lugar rapidamente

x

y

z

a

θ

m

n

|B|

Page 24: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Generative Model

Rios Embedalign 10 / 26

quickly evacuate the area / deje el lugar rapidamente

x

y

z

a

θ

m

n

|B|

Page 25: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Generative Model

Rios Embedalign 10 / 26

quickly evacuate the area / deje el lugar rapidamente

x

y

z

a

θ

m

n

|B|

Page 26: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Generative Model

Rios Embedalign 10 / 26

quickly evacuate the area / deje el lugar rapidamente

x

y

z

a

θ

m

n

|B|

Page 27: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Generative Model

Rios Embedalign 10 / 26

quickly evacuate the area / deje el lugar rapidamente

x

y

z

a

θ

m

n

|B|

Page 28: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Generative Model

Rios Embedalign 10 / 26

quickly evacuate the area / deje el lugar rapidamente

x

y

z

a

θ

m

n

|B|

Page 29: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Learning

Rios Embedalign 11 / 26

1 Read sentence

2 Predict posterior mean µi and std σi3 Sample zi ∼ N (µi , σ

2i )

4 Predict categorical distributions

5 Generate observationsevacuate1 the2 area3 / deje1 el2 lugar3

6 Maximise a lowerbound on likelihood(Kingma and Welling, 2014)

Page 30: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Learning

Rios Embedalign 11 / 26

x1 x2 x3

11

22

33

evacuate1 the2 area3

1 Read sentence

2 Predict posterior mean µi and std σi

3 Sample zi ∼ N (µi , σ2i )

4 Predict categorical distributions

5 Generate observationsevacuate1 the2 area3 / deje1 el2 lugar3

6 Maximise a lowerbound on likelihood(Kingma and Welling, 2014)

Page 31: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Learning

Rios Embedalign 11 / 26

x1 x2 x3

11

22

33

evacuate1 the2 area3

1 Read sentence

2 Predict posterior mean µi and std σi3 Sample zi ∼ N (µi , σ

2i )

4 Predict categorical distributions

5 Generate observationsevacuate1 the2 area3 / deje1 el2 lugar3

6 Maximise a lowerbound on likelihood(Kingma and Welling, 2014)

Page 32: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Learning

Rios Embedalign 11 / 26

x1 x2 x3

11

22

33

evacuate1 the2 area3

1 Read sentence

2 Predict posterior mean µi and std σi3 Sample zi ∼ N (µi , σ

2i )

4 Predict categorical distributions

5 Generate observationsevacuate1 the2 area3 / deje1 el2 lugar3

6 Maximise a lowerbound on likelihood(Kingma and Welling, 2014)

Page 33: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Learning

Rios Embedalign 11 / 26

x1 x2 x3

11

22

33

evacuate1 the2 area3

1 Read sentence

2 Predict posterior mean µi and std σi3 Sample zi ∼ N (µi , σ

2i )

4 Predict categorical distributions

5 Generate observationsevacuate1 the2 area3 / deje1 el2 lugar3

6 Maximise a lowerbound on likelihood(Kingma and Welling, 2014)

Page 34: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Learning

Rios Embedalign 11 / 26

x1 x2 x3

11

22

33

evacuate1 the2 area3

1 Read sentence

2 Predict posterior mean µi and std σi3 Sample zi ∼ N (µi , σ

2i )

4 Predict categorical distributions

5 Generate observationsevacuate1 the2 area3 / deje1 el2 lugar3

6 Maximise a lowerbound on likelihood(Kingma and Welling, 2014)

Page 35: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Outline

1 Introduction

2 Model

3 Evaluation

4 Conclusions and Future Work

Rios Embedalign 12 / 26

Page 36: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Data

Corpus Sentence pairs (million) Tokens (million)

Europarl En-Fr 1.7 42.5Europarl En-De 1.7 43.5

Rios Embedalign 13 / 26

Page 37: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Architecture

x

embedding128d

h: BiRNN100d

u 100d

f(z)

x

s 100d

sample z100d

g(zaj)

y

Generative model

Inference model

Rios Embedalign 14 / 26

Page 38: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Word Alignment

The proposal will not now be implemented

Les propositions ne seront pas mises en application maintenant

Rios Embedalign 15 / 26

Page 39: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Word Alignment

Model selection on Dev set

AER ↓Model En-Fr En-De

IBM1 32.45 46.71IBM2 22.61 40.11EmbAlign 29.43± 1.84 48.09± 2.12

Rios Embedalign 16 / 26

Page 40: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Lexical Substitution

Rios Embedalign 17 / 26

Page 41: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Lexical Substitution

Rios Embedalign 17 / 26

Page 42: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Lexical Substitution

Rios Embedalign 17 / 26

Page 43: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Lexical Substitution

Model GAP ↑ Training size

Random 30.0SkipGram(Melamud et al., 2015) 44.9 ukWaC-2BBSG(Brazinskas et al., 2017) 46.1 ukWaC-2B

En 21.31± 1.05

En-Fr 42.19± 0.57 Euro-42MEn-De 42.07± 0.47

Rios Embedalign 18 / 26

Page 44: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Sentence Evaluation (SentEval)

Rios Embedalign 19 / 26

Page 45: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Sentence Evaluation (SentEval)

Rios Embedalign 19 / 26

Page 46: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Sentence Evaluation (SentEval)

Rios Embedalign 19 / 26

Page 47: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Sentence Evaluation (SentEval)

ACC ↑ ACC/F1 ↑ CORR ↑ CORR ↑Model MR CR SUBJ MPQA SST TREC MRPC SICK-R SICK-E STS14

SkipGramEn 70.96 76.16 87.24 86.87 73.64 65.20 70.7/80.1 0.710 76.2 0.45/0.49

En 57.5 67.1 72.0 70.8 57.0 58.0 70.6/80.3 0.648 74.4 0.59/0.59En-Fr 64.0 71.8 79.1 81.5 64.7 58.4 72.1/81.2 0.682 74.6 0.60/0.59En-De 62.6 68.0 77.3 82.0 65.0 66.8 70.4/79.8 0.681 75.5 0.58/0.58Combo 66.1 72.4 82.4 84.4 69.8 69.0 71.9/80.6 0.727 76.3 0.62/0.61

Rios Embedalign 20 / 26

Page 48: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Sentence Evaluation (SentEval)

ACC ↑ ACC/F1 ↑ CORR ↑ CORR ↑Model MR CR SUBJ MPQA SST TREC MRPC SICK-R SICK-E STS14

SkipGramEn 70.96 76.16 87.24 86.87 73.64 65.20 70.7/80.1 0.710 76.2 0.45/0.49

En 57.5 67.1 72.0 70.8 57.0 58.0 70.6/80.3 0.648 74.4 0.59/0.59En-Fr 64.0 71.8 79.1 81.5 64.7 58.4 72.1/81.2 0.682 74.6 0.60/0.59En-De 62.6 68.0 77.3 82.0 65.0 66.8 70.4/79.8 0.681 75.5 0.58/0.58Combo 66.1 72.4 82.4 84.4 69.8 69.0 71.9/80.6 0.727 76.3 0.62/0.61

SkipGram(Conneau et al., 2017) 77.7 79.8 90.9 88.3 79.7 83.6 72.5/81.4 0.803 78.7 0.65/0.64nmtEn-Fr

(Conneau et al., 2017) 64.7 70.1 84.8 81.5 - 82.8 - - - 0.42/0.43

Rios Embedalign 21 / 26

Page 49: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Outline

1 Introduction

2 Model

3 Evaluation

4 Conclusions and Future Work

Rios Embedalign 22 / 26

Page 50: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Conclusions

Generative training

model learns form positive examplesno need for context window

Translation data

less ambiguous embeddingsmodel helps with semantic tasks e.g. paraphrasing

Rios Embedalign 23 / 26

Page 51: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Conclusions

Generative training

model learns form positive examples

no need for context window

Translation data

less ambiguous embeddingsmodel helps with semantic tasks e.g. paraphrasing

Rios Embedalign 23 / 26

Page 52: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Conclusions

Generative training

model learns form positive examplesno need for context window

Translation data

less ambiguous embeddingsmodel helps with semantic tasks e.g. paraphrasing

Rios Embedalign 23 / 26

Page 53: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Conclusions

Generative training

model learns form positive examplesno need for context window

Translation data

less ambiguous embeddingsmodel helps with semantic tasks e.g. paraphrasing

Rios Embedalign 23 / 26

Page 54: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Conclusions

Generative training

model learns form positive examplesno need for context window

Translation data

less ambiguous embeddings

model helps with semantic tasks e.g. paraphrasing

Rios Embedalign 23 / 26

Page 55: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Conclusions

Generative training

model learns form positive examplesno need for context window

Translation data

less ambiguous embeddingsmodel helps with semantic tasks e.g. paraphrasing

Rios Embedalign 23 / 26

Page 56: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Future Work

We modify alignment distribution

From IBM1 to IBM2En-Fr 29.43 → 18.20 AER

We model word and sentence embeddings

Movie Reviews 66.10 → 70.55 ACCMicrosoft Paraphrase 71.90/80.6 → 72.93/81.27 ACC/F1Sick R 0.727 → 0.770 CORR

We will expand the distributional context to multiple foreignlanguages at once

Rios Embedalign 24 / 26

Page 57: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Future Work

We modify alignment distribution

From IBM1 to IBM2En-Fr 29.43 → 18.20 AER

We model word and sentence embeddings

Movie Reviews 66.10 → 70.55 ACCMicrosoft Paraphrase 71.90/80.6 → 72.93/81.27 ACC/F1Sick R 0.727 → 0.770 CORR

We will expand the distributional context to multiple foreignlanguages at once

Rios Embedalign 24 / 26

Page 58: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Future Work

We modify alignment distribution

From IBM1 to IBM2En-Fr 29.43 → 18.20 AER

We model word and sentence embeddings

Movie Reviews 66.10 → 70.55 ACCMicrosoft Paraphrase 71.90/80.6 → 72.93/81.27 ACC/F1Sick R 0.727 → 0.770 CORR

We will expand the distributional context to multiple foreignlanguages at once

Rios Embedalign 24 / 26

Page 59: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Future Work

We modify alignment distribution

From IBM1 to IBM2En-Fr 29.43 → 18.20 AER

We model word and sentence embeddings

Movie Reviews 66.10 → 70.55 ACC

Microsoft Paraphrase 71.90/80.6 → 72.93/81.27 ACC/F1Sick R 0.727 → 0.770 CORR

We will expand the distributional context to multiple foreignlanguages at once

Rios Embedalign 24 / 26

Page 60: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Future Work

We modify alignment distribution

From IBM1 to IBM2En-Fr 29.43 → 18.20 AER

We model word and sentence embeddings

Movie Reviews 66.10 → 70.55 ACCMicrosoft Paraphrase 71.90/80.6 → 72.93/81.27 ACC/F1

Sick R 0.727 → 0.770 CORR

We will expand the distributional context to multiple foreignlanguages at once

Rios Embedalign 24 / 26

Page 61: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Future Work

We modify alignment distribution

From IBM1 to IBM2En-Fr 29.43 → 18.20 AER

We model word and sentence embeddings

Movie Reviews 66.10 → 70.55 ACCMicrosoft Paraphrase 71.90/80.6 → 72.93/81.27 ACC/F1Sick R 0.727 → 0.770 CORR

We will expand the distributional context to multiple foreignlanguages at once

Rios Embedalign 24 / 26

Page 62: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

Future Work

We modify alignment distribution

From IBM1 to IBM2En-Fr 29.43 → 18.20 AER

We model word and sentence embeddings

Movie Reviews 66.10 → 70.55 ACCMicrosoft Paraphrase 71.90/80.6 → 72.93/81.27 ACC/F1Sick R 0.727 → 0.770 CORR

We will expand the distributional context to multiple foreignlanguages at once

Rios Embedalign 24 / 26

Page 63: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

DGM4NLP research at UvA-SLPL

Try pre-trained Europarl model on SentEval:https://github.com/uva-slpl/embedalign/blob/

master/notebooks/senteval_embedalign.ipynb

ACL-18 tutorial Variational Inference and Deep GenerativeModels:http://acl2018.org/tutorials/

Rios Embedalign 25 / 26

Page 64: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

DGM4NLP research at UvA-SLPL

Try pre-trained Europarl model on SentEval:https://github.com/uva-slpl/embedalign/blob/

master/notebooks/senteval_embedalign.ipynb

ACL-18 tutorial Variational Inference and Deep GenerativeModels:http://acl2018.org/tutorials/

Rios Embedalign 25 / 26

Page 65: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

DGM4NLP research at UvA-SLPL

Intro probability and statistics:http://students.brown.edu/seeing-theory/

Probabilistic Graphical Models:http://pgm.stanford.edu

https://www.youtube.com/watch?v=1z5gAHVujlE&list=

PLBAGcD3siRDjiQ5VZQ8t0C7jkHQ8fhuq8

Variational Autoeconder:https://www.dropbox.com/s/v6ua3d9yt44vgb3/cover_

and_thesis.pdf?dl=1

Rios Embedalign 26 / 26

Page 66: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

DGM4NLP research at UvA-SLPL

Intro probability and statistics:http://students.brown.edu/seeing-theory/

Probabilistic Graphical Models:http://pgm.stanford.edu

https://www.youtube.com/watch?v=1z5gAHVujlE&list=

PLBAGcD3siRDjiQ5VZQ8t0C7jkHQ8fhuq8

Variational Autoeconder:https://www.dropbox.com/s/v6ua3d9yt44vgb3/cover_

and_thesis.pdf?dl=1

Rios Embedalign 26 / 26

Page 67: Deep Generative Model for Joint Alignment and Word ...

IntroductionModel

EvaluationConclusions and Future Work

DGM4NLP research at UvA-SLPL

Intro probability and statistics:http://students.brown.edu/seeing-theory/

Probabilistic Graphical Models:http://pgm.stanford.edu

https://www.youtube.com/watch?v=1z5gAHVujlE&list=

PLBAGcD3siRDjiQ5VZQ8t0C7jkHQ8fhuq8

Variational Autoeconder:https://www.dropbox.com/s/v6ua3d9yt44vgb3/cover_

and_thesis.pdf?dl=1

Rios Embedalign 26 / 26