Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art...

39
0 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics and Robotics KIT - Institute for Anthropomatics and Robotics Pre-Translation for Neural Machine Translation Jan Niehues, Eunah Cho, Thanh-Le Ha and Alex Waibel KIT – University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association www.kit.edu

Transcript of Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art...

Page 1: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

0 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

KIT - Institute for Anthropomatics and Robotics

Pre-Translation for Neural Machine TranslationJan Niehues, Eunah Cho, Thanh-Le Ha and Alex Waibel

KIT – University of the State of Baden-Wuerttemberg andNational Research Center of the Helmholtz Association www.kit.edu

Page 2: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Mixed Input

12 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

Implementation:Join source sentence and PBMT translation

the goalie der TorwartRNN state encode source and PBMT translation

Language specific word embeddingsE_the E_goalie D_der D_Torwart

BPE for word encodingE_the E_go E_al E_ie D_der D_Tor D_wart

Page 3: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Result by Word Frequency

16 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

Page 4: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Alignment

19 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

Page 5: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

0 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

KIT - Institute for Anthropomatics and Robotics

Pre-Translation for Neural Machine TranslationJan Niehues, Eunah Cho, Thanh-Le Ha and Alex Waibel

KIT – University of the State of Baden-Wuerttemberg andNational Research Center of the Helmholtz Association www.kit.edu

Page 6: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Motivation

1 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

Neural machine translation sets state-of-the artEnd-to-End neural network approach to machine translation

Comparison to SMTSignificant improvements

Automatic metricsManual evaluation

More fluent translation

Page 7: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Motivation

2 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

NMT has different problemsSmall vocabularyProblems translating rare words

English: the goalie parriedNMT: der GottNMT(gloss): the god

Page 8: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Motivation

2 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

NMT has different problemsSmall vocabularyProblems translating rare words

English: the goalie parriedNMT: der GottNMT(gloss): the god

Combine SMT and NMTSimplify the task of NMT

Page 9: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Outline

3 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

MotivationMT approachesIdea

PipelineMixed Input

EvaluationConclusion

Page 10: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Statistical Machine Translation (SMT)

4 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

Build translations from blocks of source and target words (phrasepairs)

Page 11: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Statistical Machine Translation (SMT)

4 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

Build translations from blocks of source and target words (phrasepairs)

Page 12: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Statistical Machine Translation (SMT)

4 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

Build translations from blocks of source and target words (phrasepairs)

Page 13: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Statistical Machine Translation (SMT)

4 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

Build translations from blocks of source and target words (phrasepairs)

Page 14: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Neural Machine Translation (NMT)

5 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

Neural network to predict most probably target sequenceJointly train modelLarge improvements in translation quality

Page 15: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Neural Machine Translation (NMT)

6 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

Fixed vocabulary sizeByte pair encoding (Sennrich et al. 2016)

Represent all words with n sub-wordsStart with character representationJoin most common bi-gram sequence to new symbol

Exampel:t h e _ g o a l i e _ p a r r i e d

Page 16: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Neural Machine Translation (NMT)

6 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

Fixed vocabulary sizeByte pair encoding (Sennrich et al. 2016)

Represent all words with n sub-wordsStart with character representationJoin most common bi-gram sequence to new symbol

Exampel:t h e _ g o a l ie _ p a r r ie d

Page 17: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Neural Machine Translation (NMT)

6 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

Fixed vocabulary sizeByte pair encoding (Sennrich et al. 2016)

Represent all words with n sub-wordsStart with character representationJoin most common bi-gram sequence to new symbol

Exampel:t h e _ g o a l ie _ p a r r ied

Page 18: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Neural Machine Translation (NMT)

6 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

Fixed vocabulary sizeByte pair encoding (Sennrich et al. 2016)

Represent all words with n sub-wordsStart with character representationJoin most common bi-gram sequence to new symbol

Exampel:t h e _ g o a l ie _ pa r r ied

Page 19: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Neural Machine Translation (NMT)

6 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

Fixed vocabulary sizeByte pair encoding (Sennrich et al. 2016)

Represent all words with n sub-wordsStart with character representationJoin most common bi-gram sequence to new symbol

Exampel:the _ go al ie _ par ried

Page 20: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Difference SMT/NMT

7 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

SMT:Handle large vocabularyEasily extensible

Add translation via new phrase pairs

NMT:Joint modelLong contextBetter generalization due to word embeddings

Page 21: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Pre-Translation

8 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

Combine advantages of both approachesFacilitate advantages of SMTSuccessful combination of other approachesIdea:

Use SMT as input to NMTEncode words using Byte pair encoding

Use translation of words not in NMT vocabulary

Page 22: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Related Work

9 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

Combination of SMT and Rule-based MT (Dugast et al., 2007, Simardet al, 2007)Automatic Post editing (Junczyd-Dowmunt and Grundkiewicz, 2016)Preprocessing for PBMT

Compound splittingPre-reordering

Handling of rare words in NMT (Luong et al 2014, Sennrich et al,2015)

Page 23: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Pipeline

10 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

Input:Source sentence

Translate using PBMTTranslate from PBMT German to German using NMT

Page 24: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Pipeline

10 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

Input:Source sentence

Translate using PBMTTranslate from PBMT German to German using NMT

Page 25: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Pipeline

10 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

Input:Source sentence

Translate using PBMTTranslate from PBMT German to German using NMT

Page 26: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Pipeline

10 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

Input:Source sentence

Translate using PBMTTranslate from PBMT German to German using NMT

Page 27: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Mixed Input

11 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

Input:Source sentence

Translate using PBMTCombine source and PBMT TranslationTranslate joined text using NMT

Page 28: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Mixed Input

12 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

Implementation:Join source sentence and PBMT translation

the goalie der TorwartRNN state encode source and PBMT translation

Language specific word embeddingsE_the E_goalie D_der D_Torwart

BPE for word encodingE_the E_go E_al E_ie D_der D_Tor D_wart

Page 29: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Training

13 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

Training data:Parallel corpusPBMT translation of corpus

Problem:PBMT tends to overfit on the training data

Filter singletons from phrase tableSuccessful used in other models

Page 30: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Experiments

14 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

Training data:WMT EN-DE Data

PBMTIn-house translation system

NMTNematusBPE with 40K operations

Page 31: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Results English - German

15 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

System Dev/Valid Testtst2014 tst2015 tst2016

NMT 20.79 23.34 27.65NMT Ensemble 21.42 24.03 28.89PBMT 19.76 21.80 26.42Advanced PBMT 21.62 23.34 28.13

Page 32: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Results English - German

15 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

System Dev/Valid Testtst2014 tst2015 tst2016

NMT 20.79 23.34 27.65NMT Ensemble 21.42 24.03 28.89PBMT 19.76 21.80 26.42Advanced PBMT 21.62 23.34 28.13Pipeline 20.56 22.04 26.75Pipeline Advanced 21.76 22.92 27.61

Page 33: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Results English - German

15 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

System Dev/Valid Testtst2014 tst2015 tst2016

NMT 20.79 23.34 27.65NMT Ensemble 21.42 24.03 28.89PBMT 19.76 21.80 26.42Advanced PBMT 21.62 23.34 28.13Pipeline 20.56 22.04 26.75Pipeline Advanced 21.76 22.92 27.61Mix 21.88 24.11 28.04Mix Advanced 22.53 24.37 29.62Mix Advanced Ensemble 23.16 25.35 30.67

Page 34: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Result by Word Frequency

16 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

Page 35: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Examples

17 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

English: Then with a shot which the goalie parriedwith his knee in the 35th minute.

PBMT: Dann mit einem Schuss, die der Torwart pariertmit seinem Knie in der 35. Minute.

NMT: Dann mit einem Schuss, den der Gottmit seinem Knie in der 35. Minute.

Pre: Dann mit einem Schuss, das der Torwartmit seinem Knie in der 35. Minute pariert.

Pre(gloss): Then with a shoot, that the goaliewith his knee in the 35th minute parried.

Page 36: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Examples

18 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

English: ... a riot in the stadium.PBMT: ... einen Aufruhr im Stadion.NMT: ... einen Riot im Stadion.Pre: ... einen Aufruhr im Station.Pre (gloss): ... a riot in_the stadium.

Page 37: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Alignment

19 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

Page 38: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

Conclusion

20 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

Combine advantages of NMT and SMTImprove handling of rare wordsEasy handling different input streamsIncrease overall translation performanceFurther work:

Do we need to do a full translation?

Page 39: Pre-Translation for Neural Machine Translation · Neural machine translation sets state-of-the art End-to-End neural network approach to machine translation Comparison to SMT Significant

21 2016-12-15 Jan Niehues - Pre-Translation for Neural Machine Translation KIT - Institute for Anthropomatics andRobotics

KIT

Thanks