Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation
-
Upload
association-for-computational-linguistics -
Category
Education
-
view
87 -
download
1
Transcript of Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation
![Page 1: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/1.jpg)
Towards String-to-Tree Neural Machine Translation
Roee Aharoni & Yoav Goldberg NLP Lab, Bar Ilan University
ACL 2017
![Page 2: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/2.jpg)
NMT is all the rage!
![Page 3: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/3.jpg)
NMT is all the rage!
![Page 4: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/4.jpg)
NMT is all the rage!
• Driving the current state-of-the-art (Sennrich et al., 2016)
![Page 5: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/5.jpg)
NMT is all the rage!
• Driving the current state-of-the-art (Sennrich et al., 2016)
• Widely adopted by the industry
![Page 6: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/6.jpg)
Seq2Seq with AttentionBahdanau et al. (2015)
![Page 7: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/7.jpg)
Seq2Seq with Attention
f = argmax
f 0p(f 0|e)
Bahdanau et al. (2015)
![Page 8: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/8.jpg)
Syntax was all the rage!
![Page 9: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/9.jpg)
Syntax was all the rage!• The “previous” state-of-the-
art was syntax-based SMT
From Rico Sennrich, “NMT: Breaking the Performance Plateau”, 2016
![Page 10: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/10.jpg)
From Williams, Sennrich, Post & Koehn (2016), “Syntax-based Statistical Machine Translation”
Syntax was all the rage!• The “previous” state-of-the-
art was syntax-based SMT
• i.e. systems that used linguistic information (usually represented as parse trees)
![Page 11: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/11.jpg)
Syntax was all the rage!• The “previous” state-of-the-
art was syntax-based SMT
• i.e. systems that used linguistic information (usually represented as parse trees)
• “Beaten” by NMT in 2016
From Rico Sennrich, “NMT: Breaking the Performance Plateau”, 2016
![Page 12: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/12.jpg)
Syntax was all the rage!• The “previous” state-of-the-
art was syntax-based SMT
• i.e. systems that used linguistic information (usually represented as parse trees)
• “Beaten” by NMT in 2016
• Can we bring the benefits of syntax into the recent neural systems? From Rico Sennrich, “NMT: Breaking
the Performance Plateau”, 2016
![Page 13: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/13.jpg)
Syntax: Constituency Structure
![Page 14: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/14.jpg)
Syntax: Constituency Structure• A Constituency (a.k.a Phrase-Structure) grammar defines a set of
rewrite rules which describe the structure of the language.
![Page 15: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/15.jpg)
Syntax: Constituency Structure• A Constituency (a.k.a Phrase-Structure) grammar defines a set of
rewrite rules which describe the structure of the language. • Groups words into larger units (constituents)
![Page 16: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/16.jpg)
Syntax: Constituency Structure• A Constituency (a.k.a Phrase-Structure) grammar defines a set of
rewrite rules which describe the structure of the language. • Groups words into larger units (constituents)• Defines a hierarchy between constituents
![Page 17: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/17.jpg)
Syntax: Constituency Structure• A Constituency (a.k.a Phrase-Structure) grammar defines a set of
rewrite rules which describe the structure of the language. • Groups words into larger units (constituents)• Defines a hierarchy between constituents• Draws relations between different constituents (words, phrases,
clauses…)
![Page 18: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/18.jpg)
Why Syntax Can Help MT?
![Page 19: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/19.jpg)
• Hints as to which word sequences belong together
Why Syntax Can Help MT?
![Page 20: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/20.jpg)
• Hints as to which word sequences belong together• Helps in producing well structured sentences
Why Syntax Can Help MT?
![Page 21: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/21.jpg)
• Hints as to which word sequences belong together• Helps in producing well structured sentences • Allows informed reordering decisions according to the
syntactic structure
Why Syntax Can Help MT?
![Page 22: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/22.jpg)
• Hints as to which word sequences belong together• Helps in producing well structured sentences • Allows informed reordering decisions according to the
syntactic structure• Encourages long-distance dependencies when
selecting translations
Why Syntax Can Help MT?
![Page 23: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/23.jpg)
String-to-Tree Translationsource
target
![Page 24: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/24.jpg)
Our Approach: String-to-Tree NMT
![Page 25: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/25.jpg)
Our Approach: String-to-Tree NMT
source
![Page 26: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/26.jpg)
Our Approach: String-to-Tree NMT
source target
![Page 27: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/27.jpg)
Our Approach: String-to-Tree NMT
• Main idea: translate a source sentence into a linearized tree of the target sentence
source target
![Page 28: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/28.jpg)
Our Approach: String-to-Tree NMT
• Main idea: translate a source sentence into a linearized tree of the target sentence
• Inspired by works on RNN-based syntactic parsing (Vinyals et. al, 2015, Choe & Charniak, 2016)
source target
![Page 29: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/29.jpg)
Our Approach: String-to-Tree NMT
• Main idea: translate a source sentence into a linearized tree of the target sentence
• Inspired by works on RNN-based syntactic parsing (Vinyals et. al, 2015, Choe & Charniak, 2016)
• Allows using the seq2seq framework as-is
source target
![Page 30: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/30.jpg)
Experimental Details
• We used the Nematus toolkit (Sennrich et al. 2017)
• Joint BPE segmentation (Sennrich et al. 2016)
• For training, we parse the target side using the BLLIP parser (McClosky, Charniak and Johnson, 2006)
• Requires some care about making BPE, Tokenization and Parser work together
![Page 31: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/31.jpg)
Experiments - Large Scale
![Page 32: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/32.jpg)
Experiments - Large Scale• German to English, 4.5 million parallel training sentences from WMT16
![Page 33: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/33.jpg)
Experiments - Large Scale• German to English, 4.5 million parallel training sentences from WMT16
• Train two NMT models using the same setup (same settings as the SOTA neural system in WMT16)
![Page 34: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/34.jpg)
Experiments - Large Scale• German to English, 4.5 million parallel training sentences from WMT16
• Train two NMT models using the same setup (same settings as the SOTA neural system in WMT16)
• syntax-aware (bpe2tree)
![Page 35: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/35.jpg)
Experiments - Large Scale• German to English, 4.5 million parallel training sentences from WMT16
• Train two NMT models using the same setup (same settings as the SOTA neural system in WMT16)
• syntax-aware (bpe2tree)
• syntax-agnostic baseline (bpe2bpe)
![Page 36: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/36.jpg)
Experiments - Large Scale• German to English, 4.5 million parallel training sentences from WMT16
• Train two NMT models using the same setup (same settings as the SOTA neural system in WMT16)
• syntax-aware (bpe2tree)
• syntax-agnostic baseline (bpe2bpe)
• The syntax-aware model performs better in terms of BLEU
Single Model
5 Model Ensemble
![Page 37: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/37.jpg)
Experiments - Low Resource
• German/Russian/Czech to English - 180k-140k parallel training sentences (News Commentary v8)
• The syntax-aware model performs better in terms of BLEU in all cases (12 comparisons)
• Up to 2+ BLEU improvement
![Page 38: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/38.jpg)
Looking Beyond BLEU
![Page 39: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/39.jpg)
Accurate Trees• 99% of the predicted trees in the development set had valid bracketing
• Eye-balling the predicted trees found them well-formed and following the syntax of English.
![Page 40: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/40.jpg)
Where Syntax Helps? Alignments
![Page 41: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/41.jpg)
Where Syntax Helps? Alignments
• The attention based model induces soft alignments between the source and the target
![Page 42: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/42.jpg)
Where Syntax Helps? Alignments
• The attention based model induces soft alignments between the source and the target
• The syntax-aware model produced more sensible alignments
![Page 43: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/43.jpg)
Where Syntax Helps? Alignments
• The attention based model induces soft alignments between the source and the target
• The syntax-aware model produced more sensible alignments
![Page 44: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/44.jpg)
Where Syntax Helps? Alignments
• The attention based model induces soft alignments between the source and the target
• The syntax-aware model produced more sensible alignments
![Page 45: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/45.jpg)
Where Syntax Helps? Alignments
• The attention based model induces soft alignments between the source and the target
• The syntax-aware model produces more sensible alignments
![Page 46: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/46.jpg)
Where Syntax Helps? Alignments
• The attention based model induces soft alignments between the source and the target
• The syntax-aware model produces more sensible alignments
![Page 47: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/47.jpg)
Where Syntax Helps? Alignments
• The attention based model induces soft alignments between the source and the target
• The syntax-aware model produces more sensible alignments
![Page 48: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/48.jpg)
Where Syntax Helps? Alignments
• The attention based model induces soft alignments between the source and the target
• The syntax-aware model produces more sensible alignments
![Page 49: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/49.jpg)
Attending to Source Syntax
![Page 50: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/50.jpg)
Attending to Source Syntax• We inspected the attention weights
during the production of the tree’s opening brackets
![Page 51: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/51.jpg)
Attending to Source Syntax• We inspected the attention weights
during the production of the tree’s opening brackets
• The model consistently attends to the main verb (“hatte") or to structural markers (question marks, hyphens…) in the source sentence
![Page 52: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/52.jpg)
Attending to Source Syntax• We inspected the attention weights
during the production of the tree’s opening brackets
• The model consistently attends to the main verb (“hatte") or to structural markers (question marks, hyphens…) in the source sentence
• Indicates the system implicitly learns source syntax to some extent (Shi, Padhi and Knight, 2016) and possibly plans the decoding accordingly
![Page 53: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/53.jpg)
Where Syntax Helps? Structure
![Page 54: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/54.jpg)
Where Syntax Helps? Structure
![Page 55: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/55.jpg)
Where Syntax Helps? Structure
![Page 56: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/56.jpg)
Structure (I) - Reordering
![Page 57: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/57.jpg)
Structure (I) - Reordering
• German to English translation requires a significant amount of reordering during translation
![Page 58: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/58.jpg)
Structure (I) - Reordering
• German to English translation requires a significant amount of reordering during translation
• Quantifying reordering shows that the syntax-aware system performs more reordering during the training process
![Page 59: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/59.jpg)
Structure (I) - Reordering
![Page 60: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/60.jpg)
Structure (I) - Reordering• We would like to interpret the increased reordering from a syntactic
perspective
![Page 61: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/61.jpg)
Structure (I) - Reordering• We would like to interpret the increased reordering from a syntactic
perspective
• We extract GHKM rules (Galley et al., 2004) from the dev set using the predicted trees and attention-induced alignments
![Page 62: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/62.jpg)
Structure (I) - Reordering• We would like to interpret the increased reordering from a syntactic
perspective
• We extract GHKM rules (Galley et al., 2004) from the dev set using the predicted trees and attention-induced alignments
• The most common rules reveal linguistically sensible transformations, like moving the verb from the end of a German constituent to the beginning of the matching English one
German English:
![Page 63: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/63.jpg)
Structure (I) - Reordering• We would like to interpret the increased reordering from a syntactic
perspective
• We extract GHKM rules (Galley et al., 2004) from the dev set using the predicted trees and attention-induced alignments
• The most common rules reveal linguistically sensible transformations, like moving the verb from the end of a German constituent to the beginning of the matching English one
• More examples in the paper
German English:
![Page 64: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/64.jpg)
Structure (II) - Relative Constructions
![Page 65: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/65.jpg)
Structure (II) - Relative Constructions
• A common linguistic structure is relative constructions, i.e. “The XXX which YYY”, “A XXX whose YYY”…
![Page 66: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/66.jpg)
Structure (II) - Relative Constructions
• A common linguistic structure is relative constructions, i.e. “The XXX which YYY”, “A XXX whose YYY”…
• The words that connect the clauses in such constructions are called relative pronouns, i.e. “who”, “which”, “whom”…
![Page 67: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/67.jpg)
Structure (II) - Relative Constructions
• A common linguistic structure is relative constructions, i.e. “The XXX which YYY”, “A XXX whose YYY”…
• The words that connect the clauses in such constructions are called relative pronouns, i.e. “who”, “which”, “whom”…
• The syntax-aware system produced more relative pronouns due to the syntactic context
![Page 68: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/68.jpg)
Structure (II) - Relative Constructions
![Page 69: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/69.jpg)
Source:
“Guangzhou, das in Deutschland auch Kanton genannt wird…”
Structure (II) - Relative Constructions
![Page 70: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/70.jpg)
Source:
“Guangzhou, das in Deutschland auch Kanton genannt wird…”
Reference:
“Guangzhou, which is also known as Canton in Germany…”
Structure (II) - Relative Constructions
![Page 71: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/71.jpg)
Syntax-Agnostic:
“Guangzhou, also known in Germany, is one of…
Source:
“Guangzhou, das in Deutschland auch Kanton genannt wird…”
Reference:
“Guangzhou, which is also known as Canton in Germany…”
Structure (II) - Relative Constructions
![Page 72: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/72.jpg)
Syntax-Agnostic:
“Guangzhou, also known in Germany, is one of…
Syntax-Based:
“Guangzhou, which is also known as the canton in Germany,…”
Source:
“Guangzhou, das in Deutschland auch Kanton genannt wird…”
Reference:
“Guangzhou, which is also known as Canton in Germany…”
Structure (II) - Relative Constructions
![Page 73: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/73.jpg)
Source:
“Zugleich droht der stark von internationalen Firmen abhängigen Region ein Imageschaden…”
Structure (II) - Relative Constructions
![Page 74: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/74.jpg)
Reference:
“At the same time, the image of the region, which is heavily reliant on international companies…”
Source:
“Zugleich droht der stark von internationalen Firmen abhängigen Region ein Imageschaden…”
Structure (II) - Relative Constructions
![Page 75: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/75.jpg)
Syntax-Agnostic:
“At the same time, the region's heavily dependent region…”
Reference:
“At the same time, the image of the region, which is heavily reliant on international companies…”
Source:
“Zugleich droht der stark von internationalen Firmen abhängigen Region ein Imageschaden…”
Structure (II) - Relative Constructions
![Page 76: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/76.jpg)
Syntax-Agnostic:
“At the same time, the region's heavily dependent region…”
Syntax-Based:
“At the same time, the region, which is heavily dependent on international firms…”
Reference:
“At the same time, the image of the region, which is heavily reliant on international companies…”
Source:
“Zugleich droht der stark von internationalen Firmen abhängigen Region ein Imageschaden…”
Structure (II) - Relative Constructions
![Page 77: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/77.jpg)
Human Evaluation
![Page 78: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/78.jpg)
Human Evaluation• We performed a small-scale human-evaluation using mechanical
turk on the first 500 sentences in newstest 2015
![Page 79: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/79.jpg)
Human Evaluation• We performed a small-scale human-evaluation using mechanical
turk on the first 500 sentences in newstest 2015
• Two turkers per sentence
![Page 80: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/80.jpg)
Human Evaluation• We performed a small-scale human-evaluation using mechanical
turk on the first 500 sentences in newstest 2015
• Two turkers per sentence
• The syntax-aware translations had an advantage over the baseline
0
95
1902bpe better neutral 2tree better
![Page 81: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/81.jpg)
Conclusions
![Page 82: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/82.jpg)
Conclusions• Neural machine translation can clearly benefit from target-side
syntax Other recent work include:
![Page 83: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/83.jpg)
Conclusions• Neural machine translation can clearly benefit from target-side
syntax Other recent work include:• Eriguchi et al., 2017, Wu et al., 2017 (Dependency)
![Page 84: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/84.jpg)
Conclusions• Neural machine translation can clearly benefit from target-side
syntax Other recent work include:• Eriguchi et al., 2017, Wu et al., 2017 (Dependency)• Nadejde et al., 2017 (CCG)
![Page 85: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/85.jpg)
Conclusions• Neural machine translation can clearly benefit from target-side
syntax Other recent work include:• Eriguchi et al., 2017, Wu et al., 2017 (Dependency)• Nadejde et al., 2017 (CCG)
• A general approach - can be easily incorporated into other neural language generation tasks like summarization, image caption generation…
![Page 86: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/86.jpg)
Conclusions• Neural machine translation can clearly benefit from target-side
syntax Other recent work include:• Eriguchi et al., 2017, Wu et al., 2017 (Dependency)• Nadejde et al., 2017 (CCG)
• A general approach - can be easily incorporated into other neural language generation tasks like summarization, image caption generation…
• Larger picture: don’t throw away your linguistics! Neural systems can also leverage symbolic linguistic information
![Page 87: Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation](https://reader031.fdocuments.in/reader031/viewer/2022021815/5a647af37f8b9a27568b4bef/html5/thumbnails/87.jpg)