Stochastic Inversion Transduction Grammars Dekai Wu 11-734 Advanced Machine Translation Seminar...
-
Upload
alvin-gallagher -
Category
Documents
-
view
218 -
download
0
Transcript of Stochastic Inversion Transduction Grammars Dekai Wu 11-734 Advanced Machine Translation Seminar...
![Page 1: Stochastic Inversion Transduction Grammars Dekai Wu 11-734 Advanced Machine Translation Seminar Presented by: Sanjika Hewavitharana 04/13/2006.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649ee55503460f94bf49b9/html5/thumbnails/1.jpg)
Stochastic Inversion Transduction Grammars Dekai Wu
11-734 Advanced Machine Translation Seminar
Presented by:
Sanjika Hewavitharana
04/13/2006
![Page 2: Stochastic Inversion Transduction Grammars Dekai Wu 11-734 Advanced Machine Translation Seminar Presented by: Sanjika Hewavitharana 04/13/2006.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649ee55503460f94bf49b9/html5/thumbnails/2.jpg)
Overview
Simple Transduction Grammars
Inversion Transduction Grammars (ITGs)
Stochastic ITGs
Parsing with SITGs
Applications of SITGs
Main Reading: Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora (1997)
![Page 3: Stochastic Inversion Transduction Grammars Dekai Wu 11-734 Advanced Machine Translation Seminar Presented by: Sanjika Hewavitharana 04/13/2006.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649ee55503460f94bf49b9/html5/thumbnails/3.jpg)
Introduction
Mathematical models of translation IBM Models (Brown et al.): String generates String Syntax based (Yamada & Kenji): Tree generates String ITG (Wu): two trees are generated simultaneously
ITGs A formalism for modeling bilingual sentence pairs Not intended to use as full translation models, but to use for
parallel corpus analysis Extract useful structures from input data
Generative view rather than translation view two output trees are generated simultaneously, one for each
language
![Page 4: Stochastic Inversion Transduction Grammars Dekai Wu 11-734 Advanced Machine Translation Seminar Presented by: Sanjika Hewavitharana 04/13/2006.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649ee55503460f94bf49b9/html5/thumbnails/4.jpg)
Transduction Grammars
A simple transduction grammar is a CFG whose terminals are pairs of symbols (or singletons)
Can be used to model the generation of bilingual sentence pairs
E The Financial Secretary and I will be accountable.
C
yA
xA
yxA
/
/
/
21, LyLx
![Page 5: Stochastic Inversion Transduction Grammars Dekai Wu 11-734 Advanced Machine Translation Seminar Presented by: Sanjika Hewavitharana 04/13/2006.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649ee55503460f94bf49b9/html5/thumbnails/5.jpg)
Transduction Grammar Rules E.g.
Simple Rules:
Inversion Rule:
![Page 6: Stochastic Inversion Transduction Grammars Dekai Wu 11-734 Advanced Machine Translation Seminar Presented by: Sanjika Hewavitharana 04/13/2006.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649ee55503460f94bf49b9/html5/thumbnails/6.jpg)
Transduction Grammars
A simple transduction grammar is a CFG whose terminals are pairs of symbols (or singletons)
Can be used to model the generation of bilingual sentence pairs
E
C
yA
xA
yxA
/
/
/
21, LyLx
![Page 7: Stochastic Inversion Transduction Grammars Dekai Wu 11-734 Advanced Machine Translation Seminar Presented by: Sanjika Hewavitharana 04/13/2006.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649ee55503460f94bf49b9/html5/thumbnails/7.jpg)
Transduction Grammars
In general, they are not very useful two languages should share exactly the same grammatical
structure
So some sentence pairs cannot be generated
ITG removes the rigid parallel ordering constraint Constituent order in one language may be the inverse of the
other language
Order is the same for both (square brackets):
Order is inverted for one (angle brackets):
CBA
CBA
![Page 8: Stochastic Inversion Transduction Grammars Dekai Wu 11-734 Advanced Machine Translation Seminar Presented by: Sanjika Hewavitharana 04/13/2006.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649ee55503460f94bf49b9/html5/thumbnails/8.jpg)
ITGs
e.g.
With ITG we can parse the previous sentence pair Inversion rule: VP VV PP
ccc
eee
CBA
CBA
BCA
ccc
eee
BCA
CBA
BCA
![Page 9: Stochastic Inversion Transduction Grammars Dekai Wu 11-734 Advanced Machine Translation Seminar Presented by: Sanjika Hewavitharana 04/13/2006.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649ee55503460f94bf49b9/html5/thumbnails/9.jpg)
ITG Parse Tree
![Page 10: Stochastic Inversion Transduction Grammars Dekai Wu 11-734 Advanced Machine Translation Seminar Presented by: Sanjika Hewavitharana 04/13/2006.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649ee55503460f94bf49b9/html5/thumbnails/10.jpg)
Expressiveness of ITGs
![Page 11: Stochastic Inversion Transduction Grammars Dekai Wu 11-734 Advanced Machine Translation Seminar Presented by: Sanjika Hewavitharana 04/13/2006.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649ee55503460f94bf49b9/html5/thumbnails/11.jpg)
Expressiveness of ITGs
Not all matching are possible with ITG e.g. ‘Inside-out’ matching are not allowed
This helps to reduce the combinatorial growth of matchings with the number of tokens
The number of matchings eliminated increases rapidly as the number of tokens increases
Author claims this is a benefit
![Page 12: Stochastic Inversion Transduction Grammars Dekai Wu 11-734 Advanced Machine Translation Seminar Presented by: Sanjika Hewavitharana 04/13/2006.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649ee55503460f94bf49b9/html5/thumbnails/12.jpg)
Expressiveness of ITGs
![Page 13: Stochastic Inversion Transduction Grammars Dekai Wu 11-734 Advanced Machine Translation Seminar Presented by: Sanjika Hewavitharana 04/13/2006.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649ee55503460f94bf49b9/html5/thumbnails/13.jpg)
Normal Form of ITG
For any ITG there exists an equivalent grammar in the normal form
Right hand side of all rules have either:
Terminal couples
Terminal singletons
Pairs of non-terminals with straight orientation
Pairs of non-terminals with inverted orientation
yxA /
/,/,/ AyAxA
BCA
BCA
![Page 14: Stochastic Inversion Transduction Grammars Dekai Wu 11-734 Advanced Machine Translation Seminar Presented by: Sanjika Hewavitharana 04/13/2006.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649ee55503460f94bf49b9/html5/thumbnails/14.jpg)
Stochastic ITGs
A probability can be assigned to each rewrite rule
The probabilities of all the rules with a given left hand side must sum to 1.
An SITG will give the most probable matching (ML) parse for a sentence pair. Similar to Viterbi or CYK (Chart) parsing
001.0),(
4.0][
yxb
a
A
NANN
![Page 15: Stochastic Inversion Transduction Grammars Dekai Wu 11-734 Advanced Machine Translation Seminar Presented by: Sanjika Hewavitharana 04/13/2006.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649ee55503460f94bf49b9/html5/thumbnails/15.jpg)
Parsing with SITGs
Every node (q) in the parse tree has 5 elements: Begin & end indices for language-1 string (s,t) Begin & end indices for language-2 string (u,v) Non-terminal category (i)
Each cell (in the chart) stores the probability of the most likely parse covering the appropriate substrings, rooted in the appropriate category
![Page 16: Stochastic Inversion Transduction Grammars Dekai Wu 11-734 Advanced Machine Translation Seminar Presented by: Sanjika Hewavitharana 04/13/2006.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649ee55503460f94bf49b9/html5/thumbnails/16.jpg)
Parsing with SITGs - Algorithm
Initialize the cells corresponding to terminals using a translation lexicon
For the other cells, recursively find the most probable way of obtaining that nonterminal category.
Compute the probability by multiplying the probability of the rule by the probabilities of both the constituents
Store that probability plus the orientation of the rule
Complexity: O(n3m3)
![Page 17: Stochastic Inversion Transduction Grammars Dekai Wu 11-734 Advanced Machine Translation Seminar Presented by: Sanjika Hewavitharana 04/13/2006.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649ee55503460f94bf49b9/html5/thumbnails/17.jpg)
Applications of SITGs
Segmentation
Bracketing
Alignment
Bilingual Constraint Transfer
Mining parallel sentences from comparable corpora
[Wu & Fung 2005]
![Page 18: Stochastic Inversion Transduction Grammars Dekai Wu 11-734 Advanced Machine Translation Seminar Presented by: Sanjika Hewavitharana 04/13/2006.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649ee55503460f94bf49b9/html5/thumbnails/18.jpg)
Applications of SITGs - Segmentation
Word boundaries are not marked in Chinese text No word chunks available for matching
One option : do word segmentation as preprocessing Might produce chunks with that does not agree bilingually
Solution: extend the algorithm to accommodate segmentation Allow the initialization step to find strings of any length in the
translation lexicon The recursive step stores the most probable way of creating a
constituent, whether it came from the lexicon or from rules
![Page 19: Stochastic Inversion Transduction Grammars Dekai Wu 11-734 Advanced Machine Translation Seminar Presented by: Sanjika Hewavitharana 04/13/2006.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649ee55503460f94bf49b9/html5/thumbnails/19.jpg)
Applications of SITGs – Bracketing
How to assign structure to a sentence with no grammar available? Especially problematic for minority language
A solution using ITGs: Get a parallel corpus pairing it with some other language Get a reasonable translation dictionary Parse it with a bracketing transduction grammar
![Page 20: Stochastic Inversion Transduction Grammars Dekai Wu 11-734 Advanced Machine Translation Seminar Presented by: Sanjika Hewavitharana 04/13/2006.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649ee55503460f94bf49b9/html5/thumbnails/20.jpg)
Bracketing Transduction Grammar
A minimal ITG Only one nonterminal: A Production rules:
Lexical translation probabilities has prominence Small prob. values for the two singleton production rules Also, a very small value for
AAA
AAAa
a
ji
b
vuAij
/
j
b
i
b
vA
uAj
i
/
/
ijb
a
![Page 21: Stochastic Inversion Transduction Grammars Dekai Wu 11-734 Advanced Machine Translation Seminar Presented by: Sanjika Hewavitharana 04/13/2006.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649ee55503460f94bf49b9/html5/thumbnails/21.jpg)
Bracketing with Singletons
Singletons cause bracketing errors Some refinements:
Depending on the language, bias the singletons attachment either to the left or the right of a constituent
Apply a series of transformations which would push the singletons as closely as possible towards couplese.g. [ x A B ] ⇌ x A B ⇌ x A B ⇌ [x A ] B
Before:
After:
![Page 22: Stochastic Inversion Transduction Grammars Dekai Wu 11-734 Advanced Machine Translation Seminar Presented by: Sanjika Hewavitharana 04/13/2006.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649ee55503460f94bf49b9/html5/thumbnails/22.jpg)
Bracketing Experiments
Used 2000 Chinese-English sentence-pairs from HKUST corpus
Some filtering: Remove sentence pairs that were not adequately covered by
the lexicon (>1 unknown words) Remove sentence pairs with high unmatched words (>2)
Bracketing precision: 80% for English 78% for Chinese
Errors mainly due to lexical imperfections
A statistical lexicon (~6.5k English, ~5.5k Chinese words)
Can be improved with extra information
e.g. POS, grammar-based bracketer
![Page 23: Stochastic Inversion Transduction Grammars Dekai Wu 11-734 Advanced Machine Translation Seminar Presented by: Sanjika Hewavitharana 04/13/2006.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649ee55503460f94bf49b9/html5/thumbnails/23.jpg)
Applications of SITGs - Alignment
Alignments (phrasal or word) are a natural byproduct of bilingual parsing
Unlike ‘parse-parse-match’ methods, this Doesn’t require a robust grammar for both languages Guarantees compatibility between parses Has a principled way of choosing between possible alignments
Provides a more reasonable ‘distortion penalty’
Recent empirical studies show ITGs produce better alignments in various applications [Wu & Fung 2005]
![Page 24: Stochastic Inversion Transduction Grammars Dekai Wu 11-734 Advanced Machine Translation Seminar Presented by: Sanjika Hewavitharana 04/13/2006.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649ee55503460f94bf49b9/html5/thumbnails/24.jpg)
Bilingual Constraint Transfer
A high-quality parse for one language can be leveraged to get structure for the other
Alter the parsing algorithm: only allow constituents that match the parse that already
exists for the well-studied language
This works for any sort of constraint supplied for the well-studied language
![Page 25: Stochastic Inversion Transduction Grammars Dekai Wu 11-734 Advanced Machine Translation Seminar Presented by: Sanjika Hewavitharana 04/13/2006.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649ee55503460f94bf49b9/html5/thumbnails/25.jpg)
References:
Dekai Wu (1997), Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora, Computational Linguistics, Vol. 23, no. 1, pp. 377-403.
Dekai Wu (1995), Grammarless Extraction of Phrasal Translation Examples from Parallel Texts, 6th Intl. Conf.on Theoretical and Methodological Issues in Machine Translation, Vol. 2, pp. 354-372. Leuven, Belgium.
Dekai Wu and Pascale FUNG (2005), Inversion Transduction Grammar Constraints for Mining Parallel Sentences from Quasi-Comparable Corpora, 2nd Intl. Joint Conf. on Natural Language Processing (IJCNLP-2005), Jeju, Korea, October.