Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J....
-
date post
21-Dec-2015 -
Category
Documents
-
view
214 -
download
0
Transcript of Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J....
Symmetric Probabilistic Alignment
Jae Dong Kim
Committee:Jaime G. Carbonell
Ralf D. BrownPeter J. Jansen
Motivation
In the CMU EBMT system, alignment has been less studied compared to the other components.
We want to investigate a new sub-sentential aligner which uses translation probabilities in a symmetric fashion.
Outline
Introduction
Symmetric Probabilistic Alignment
Experiments and Results
Conclusions
Future Work
Sub-sentential Alignment
The CMU EBMT system refers to translation examples to translate unknown source sentence
Since it is hard to find an exactly matching example sentence, the system finds the longest match Encapsulated local context Local reordering
The aligner should work on fragments (sub-sentences)
Need for a new aligner
Relatively less studied compared to the other components
The old aligner Heuristic based
Builds a correspondence table Finds the longest target fragment and the shortest
target fragment Checks every substring of the longest one, which
includes the shortest one
Fast but doesn’t use probabilities
Related Work
IBM models (Brown et al, 93)
HMM (Vogel et al, 96)
Competitive link (Melamed, 97)
Explicit Syntactic Information(Yamada et al, 02)
ISA (Zhang, 03)
The SPA is different from the above in that it aligns sub-sentences using translation probabilities and some heuristics when the boundary of source fragment is given.
Outline
Introduction
Symmetric Probabilistic Alignment
Experiments and Results
Conclusions
Future Work
Basic Algorithm (1)
Assumptions: A bilingual probabilistic dictionary is available Contiguous source fragments are translated into
contiguous target fragments Fragments are translated independently of surrounding
context
Given and s i1 ,... , s ik t j 1 ,... , t j l
t =argmax t ∏p=1
k max max q =1l p t jq∣s ip ,
1k
×∏q=1l max max p=1
k p s ip∣t jq , 1l
Basic Algorithm (2)
Assume that we are considering a candidate target fragment 't2 t3 t4' given a source fragment 's7 s8 s9'
Source -> Target Translation ScoreS_tmp = max( p(t2|s7), p(t3|s7), p(t4|s7), ε )
x max( p(t2|s8), p(t3|s8), p(t4|s8), ε )
x max( p(t2|s9), p(t3|s9), p(t4|s9), ε )
S_st = S_tmp^{1/3}
Basic Algorithm (3)
Source <- Target Translation ScoreS_tmp = max( p(s7|t2), p(s8|t2), p(s9|t2), ε )
x max( p(s7|t3), p(s8|t3), p(s9|t3), ε )
x max( p(s7|t4), p(s8|t4), p(s9|t4), ε )
S_ts = S_tmp^{1/3}
Source <->Target Translation ScoreScore = S_st * S_ts
t
Restrictions (1)
Untranslated word penaltys7 s8 s9
t2 t3 t4
Anchor Contexts6 s7 s8 s9 s10 s6 s7 s8 s9 s10
t1 t2 t3 t4 t5 t1 t2 t3 t4 t5
Restrictions (2)
Length penalty “t2 ... t30” for “s7 s8 s9”. Realistic? We expect a proportional target fragment length
to the source fragment length.
Distance penalty “t45 t46 t47” for “s7 s8 s9”. Realistic? Maybe. Between similar word order languages, we
might expect a proportional position.
Combined Aligner
Set a threshold for the SPA
The SPA produces results with higher score than the threshold
For each source fragment If there is a result from the SPA -> use the SPA
result Otherwise, use the IBM result
Outline
Introduction
Symmetric Probabilistic Alignment
Experiments and Results
Conclusions
Future Work
Alignment Accuracy (1)Evaluation Metrics F1 (Precision, Recall) - based on positions
Data English-Chinese
Xinhua news wire Training data: 1m sentence pairs
Trained GIZA++ with default parameters For the SPA, used the dictionary by GIZA++
Test data: 366 sentence pairs - 3 copies by 3 people 20 more sentence pairs - 1 copy by another 27286 3-8 words long source fragments
Alignment Accuracy (2)
Data French-English
Canadian Hansard Training data: 1m sentence pairs
Trained GIZA++ with default parameters For the SPA, used the dictionary by GIZA++
Test data 91 sentence pairs 12466 3-8 words long source fragments
Alignment Accuracy (3)
Alignments to be compared Random: random alignment to a reasonably long target fragment Positional: alignment to a proportionally positioned target fragment Oracle: the best possible contiguous human alignment SPA-uni: unidirectional basic alignment SPA-basic: bidirectional basic alignment SPA: the best SPA alignment with restrictions IBM4: non-contiguous alignment by IBM Model 4 COMB: the combination of SPA and IBM4 alignments SPA-top10: the best of top 10 alignment results of SPA
Alignment Accuracy : En-Cn
SPA-basic outperformed SPA-uni
SPA was the best when we applied untranslated word penalty and length penalty
Our significance test showed that the difference between IBM4 and COMB is significant
Recall Precision F1Random 0.321979 0.372175 0.345262Positional 0.582254 0.576207 0.579215Oracle 0.905602 0.861449 0.882974SPA-uni 0.942574 0.355970 0.516776SPA-basic 0.869897 0.473884 0.613538SPA (u,l) 0.733485 0.693883 0.713135IBM4 0.738995 0.807471 0.771717COMB 0.756338 0.804163 0.779517
Alignment Accuracy : Fr-EnRecall Precision F1
Random 0.193939 0.238384 0.213877Positional 0.668841 0.728991 0.697622Oracle 0.980509 0.937717 0.958636SPA-uni 0.880979 0.281680 0.426874SPA-basic 0.707808 0.712078 0.709936SPA (u,a,l,d) 0.781466 0.801407 0.791311IBM4 0.777064 0.965592 0.861130COMB 0.781734 0.960679 0.862018
SPA-basic outperformed SPA-uni
SPA was the best when we applied all the restrictions
Our significance test showed that the difference between IBM4 and COMB is not significant
Human Alignment Evaluation
Recall Precision F1ltao/xuwang 0.858758 0.980900 0.915774ltao/sandy 0.742688 0.982903 0.846076xuwang/ltao 0.896835 0.976533 0.934989xuwang/sandy 0.783359 0.987704 0.873742sandy/ltao 0.959004 0.950798 0.954884sandy/xuwang 0.968574 0.961476 0.965012
Rough idea about how much humans agree on alignment
EBMT Performance (1)
Data French-English (Canadian Hansard) 20k training sentence pairs Test
Development set: 100 sentence pairs 2 reference set: 2 references for 100 source
sentences Evaluation set: 10 X 100 sentence pairs
Evaluation Metric BLEU
EBMT performance (2)
Devtest 2refTest TestEBMT 0.1632 0.2400 0.13455SPA 0.2214 0.2896 0.17287IBM4 0.2197 0.2785 0.17549COMB 0.2240 0.2815 0.17506
SPA, IBM4 and COMB performs significantly better than EBMT (the old aligner)
For 'Test', SPA outperformed EBMT by 28.5 %
Among SPA, IBM4 and COMB, nothing is significantly better than the others
Outline
Introduction
Symmetric Probabilistic Alignment
Experiments and Results
Conclusions
Future Work
Conclusions
Improvement on EBMT performance
Combined aligner worked the best on English-Chinese set
Bidirectional alignment worked better than unidirectional alignment
Future Work
Incorporating human dictionaries to cover more general domains
Non-contiguous alignment
Co-training of the SPA and a dictionary
Experiments on different data sets and different language pairs
Experiments with different metrics
Speed up
References
Ying Zhang, Stephan Vogel and Alex Waibel. Integrated Phrase Segmentation and Alignment Model for Statistical Machine Translation. submitted to Proc. of International Confrerence on Natural Language Processing and Knowledge Engineering (NLP-KE), 2003, Beijing, China.
Peter F. Brown, Stephen A. Della Pietra, Vin-cent J. Della Pietra, and Robert L. Mercer. 1993. The mathematics of statistical machinetranslation: Parameter estimation. Computa-tional Linguistics, 19 (2) :263-311.
Stephan Vogel, Hermann Ney, and Christoph Till-mann. 1996. HMM-based word alignment in statistical translation. In COLING '96: The 16th Int. Conf. on Computational Linguistics, pages 836-841, Copenhagen, August.
I. Dan Melamed. "A Word-to-Word Model of Translational Equivalence". In Procs. of the ACL97. pp 490--497. Madrid Spain, 1997.
K. Yamada and K. Knight. A decoder for syntax-based statistical MT. In ACL '02, 2002.
Alignment Accuracy Calculation
Human Answer... under the unemployment insurance plan of the
other country ...
Machine Answer... under the unemployment insurance plan of the
other country ...
Precision: 4/5 = 0.2
Recall: 4/8 = 0.5
F1 = 0.2857