LTI Education Committee Report Alon Lavie LTI Retreat March 2, 2012.
Towards Interactive and Automatic Refinement of ... · PhD Thesis Proposal Jaime Carbonell...
Transcript of Towards Interactive and Automatic Refinement of ... · PhD Thesis Proposal Jaime Carbonell...
Towards Interactive and Automatic Refinement of
Translation Rules
Ariadna Font LlitjósPhD Thesis Proposal
Jaime Carbonell (advisor)Alon Lavie (co-advisor)
Lori LevinBonnie Dorr (Univ. Maryland)
5 November 2004
Interactive and Automatic Rule Refinement 2
Outline
• Introduction• Thesis statement and scope• Preliminary Research
– Interactive elicitation of error information– A framework for automatic rule adaptation
• Proposed Research• Contributions and Thesis Timeline
Interactive and Automatic Rule Refinement 3
Machine Translation (MT)
• Source Language (SL) sentence: Gaudi was a great artist
Spanish translation: Gaudi era un gran artista
• MT System outputs :*Gaudi estaba un artista grande*Gaudi era un artista grande
Interactive and Automatic Rule Refinement 4
Spanish Adjectives Completed Work Automatic Rule Adaptation
General order: grande big in size
NP
DET N ADJ
NP
DET ADJ Na big house una casa grande
Exception: gran exceptionalNP
DET ADJ N
NP
DET ADJ Na great artist un gran artista
Interactive and Automatic Rule Refinement 5
Commercial and Online Systems Correct Translation: Gaudi era un gran artista
• Systran, Babelfish (Altavista), WorldLingo, Translated.net :
*Gaudi era ∅ gran artista
• ImTranslation: *El Gaudi era un gran artista
• 1-800-Translate*Gaudi era un fenomenal artista
Interactive and Automatic Rule Refinement 6
Post-editing
• Current solutions:
Post-editing [Allen, 2003]by human linguists or editors (experts)
Automated Post-Editing prototype module (APE) [Allen & Hogan, 2000]to alleviate the tedious task of correcting most frequent
errors over and over
• No solution to fully automate post-editing process
Interactive and Automatic Rule Refinement 7
Drawbacks of Current Methods
• Manual post-editing Corrections do not generalize
Gaudi era un artista grandeJuan es un amigo grande (Juan is a great friend)Era una oportunidad grande (It is a great opportunity)
• APE Humans need to predict all the errors ahead of time and code for the post-editing rules; given new error
Interactive and Automatic Rule Refinement 8
My Solution
• Automate post-editing efforts by feeding them back into the MT system.
• Possible alternatives:Automatic learning of post-editing rules+ system independent- several thousands of sentences might need to be corrected for the same error
Automatic refinement of translation rules+ attacks the core of the problem- for transfer-based MT systems (need rules to fix!)
Interactive and Automatic Rule Refinement 9
Related Work[Corston-Oliver & Gammon, 2003][Imamura et al. 2003][Menezes & Richardson, 2001]
[Su et al. 1995]
[Brill, 1993][Gavaldà, 2000]
Post-editing
Rule AdaptationFixingMachine Translation
[Callison-Burch, 2004]
[Allen & Hogan, 2000]
My ThesisNo pre-existing training data requiredNo human reference translations requiredNon-expert user feedback
Interactive and Automatic Rule Refinement 10
Resource-poor Scenarios (AVENUE)
- Lack of electronic parallel data- Lack of computational linguists
Lack of manual grammar
Why bother?- Indigenous communities have difficult access to
crucial information that directly affects their life (such as land laws, plagues, health warnings, etc.)
- Preservation of their language and culture
Resource-poorLanguages:MapudungunQuechuaAymara
Interactive and Automatic Rule Refinement 11
How is MT possible for resource-poor languages?
Bilingual speakers
Interactive and Automatic Rule Refinement 12
AVENUE Project Overview
Learning
Module
Transfer Rules
Lexical Resources
Run Time Transfer System
Lattice
Word-Aligned Parallel Corpus
Elicitation Tool
Elicitation Corpus
Elicitation Rule Learning
Run-Time System
Handcrafted rules
Morphology
Morpho-logical analyzer
Interactive and Automatic Rule Refinement 13
My Thesis
Learning
Module
Transfer Rules
Lexical Resources
Run Time Transfer System
Lattice
Translation
Correction
Tool
Word-Aligned Parallel Corpus
Elicitation Tool
Elicitation Corpus
Elicitation Rule Learning
Run-Time System
Rule Refinement
Rule
Refinement
Module
Handcrafted rules
Morphology
Morpho-logical analyzer
Interactive and Automatic Rule Refinement 14
Related Work
Post-editing
Rule AdaptationFixingMachine Translation
My Thesis
Resource-poor languages
Interactive and Automatic Rule Refinement 15
Thesis Statement
Given a rule-based Transfer MT system, we can:- Extract useful information from non-expert
bilingual speakers to correct MT output..- Automatically refine and expand translation
rules, given corrected and aligned translation pairs and some error information.
So that the set of refined rules has better coverage and higher overall MT quality.
Interactive and Automatic Rule Refinement 16
Assumptions
• No pre-existing parallel training data
• No pre-existing human reference translations
• The SL sentence needs to be fully parsed by the translation grammar.
• Bilingual speakers can give enough information about the MT errors.
Interactive and Automatic Rule Refinement 17
Scope
Evaluate automatic refinement for the following conditions:
1. User correction information only.
2. Correction and error information.
3. Extra information is required user interaction.
Both in manually written and automatically learned grammars [AMTA 2004].
Interactive and Automatic Rule Refinement 18
Technical Challenges
Automatic Evaluation of Refinement process
Automatically Refine and Expand Translation Rules minimally
Manually written Automatically Learned
Elicit minimal MT information from non-expert users
Preliminary Work
• Interactive elicitation of error information
• A framework for automatic rule adaptation
Interactive and Automatic Rule Refinement 20
Error Typology for Automatic Rule Refinement (simplified)
Completed Work Interactive elicitation of error information
Missing wordExtra wordWrong word order
Incorrect word
Wrong agreement
Local vs Long distance
Word vs. phrase
+ Word change
Sense
Form
Selectional restrictions
Idiom
Missing constraint
Extra constraint
Interactive and Automatic Rule Refinement 21
TCTool (Demo)• Add a word• Delete a word• Modify a word• Change word order
Actions:
Interactive elicitation of error information
Interactive and Automatic Rule Refinement 22
1st Eng2Spa User Study [LREC 2004]
• MT error classification 9 linguistically-motivated classes [Flanagan, 1994], [White et al. 1994]:word order, sense, agreement error (number, person, gender, tense), form, incorrect word and no translation
Interactive elicitation of error informationCompleted Work
precision recallerror detection 90% 89%error classification 72% 71%
Interactive and Automatic Rule Refinement 23
Translation Rules{NP,8} NP::NP : [DET ADJ N] -> [DET N ADJ]( (X1::Y1) (X2::Y3) (X3::Y2);; English parsing:((x0 def) = (x1 def)) NP definiteness = DET definiteness(x0 = x3) NP = N (N is the head of the NP)
;; Spanish generation: ((y1 agr) = (y2 agr)) DET agreement = N agreement((y3 agr) = (y2 agr)) ADJ agreement = N agreement(y2 = x3) ) Pass the features of English N to Spanish N
ADJ::ADJ |: [nice] -> [bonito]((X1::Y1)((x0 pos) = adj)((x0 form) = nice) ((y0 agr num) = sg) Spanish ADJ is singular in number((y0 agr gen) = masc)) Spanish ADJ is masculine in number
Interactive and Automatic Rule Refinement 24
Automatic Rule Refinement FrameworkCompleted Work Automatic Rule Adaptation
• Find best RR operations given a:• Grammar (G), • Lexicon (L), • (Set of) Source Language sentence(s) (SL), • (Set of) Target Language sentence(s) (TL), • Its Parse tree (P), and • Minimal correction of TL (TL’)
such that TQ2 > TQ1• Which can also be expressed as:
max TQ(TL|TL’,P,SL,RR(G,L))
Interactive and Automatic Rule Refinement 25
1. Refine a translation rule:R0 → R1 (change R0 to make it more
specific or more general)
Types of Refinement OperationsCompleted Work Automatic Rule Adaptation
R0:
R1:
NP
DET N ADJ
NP
DET ADJ N
a nice house
una casa bonito
NP
DET N ADJ
NP
DET ADJ N
a nice house
una casa bonita
N gender = ADJ gender
Interactive and Automatic Rule Refinement 26
2. Bifurcate a translation rule:R0 → R0 (same, general rule)
→ R1 (add a new more specific rule)
Types of Refinement Operations (2)Completed Work Automatic Rule Adaptation
R0: NP
DET N ADJ
NP
DET ADJ N
NP
DET ADJ N
NP
DET ADJ N
R1:
a nice house una casa bonita
a great artist un gran artista
ADJ type: pre-nominal
Interactive and Automatic Rule Refinement 27
Formalizing Error Information
Wi = error Wi’ = correction Wc = clue word
Completed Work Automatic Rule Adaptation
NP
DET N ADJ
NP
DET ADJ N
a nice house
una casa bonito
NP
DET N ADJ
NP
DET ADJ N
a nice house
una casa bonita
N gender = ADJ gender
Wi = bonito
Wi’ = bonita
Wc = casa
Interactive and Automatic Rule Refinement 28
Triggering Feature DetectionCompleted Work Automatic Rule Adaptation
Comparison at the feature level to detect triggering feature(s)
→ Delta function: δ(Wi,Wi’)
Examples:δ(bonito,bonita) = {gender}δ(comiamos,comia) = {person,number}δ(mujer,guitarra) = {∅}
If δ set is empty, need to postulate a new binary feature
gen = masc gen = masc
Interactive and Automatic Rule Refinement 29
Deciding on the Refinement OpCompleted Work Automatic Rule Adaptation
Given:
- Action performed by the user (add, delete, modify, change word order)
- Error information available (error type, clue word, word alignments, etc.)
Refinement Action
Interactive and Automatic Rule Refinement 30
Rule Refinement Operations
Modify Add Delete Change W Order
+Wc –Wc +Wc –Wc +al –al Wi Wc Wi(…) Wc –Wc
δ≠∅ δ=∅ +rule –rule +al –al =Wi Wi Wi’ RuleLearner
POSi=POSi’ POSi≠POSi’
Proposed Work
- Rule Refinement Example- Batch mode implementation- Interactive mode implementation- User Studies- Evaluation
Interactive and Automatic Rule Refinement 32
Rule Refinement ExampleAutomatic Rule Adaptation
Change word order
SL: Gaudí was a great artist
MT system output:TL: Gaudí era un artista grande
Goal (given by user correction):*Gaudí era un artista grande
Gaudí era un gran artista
Interactive and Automatic Rule Refinement 33Refinement Operation Typology
1. Error InformationElicitation
Automatic Rule Adaptation
Interactive and Automatic Rule Refinement 34
2. Variable Instantiation from Log FileAutomatic Rule Adaptation
Correcting Actions:
1. Word order change (artista grande → grande artista):Wi = grande
2. Edited grande into gran:Wi’ = gran identified artist as clue word → Wc = artista
In this case, even if user had not identified Wc, refinement process would have been the same
Interactive and Automatic Rule Refinement 35
3. Retrieve Relevant Lexical Entries• No lexical entry for [great gran]
• Duplicate lexical entry [great grande] and change TL side:
ADJ::ADJ |: [great] -> [gran]((X1::Y1)(…)((y0 agr num) = sg)((y0 agr gen) = masc))
(Morphological analyzer: grande = gran)
Automatic Rule Adaptation
ADJ::ADJ |: [great] -> [grande]((X1::Y1)(…)((y0 agr num) = sg)((y0 agr gen) = masc))
Interactive and Automatic Rule Refinement 36
4. Finding Triggering Feature(s)
Feature δ function: δ(Wi, Wi’) = ∅→ need to postulate a new binary feature: feat1
5. Blame assignment (from MT system output)
tree: <((S,1 (NP,2 (N,5:1 "GAUDI") )
(VP,3 (VB,2 (AUX,17:2 "ERA") )
(NP,8 (DET,0:3 "UN")
(N,4:5 "ARTISTA")
(ADJ,5:4 "GRANDE") ) ) ) )>
Automatic Rule Adaptation
S,1…NP,1…NP,8…
Grammar
Interactive and Automatic Rule Refinement 37
6. Variable Instantiation in the RulesAutomatic Rule Adaptation
Wi = grande → POSi = ADJ = Y3, y3Wc = artista → POSc = N = Y2, y2
{NP,8} ;; Y1 Y2 Y3NP::NP : [DET ADJ N] -> [DET N ADJ]( (X1::Y1) (X2::Y3) (X3::Y2)((x0 def) = (x1 def))(x0 = x3)((y1 agr) = (y2 agr)) ; det-noun agreement((y3 agr) = (y2 agr)) ; adj-noun agreement(y2 = x3) )
Interactive and Automatic Rule Refinement 38
Automatic Rule Adaptation
7. Refining Rules
• Bifurcate NP,8 NP,8 (R0) + NP,8’ (R1)(flip order of ADJ-N)
{NP,8’} NP::NP : [DET ADJ N] -> [DET ADJ N]( (X1::Y1) (X2::Y2) (X3::Y3)((x0 def) = (x1 def))(x0 = x3)((y1 agr) = (y3 agr)) ; det-noun agreement((y2 agr) = (y3 agr)) ; adj-noun agreement(y2 = x3)((y2 feat1) =c + ))
Interactive and Automatic Rule Refinement 39
8. Refining Lexical EntriesAutomatic Rule Adaptation
ADJ::ADJ |: [great] -> [grande]((X1::Y1)((x0 form) = great)((y0 agr num) = sg)((y0 agr gen) = masc)((y0 feat1) = -))
ADJ::ADJ |: [great] -> [gran]((X1::Y1)((x0 form) = great)((y0 agr num) = sg)((y0 agr gen) = masc)((y0 feat1) = +))
Interactive and Automatic Rule Refinement 40
Done? Not yetNP,8 (R0) ADJ(grande)
[feat1 = -]
NP,8’ (R1) ADJ(gran) [feat1 =c +] [feat1 = +]
Need to restrict application of general rule (R0) to just post-nominal ADJ
Automatic Rule Adaptation
un artista grandeun artista granun gran artista*un grande artista
Interactive and Automatic Rule Refinement 41
Add Blocking ConstraintNP,8 (R0) ADJ(grande)[feat1 = -] [feat1 = -]
NP,8’ (R1) ADJ(gran) [feat1 =c +] [feat1 = +]
Can we also eliminate incorrect translations automatically?
Automatic Rule Adaptation
un artista grande*un artista granun gran artista*un grande artista
Interactive and Automatic Rule Refinement 42
Making the grammar tighter
• If Wc = artista Add [feat1= +] to N(artista)Add agreement constraint to NP,8 (R0)
between N and ADJ ((N feat1) = (ADJ feat1))
Automatic Rule Adaptation
*un artista grande*un artista granun gran artista*un grande artista
Interactive and Automatic Rule Refinement 43
Batch Mode Implementation
• Given a set of user corrections, apply refinement module.
• For Refinement Operations of errors that can be refined fully automatically using:
1. Correction information only
2. Correction and error information
Proposed Work Automatic Rule Adaptation
error type, clue word
Interactive and Automatic Rule Refinement 44
Rule Refinement Operations
Modify Add Delete Change W Order
+Wc –Wc +Wc –Wc +al –al Wi Wc Wi(…) Wc –Wc
δ≠∅ δ=∅ +rule –rule +al –al =Wi Wi Wi’ RuleLearner
POSi=POSi’ POSi≠POSi’
Interactive and Automatic Rule Refinement 45
1. Correction info only
Modify Add Delete Change W Order
+Wc –Wc +Wc –Wc +al –al Wi Wc Wi(…) Wc –Wc
δ≠∅ δ=∅ +rule –rule +al –al =Wi Wi Wi’ RuleLearner
POSi=POSi’ POSi≠POSi’
Rule Refinement Operations
It is a nice house – *Es una casa bonitoEs una casa bonita
Gaudi was a great artist – *Gaudi era un artista grandeGaudi era un gran artista
Interactive and Automatic Rule Refinement 46
Modify Add Delete Change W Order
+Wc –Wc +Wc –Wc +al –al Wi Wc Wi(…) Wc –Wc
δ≠∅ δ=∅ +rule –rule +al –al =Wi Wi Wi’ RuleLearner
POSi=POSi’ POSi≠POSi’
Rule Refinement Operations
I am proud of you – *Estoy orgullosa de tuEstoy orgullosa de ti
PP PREP NP
2. Correction and Error info
Interactive and Automatic Rule Refinement 47
Interactive Mode Implementation
• Extra error information is required to determine triggering context automatically
Need to give other relevant sentences to the user at run-time (minimal pairs)
• For Refinement Operations of errors that can be refined fully automatically but:
3. require a further user interaction
Proposed Work Automatic Rule Adaptation
Interactive and Automatic Rule Refinement 48
Modify Add Delete Change W Order
+Wc –Wc +Wc –Wc +al –al Wi Wc Wi(…) Wc –Wc
δ≠∅ δ=∅ +rule –rule +al –al =Wi Wi Wi’ RuleLearner
POSi=POSi’ POSi≠POSi’
Rule Refinement Operations3. Further user interaction
I see them – *Veo losLos veo
Interactive and Automatic Rule Refinement 49
Example Requiring Minimal PairAutomatic Rule Adaptation
1. Run SL sentence through the transfer engineI see them *veo los Correct TL: los veo
2. Wi = los but no Wi’ nor Wc Need a minimal pair to determine appropriate refinement:
I see cars veo autos
3. Triggering feature(s): δ(veo los, veo autos)
δ(los,autos) = {pos}
PRON(los)[pos=pron] N(autos)[pos=n]
Proposed Work
Interactive and Automatic Rule Refinement 50
Refining and Adding Constraints
VP,3: VP NP VP NP (veo los, veo autos)
VP,3’: VP NP NP VP + [NP pos =c pron](los veo, *autos veo)
• Percolate triggering features up to the constituent level:
NP: PRON PRON + [NP pos = PRON pos]
• Block application of general rule (VP,3):
VP,3: VP NP VP NP + [NP pos = (*NOT* pron)]*veo los, veo autos (los veo, *autos veo)
Proposed Work
Interactive and Automatic Rule Refinement 51
Generalization Power
When triggering feature already exists in the feature language (pos, gender, number, etc.)
I see them *veo los los veo
- I love him lo amo (before: *amo lo)- They called me yesterday me llamaron ayer
(before: *llamaron me ayer)- Mary helps her with her homework
Maria le ayuda con sus tareas (before: *Maria ayuda le con sus tareas)
Proposed Work
Interactive and Automatic Rule Refinement 52
User Studies
• TCTool: new MT classification (Eng2Spa)
• Different language pair Mapudungun or Quechua Spanish
• Batch vs Interactive mode
• Amount of information elicitedjust corrections vs + error information
Interactive and Automatic Rule Refinement 53
Evaluation of Refined MT Output
1. Evaluate best translation Automatic evaluation metrics (BLEU, NIST, METEOR)
2. Evaluate translation candidate list size precision (includes parsimony)
Interactive and Automatic Rule Refinement 54
1. Evaluate Best translationHypothesis file (translations to be evaluated
automatically)
Raw MT output:– Best sentence (picked by user to be correct or
requiring the least amount of correction)Refined MT output:– Use METEOR score at sentence level to pick best
candidate from the list
→ Run all automatic metrics on the new hypothesis file using user corrections as reference translations.
Interactive and Automatic Rule Refinement 55
2. Evaluate Translation Candidate List
• Precision: tp binary {0,1} (1 =user correction)
tp + fp total number of TLs
SLTL XXTL XXTL XX
SLTLTL XXTL XXTL XXTL XX
SLTLTL XXTL XXTL XXTL
<
SLTLTL XXTL XX
<=
0/30/3 1/51/5 1/51/5 1/31/3
≠ user correction
SLTLTL XX<
< = < 1/21/2<
Interactive and Automatic Rule Refinement 56
Expected Contributions
• An efficient online GUI to display translations and alignments and solicit pinpoint fixes from non-expert bilingual users.
• An expandable set of rule refinement operations– triggered by user corrections,– to automatically refine and expand different types
of grammars.• A mechanism to automatically evaluate rule
refinements with user corrections as reference translations.
Interactive and Automatic Rule Refinement 57
Thesis TimelineResearch components Duration (months)
Back-end implementation 8User Studies 3Resource-poor language (data + manual grammar) 2Adapt system to new language pair 1Evaluation 1Write and defend thesis 3
Total 18
Expected graduation date: May 2006
Interactive and Automatic Rule Refinement 58
Thanks!
Questions?
Interactive and Automatic Rule Refinement 59
Interactive and Automatic Rule Refinement 60
Interactive and Automatic Rule Refinement 61
Some Questions
• What if users corrections are different (user noise)?
• More than one correction per sentence?• Wc example• Data set• Where is appropriate to refine vs
bifurcate?• Lexical bifurcate• Is the refinement process deterministic?
Interactive and Automatic Rule Refinement 62
Others
• TCTool Demo Simulation• RR operation patterns• Automatic Evaluation feasibility study• AMTA paper results• User studies map• Precision, recall, F1• NIST, BLEU, METEOR
Interactive and Automatic Rule Refinement 63
Precision, Recall and F1
• Precision: tp tp + fp (selected, incorrect)
• Recall: tp tp + fn (correct, not selected)
• F1: 2 PR (P + R)
back to mainback to main
Interactive and Automatic Rule Refinement 64
Automatic Evaluation Metrics
• BLEU: averages the precision for unigram,bigram and up to 4-grams and applies a length penalty [Papineni, 2001].
• NIST: instead of n-gram precision the information gain from each n-gram is taken into account [NIST 2002].
• METEOR: assigns most of the weight to recall, instead of precision and uses stemming [Lavie, 2004]
back to mainback to main
Proposed Work
Interactive and Automatic Rule Refinement 65
Data Set
• Split development set (~400 sentence) into: – Dev set Run User Studies
Develop Refinement ModuleValidate functionality
– Test set Evaluate effect of Refinement operations
• + Wild test set (from naturally occurring text)
Requirement: need to be fully parsed by grammar
back to questionsback to questions
Interactive and Automatic Rule Refinement 66
Refine vs Bifurcate
• Batch mode bifurcate (no way to tell if the original rule should never apply)
• Interactive mode refine (change original rule) if can get enough evidence that original rule never applies.
• Corrections involving agreement constraintsseem to hold for all cases refine
(Open research question)
back to questionsback to questions
Interactive and Automatic Rule Refinement 67
More than one correction/sentenceA B
TL X XA
1st XB
2nd X
Tetris approach to Automatic Rule Refinement
Assumption: different corrections to different words different error
back to questionsback to questions
Interactive and Automatic Rule Refinement 68
Exception: structural divergences
He danced her out of the room * La bailó fuera de la habitación
her he-danced out of the roomLa sacó de la habitación bailandoher he-take-out of the room dancing
Have no way of knowing that these corrections are related
Do one error at a time, if TQ decreases over the test set, hypothesize that it’s a divergence
Feed to the Rule Learner as a new (manually corrected) training example
back to questionsback to questions
Interactive and Automatic Rule Refinement 69
Constituent order changeI gave him the tools
*di a él las herramientasI-gave to him the tools
le di las herramientas a élhim I-gave the tools to him
desired refinement:VP VP PP(a PRON) NP VP VP NP PP(a PRON)
Can extract constituent information from MT output (parse tree) and treat as one error/correction
Interactive and Automatic Rule Refinement 70
More than one correction/error
A+BTL X
Example: edit and move same word (like: gran, bailó)
Occam’s razor
Assumption: both corrections are part of the same error
back to questionsback to questions
Interactive and Automatic Rule Refinement 71
Wc Example
I am proud of you *Estoy orgullosa de tu Wc=de tu tiI-am proud of you-nom
Estoy orgullosa de tiI-am proud of you-oblic
Without Wc information, would need to increase the ambiguity of the grammar significantly!+[you ti]: I love you *ti quiero (te quiero,…)
you read *ti lees (tu lees,…)
back to questionsback to questions
Interactive and Automatic Rule Refinement 72
Lexical bifurcate
• Should the system copy all the features to the new entry?– Good starting point– Might want to copy just a subset of
features (possibly POS-dependent)
Open Research Question
back to questionsback to questions
Interactive and Automatic Rule Refinement 73
Automatic Rule Adaptation
Interactive and Automatic Rule Refinement 74
Automatic Rule Adaptation
SL + best TL picked by user
Interactive and Automatic Rule Refinement 75
Automatic Rule Adaptation
Changing word order
Interactive and Automatic Rule Refinement 76
Automatic Rule Adaptation
Changing “grande” into “gran”
Interactive and Automatic Rule Refinement 77
Automatic Rule Adaptation
back to mainback to main
Interactive and Automatic Rule Refinement 78
1
2
3
Automatic Rule Adaptation
Interactive and Automatic Rule Refinement 79
Automatic Rule Adaptation
Input to RR module
- User correction log file - Transfer engine output (+ parse tree):
sl: I see them
tl: VEO LOS
tree: <((S,0 (VP,3 (VP,1 (V,1:2 "VEO") )
(NP,0 (PRON,2:3 "LOS") ) ) ) )>
sl: I see cars
tl: VEO AUTOS
tree: <((S,0 (VP,3 (VP,1 (V,1:2 "VEO") )
(NP,2 (N,1:3 “AUTOS") ) ) ) )>back to mainback to main
Interactive and Automatic Rule Refinement 80
Types of RR Operations
• Grammar:– R0 → R0 + R1 [=R0’ + constr] Cov[R0] ≤ Cov[R0,R1]– R0 → R1[=R0 + constr= -]
→ R2[=R0’ + constr=c +] Cov[R0] ≤ Cov[R1,R2]– R0 → R1 [=R0 + constr] Cov[R0] > Cov[R1]
• Lexicon– Lex0 → Lex0 + Lex1[=Lex0 + constr] – Lex0 → Lex1[=Lex0 + constr]– Lex0 → Lex0 + Lex1[≈Lex0 + ≠ TLword]– ∅ → Lex1 (adding lexical item)
bifurcate
refine
Completed Work Automatic Rule Adaptation
back to mainback to main
Interactive and Automatic Rule Refinement 81
Manual vs Learned Grammars[AMTA 2004]
Automatic Rule Adaptation
NIST BLEU METEORManual grammar 4.3 0.16 0.6
Learned grammar 3.7 0.14 0.55
• Manual inspection:
• Automatic MT Evaluation:
back to mainback to main
Interactive and Automatic Rule Refinement 82
Human Oracle experimentAutomatic Rule AdaptationCompleted Work
• As a feasibility experiment, compared raw output with manually corrected MT:
statistically significant (confidence interval test)
• These is an upper-bound on how much difference we should expect any refinement approach to make.
back to mainback to main
Interactive and Automatic Rule Refinement 83
Order deterministic?• RR op application is not deterministic Order
of the corrected sentences input to the system
• Example:1st: gran artista bifurcate (2 rules)2nd: casa bonito add agr constraint to only 1 rule
(original, general rule)
the specific rule is still incorrect (missing agr constraint)
1st: casa bonito add agr constraint 2nd: gran artista bifurcate
both rules have agreement constraint (optimal order)
back to questionsback to questions
Interactive and Automatic Rule Refinement 84
User noise?
Solution: Have several users evaluate and correct
the same test setthreshold, 90% agreement- correction- error information (type, clue word)
Only modify the grammar if enough evidence of incorrect rule.
back to questionsback to questions
Proposed Work
Interactive and Automatic Rule Refinement 85
User Studies Map
RR moduleManual grammars
Learned grammars
Eng2spaOnly Corrections
Corrections+error info
Active learning
Batch mode
interactive mode
XX
XX
Mapu2SpaXX
back to mainback to main
Recycle corrections of Machine Translation output back into the system
by refining and expandingexisting translation rules
Interactive and Automatic Rule Refinement 87
1. Correction info only
Modify Add Delete Change W Order
+Wc –Wc +Wc –Wc +al –al Wi Wc Wi(…) Wc –Wc
δ≠∅ δ=∅ +rule –rule +al –al =Wi Wi Wi’ RuleLearner
POSi=POSi’ POSi≠POSi’
Rule Refinement Operations
It is a nice house – Es una casa bonitoEs una casa bonita
John and Mary fell – Juan y Maria ∅ cayeronJuan y Maria se cayeron
Gaudi was a great artist – Gaudi era un artista grandeGaudi era un gran ar
Interactive and Automatic Rule Refinement 88
1. Correction info only
Modify Add Delete Change W Order
+Wc –Wc +Wc –Wc +al –al Wi Wc Wi(…) Wc –Wc
δ≠∅ δ=∅ +rule –rule +al –al =Wi Wi Wi’ RuleLearner
POSi=POSi’ POSi≠POSi’
Rule Refinement Operations
Gaudi was a great artist – Gaudi era un artista grandeGaudi era un gran artista
Es una casa bonitoEs una casa bonita
J y M cayeronJ y M se cayeron
I will help him fix the car – Ayudaré a él a arreglar el autoLe ayudare a arreglar
el auto
Interactive and Automatic Rule Refinement 89
1. Correction info only
Modify Add Delete Change W Order
+Wc –Wc +Wc –Wc +al –al Wi Wc Wi(…) Wc –Wc
δ≠∅ δ=∅ +rule –rule +al –al =Wi Wi Wi’ RuleLearner
POSi=POSi’ POSi≠POSi’
Rule Refinement Operations
I will help him fix the car – Ayudaré a él a arreglar el autoLe ayudare a arreglar
el auto
I would like to go – Me gustaria que irMe
gustaria ∅ ir
Interactive and Automatic Rule Refinement 90
Modify Add Delete Change W Order
+Wc –Wc +Wc –Wc +al –al Wi Wc Wi(…) Wc –Wc
δ≠∅ δ=∅ +rule –rule +al –al =Wi Wi Wi’ RuleLearner
POSi=POSi’ POSi≠POSi’
Rule Refinement Operations
I am proud of you – Estoy orgullosa tuEstoy orgullosa de ti
PP PREP NP
2. Correction and Error info
Interactive and Automatic Rule Refinement 91
Focus 3
Modify Add Delete Change W Order
+Wc –Wc +Wc –Wc +al –al Wi Wc Wi(…) Wc –Wc
δ≠∅ δ=∅ +rule –rule +al –al =Wi Wi Wi’ RuleLearner
POSi=POSi’ POSi≠POSi’
Rule Refinement Operations
Wally plays the guitar – Wally juega la guitarraWally toca la guitarra
I saw the woman – Vi ∅ la mujerVi a la mujer
I see them – Veo losLos veo
Interactive and Automatic Rule Refinement 92
Outside Scope of ThesisRule Refinement Operations
Modify Add Delete Change W Order
+Wc –Wc +Wc –Wc +al –al Wi Wc Wi(…) Wc –Wc
δ≠∅ δ=∅ +rule –rule +al –al =Wi Wi Wi’ RuleLearner
POSi=POSi’ POSi≠POSi’John read the book – A Juan leyó el libro∅ Juan leyó el libro
Where are you from? – Donde eres tu de?De donde eres