September 2004CSAW 20041 Extraction of Bilingual Information from Parallel Texts Mike Rosner.

September 2004 CSAW 2004 1

Extraction of Bilingual Information from Parallel Texts

Mike Rosner

Outline

• Machine Translation• Traditional vs. Statistical Architectures• Experimental Results• Conclusions

Translational Equivalence:many:many relation

SOURCE TARGET

Traditional Machine Translation

Remarks

• Character of System– Knowledge based.– High quality results if domain is well delimited.– Knowledge takes the form of specialised rules

(analysis; synthesis; transfer).• Problems

– Limited coverage– Knowledge acquisition bottleneck.– Extensibility.

Statistical Translation

• Robust• Domain independent• Extensible• Does not require language specialists• Uses noisy channel model of translation

Noisy Channel ModelSentence Translation (Brown et. al. 1990)

sourcesentence

target sentence

sentence

The Problem of Translation

• Given a sentence T of the target language, seek the sentence S from which a translator produced T, i.e.find S that maximises P(S|T)

• By Bayes' theorem P(S|T) = P(S) x P(T|S)

P(T)whose denominator is independent of S.

• Hence it suffices to maximise P(S) x P(T|S)

A Statistical MT System

Source Language

TranslationModel

P(S) * P(T|S) = P(S,T)

DecoderT S

The Three Components of a Statistical MT model

1. Method for computing language model probabilities (P(S))

2. Method for computing translation probabilities (P(S|T))

3. Method for searching amongst source sentences for one that maximisesP(S) * P(T|S)

ProbabilisticLanguage Models

• GeneralP(s1s2...sn) =P(s1)*P(s2|s1) ...*P(sn|s1...s(n-1))

• TrigramP(s1s2...sn) =P(s1)*P(s2|s1)*P(s3|s1,s2) ...*P(sn|s(n-1)s(n-2))

• BigramP(s1s2...sn) =P(s1)*P(s2|s1) ...*P(sn|s(n-1))

A Simple Alignment Based Translation Model

Assumption: target sentence is generated from the source sentence word-by-word

S: John loves Mary

T: Jean aime Marie

Sentence Translation Probability

• According to this model, the translation probability of the sentence is just the product of the translation probabilities of the words.

More Realistic Example

The proposal will not now be implemented

Les propositions ne seront pas mises en application maintenant

Some Further Parameters

• Word Translation Probability:P(t|s)

• Fertility: the number of words in the target that are paired with each source word: (0 – N)

• Distortion: the difference in sentence position between the source word and the target word: P(i|j,l)

Searching

• Maintain list of hypotheses. Initial hypothesis: (Jean aime Marie | *)

• Search proceeds interatively. At each iteration we extend most promising hypotheses with additional wordsJean aime Marie | John(1) *Jean aime Marie | * loves(2) *Jean aime Marie | * Mary(3) *

Parameter Estimation

• In general - large quantities of data• For language model, we need only source

language text.• For translation model, we need pairs of

sentences that are translations of each other.

• Use EM Algorithm (Baum 1972) to optimize model parameters.

Experiment (Brown et. al. 1990)• Hansard. 40,000 pairs of sentences = approx.

800,000 words in each language.• Considered 9,000 most common words in each

language.• Assumptions (initial parameter values)

– each of the 9000 target words equally likely as translations of each of the source words.

– each of the fertilities from 0 to 25 equally likely for each of the 9000 source words

– each target position equally likely given each source position and target length

English: notFrench Probabilitypas .469ne .460non .024pas du tout .003faux .003plus .002ce .002que .002jamais .002

Fertility Probability2 .7580 .1331 .106

English: hear

French Probabilitybravo .992entendre .005entendu .002entends .001

Fertility Probability0 .5841 .416

Bajada 2003/4

• 400 sentence pairs from Malta/EU accession treaty

• Three different types of alignment– Paragraph (precision 97% recall 97%)– Sentence (precision 91% recall 95%)– Word: 2 translation models

• Model 1: distortion independent• Model 2: distortion dependent

Bajada 2003/4

Model 1 Model 2word pairs present 244 244

word pairs identified 145 145

correct 58 77incorrect 87 68precision 40% 53%recall 24% 32%

Conclusion/Future Work

• Larger data sets• Finer models of word/word translation

probabilities taking into account– fertility– morphological variants of the same words

• Role and tools for bilingual informant (not linguistic specialist)

September 2004CSAW 20041 Extraction of Bilingual Information from Parallel Texts Mike Rosner.

Documents

Transcript of September 2004CSAW 20041 Extraction of Bilingual Information from Parallel Texts Mike Rosner.

16 March 20041 PXL-500 Installation Site Preparation.

9/1/20041 Insurance Requirements in Contracts .

BILINGUAL CHILDREN'S REPAIRS Bilingual Children's Repairs of ...

SVA Marketing Creativity Course with Charlie Rosner

ARNOLD ROSNER THE CHRONICLE OF NINE: THE TRAGEDY OF …

6 February 20041 LAN-50/500 Ethernet Communication.

Bilingual learners and bilingual education

PUBLICATIONS - JONATHAN L. ROSNERhep.uchicago.edu/~rosner/pubs.pdf · PUBLICATIONS - JONATHAN L. ROSNER Publications in Molecular Biology 1. Limited Thymidine Uptake in E. Coli due

Greg Rosner Resume - PPAI

Pronouns : By: Louis By: Louis Matias Rosner Matias Rosner Alejandro Z. Alejandro Z. By: Louis By: Louis Matias Rosner Matias Rosner Alejandro Z. Alejandro.

Greenberg Quinlan Rosner - Polling Memo - 4/18/2012 Poll

Rosner dec10

Siddhartha By Herman Hesse Translated by Hilda Rosner

Leo RosneR: A MAn of note - Hybrid Publishershybridpublishers.com.au/media/sample/AJHS Leo Rosner b.pdf · Leo RosneR: A MAn of note ... she was married with a little boy, Juziu.

Shmuel Rosner Religion and State and Distancing

September 23, 20041 Fuji Electric Ultrasonic Flowmeters.

Network layer -- May 20041 Network layer Computer Networks.

Filed: June 1 0, 2005 Brian Rosner - Stanford University

CSC110 Fall 20041 Chapter 5: Decision Visual Basic.NET.

Greenberg Quinlan Rosner - Graphs - 4/18/2012 Poll