111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical...

110
1 1 1 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco Guzmán CS Department Tecnológico de Monterrey 24 – November 2011 Guzman 2011

Transcript of 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical...

Page 1: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

1

Guzman 2011

11

The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation

A doctoral dissertation by:

Francisco Guzmán

CS Department

Tecnológico de Monterrey

24 – November 2011

Page 2: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

2

Guzman 2011

In a world of increasing information …

Information on the Internet growing exponentially!

More content: News, blogs, status updates, tweets, etc.

24 – November 2011

Page 3: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

3

Guzman 2011

… and many languages …

4000-5000 different languages in the world

Access to information is limited by language barrier.

24 – November 2011

Page 4: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

4

Guzman 2011

… we need Machine Translation

as a quick and cheap mean to perform translation.

24 – November 2011

Page 5: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

5

Guzman 2011

Pop quiz: what is she saying?

Options

A) I had a big sandwich for lunch

B) I’ve had enough with Greece’s Papandreu

C) We have a huge problem with European debt

D) Europe is in a very difficult situation

24 – November 2011

Page 6: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

6

Guzman 2011

Online translators are examples …

24 – November 2011

Page 7: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

7

Guzman 2011

… of Machine Translation …

24 – November 2011

Page 8: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

8

Guzman 2011

… that use statistical methods …

MachineTranslatio

n

Statistical

Rule-based

Example-based

Started in 80’s (IBM Candide).

Statistical Analysis of bilingual texts.

Not bound to source/target language.

Has proven to be very effective. As long as we have

enough training data.

24 – November 2011

Page 9: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

9

Guzman 2011

… to get the best translation

Model the probability of a source languagesentence f of being translated into target

language sentence e

24 – November 2011

Page 10: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

10

Guzman 2011

… to get the best translation

Translation

Decoder

Translation

Model

LanguageModel

Language Model = fluency

Translation Model = fidelity

Decoder = search for optimal

24 – November 2011

Page 11: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

11

Guzman 2011

Yet better translators are necessary

Google Music competirá con la tienda dominante, iTunes de Apple, y otros servicios digitales de música.

La tienda de Google venderá canciones por un precio estimado a un dólar la canción, informó el Wall Street Journal.

Google Music Store to compete with dominant, Apple iTunes and other digital music services.

Google's store sold songs for an estimated price for a dollar a song, the Wall Street Journal.

24 – November 2011

Page 12: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

12

Guzman 2011 24 – November 2011

This is a

story

Page 13: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

13

Guzman 2011

This talk outline:

24 – November 2011

0 -Motivation

1- Word Alignments and SMT

· Quality discrepancy· Partial explanations· Hypothesis and objectives

2- Alignments and the translation model

3- The translation model and translation quality.

4- Improving SMT using alignment structure

Conclusions

Page 14: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

14

Guzman 2011

Word AlignmentsAnd Phrase-based SMT

Je ne bois pas du lait

I don’t drink milk

veux

want

24 – November 2011

Page 15: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

15

Guzman 2011

Phrase-Based SMT …

Phrases are chunks of words.

The idea is that phrases move as units during the translation process.

Phrases capture some contextual information.

There is much less reordering to do.

24 – November 2011

Page 16: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

16

Guzman 2011

… is based on word alignments.

Preprocessing

Word Alignments

Phrase Extraction

Translation Model

24 – November 2011

Page 17: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

17

Guzman 2011

Word Alignments …

Represent word-to-word (lexical) translations

La casa es blanca

The house is whiteJe ne veux pas du lait

I don’t want milk

24 – November 2011

Page 18: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

18

Guzman 2011

… are important …

How phrases are extracted (which ones, how many, etc).

How phrases are scored.

Certain features (lexical).

Preprocessing

Word Alignments

Phrase Extraction

Translation Model

24 – November 2011

Page 19: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

19

Guzman 2011

… increasing its quality …

There are three basic quantities that we can measure: true positives (tp) (matches) false positives (fp) or error type I false negatives (fn) or error type

II

There are several metrics to measure the alignment quality Precision Recall F-score AER (Alignment Error Rate)

Tp: 1Fn:2Fp: 1

24 – November 2011

Page 20: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

20

Guzman 2011

.. has promoted developments …

Concern about improving alignment quality.

The recent availability of human annotated data.

Development of discriminative approaches, that use alignment quality as tuning metric. Moore (2005); Taskar et al. (2005); Blunsom and Cohn (2006); Niehues and Vogel (2008)

Under the presumption that better alignments meant better translations.

24 – November 2011

Page 21: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

21

Guzman 2011

… to improve translation?

Despite the improvements in alignment quality,

Translation quality improvements remained small.

Some studies started to look into this phenomenon.

Lopez and Resnik, 2006

24 – November 2011

Page 22: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

22

Guzman 2011

Looking into detail …

Alignment Quality vs Translation Quality Fraser and Marcu (2006) Vilar et al. (2006)

Alignment Quality vs Translation Pipeline Lopez et al. (2006) Ayan and Dorr (2006)

Alignment Structure vs Translation Lambert et al. (2009,2010) Guzman et al. (2009)

24 – November 2011

Page 23: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

23

Guzman 2010

… the performance mismatch

Translation Performance achieved by automatic metrics by comparing to a set of references. Bi Lingual Evaluation Understudy (BLEU) (widely used)

Alignment Performance obtained by comparing to a human generated reference Alignment Error Rate (AER) and F-measure (widely used)

23

Page 24: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

24

Guzman 2011

Alignment Quality vs. Translation

24 – November 2011

Page 25: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

25

Guzman 2011

AQ vs TQ

Fraser and Marcu (2007)

Evaluated correlation between BLEU and AER.

Variation of F-measure, balance precision, recall.

Vilar and Ney (2006)

Better BLEU scores can be obtained with “degraded” alignments.

Mismatch between alignment and translation models.

Metrics fail because no structure.

AER != BLEU

It is important to have better alignment metrics

v AER = ^ BLEU

We need to regard not only alignment quality but also structure

24 – November 2011

Page 26: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

26

Guzman 2011

Alignment Quality in the Pipeline

24 – November 2011

Page 27: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

27

Guzman 2011

AQ vs the pipeline

Ayan and Dorr (2006)

Analyze the quality of the alignments and resulting phrase tables.

In-depth analysis of phrase table coverage.

Lopez and Resnik (2006)

Compare effect of different alignments in decoding.

Analyze variations in the decoder search space due to variations in alignment quality.

We have to study the effects of alignments on the Translation Model

We need better feature engineering.

24 – November 2011

Page 28: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

28

Guzman 2011

Alignment Structure vs. Pipeline

24 – November 2011

Je ne veux pas du lait

I don’t want milk

Target Gaps

SourceGaps

DiagonalityCrossings

Links

Page 29: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

29

Guzman 2011

AS vs the pipeline

Lambert et al. (2009, 2010) Effect of Number of Links in phrase table size and

ambiguity. Analysis of link length, crossings, etc. Bivariate correlation analysis.

If we study alignment structure we find interesting relationships

24 – November 2011

Page 30: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

30

Guzman 2011

The story behind …

It is important to have better alignment metrics

We need to regard not only alignment quality but also structure

We have to study the effects of alignments on the Translation Model

We need better feature engineering.

If we study alignment structure we find interesting relationships

24 – November 2011

Page 31: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

31

Guzman 2011

… what we need to build …

More inclusive analysis Using quality AND structure. Several training stages involved. Multivariate approach.

Predictive models Identify most relevant variables. Help us to design better features.

Page 32: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

32

Guzman 2011

… our hypothesis …

Alignment structure has a large impact on how a translation model is estimated. Hence, it

should also have a large impact on Machine Translation performance. Thus, by controlling the impact of alignment structure we will be able to improve Machine Translation performance.

24 – November 2011

Page 33: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

33

Guzman 2011

… which lead to objectives …

Analyze the impact of alignment structure at different stages of the training pipeline

Provide models that measure the impact of alignment structure in phrase-based translation model estimation

Provide a model that measure the impact of alignment structure and translation model in translation quality

Use alignment structure to better alignment training and better translation modeling to improve machine translation performance

24 – November 2011

Page 34: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

34

Guzman 2011

This talk outline:

24 – November 2011

0 -Motivation

1- Word Alignments and SMT

· Quality discrepancy· Partial explanations· Hypothesis and objectives

2- Alignments and the translation model

3- The translation model and translation quality.

4- Improving SMT using alignment structure

Conclusions

Page 35: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

35

Guzman 2011

Effects of Alignments In the translation model

24 – November 2011

Page 36: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

36

Guzman 2011

Alignment and the TM

Preprocessing

Word Alignments

Phrase Extraction

Translation Model

24 – November 2011

Page 37: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

37

Guzman 2011

Phrase Extraction (PX)

Phrases are extracted up to a max length N

Consistency is defined as follows: Any phrase pair must contain at least one link. Any word inside the phrase-pair must be exclusively linked to

words inside the same phrase-pair.

Extract all phrases that are consistent with the word alignment.

24 – November 2011

Page 38: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

38

Guzman 2011

Phrase Extraction (up to length 4)

the | la

house | casa

is | es

white | blanca

the house | la casa

house is | casa es

is white | es blanca

the house is | la casa es

house is white | casa es

blanca

the house is white | la casa

es blanca

24 – November 2011

Page 39: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

39

Guzman 2011

Consistency

24 – November 2011

Page 40: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

40

Guzman 2011

The effect of alignments in PX

Objective: determine which characteristic was more relevant Quality? Structure?

Analyzed different types of Chinese – English Discriminative (DWA-1,DWA-2, … , DWA-9) Generative (GIZA- S2T, GIZA- T2S) Heuristic (SYM)

Diversity of balance precision/recall between alignments.

24 – November 2011

Page 41: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

41

Guzman 2011

Alignment Density metrics

Literature

Avg. number of links

Ours

Avg. Number of source and target gaps

Gap rates

24 – November 2011

Page 42: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

42

Guzman 2011

What do we mean by gaps?

In this phrase pair:1 gap on the target phrase

In this phrase pair:2 gaps on the source phrase

Alignment matrix

24 – November 2011

Page 43: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

43

Guzman 2011

Phrase- pair metrics

Quantitative Number of

phrase pairs Singletons

(unique entries) Phrase length Gaps (unaligned

words inside phrase pair)

Quality Manual

Evaluation

24 – November 2011

Page 44: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

44

Guzman 2011

Number of Phrases

PT grows as our alignment gets sparser

Related to unaligned words rather than number of links

0

10,000

20,000

30,000

40,000

50,000

60,000

70,000

80,000

90,000

100,000

0

100,000

200,000

300,000

400,000

500,000

600,000

700,000

800,000

900,000

Number of generated phrase pairs

Phrases

Links

Alignments

num

ber

of

links

num

ber

of

phra

ses

24 – November 2011

Page 45: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

45

Guzman 2011

Number of Phrases

0

5,000

10,000

15,000

20,000

25,000

30,000

0

100,000

200,000

300,000

400,000

500,000

600,000

700,000

800,000

900,000Number of generated phrase pairs

Phrases

Unaligned Source

Unaligned Target

Alignments

Num

ber

of

Unaligned W

ord

s

num

ber

of

phra

ses

PT grows as our alignment gets sparser

Related to unaligned words rather than number of links

24 – November 2011

Page 46: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

46

Guzman 2011

Human Evaluation of Phrase Pairs

Setup: Bilingual Chinese-English Speakers Each subject was asked whether a phrase pair was

adequate No contextual information Included a noisy input

YES

NO

24 – November 2011

Page 47: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

47

Guzman 2011

Results

Most dense alignments fare better.

Gaps 3 times more errors for Hand Aligned data

Random pairings are usually bad

24 – November 2011

HA

-no

gap

HA

-ga

p

Ra

ndom

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

92

76

8

8

24

92

Adequacy by Alignments

NoYes

Alignments

Tes

t C

ase

s (a

de

qu

acy

)

Page 48: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

48

Guzman 2011

Summary

High precision alignments

More gaps

Less links

More phrase pairs

More unique phrase pairs

More gaps in phrases

Longer phrases

More coverage

More TM sparity

Use less phrases

Lower quality

phrase pairs

24 – November 2011

Page 49: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

49

Guzman 2011

Word alignments and TM

Preprocessing

Word Alignments

Phrase Extraction

Translation Model

24 – November 2011

Page 50: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

50

Guzman 2011

Going further: Translation Model

More than just phrase-pairs

Translation probability features Phrasal phi(e|f) (PT1) Lexical lex(e|f) (PT2) Inverse phrasal phi(f|e) (PT3) Inverse lexical lex(f |e) (PT4)

Estimated using MLE

24 – November 2011

Page 51: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

52

Guzman 2011

Predicting TM characteristics

Setup Different alignments Resampling Tested on unseen Es-En, Ar-En, Ch-En

Methodology Get best model, using multivariate linear regression Report R2 on unseen data

24 – November 2011

Page 52: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

53

Guzman 2011

Variables

Alignment

Quality F-measure, Precision

Recall

Structure Density: Links, Gaps Distortion: Crossings,

Rel. Distortion, Diagonality.

Phrase-table (TM)

Entries Number of entries

TM Features Average entropy

Alignment Density Distortion

Phrase-length

24 – November 2011

Page 53: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

54

Guzman 2011

New variables

Literature

Alignment distortion Rel. distortion Crossings

Translation model Phrase Length Entries

Ours

Alignment distortion Diagonality

Translation model Avg. feature entropy

24 – November 2011

Page 54: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

55

Guzman 2011 24 – November 2011

Page 55: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

56

Guzman 2011

Number of entries

PNE

Target gaps

Link density

Diagonality

train Es Ar Ch0.84

0.86

0.88

0.9

0.92

0.94

0.96

0.98

R2

24 – November 2011

Page 56: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

57

Guzman 2011

Phrasal Entropy (phi)

Inv. Phi (f|

e)

Source Gaps

Phi (e|f)

Target gaps

24 – November 2011

Page 57: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

58

Guzman 2011

Lexical Entropy (lex)

Inv Lex(f|e)

Diag.

Lex (e|f)

Diag.

24 – November 2011

Page 58: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

59

Guzman 2011

Model summary

Gaps

Links TM Size

Source Phrase Length

Coverage

Phrasal Entropy

Use less phrases

Diag.

More TM sparsity

Lexical Entropy

More TM sparsity

24 – November 2011

Page 59: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

60

Guzman 2011

Summary

Alignment structure has impact of the translation model

Most of the translation model features can be predicted knowing the alignment characteristics Size of phrase-table Average length Phrasal feature entropy

Most relevant alignment characteristics Links Gaps Diagonality

24 – November 2011

Page 60: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

61

Guzman 2011

This talk outline:

24 – November 2011

0 -Motivation

1- Word Alignments and SMT

· Quality discrepancy· Partial explanations· Hypothesis and objectives

2- Alignments and the translation model

3- The translation model and translation quality.

4- Improving SMT using alignment structure

Conclusions

Page 61: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

62

Guzman 2011

Are you still with me?

24 – November 2011

Page 62: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

63

Guzman 2011

Predicting MT performance

24 – November 2011

Page 63: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

64

Guzman 2011

Our goal

To investigate which features from our PT (special focus on alignment) and first best translations help to predict translation quality (BLEU, METEOR, TER) score.

Build predictive multivariate regression models

Test robustness of our prediction models

24 – November 2011

Page 64: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

65

Guzman 2011

Alignment structure vs. translation

No easy way to measure

Problem, TM is very large!

input decoder

TM

translation

AlignmentInfo

24 – November 2011

Page 65: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

66

Guzman 2011

Filter TM to input

Doc 1

Doc 2

Doc N

Translation tasks

Translation Model

TM2 TM2

Filtering

To analyze the available translation options at decoding time

24 – November 2011

Page 66: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

67

Guzman 2011

Experimental Setup

7 Different types of alignments DWA{4,5,6,7} GROW-DIAG(-FINAL)(-AND)

Sampling from different document sets Created small translation tasks (100 sentences each) 24 for En-Es train 8 for En-Es test 4 Different docs for Ar-En, Ch-En

24 – November 2011

Page 67: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

68

Guzman 2011

DWA-4

GROW-DIAG

Doc 1

Doc 2

Doc N

GD1 GD2

Filtering

Translation models

Translation tasks

FB-GD1

FB-GD2

Translation

Doc 1

Doc 2

Doc N

references

Evaluation

BLEU 2

FB Measurements

PT Measurements

FB var2

PT var2

MODEL

1

2

3

4

5

24 – November 2011

Page 68: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

69

Guzman 2011

Variables to measure

Phrase Table variable

First Best Hypotheses

Phrase Table entries PSU Src Unique (%)  

PTU Tgt Unique (%)

PNE Number of entriesAlignment Density

 PSG Source Gaps FSG Source Gaps

PTG Target Gaps FTG Target Gaps

PLK Link density FLK Link densityAlignment Dimension

 PSL Source length FSLP Source length

PTL Target length FTLP Target lengthAlignment Distortion

 PCR Crossings FCR Crossings

PDG Diagonality FDG Diagonality

PDT Relative distortion FDT Relative distortion

Translation Model Features 

PT1 P(f|e) avg entropy FT1 P(f|e) avg cost

PT2

lex(f|e) avg entropy FT2

lex(f|e) avg cost

PT3 P(e|f) avg entropy FT3 P(e|f) avg cost

PT4

lex(e|f) avg entropy FT4

lex(e|f) avg cost

Language Model   FLM LM costTranslation quality BLEU Bleu

MET Meteor

TER TER

24 – November 2011

Page 69: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

70

Guzman 2011

Modeling issues

Specification Many features to include. Which ones?

Estimation Dealing with large number of features

Feature reduction Feature selection Regularization

Stepwise regression

24 – November 2011

Page 70: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

71

Guzman 2011

Methodology

Stepwise Regression (Search)(Hair et al,2010) Start with an empty base model Build a regression model with each one of the possible

predictors Add the most significant predictor to the base model Start over with the remaining predictors. Continue until no other significant predictors can be

added to the model. At any time: Discard any predictor that becomes

irrelevant after adding latest predictor.

24 – November 2011

Page 71: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

72

Guzman 2011

RESULTS

General models BLEU METEOR TER

Targeted for Easy translation tasks: BLEU METEOR TER

Targeted for Hard translation tasks BLEU METEOR TER

24 – November 2011

Page 72: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

73

Guzman 2011

BLEU

High determination coefficient (>50%) even for unseen data.

Features helpful given language Ar-En PT4, PT2, FLM Ch-En PT4, PNE, PT2,

FLM Es-En (all)

24 – November 2011

BLEU

Entr. Inv. Lexical

features

Entr. Lexical

features

TM Size Hyp. Target gaps

Hyp. LM Cost

Page 73: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

74

Guzman 2011

Other metrics

METEOR

Similar to BLEU

Lower percentage of variance explained (~40%)

TER

Simpler model (no FLM)

Inverse coefficients

Harder to predict (~30%)

24 – November 2011

Page 74: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

75

Guzman 2011

Summary

Predictive models for translation are language dependent

General models => Rely heavily on translation model Entries Lexical entropy

Targeted models => Rely on hypothesis characteristics Language model Translation costs

Target gaps = bad translations

24 – November 2011

Page 75: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

76

Guzman 2011

Summary

TM size

Lexical entropy

Translation Quality

Hyp. Target gaps

Inv. Lexical entropy

LM cost

Gaps

Links

Diag.

24 – November 2011

Page 76: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

77

Guzman 2011

Controlling the effects of structure

TM size

Translation Quality

Hyp. Target gaps

Gaps

Links

24 – November 2011

Page 77: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

78

Guzman 2011

This talk outline:

24 – November 2011

0 -Motivation

1- Word Alignments and SMT

· Quality discrepancy· Partial explanations· Hypothesis and objectives

2- Alignments and the translation model

3- The translation model and translation quality.

4- Improving SMT using alignment structure

Conclusions

Page 78: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

79

Guzman 2011

Improving Machine TranslatonUsing Word Alignment Structure

24 – November 2011

Page 79: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

80

Guzman 2011

At two different stages

Preprocessing

Word Alignments

Phrase Extraction

Translation Model

24 – November 2011

At the translation model stageCreate new features that incorporate alignment gaps

At the alignment training stage: Include more alignment gaps into the the training metrics

Page 80: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

81

Guzman 2011

Alignment Metrics

Traditional metrics focused on “positive links” Precision Recall F measure

Our approach Focus on positive null links. Focus on positive and negative links.

24 – November 2011

Page 81: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

82

Guzman 2011

F0: Including gaps

We take into account gaps or null alignments into the computation of F-measure

24 – November 2011

Page 82: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

83

Guzman 2011

Balanced Accuracy

Take into account the ‘true’ negatives

Balance between precision and specificity

24 – November 2011

Page 83: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

84

Guzman 2011

Alignment Experimental Results

Tuning 200 training examples Spanish English

Test 220 alignments Spanish English

Systems Baseline (DWA-F) DWA-F0 DWA-BA P F F0 BA

60%

65%

70%

75%

80%

85%

90%

Test results by tuning metric

FF0BA

Alignment Evaluation on test data

Training metric

24 – November 2011

Page 84: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

85

Guzman 2011

Alignment Structure

ATG

0.000 0.010 0.020 0.030 0.040 0.050 0.060

ALK

0.000 0.200 0.400 0.600 0.800 1.000 1.200 1.400 1.600

Bacc F0

F HA

PNE (M)

0 5 10 15 20 25

24 – November 2011

Target Gaps (avg)

Links (avg)

Model Size(millions of entries)

Page 85: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

86

Guzman 2011

Translation Results

DWA-F DWA-5BA DWA-F035.0%

35.5%

36.0%

36.5%

37.0%

BLEU NC07BLEU NC08

24 – November 2011

Page 86: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

87

Guzman 2011

New metrics summary

Different tunings provide different results

Tuning towards F0 provides best results Most precise alignment Larger phrase-table Best quality

Tuning towards BA provides most human-like structure More compact alignments Fewer translation options.

24 – November 2011

Page 87: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

88

Guzman 2011

Alignment structure in translation

Target gaps were an indicator of bad quality.

Can we improve translation using that information?

24 – November 2011

Page 88: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

89

Guzman 2011

Target Gap Feature

Include gap counts as a feature in translation model

24 – November 2011

Page 89: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

90

Guzman 2011

Experimental Setup

7 Systems 4 Based on discriminative alignments (DWA-4, DWA-5,

DWA-6, DWA-7) 3 Based on heuristic symmetrization (GD, GDF, GDFA)

Training (Spanish- English) Europarl, UN, News commentary. (WMT10 train) About 8 Million training sentences.

24 – November 2011

Page 90: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

91

Guzman 2011

Experimental Setup

7 Different test sets (1 ref) 4 In-domain

Europarl Proceedings (WMT06,WMT07,WMT08) Acquis Communautaire (AC)

3 Out of domain News based (NW09, NW10, SC09)

Two settings Baseline (canonical features) +target gap feature

24 – November 2011

Total 15623 sentences (test time approx. 6hr decoding per system, multithreaded)

Page 91: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

92

Guzman 2011

General results

24 – November 2011

Page 92: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

93

Guzman 2011

Translation gains by task

Not so beneficial for out of domain

Larger improvements for in-domain data

24 – November 2011

Page 93: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

94

Guzman 2011

Translation gains by system

Less Gaps

More Gaps

24 – November 2011

Page 94: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

95

Guzman 2011

T-gap feature summary

Improves translation estimation In most cases, best translations system+gaps Originally liability, gaps are turned to advantage

Target gap feature useful When dealing with in-domain data When we have more target gaps

24 – November 2011

Page 95: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

96

Guzman 2011

Using limited training data

Use source and target gap features

Translate Ch-En data.

Different systems DWA(0.1 -0.9), SYM

Test sets (4 refs): News Web blogs

Conditions Baseline System + gap features

24 – November 2011

Page 96: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

97

Guzman 2011

News Test

DW

A-0

.1

DW

A-0

.2

DW

A-0

.3

DW

A-0

.4

DW

A-0

.5

DW

A-0

.6

DW

A-0

.7

DW

A-0

.8

DW

A-0

.9

SY

M

21

21.5

22

22.5

23

23.5

24

24.5

25

25.5

NewsWire

BaselineUnalign-Feat

BL

EU

Gapfeatures yieldbest results

24 – November 2011

Page 97: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

98

Guzman 2011

Web blogs

DW

A-0

.1

DW

A-0

.2

DW

A-0

.3

DW

A-0

.4

DW

A-0

.5

DW

A-0

.6

DW

A-0

.7

DW

A-0

.8

DW

A-0

.9

SY

M

18

18.5

19

19.5

20

20.5

21

21.5

22

22.5

23

Web

BaselineUnalign-Feat

BL

EU

Larger improvements2bp

24 – November 2011

Page 98: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

99

Guzman 2011

T-gap + S-gap summary

Using both gaps can improve translation

Very useful in limited training

Chinese-English task up to 2BP of improvement

24 – November 2011

Page 99: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

100

Guzman 2011

This talk outline:

24 – November 2011

0 -Motivation

1- Word Alignments and SMT

· Quality discrepancy· Partial explanations· Hypothesis and objectives

2- Alignments and the translation model

3- The translation model and translation quality.

4- Improving SMT using alignment structure

Conclusions

Page 100: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

101

Guzman 2011

Conclusions

24 – November 2011

Page 101: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

102

Guzman 2011

Revisiting hypothesis

Alignment structure has a large impact on how a translation model is estimated. Hence, it

should also have a large impact on Machine Translation performance. Thus, by controlling the impact of alignment structure we will be able to improve Machine Translation performance.

24 – November 2011

Page 102: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

103

Guzman 2011

Break apart

Alignment structure has a large impact on how a translation model is estimated Yes, many features of the translation model can be determined

knowing the alignment

Hence, it should also have a large impact on Machine Translation performance. Yes, the size of translation model is a large contributor to

quality. Also target gaps

By controlling the impact of alignment structure we will be able to improve Machine Translation performance Yes, at two stages. Alignment training and TM features.

24 – November 2011

Page 103: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

104

Guzman 2011

Objectives => Contributions

Analyze the impact of alignment structure at different stages of the training pipeline

Provide models that measure the impact of alignment structure in phrase-based translation model estimation

Provide a model that measure the impact of alignment structure and translation model in translation quality

Use alignment structure to better alignment training and better translation modeling to improve machine translation performance

24 – November 2011

Page 104: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

105

Guzman 2011

Future Work

Couple new alignment metrics + new decoding features Interactions

Explore use of alignment distortion (e.g. diagonality) as decoding features

Explore other model specification alternatives

Use hierarchical models to model dependencies between AL => TM => TQ

24 – November 2011

Page 105: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

106

Guzman 2011

END

24 – November 2011

Page 106: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

107

Guzman 2011 24 – November 2011

I took the sea bass and fried it with the special sauce

Tomé la lubina y la freí con la salsa especial

http://www.1-800-translate.com/machine_transhttp://www.ackuna.com/badtranslatorhttp://translationparty.com/

Page 107: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

108

Guzman 2011 24 – November 2011

Page 108: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

111Guzman 201124 – November 2011

Page 109: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

112

Guzman 2011

Stop criteria

Suggested conservative thresholds (Hair, 2010)

Partial F-statistics P-value enter: 0.01 P-value out: 0.05

Yield most compact models

Also to block against spurious effects (capitalization by chance), high collinearity, we repeated procedure blocking original variables, and checking on a Spanish CV set.

24 – November 2011

F[N-1,N-n-1]=(1-R2)(N-p-1) (1-R’2)(N-p-2)

Page 110: 111 The Impact of Statistical Word Alignment Quality and Structure in Phrase-based Statistical Machine Translation A doctoral dissertation by: Francisco.

113

Guzman 2010

BLEU: Translation Quality

Bi Lingual Evaluation Understudy.

Widely used.

Ranks from 0 to 1.

Compares n-grams from the candidate translations with the references translations.

Precision oriented.

Brevity penalty (to avoid too short translations).

Ranges of acceptable BLEU differ depending on the task.

113