The Effect of Pseudo Relevance Feedback on MT-Based CLIR Yan Qu, Alla N. Eilerman Hongming Jin,...

The Effect of Pseudo Relevance Feedback on

MT-Based CLIR

Yan Qu, Alla N. Eilerman

Hongming Jin, David A. Evans

CLARITECH Corporation

The Effect of Pseudo Relevance Feedback on MT-Based CLIR © 2000, CLARITECH Corporation 2April 12, 2000

Outline• Our approach to Cross-Language Information

Retrieval (CLIR)• Objectives of this work• Review of previous work with Pseudo Relevance

Feedback (PRF)• System diagram• Data for experiments• Error analysis of MT-based query translation• The effect of PRF on French monolingual retrieval• The effect of PRF on English-to-French cross-

language retrieval• Summary and conclusions


Our Approach to CLIR

• Used MT-based query translation to bridge the language gap

• Adapted pseudo relevance feedback to CLIR– pre-translation query expansion– post-translation query expansion– combined (pre- and post-translation) query

expansion


Objectives

• Identify factors that affect the quality of MT-based query translation

• Evaluate the effectiveness of using pseudo relevance feedback for improving CLIR performance

• Identify contexts for selecting these feedback methods


Relevance Feedback in Monolingual Retrieval

• Relevance feedback (Salton & Buckley, 1990; Evans et al., 1999)

• Pseudo relevance feedback (PRF) (Evans & Lefferts, 1994; Milic-Frayling et al., 1998)

• Both have been demonstrated to be effective in improving retrieval performance


Pseudo Relevance Feedback in CLIR

PRE-TRANSLATION

POST-TRANSLATION

COMBINED

Parallelcorpora

Carbonell et al,1997

Bilingualdictionaries

Ballesteros &Croft, 1998



MT-system ??? ??? ???


CLIR with Simple MT-based Query Translation

Queries in SL

Ranked list from TL Database

MT

Queries in TL Retrieval


CLIR with Query Expansion Before MT

Queries in SL

Retrieval


MT

Queries in TLRetrieval & Thesaurus Extraction

Thesaurus Terms in SL

& thesaurus terms in TL


CLIR with Query Expansion After MT

Queries in SL

Retrieval


MT

Queries in TL

Retrieval & Thesaurus Extraction

Thesaurus Terms in TL


MT

Queries in SL

Queries in TL

Retrieval

Ranked list from TL database

Queries in SL


MT

Retrieval

Thesaurus terms in SL

Queries in TL & thesaurus terms in TL


Queries in SL

MT

Retrieval


Queries in TL


Queries in TL & thesaurus terms in TL

Process Summary


CLARIT English NLP• Used for processing the English corpus

and the English queries

• Consists of a parser and morphological analyzer

• Uses an English lexicon and grammar to identify linguistic structures in texts

• Supplemented by a “stop word” list to filter out substantive words that are extraneous to the topics (e.g., document, relevant)


French Text Processing (Pseudo-NLP Approach)

• Goal: to obtain mostly correct phrase segmentation

• Manually constructed resources– lexicon of closed-class categories with 1081 entries– “stop word” lexicon including 525 words and their

inflected forms that are extraneous to the topics (e.g., document, pertinent)

– grammar based on the CLARIT English grammar and adapted to accommodate French categories

– no French morphological normalization


English-to-French Translation

• SYSTRAN Enterprise translation software

• Translation direction: English to French

• Client-server architecture

• Translation is a black box to our system

• No special or additional resources were used to supplement the translation process


Data Sources for Experiments

• TREC-6 CLIR track data collections provided by NIST

(Voorhees & Harman, 1998)– 250 MB collection of French SDA news

(1988-1990) from the Swiss News Agency: 141,656 documents

– 750 MB collection of English AP news (1988-1990) from the Associated Press: 242,918 documents


Topics for Experiments• TREC-6 CLIR track topics provided by

NIST (Voorhees & Harman, 1998)– 22 English topics for the English-to-French

cross-language runs– 22 French topics for the French monolingual

runs– Equivalent across languages– Prepared by humans– Composed of the title, description, and the

narrative fields


A Sample English Topic<num> Number: CL1<E-title> Waldheim Affair

<E-desc> Description:

Reasons for controversy surrounding Waldheim's World War II actions.

<E-narr> Narrative:

Revelations about Austrian President Kurt Waldheim’s participation in Nazi crimes during World War II are argued on both sides. Relevant documents are those that express doubts about the truth of these revelations. Documents that just discuss the affair are not relevant.


An Ideal French Topic

<num> Number: CL1<F-title> Affaire Waldheim

<F-desc> Description:

Raisons de la controverse à l'égard des agissements de Waldheim pendant la deuxième guerre mondiale.

<F-narr> Narrative:

Les révélations sur la participation du président autrichien Kurt Waldheim aux crimes nazis pendant la deuxième guerre mondiale font l'objet de controverses. Les documents pertinents font état de doutes sur la culpabilité de Waldheim. Les articles qui ne font que mentionner l'affaire ne sont pas valables.


CLARIT Queries• Composed of the title, description, and

the narrative fields

• Processed automatically into query vectors


Sample English Query Vector<cf="1" tf="1">waldheim affair</><cf="1" tf="1">waldheim world war ii</><cf="1" tf="1">nazi crime</><cf="1" tf="1">austrian president kurt waldheim</><cf="1" tf="1">austrian president</><cf="1" tf="1">controversy surround</><cf="1" tf="1">president kurt waldheim</><cf="1" tf="1">kurt waldheim</><cf="1" tf="3">waldheim</><cf="1" tf="1">kurt</><cf="1" tf="2">revelation</><cf="1" tf="1">austrian</><cf="1" tf="1">participation</><cf="1" tf="1">surround</><cf="1" tf="1">truth</>


Sample French Query Vector<cf="1" tf="1">crimes nazis</><cf="1" tf="1">affaire waldheim</><cf="1" tf="1">président autrichien kurt waldheim</><cf="1" tf="1">président autrichien</><cf="1" tf="1">controverses</><cf="1" tf="1">agissements</><cf="1" tf="1">kurt waldheim</><cf="1" tf="1">culpabilité</><cf="1" tf="4">waldheim</><cf="1" tf=”2">deuxième guerre mondiale</><cf="1" tf="2">deuxième guerre</><cf="1" tf="1">doutes</><cf="1" tf="1">révélations</><cf="1" tf="1">nazis</>


Topic and Query Statistics

EnglishTopics

IdealFrenchTopics

Avg. # ofWords

43 51.3

Avg. # ofTerms

17.3 15.4


Evaluation

• Relevance judgments on the French SDA news, prepared by NIST judges (TREC-6)

• Evaluation measures:– eleven-point average precision (N=1000

documents)– precision at low recall levels (10,

20, and 100 documents)– recall– exact precision


English-to-French Retrieval vs. French Monolingual Retrieval

(without PRF)F-nf EF-nf Percentage

(baseline) of baseline

RelRetDocCount 1006 845 84%Recall 0.7306 0.6137 84%Average Precision 0.2548 0.1862 73%Precision at 10 Docs 0.3727 0.3136 84%Precision at 20 Docs 0.3386 0.2818 83%Precision at 100 Docs 0.1909 0.1618 85%Exact Precision 0.3143 0.2213 70%


Types of Translation Errors• E1: missing translation of an English term• E2: unnecessary translation of a borrowed

English term• E3: wrong sense disambiguation• E4: wrong sense disambiguation caused by

removed capitalization• E5: word-by-word translation of a multiword

(idiomatic) term• E6: wrong phrase construction• E7: broken phrase


Error Type 1: Missing Translation

English: agencies’

Ideal French translation: (des) agences

MT output: (d’)agencies


Error Type 2: Unnecessary Translation

English: fast food

Ideal French translation: fast food

MT output: aliments de préparation rapide (food of fast preparation)


Error Type 3: Wrong Sense Disambiguation

English: logging

Ideal French translation: déforestation (deforestation)

MT output: notation (notation)


Error Type 4: Wrong Disambiguation Caused

by Removed CapitalizationEnglish: aids (AIDS)

Ideal French translation: sida (SIDA “AIDS”)

MT output: aides (assistants)


Error Type 5: Word-by-Word Translation of a Multiword Idiomatic Term

English: death penalty

Ideal French translation: la peine de mort

MT output: la pénalité de la mort


Error Type 6: Wrong Phrase Construction

English: austrian president kurt waldheim’s participation

Ideal French translation: la participation du président autrichien kurt waldheim

MT output: la participation autrichienne de waldheim de kurt de président


Error Type 7: Broken Phrase

English: sex education

Ideal French translation:

éducation sexuelle

MT output: éducation de sexe


Error DistributionsWrong sense disambiguation

0

5

10

15

20

25

E1 E2 E3 E4 E5 E6 E7

Error Type

Fre

qu

ency

Frequency

Word-by-word translationWrong phrase construction

Broken phrases


The Effect of PRF on French Monolingual Retrieval

F-nf F-prf Increase(baseline)

RelRetDocCount 1006 1147 14%Recall 0.7306 0.8330 14%Average Precision 0.2548 0.2968 16%Precision at 10 Docs 0.3727 0.4273 15%Precision at 20 Docs 0.3386 0.3523 4%Precision at 100 Docs 0.1909 0.2236 17%Exact Precision 0.3143 0.334 6%


EF-nf EF-pf-pre Increase EF-pf-post Increase EF-pf-comb Increase(baseline)

RelRetDocCount 845 1010 19.5% 1010 19.5% 1047 23.9%Recall 0.6137 0.7335 19.5% 0.7335 19.5% 0.7603 23.9%Average Precision 0.1862 0.2099 12.7% 0.2392 28.5% 0.2176 16.9%Precision at 10 Docs 0.3136 0.3455 10.2% 0.3409 8.7% 0.3455 10.2%Precision at 20 Docs 0.2818 0.2977 5.6% 0.3023 7.3% 0.3045 8.1%Precision at 100 Docs 0.1618 0.1864 15.2% 0.1973 21.9% 0.1864 15.2%Exact Precision 0.2213 0.2552 15.3% 0.2582 16.7% 0.2617 18.3%

The Effect of PRF on English-to-French Retrieval


English-to-French Retrieval vs French Monolingual Retrieval

(with PRF)F-prf EF-prf-pre Percentage EF-prf-post Percentage EF-prf-comb Percentage

(baseline) of baseline of baseline of baseline

RelRetDocCount 1147 1010 88% 1010 88% 1047 91%Recall 0.8330 0.7335 88% 0.7335 88% 0.7603 91%Average Precision 0.2968 0.2099 71% 0.2392 81% 0.2176 73%Precision at 10 Docs 0.4273 0.3455 81% 0.3409 80% 0.3455 81%Precision at 20 Docs 0.3523 0.2977 85% 0.3023 86% 0.3045 86%Precision at 100 Docs 0.2236 0.1864 83% 0.1973 88% 0.1864 83%Exact Precision 0.334 0.2552 76% 0.2582 77% 0.2617 78%


0.00.1

0.20.30.4

0.50.60.7

0.80.9

AverageRecall

AveragePrecision

ExactPrecision

Precision at100 Docs

F-nf

F-pf

EF-nf

EF-pf-pre

EF-pf-post

EF-pf-comb

Cross-Language Retrieval vs. Monolingual Retrieval


Cross-Language Retrieval vs. Monolingual Retrieval


-0.2000-0.1500

-0.1000-0.05000.00000.0500

0.10000.15000.20000.2500

0.30000.3500

query

pf-pref

pf-post

pf-comb

Performance of Different PRF Methods

Topic 1009: Effects of logging

Topic 1016: Tuberculosis


Topic 1009 “Effects of Logging”• Key concept lost due to wrong sense disambiguation (E3 error):

logging (felling trees) notation (notation)

• Pre-translation feedback – neutralized the effect of the translation error by bringing useful

thesaurus terms (tropical forest, tree, earth, sea, ocean, land, atmosphere, carbone dioxide, ozone depletion, greenhouse effect, global warming, destruction, pollution, damage, environment, environmentalist, conference, organization, world, nation, country).

Result: 688% increase in average precision

• Post-translation feedback – returned some useful terms– introduced noise caused by the wrong translation of logging


• Combined feedback – created a strong base query prior to translation – further improved it with appropriate terms after translation– avoided too much noise



Topic 1016 “Tuberculosis”• Key term is translated correctly: tuberculosis tuberculose

• Translation errors affected some important terms: aids (AIDS) aides (assistants) third-world (countries) le troisième-monde (the third world)

• Pre-translation and combined feedback created additional sources of errors and noise by introducing– ambiguous thesaurus terms (cases, tests), which were

mistranslated (caisse instead of cas, essai instead of test)

– acronyms (AIDS, CDC, HIV), either mistranslated or not translated

Result: 29-30% decrease in average precision• Post-translation feedback compensated for translation errors

by bringing– correct terms (SIDA “AIDS”, tiers monde “third world)– additional useful terms (bacille, tuberculeux, virus, infectées,

maladie, risque, santé, problème, etc.) Result: 32% increase in average precision


Performance of Different PRF Methods

PRE-TRANSLATION

FEEDBACK

POST-TRANSLATION

FEEDBACK

COMBINED

FEEDBACK

NUMBEROF

QUERIES+ + + 9

+ – + 2

+ – – 1

– + + 1

– + – 8

– – – 1


Decision Tree for Selecting PRF Methods

Most key terms translated correctly

Yes No

All three methods behave similarly;generally improveretrieval performance

Lost of meaning compensated by

context terms

Yes No

Post-MT feedback is generally betterthan pre-MT and combined feedback

Pre-MT and combined feedback are generally better than post-MT feedback


Summary• Adopted pseudo relevance feedback for

query expansion in CLIR with MT-based query translation

• Conducted analysis of translation errors

• Evaluated empirically the effect of three feedback methods on retrieval performance

• Examined contexts where different feedback methods are effective


Conclusions• Wrong sense disambiguation and inappropriate

translation of multi-word terms are the most frequent translation errors when using MT.

• All feedback methods demonstrated significant performance improvement in CLIR compared with not using feedback.

• The use of PRF in general helps to reduce the negative effect of translation errors.

• Post-translation feedback generally outperforms pre-translation and combined feedback.

• The effectiveness of different feedback methods depends on the types of translation errors and the relative importance of the terms affected by these errors.


Future Work

• Investigate the effect of query length

• Investigate the effect of context

• Develop measures to evaluate the original query quality

• Develop measures to evaluate the translated query quality

• Investigate the empirical conditions for selecting different feedback methods


The End

The Effect of Pseudo Relevance Feedback on MT-Based CLIR Yan Qu, Alla N. Eilerman Hongming Jin,...

Documents

Transcript of The Effect of Pseudo Relevance Feedback on MT-Based CLIR Yan Qu, Alla N. Eilerman Hongming Jin,...