Development of an Intelligent Translation Memory

13
Development of an Intelligent Translation Memory MorphoLogic http://www.morphologic.hu SZAK Publishers http://www.szak.hu Balázs Kis ([email protected]) IKTA5-146/2002 Rome, 21 May 2003

description

IKTA5-146/2002. Development of an Intelligent Translation Memory. MorphoLogic http://www.morphologic.hu SZAK Publishers http://www.szak.hu Balázs Kis ([email protected]). Rome, 21 May 2003. IKTA5-146/2002. Project Details. Duration 3 March 2003 – 25 February 2005 Budget - PowerPoint PPT Presentation

Transcript of Development of an Intelligent Translation Memory

Page 1: Development of an Intelligent Translation Memory

Development of an Intelligent Translation Memory

MorphoLogic http://www.morphologic.hu

SZAK Publishers http://www.szak.hu

Balázs Kis ([email protected])

IKTA5-146/2002

Rome, 21 May 2003

Page 2: Development of an Intelligent Translation Memory

Project Details

Duration3 March 2003 – 25 February 2005

BudgetTotal: 96,8 M HUF [387 200 €]Funding: 57,1 M HUF [228 400 €]

ConsortiumMorphoLogic Ltd. (84 %)SZAK Publishers Ltd. (16 %)Project leader: dr. Gábor Prószéky

IKTA5-146/2002

Rome, 21 May 2003

Page 3: Development of an Intelligent Translation Memory

The Problem and Its Impact (1.)

Current state-of-the art translation memories store previously translated segments and

translations offer look-up for similar source segments

backed by character-based fuzzy indexes

Advantage: this is language independent, and

inexpensive to develop and support

IKTA5-146/2002

Rome, 21 May 2003

Page 4: Development of an Intelligent Translation Memory

The Problem and Its Impact (2.)

Disadvantages of current TM technologies they ignore relationships between

syntactic structures, therefore long segments or those with similar

meaning or syntactic structure often stay hidden, so

many segments included in the translation memory are simply lost

IKTA5-146/2002 Rome, 21 May 2003

Rome, 21 May 2003

Page 5: Development of an Intelligent Translation Memory

Before the project started...

MorphoLogic had at hand Human Language Technology modules from

morphology to every level of parsing syntax a localisation department with very specific

technological needs (still pending)

SZAK Publishers had at hand many years experience with translation and

terminology a parallel corpus of technical texts of approx.

1,5 million words (under processing for project needs)

IKTA5-146/2002

Rome, 21 May 2003

Page 6: Development of an Intelligent Translation Memory

Main Objective

Development of a Translation Memory equipped with Linguistic Intelligence finding source segments based on their

grammatical similarity; making changes to stored translations

according to the current source segment

Long-term objective: an improvement in the quality of translations

and a decrease in the translation effort (time)

IKTA5-146/2002

Rome, 21 May 2003

Page 7: Development of an Intelligent Translation Memory

Project Constraints

An important remark: This will be a language-

dependent translation memory(linguistic intelligence assumes language-specific HLT modules)

First phase: using English and Hungarian HLT modules

IKTA5-146/2002

Rome, 21 May 2003

Page 8: Development of an Intelligent Translation Memory

Project Contents

The result is an integrated CAT tool(CAT = Computer Assisted Translation)

The tool consists of A terminology management module

(already available) A text alignment program A translation memory

IKTA5-146/2002

Rome, 21 May 2003

Page 9: Development of an Intelligent Translation Memory

Project Phases

1. Planning and Specification (completed)

2. Corpus Building3. Core Research Phase:

Development of Grammatical Proximity Search and Translation Correction modules

4. Implementation of Database Engine5. Integration and Test Translation

IKTA5-146/2002

Rome, 21 May 2003

Page 10: Development of an Intelligent Translation Memory

Grammatical Proximity Search

Research on Non-Exact Matching of Phrases and Sentences (this is not fuzzy!) A procedure for matching grammatical

structures normalized by means of syntactic and semantic features

Critical evaluation of some „traditional” procedures

Research on Adapting Stored Translations to current source segment

IKTA5-146/2002

Rome, 21 May 2003

Page 11: Development of an Intelligent Translation Memory

A sample match

FrontPage opens the current page in Page view.

Word opens the second file in Print Layout view.

A FrontPage az aktuális oldalt a Page nézetben nyitja meg.

A Word a második fájlt a Print Layout nézetben nyitja meg.

Stored source segment

Stored translation

Current source segment recognized

Adapted translation

Traditional TMs do not find a match with the default 70% threshold!

IKTA5-146/2002

Rome, 21 May 2003

Page 12: Development of an Intelligent Translation Memory

Expected Results...

Experiments start Autumn 2003

First Test Version End of 2003

IKTA5-146/2002

Rome, 21 May 2003

Page 13: Development of an Intelligent Translation Memory

Further Steps

Making the tool known in Hungary and abroadImprovement of Services based on User FeedbackAddition of Further Language Pairs

IKTA5-146/2002

Rome, 21 May 2003