Development of an Intelligent Translation Memory
-
Upload
rhonda-gates -
Category
Documents
-
view
12 -
download
1
description
Transcript of Development of an Intelligent Translation Memory
Development of an Intelligent Translation Memory
MorphoLogic http://www.morphologic.hu
SZAK Publishers http://www.szak.hu
Balázs Kis ([email protected])
IKTA5-146/2002
Rome, 21 May 2003
Project Details
Duration3 March 2003 – 25 February 2005
BudgetTotal: 96,8 M HUF [387 200 €]Funding: 57,1 M HUF [228 400 €]
ConsortiumMorphoLogic Ltd. (84 %)SZAK Publishers Ltd. (16 %)Project leader: dr. Gábor Prószéky
IKTA5-146/2002
Rome, 21 May 2003
The Problem and Its Impact (1.)
Current state-of-the art translation memories store previously translated segments and
translations offer look-up for similar source segments
backed by character-based fuzzy indexes
Advantage: this is language independent, and
inexpensive to develop and support
IKTA5-146/2002
Rome, 21 May 2003
The Problem and Its Impact (2.)
Disadvantages of current TM technologies they ignore relationships between
syntactic structures, therefore long segments or those with similar
meaning or syntactic structure often stay hidden, so
many segments included in the translation memory are simply lost
IKTA5-146/2002 Rome, 21 May 2003
Rome, 21 May 2003
Before the project started...
MorphoLogic had at hand Human Language Technology modules from
morphology to every level of parsing syntax a localisation department with very specific
technological needs (still pending)
SZAK Publishers had at hand many years experience with translation and
terminology a parallel corpus of technical texts of approx.
1,5 million words (under processing for project needs)
IKTA5-146/2002
Rome, 21 May 2003
Main Objective
Development of a Translation Memory equipped with Linguistic Intelligence finding source segments based on their
grammatical similarity; making changes to stored translations
according to the current source segment
Long-term objective: an improvement in the quality of translations
and a decrease in the translation effort (time)
IKTA5-146/2002
Rome, 21 May 2003
Project Constraints
An important remark: This will be a language-
dependent translation memory(linguistic intelligence assumes language-specific HLT modules)
First phase: using English and Hungarian HLT modules
IKTA5-146/2002
Rome, 21 May 2003
Project Contents
The result is an integrated CAT tool(CAT = Computer Assisted Translation)
The tool consists of A terminology management module
(already available) A text alignment program A translation memory
IKTA5-146/2002
Rome, 21 May 2003
Project Phases
1. Planning and Specification (completed)
2. Corpus Building3. Core Research Phase:
Development of Grammatical Proximity Search and Translation Correction modules
4. Implementation of Database Engine5. Integration and Test Translation
IKTA5-146/2002
Rome, 21 May 2003
Grammatical Proximity Search
Research on Non-Exact Matching of Phrases and Sentences (this is not fuzzy!) A procedure for matching grammatical
structures normalized by means of syntactic and semantic features
Critical evaluation of some „traditional” procedures
Research on Adapting Stored Translations to current source segment
IKTA5-146/2002
Rome, 21 May 2003
A sample match
FrontPage opens the current page in Page view.
Word opens the second file in Print Layout view.
A FrontPage az aktuális oldalt a Page nézetben nyitja meg.
A Word a második fájlt a Print Layout nézetben nyitja meg.
Stored source segment
Stored translation
Current source segment recognized
Adapted translation
Traditional TMs do not find a match with the default 70% threshold!
IKTA5-146/2002
Rome, 21 May 2003
Expected Results...
Experiments start Autumn 2003
First Test Version End of 2003
IKTA5-146/2002
Rome, 21 May 2003
Further Steps
Making the tool known in Hungary and abroadImprovement of Services based on User FeedbackAddition of Further Language Pairs
IKTA5-146/2002
Rome, 21 May 2003