Download - ELIA Lyon 2015 Static

Transcript

© 2015 Autodesk

Past, Present and Future ofLanguage Technology forLocalisation at AutodeskDr. Ventsislav Zhechev [email protected]

Autodesk Development Sàrl, Neuchâtel, Switzerland

16 April 2015

© 2015 Autodesk

■ Early adopters of MT for post-editing■ around 2001 with Systran

Rule-based Machine Translation■ pilot for FIGS■ discontinued after several months—

deemed too expensive and immature

Past

© 2015 Autodesk

■ Reintroduction of MT for post-editing■ in 2009, using free open-source tools■ Moses toolkit, Statistical Machine Translation

■ First translator productivity test■ promising results■ widely cited as first larger-scale

productivity test in an industrial setting (Plitt & Masselot, 2010)

Past

© 2015 Autodesk

■ Deployed MT for production for FIGS■ progressively introduced new languages

■ Proof-of-concept infrastructure■ many interdependencies between systems■ difficult to maintain■ difficult to grow

Past

© 2015 Autodesk

PassoloPassolo

Passolo

Past MT Infrastructure

FR CS DE

FR PL JA

CS ZH KO

DE PL IT

WorldServer

Passolo

© 2015 Autodesk

PassoloPassolo

Passolo

Past MT Infrastructure

FR CS DE

FR PL JA

CS ZH KO

DE PL IT

WorldServer

Passolo

KO HU RU?

© 2015 Autodesk

■ Updated MT infrastructure: MT Info Service■ developed 2011–2013■ based on Moses open source toolkit■ scalable■ modular

■ Integrated with localisation processes for software and documentation■ all non-exact-match software strings■ all documentation strings with TM score <75%

Present

© 2015 Autodesk

Present MT Infrastructure

PassoloPassolo

PassoloWorldServer

Passolo

FR CS DE

FR PL JA

CS ZH KO

DE PL IT

KO HU RU?

Load Balancer

© 2015 Autodesk

Present MT Infrastructure

PassoloPassolo

PassoloWorldServer

Passolo

FR CS DE

FR PL JA

CS ZH KO

DE PL IT

KO HU RU

Load Balancer

© 2015 Autodesk

■ Main benefits of current MT setup■ adapted to best handle Autodesk content■ increased translator productivity

to 100% for some translators■ reduced costs

up to 30% discount■ in use for 14 production languages

cs, de, es, fr, hu, it, jp, ko, pl, pt-br, pt-pt, ru, zh-hans, zh-hant

Present

© 2015 Autodesk

Presentproductivity increase from MT use

0%

25%

50%

75%

100%

125%

Translator 1 Translator 2 Translator 3 Translator 4

translation from scratch MT post-editing MT productivity increase

© 2015 Autodesk

■ Over time, user acceptance of MT has grown significantly■ mostly due to improvements in MT quality

■ Largest complaint became the incorrect translation of terminology

Present

© 2015 Autodesk

■ In-house terminology group disbanded in 2012■ terminology databases unmaintained■ new approach necessary

to support translators

Present

© 2015 Autodesk

■ Introduced a new web-based translation lookup tool■ NeXLT

(in homage to an earlier tool called XLT)■ http://langtech.autodesk.com/nexlt

■ Easy access for translators■ Search for product-specific translations across

all Autodesk published content

Present

© 2015 Autodesk

■ Introduced a new web-based translation lookup tool■ NeXLT

(in homage to an earlier tool called XLT)■ http://langtech.autodesk.com/nexlt

■ Easy access for translators■ Search for product-specific translations across

all Autodesk published content

Present

© 2015 Autodesk

■ MT cannot handle new content well■ but its main use is for segments

with low fuzzy score, i.e. new segments

■ Product-specific terminology handling■ extract terminology from new content■ term translation and approval

on the Term Translation Central http://langtech.autodesk.com/ttc

■ automatic integration with other systems

Present

© 2015 Autodesk

■ MT cannot handle new content well■ but its main use is for segments

with low fuzzy score, i.e. new segments

■ Product-specific terminology handling■ extract terminology from new content■ term translation and approval

on the Term Translation Central http://langtech.autodesk.com/ttc

■ automatic integration with other systems

Present

© 2015 Autodesk

■ MT applies terminology on-the-fly■ systems requesting MT specify the product■ available terminology is retrieved■ terminology translations are fixed

throughout the MT process■ Translators need to do fewer term lookups

Present

© 2015 Autodesk

Present Terminology Infrastructure

FR CS DE

FR PL JA

CS ZH KO

DE PL IT

KO HU RU

AutoCADLoad Balancer

Civil 3D

BIM 360

Term Translation Central

AutoCA

D

Civil 3D

BIM 360

© 2015 Autodesk

■ Publishing un-edited MT■ using rules-based MT for language variants

■ EN-US to EN-UK and FR-FR to FR-CA via Prompsit

■ using statistical MT for content that would otherwise remain untranslated■ via both in-house MT engines

and MS Translator Hub

■ sales support articles, knowledge base, wiki

Present

© 2015 Autodesk

■ Use of MT for bootstrapping localisation into new languages■ e.g. Google Translate for pre-translating

Autodesk InfraWorks in Arabic■ a vendor used rules-based MT to translate

into PT-BR from ES for an existing product

Present

© 2015 Autodesk

■ How can we better help translators do their work?■ tight integration of all localisation tools,

with a focus on language technology■ CAT tool, Translation Memory, Terminology,

Machine Translation should all work as one

Future

© 2015 Autodesk

■ A central role for language technology■ automate menial tasks■ ensure translation consistency

■ Backed by our new Central Repository■ currently in proof-of-concept phase■ houses all translated segments

■ both software and documentation

■ provides all systems with translated content

Future

© 2015 Autodesk

Future Modern Infrastructure

Term Translation Central

Machine Translation NeXLT

Passolo

WorldServer

Crowd

Central Repository

Translatorsvia CAT Tool

© 2015 Autodesk

■ Integrate the CAT tool with our terminology process■ highlight terminology in source strings■ prompt the user with the appropriate term

translation, if not present in the target segment

■ The terminology information will be automatically retrieved by the CAT tool■ the CAT tool will know about Autodesk products■ terms will be available immediately on approval

Future

© 2015 Autodesk

Future Integrating Terminology in CAT Tool

You can enter this information as a hydrology property of the model.

English French Product

hydrology property propriétés hydrologiques InfraWorks

Term Translation Central

© 2015 Autodesk

Future Integrating Terminology in CAT Tool

You can enter this information as a hydrology property of the model.

English French Product

hydrology property propriétés hydrologiques InfraWorks

Term Translation Central

© 2015 Autodesk

Future Integrating Terminology in CAT Tool

You can enter this information as a hydrology property of the model.

English French Product

hydrology property propriétés hydrologiques InfraWorks

Term Translation Central

propriétés hydrologiques

© 2015 Autodesk

Future Integrating Terminology in CAT Tool

You can enter this information as a hydrology property of the model.

English French Product

hydrology property propriétés hydrologiques InfraWorks

Term Translation Central

propriétés hydrologiquesVous pouvez entrer ces informations en tant que

© 2015 Autodesk

Future Integrating Terminology in CAT Tool

You can enter this information as a hydrology property of the model.

English French Product

hydrology property propriétés hydrologiques InfraWorks

Term Translation Central

Vous pouvez entrer ces informations en tant que propriétés hydrologiques du mo…

© 2015 Autodesk

■ On-the-fly processing of new terms■ infer term translations from translators’ work

■ if a source segment contains previously identified term without existing translation

■ infer term translation by analysing the segment translation the translator produced

■ prompt the user for confirmation

■ immediate integration of newly stored term translations for MT of subsequent segments

Future

© 2015 Autodesk

Future Integrating Terminology in CAT Tool

You can enter this information as a hydrology property of the model.

English French Product

hydrology property InfraWorks

Term Translation Central

© 2015 Autodesk

Future Integrating Terminology in CAT Tool

You can enter this information as a hydrology property of the model.

English French Product

hydrology property InfraWorks

Term Translation Central

term not translated yet

© 2015 Autodesk

Future Integrating Terminology in CAT Tool

You can enter this information as a hydrology property of the model.

English French Product

hydrology property InfraWorks

Term Translation Central

term not translated yet

Vous pouvez entrer ces informations en tant que propriétés hydrologiques du modèle.

© 2015 Autodesk

Future Integrating Terminology in CAT Tool

You can enter this information as a hydrology property of the model.

English French Product

hydrology property InfraWorks

Term Translation Central

Vous pouvez entrer ces informations en tant que propriétés hydrologiques du modèle.

© 2015 Autodesk

Future Integrating Terminology in CAT Tool

You can enter this information as a hydrology property of the model.

English French Product

hydrology property InfraWorks

Term Translation Central

Vous pouvez entrer ces informations en tant que propriétés hydrologiques du modèle.hydrology property ✓ X

© 2015 Autodesk

Future Integrating Terminology in CAT Tool

You can enter this information as a hydrology property of the model.

English French Product

hydrology property propriétés hydrologiques InfraWorks

Term Translation Central

Vous pouvez entrer ces informations en tant que propriétés hydrologiques du modèle.

© 2015 Autodesk

■ Integrate software UI reference processing in the CAT tool■ similar to the terminology process

■ UI strings are generally translatedbefore corresponding documentation■ highlight UI references in source strings

■ prompt the user with the appropriate UI translation, if not present in the target segment

Future

© 2015 Autodesk

■ Augment all linguistic resources with UI screenshots (where available)■ provide vital context for translators

to help disambiguate use cases

■ This is a major current user request!

Future

© 2015 Autodesk

■ Automatic analysis ofpost-editing performance■ based on live data provided by the CAT tool

■ will allow us to■ react quickly to MT quality issues

affecting translators’ productivity

■ identify content problematic for MT

■ will help with vendor negotiations

■ will give us clues as to when to retrain the MT engines based on incoming translation volume

Future

© 2015 Autodesk

Future Automatic MT Evaluation, JFS Metric (Joint Fuzzy Score)

35%

40%

45%

50%

55%

60%

65%

70%

75%

80%

AutoCAD3dsM

ax

InventorRevit

Civil 3D

Vault

Map 3D

Navisworks

CZDEESFRITJAKOPLPT-BRRUZH-HANSZH-HANT

© 2015 Autodesk

■ Automatic analysis of translation consistency■ by evaluating the use of terminology

and UI reference translations

■ Active promotion of translation consistency■ by prompting translators to use the appropriate

translations as stored in our Central Repository

■ Applicable to consistency both within and across products/content types

Future

© 2015 Autodesk

■ Main goals ■ Provide translators with all information they

need to do the best translation within their working environment, to avoid context switching between tools

■ When unachievable, make it as easy as possible to find necessary information, by integrating shortcuts in the CAT tool

Future

© 2015 Autodesk

■ Main goals ■ Add intelligence to the CAT tool by

enhancing its interaction with our services to make sure the latest and most appropriate translations are presented for post-editing

■ Pave the paths for new future technologies ■ interactive translation ■ on-line MT training and adaptation

Future

© 2015 Autodesk, Inc. All rights reserved.