iMT Language Solutions
-
Upload
sdl -
Category
Technology
-
view
1.318 -
download
0
Transcript of iMT Language Solutions
SDL Proprietary and Confidential
Machine Translation: Latest Innovations and their Impact on Commercial Translation
Claudiu Stiube, MT Customer Solutions Manager
SDL Language Customer Success Summit 2015
Evolution of MT
3
1950s
2002
2010
2011
2015
SDL acquires RBMT
engine…establishes
MT group dedicated to
improving quality for
enterprise applications
First SDL Post-
Editing projects
using SMT go into
production
Post-Editing
booms: 4-fold
increase
SDL launches
PE Certification
Program
War-time
cryptography
requirements,
with subsequent
experiments &
investment in
automated
translation
SDL launches
XMT next-
generation MT
platform
2014
Brief history of Machine TranslationSDL acquires
Language Weaver / BeGlobal Statistical
Machine Translation (SMT)
4
Overview: The SDL MT Team
Who we areFirst to commercialize Statistical
Machine Translation
o 50+ Professionals
o Over 10 Nationalities
o Across 5 Time Zones
o 8 Locations
o Ex-translators
o Computational
Linguists
o Project
Managers
Widespread team of language lovers:
o Data
Specialists
o Post-
Editors
o Architects
…all gathered from the
four corners of SDL!
What we doDrive MT Adoption:
Educate, promote and support MT
usage in existing SDL accounts
& new opportunities
o Design
o Create
o Test
o Implement
o Monitor
Custom Engine Builds:
…custom
Statistical Machine
Translation
engines
Linguistic Projects:
Semantic annotation projects
for US Government bodies
& academic institutes
How we do it
o Los Angeles, CA
o Cambridge, UK
Two Research Labs:
o 100s of Scientific
Publications
o Over 50 Patents
Approved or Filed
We’re Evangelists…about
Machine Translation, using
automation to accelerate
productivity
Common MT Use-Cases
6
Communication
Channels
Consumer PreferencesIncreased Global
Competition
Export Market Growth
7
Right translation method, right price, right timeQ
uali
ty
Volume
Human Translation Machine Translation
Blogs
User Forums
Reviews
Chat
Support
FAQ
Websites
Wikis
Knowledge
Base
Alerts/
Notifications
Help
User
Guides
Documentation
Post-Edit
Newsletters
Advertising Content
Legal
8
Description:
○ Direct access to machine translation
from SDL Trados Studio
Benefits:
○ Improve the efficiency of translators by
providing results of machine translation
to them for segments that do not match
entries in translation memory
Translator productivity
9
Description:
○ Real-time translation of web-based
chat conversations
Benefits:
o Reduces cost of staffing the
support/sales operations as they
do not need multi-lingual agents
o Customer acquisition rates and
satisfaction are much higher if you
engage the customer in chat.
Live chat translation
10
Description:
○ Translation of user-generated content
in web-based community forums
Benefits:
o Enable interactions between
customers who speak different
languages
o Leverage community expertise
across languages instead of only
within the language of community
experts
Community forum translation
11
Description:
○ Translation of knowledge base content
for local language customers of technical
solutions
Benefits:
o Reduces customer support costs
and activity level by allowing remote
language customers to directly
access solutions
o Increases customer satisfaction by
providing solutions in their native
language
Knowledgebase content translation
12
Description:
○ Integrate with web content management
system to translate web site
○ Embedding MT into the web site to
support translation “on demand”
Benefits:
○ Ability to translate large volumes of web
content that would not otherwise be
translated because of cost
○ Real-time translation can facilitate
support for multi-lingual content with
minimal changes to the development
and storage of the source content
Web content translation
13
Case study: MT for online customer reviews
Requirements:
o Share customer reviews with
international audiences
o Automate the translation of customer
reviews into 13 languages
Results:
o Reduced bounce rate from 70% to 25%
o Increased user dwell times and page views
o Economically translate 1 billion words/month
14
Case study: MT for instant MS Office translation
[a large global
retail client]
Requirements:
o Improve communication among
geographically scattered company
employees
o Fast, low-cost translation of MS Outlook
emails & MS Office business documents
Results:
o BeGlobal Machine Translation integrated
via API with MS Office apps
o Any employee can instantly translate emails
or attachments with a simple double-click
15
Case study: MT for speedier translation, reduced cost
Requirements:
o Economically and quickly
translate content for 4,000
hotels, 4 million words per
language
Results:
o Trained MT engine integrated with CMS, Web
CMS, Translation Memory, Terminology
Management
o Human post-edit review
16
Engine training: Making MT smarter
Customized engines
Domain verticals
Baselines
17
Baselines
Baselines
Data mined
from reliable
sources
available in the
public domain,
covering various
subjects
Core generic MT
engines for each
language pair
Work well for
general & varied
content
Can be used
as backup for
verticals &
customized
engines
Contain
hundreds of
millions of words
of bilingual data
100Ms+
18
Domain verticals
Domain verticals
Trained statistical engines exclusive
for a domain
Data selected from sources within a
domain or industry
MT output more likely to follow
technical terminology
Solution used when client-specific data is not available or not enough for a
customization
19
Customized engines
Customized engines
Optimize the MT
output for
specific client
projects
Training based
on client-
specific
bilingual data
More data
usually has a
positive effect
on the MT
output
Quality &
consistency
of data is as
important as
quantity
Adherence to client-specific terminology
& style
20
How SDL trains an MT engine
Training Data Prep &
Engine Customization
Prep of Testing
Material
Evaluate MT Output
Machine
Translation
Post-Edit
Quality
Assessment
& Translation
Delivery
Update
Translation
Memory
Source
Content
Apply
Translation
Memory
Content Evaluation MT Customization Production QA
Refine Training or Deploy
for Production
Integrate MT on
Translation Process
SDL MT
Server
Translation
Memory
21
SDL MT Group developers are constantly
researching ways to improve Generic,
Vertical, and Customized MT Engines
SDL Research Scientists are continuously
improving the Statistical Machine Translation
algorithms (e.g. Language Models, Translation
Models, Reordering Models, Syntax,
Transliteration, Rule-Based Components, etc…)
SDL Data Engineers are
continuously mining large
amounts of good data used
by the statistical algorithms
Continuous improvement
22
Introducing SDL XMT…
A NEW, modular & flexible
technology that will power the
“next generation” of SDL MT
Syntax-based Machine
TranslationPhrase-based
Machine Translation
Word-basedMachine
Translation
2002
2003
2008
2015
XMTXMT
23
Legacy MT
Legacy MT
(MonolithicPhrase-based)
Foreign
Language
Your
Language
24
……
Neural
Networks
Compound
Splitting
Phrase-
Based
Finite
State
Automata
String
to Tree
Rule-
Based
Tree to
String
Pre-
Ordering
Trans-
literation
Hidden
Markov
Model
Hyper
Graphs
Modular &
Flexible“State-of-the-Art”
Machine Learning
Better Translation
Quality
Rapid Research
Transition
SDL XMT: Next generation technology, higher quality
XMT
Foreign
Language
Your
Language
M O D U L A R C O M P O N E N T S
25
Language Learning in XMT
Continuous
improvement by
learning from
Post-Editing.
○ The machine learns how
to translate from source to
target during the training
process
○ The machine does
not learn during the
translation process
Machine TranslationMachine Translation
+ Language Learning
○ The machine learns how
to translate from source
to target during the
training process
○ The machine learns &
improves seamlessly,
continuously, and in
real-time from user
feedback during the
translation process
○ See it in action: SDL XMT
XMT
How to Deploy
MT Post-Edit
27
SDL iMT: Key steps in the process
○ Evaluate content and translation assets
○ Train MT engines for your content or use existing solution
○ Configure the trained MT engines with SDL’s translation environment
(TMS, WS, Studio)
○ Post-edit the MT output to full publishable quality
○ SDL infrastructure to support these steps
Evaluate Train MT Configure Post-Edit
SDL Infrastructure
28
Quality in MTBuilding blocks are there as a lot of content is pulled from the engines
Allows the linguist to focus on refining the output
Custom engines pull in client terminology & style
Fewer resources equals greater consistency
Trained linguists well-versed in handling MT output & certified
29
Post-Editing quality requirements
When post-editing to publishable quality,
the following basic principles still apply:
o The same
references must
be used for as
for conventional
translation (project-
specific guidelines,
TMs, glossaries,
termbases, etc.)
o Grammar,
spelling and
punctuation
must be correct
o Appropriate
style & correct
terminology
must be used
consistently
o The translation
must read well
and be suitable
for its intended
purpose
Customer
User Guide
30
Features to watch out for in MT output…
Incorrect Formatting
Additional or Missing words
Words Not Localized or
Wrong Flavor
Gender, Number, Agreement or Verb Inflection
Issues
Articles & Prepositions
Syntax & Word Order Issues
Wrong Punctuation
Inconsistent or Non-compliant Terminology
Mistranslations
!
31
Post-Editing Machine Translation certification
○ The demand for MT solutions
is growing quickly & Post-
Editing is becoming a
mainstream skill for translators
○ In response, SDL have
created Post-Editing
Certification – released
in June 2014
○ 85% of in-house
staff completed the
Certification in 2014
○ 2,500+ freelancers
signed up for the course
○ The Certification covers the
theory behind Machine
Translation as well as practical
approaches to Post-Editing
○ Our Certification is for anyone
impacted by Post-Editing –
certified translators can offer
an extended skill set
JUNE 2014
85%
2,500+
Copyright © 2008-2015 SDL plc. All rights reserved. All company names, brand names, trademarks,
service marks, images and logos are the property of their respective owners.
This presentation and its content are SDL confidential unless otherwise specified, and may not be
copied, used or distributed except as authorised by SDL.
Global Customer Experience Management