machine translation manuel herranz PangeaMT TAUS Barcelona

18
Don't be afraid to provide the tools to those who need them Manuel Herranz – PangeaMT - Pangeanic www.pangea.com.mt - User empowerment - DIY SMT

description

how machine translation is about empowering users and how users can be empowered using DIY SMT technology to build their own statistical machine translation solutions

Transcript of machine translation manuel herranz PangeaMT TAUS Barcelona

Page 1: machine translation manuel herranz PangeaMT TAUS Barcelona

Don't be afraid to provide the tools to those who need them

Manuel Herranz – PangeaMT - Pangeanic

www.pangea.com.mt

- User empowerment -DIY SMT

Page 2: machine translation manuel herranz PangeaMT TAUS Barcelona

PangeaMT – putting open standards to work… well

http://t.co/HDTboxQ

USERS

80% like 19% not like 1% done before

User empowerment

Page 3: machine translation manuel herranz PangeaMT TAUS Barcelona

PangeaMT – putting open standards to work… well

http://t.co/HDTboxQ

USERS

80% like 19% not like 1% done before

User empowerment

Meaning of USER becoming closely related to COMMUNITY, POWER, FEEDBACK, ACCOUNTABILITY

Page 4: machine translation manuel herranz PangeaMT TAUS Barcelona

PangeaMT – putting open standards to work… well

Humankind constant search for

TOOLS

better more other things

http://t.co/HDTboxQ

Page 5: machine translation manuel herranz PangeaMT TAUS Barcelona

PangeaMT – putting open standards to work… well

Humankind constant search for

TOOLS

better more other things

An instrument for making material changes on other objects […]. Tools are the primary means by human beings control and manipulate their physical environment – Encyclopedia Britannica.

http://t.co/HDTboxQ

Page 6: machine translation manuel herranz PangeaMT TAUS Barcelona

PangeaMT – putting open standards to work… well

History of the world largelya fight to control

MT: Another translator out of business ...... ?

resources tools[technology]

Page 7: machine translation manuel herranz PangeaMT TAUS Barcelona

PangeaMT – putting open standards to work… well

History of the world largelya fight to control

resources tools[technology]

In 20th-21st century also a fight to control and manipulate

INFORMATION [data]

ACCESS [data]

Page 8: machine translation manuel herranz PangeaMT TAUS Barcelona

PangeaMT – putting open standards to work… well

History of the world largelya fight to control

21st century

INFORMATION [data]

ACCESS [data]

IS THE ERA OF• SHARING • OPEN

Page 9: machine translation manuel herranz PangeaMT TAUS Barcelona

PangeaMT – putting open standards to work… well

History of the world largelya fight to control

21st century

IS THE ERA OF• SHARING • OPEN

* Communities

* Source (Linux, others)

* Data

INFORMATION [data]

ACCESS [data]

Page 10: machine translation manuel herranz PangeaMT TAUS Barcelona

PangeaMT – putting open standards to work… well

History of the world largelya fight to control

21st century

IS THE ERA OF• SHARING • OPEN

* Communities

* Source (Linux, others)

* Data

USERShave the power

“We cannot solve the problem using the same tools and the way of thinking that created it” A. Einstein

INFORMATION [data]

ACCESS [data]

Page 11: machine translation manuel herranz PangeaMT TAUS Barcelona

PangeaMT – putting open standards to work… well

MT at Pangeanic, from Trial to Production 2007/08

.

2009/10

2011/12

• DIY SMT • Empower Users• Glossary• Automated re-training• Transfer architecture and know-how to users• Compatibility with commercial formats (ttx, sdlxliff, itd)

2007 and before

• RB tests with commercial software• Insufficiently good output• Only internal production• EU Post-Editing Award

• V1: Small data sets (2-5M words), automotive & electronics• (ES), then Fr/It/De in other fields

• Division born • 00's of engine trials and language combinations• Open-Source to commercial• TMX / XLIFF workflows

Page 12: machine translation manuel herranz PangeaMT TAUS Barcelona

PangeaMT – putting open standards to work… well

12

MT at Pangeanic, from Trial to Production

•Users provide information to improve [they are the source & target]

• Potential MT users wanted to be another Pangeanic = build their own systems

- Some can, some can’t- Other want turnkey developments- Others prefer SaaS- Most want to unwrap the black box but without walking the road

Page 13: machine translation manuel herranz PangeaMT TAUS Barcelona

PangeaMT – putting open standards to work… well

2015

2014

2013

2011

2010

2009

2012

2018

2017

2016

Use

r em

pow

erm

ent

YEAR2016

00

0's o

f custo

mize

d M

T sy

stem

s

Predictions

PangeaMTTech. notthe realm of afew providers

Page 14: machine translation manuel herranz PangeaMT TAUS Barcelona

PangeaMT – putting open standards to work… well

2015

2014

2013

2011

2010

2009

2012

2018

2017

2016

Use

r em

pow

erm

ent

YEAR2016

00

0's o

f custo

mize

d M

T sy

stem

s

Predictions

PangeaMTTech. notthe realm of afew providers

Page 15: machine translation manuel herranz PangeaMT TAUS Barcelona

PangeaMT – putting open standards to work… well

2010

2009

2018

2017

PangeaMT

Page 16: machine translation manuel herranz PangeaMT TAUS Barcelona

PangeaMT – putting open standards to work… well

2015

2014

2013

2011

2010

2009

2012

2018

2017

2016

MT

acceptance

User em

powerm

ent

• MT acceptance growth.• Translator engagement challenge• Need for data has been addressed – still more work to be done.• Users and practitioners now can build their own systems.

Until 2011

YEAR2016

000's o

f customized

MT

systemsIn 5 years... after 2016

Predictions

PangeaMT

• Combinations??• Supra-engines??• World-knowledge?? …...suggestions....???

Tech. notthe realm of afew providers

Page 17: machine translation manuel herranz PangeaMT TAUS Barcelona

PangeaMT – putting open standards to work… well

2010

2009

2018

• USER EMPOWERMENT : give people the tools so they can grow their own solutions• PangeaMT provides infrastructure• Cloud Training : so users concentrate in production, not in technical bits & updates• Pressure for data availability coming from users will benefit efforts for standardization

Summary

PangeaMT

Page 18: machine translation manuel herranz PangeaMT TAUS Barcelona

PangeaMT – putting open standards to work… well

18

Thank you !

MANY QUESTIONS PLEASE!!

[email protected]