Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 -...
Transcript of Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 -...
![Page 1: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/1.jpg)
Machine Translation
Chris DyerLanguage Technologies InstituteMachine Learning Department
MT Marathon 2013September 9, 2013
lti
ltiMonday, September 9, 13
![Page 2: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/2.jpg)
Overview
• A Brief History of MT
• Modeling: Language & Translation
• Using MT
• Evaluating MT
• Coffee
Monday, September 9, 13
![Page 3: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/3.jpg)
MT Timeline• 1940s - WW2 code breaking
• 1947 - Weaver letter outlining translation as a problem in cryptography
• 1954 - Georgetown Experiments showed “promise” of Russian-English MT
• 1966 - ALPAC report shifts funding to basic research in computational linguistics
• 1968 - MT company SYSTRAN founded (still in existence)
• 1970s - advances in formal languages and automata theory; development of statistical speech recognition techniques at IBM and Princeton
• 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution
• 1999 - Open source reimplementation of IBM models
• 2000s - Major modeling advances, rediscovery of syntax, large scale funding
• 2006 - Open source Moses decoder development begins
• 2006 - Google Translate launches
• 2010 - SDL acquires Language Weaver
• 2013 - Prague MT Marathon!
Monday, September 9, 13
![Page 4: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/4.jpg)
One naturally wonders if the problem of translation could conceivably be
treated as a problem in cryptography. When I look at an article in Russian, I
say: ‘This is really written in English, but it has been coded in some strange symbols. I will now
proceed to decode.’
Warren Weaver to Norbert Wiener, March, 1947
Monday, September 9, 13
![Page 5: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/5.jpg)
Claude Shannon. “A Mathematical Theory of Communication” 1948.
Monday, September 9, 13
![Page 6: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/6.jpg)
Claude Shannon. “A Mathematical Theory of Communication” 1948.
M
Message
Monday, September 9, 13
![Page 7: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/7.jpg)
Claude Shannon. “A Mathematical Theory of Communication” 1948.
EncoderM
Message
Monday, September 9, 13
![Page 8: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/8.jpg)
Claude Shannon. “A Mathematical Theory of Communication” 1948.
EncoderM
Message
Y
Senttransmission
p(y)
Monday, September 9, 13
![Page 9: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/9.jpg)
Claude Shannon. “A Mathematical Theory of Communication” 1948.
EncoderM
Message
Y
Senttransmission
p(y)
“Noisy” channel
p(x|y)
Monday, September 9, 13
![Page 10: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/10.jpg)
Claude Shannon. “A Mathematical Theory of Communication” 1948.
EncoderM
Message
Y
Senttransmission
p(y)
X
Receivedtransmission
p(x)
“Noisy” channel
p(x|y)
Monday, September 9, 13
![Page 11: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/11.jpg)
Claude Shannon. “A Mathematical Theory of Communication” 1948.
EncoderM
Message
DecoderM 0
Recoveredmessage
Y
Senttransmission
p(y)
X
Receivedtransmission
p(x)
“Noisy” channel
p(x|y)
Monday, September 9, 13
![Page 12: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/12.jpg)
Claude Shannon. “A Mathematical Theory of Communication” 1948.
EncoderM
Message
“Noisy” channel Decoder
Y X M 0
Senttransmission
Receivedtransmission
Recoveredmessage
p(y) p(x)p(x|y)
Monday, September 9, 13
![Page 13: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/13.jpg)
Claude Shannon. “A Mathematical Theory of Communication” 1948.
EncoderM
Message
“Noisy” channel Decoder
Y X M 0
Senttransmission
Receivedtransmission
Recoveredmessage
p(y) p(x)p(x|y)
Monday, September 9, 13
![Page 14: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/14.jpg)
Claude Shannon. “A Mathematical Theory of Communication” 1948.
EncoderM
Message
“Noisy” channel Decoder
Y X M 0
Senttransmission
Receivedtransmission
Recoveredmessage
p(y) p(x|y)
Shannon’s theory tells us:1) how much data you can send2) the limits of compression3) why your download is so slow4) how to translate
Monday, September 9, 13
![Page 15: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/15.jpg)
“Noisy” channel Decoder
Y X M 0
Senttransmission
Receivedtransmission
Recoveredmessage
p(y) p(x|y)
Y 0
Monday, September 9, 13
![Page 16: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/16.jpg)
“Noisy” channel Decoder
Y X M 0
Senttransmission
Receivedtransmission
Recoveredmessage
p(y) p(x|y)
Y 0
Monday, September 9, 13
![Page 17: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/17.jpg)
“Noisy” channel Decoder
Y X M 0
Senttransmission
Receivedtransmission
Recoveredmessage
p(y) p(x|y)
Y 0
Monday, September 9, 13
![Page 18: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/18.jpg)
“Noisy” channel Decoder
Y X M 0
Senttransmission
Receivedtransmission
Recoveredmessage
p(y) p(x|y)
Y 0
y
0= argmax
yp(y|x)
= argmax
y
p(x|y)p(y)p(x)
= argmax
yp(x|y)p(y)
Monday, September 9, 13
![Page 19: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/19.jpg)
“Noisy” channel Decoder
Y X M 0
Senttransmission
Receivedtransmission
Recoveredmessage
p(y) p(x|y)
Y 0
y
0= argmax
yp(y|x)
= argmax
y
p(x|y)p(y)p(x)
= argmax
yp(x|y)p(y)
6=
Monday, September 9, 13
![Page 20: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/20.jpg)
“Noisy” channel Decoder
Y X M 0
Senttransmission
Receivedtransmission
Recoveredmessage
p(y) p(x|y)
Y 0
y
0= argmax
yp(y|x)
= argmax
y
p(x|y)p(y)p(x)
= argmax
yp(x|y)p(y)
I can help.
6=
Monday, September 9, 13
![Page 21: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/21.jpg)
“Noisy” channel Decoder
Y X M 0
Senttransmission
Receivedtransmission
Recoveredmessage
Y 0
y
0= argmax
yp(y|x)
= argmax
y
p(x|y)p(y)p(x)
= argmax
yp(x|y)p(y)
Monday, September 9, 13
![Page 22: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/22.jpg)
“Noisy” channel Decoder
Y X M 0
Senttransmission
Receivedtransmission
Recoveredmessage
Y 0
y
0= argmax
yp(y|x)
= argmax
y
p(x|y)p(y)p(x)
= argmax
yp(x|y)p(y)
Denominator doesn’t depend on .y
Monday, September 9, 13
![Page 23: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/23.jpg)
“Noisy” channel Decoder
Y X M 0
Senttransmission
Receivedtransmission
Recoveredmessage
Y 0
y
0= argmax
yp(y|x)
= argmax
y
p(x|y)p(y)p(x)
= argmax
yp(x|y)p(y)
Monday, September 9, 13
![Page 24: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/24.jpg)
“Noisy” channel Decoder
Y X M 0
Senttransmission
Receivedtransmission
Recoveredmessage
Y 0
y
0= argmax
yp(x|y)p(y)
Monday, September 9, 13
![Page 25: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/25.jpg)
Senttransmission
Receivedtransmission
Recoveredmessage
“Noisy” channel Decoder
Y X M 0Y 0
English Csesky English’
y
0= argmax
yp(x|y)p(y)
e0 = argmax
ep(f|e)p(e)
Monday, September 9, 13
![Page 26: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/26.jpg)
Senttransmission
Receivedtransmission
Recoveredmessage
“Noisy” channel Decoder
Y X M 0Y 0
English Csesky English’
y
0= argmax
yp(x|y)p(y)
e0 = argmax
ep(f|e)p(e)
translation model
Monday, September 9, 13
![Page 27: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/27.jpg)
Senttransmission
Receivedtransmission
Recoveredmessage
“Noisy” channel Decoder
Y X M 0Y 0
English Csesky English’
y
0= argmax
yp(x|y)p(y)
e0 = argmax
ep(f|e)p(e)
translation model language model
Monday, September 9, 13
![Page 28: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/28.jpg)
Division of labor• Translation model
• translation back into the source
• learned from (source, target) translations
• adequacy of translation
• Language model
• probability of the output sentence
• learned from any target language corpus
• fluency of translation
Monday, September 9, 13
![Page 29: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/29.jpg)
Intuition for Division of Labor
• Better use of data
• We have parallel data
• We also have (lots of) parallel data
• ... use both
• Use weaker translation models
• Language modeling is hard ...
• Translation modeling is language modeling ++
• Language models are used in many applications
Monday, September 9, 13
![Page 30: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/30.jpg)
The Big Question in MT
Monday, September 9, 13
![Page 31: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/31.jpg)
The Big Question in MT
• How do we design language and translation models?
Monday, September 9, 13
![Page 32: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/32.jpg)
The Big Question in MT
• How do we design language and translation models?
• Considerations
• Is the model correct?
• Is prediction (inference) tractable?
• Is there data to learn the parameters?
Monday, September 9, 13
![Page 33: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/33.jpg)
Language Models• What is the probability of a sentence (in a particular
language)?
• There are an infinite number of grammatical sentences in any language
• This is the era of Big Data...but infinity is bigger
• For machine translation
• Naive model (model language)
• Naive parameterization (no relationship between words like walk walked walking walks)
• Clever estimation of parameters (Lecture later this week)
Monday, September 9, 13
![Page 34: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/34.jpg)
My legal name is Alexander Perchov. But all of my many friends dub me Alex, because that is a more flaccid-to-utter version of my legal name. Mother dubs me Alexi-stop-spleening-me!, because I am always spleening her. If you want to know why I am always spleening her, it is because I am always elsewhere with friends, and disseminating so much currency, and performing so many things that can spleen a mother. Father used to dub me Shapka, for the fur hat I would don even in the summer month. He ceased dubbing me that because I ordered him to cease dubbing me that. It sounded boyish to me, and I have always thought of myself as very potent and generative. I have many many girls, believe me, and they all have a different name for me. One dubs me Baby, not because I am a baby, but because she attends to me.
Monday, September 9, 13
![Page 35: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/35.jpg)
My legal name is Alexander Perchov. But all of my many friends dub me Alex, because that is a more flaccid-to-utter version of my legal name. Mother dubs me Alexi-stop-spleening-me!, because I am always spleening her. If you want to know why I am always spleening her, it is because I am always elsewhere with friends, and disseminating so much currency, and performing so many things that can spleen a mother. Father used to dub me Shapka, for the fur hat I would don even in the summer month. He ceased dubbing me that because I ordered him to cease dubbing me that. It sounded boyish to me, and I have always thought of myself as very potent and generative. I have many many girls, believe me, and they all have a different name for me. One dubs me Baby, not because I am a baby, but because she attends to me.
Monday, September 9, 13
![Page 36: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/36.jpg)
My legal name is Alexander Perchov. But all of my many friends dub me Alex, because that is a more flaccid-to-utter version of my legal name. Mother dubs me Alexi-stop-spleening-me!, because I am always spleening her. If you want to know why I am always spleening her, it is because I am always elsewhere with friends, and disseminating so much currency, and performing so many things that can spleen a mother. Father used to dub me Shapka, for the fur hat I would don even in the summer month. He ceased dubbing me that because I ordered him to cease dubbing me that. It sounded boyish to me, and I have always thought of myself as very potent and generative. I have many many girls, believe me, and they all have a different name for me. One dubs me Baby, not because I am a baby, but because she attends to me.
Monday, September 9, 13
![Page 37: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/37.jpg)
Probability Models of Text
• Sequence of symbols (bytes, letters, characters, morphemes, words, ...)
• Let denote the set of symbols
• Lots of possible sequences ( is infinitely large!)
• Probability distributions over ?
⌃
⌃⇤
⌃⇤
Monday, September 9, 13
![Page 38: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/38.jpg)
Probability Models of Text
• How are probability models built?
• Make some independence assumptions
• Make assumptions about the (conditional) distributions of the events
• Estimate parameters from a sample of data
P (x, y) = P (x)⇥ P (y) () x, y are independent
Monday, September 9, 13
![Page 39: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/39.jpg)
I
Monday, September 9, 13
![Page 40: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/40.jpg)
I want
Monday, September 9, 13
![Page 41: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/41.jpg)
I want a
Monday, September 9, 13
![Page 42: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/42.jpg)
I want a flight
Monday, September 9, 13
![Page 43: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/43.jpg)
I want a flight to
Monday, September 9, 13
![Page 44: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/44.jpg)
I want a flight to Prague
Monday, September 9, 13
![Page 45: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/45.jpg)
I want a flight to Prague STOP
Monday, September 9, 13
![Page 46: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/46.jpg)
I want a flight to Prague STOP
Monday, September 9, 13
![Page 47: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/47.jpg)
I have a bad headache
Chukchi (Siberian language):
English:
Monday, September 9, 13
![Page 48: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/48.jpg)
I have a bad headache
Chukchi (Siberian language):
English:
Aggultinative and polysynthetic languages have rich word-formation processes.
Monday, September 9, 13
![Page 49: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/49.jpg)
I have a bad headache
Chukchi (Siberian language):
English:
Aggultinative and polysynthetic languages have rich word-formation processes.
Təmeyŋəlevtpəγtərkən
Monday, September 9, 13
![Page 50: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/50.jpg)
Translation Models
• Is a string of words e a meaning-preserving translation of a string of words f?
• Conditional model p(f | e)
• Challenges
• There are an infinite number of sentences
• How do we learn parameters of the models?
Monday, September 9, 13
![Page 51: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/51.jpg)
In der Innenstadt explodierte eine Autobombe
Monday, September 9, 13
![Page 52: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/52.jpg)
• Make independence assumptions
In der Innenstadt explodierte eine Autobombe
Monday, September 9, 13
![Page 53: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/53.jpg)
• Make independence assumptions
• Permute the source words into the target language order
In der Innenstadt explodierte eine Autobombe
Monday, September 9, 13
![Page 54: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/54.jpg)
• Make independence assumptions
• Permute the source words into the target language order
In der Innenstadt explodierte eine Autobombe
Monday, September 9, 13
![Page 55: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/55.jpg)
• Make independence assumptions
• Permute the source words into the target language order
In der Innenstadt explodierte eine Autobombe
Monday, September 9, 13
![Page 56: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/56.jpg)
• Make independence assumptions
• Permute the source words into the target language order
• Pick translations for individual words / phrases
• Probability of particular translations
• Look at source context
• Ensure that the output is fluent & idiomatic
In der Innenstadt explodierte eine Autobombe
Monday, September 9, 13
![Page 57: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/57.jpg)
• Make independence assumptions
• Permute the source words into the target language order
• Pick translations for individual words / phrases
• Probability of particular translations
• Look at source context
• Ensure that the output is fluent & idiomatic
In der Innenstadt explodierte eine Autobombe A car bomb exploded downtown
Monday, September 9, 13
![Page 58: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/58.jpg)
• Make independence assumptions
• Permute the source words into the target language order
• Pick translations for individual words / phrases
• Probability of particular translations
• Look at source context
• Ensure that the output is fluent & idiomatic
• Computational challenges
• Searching all word permutations is NP-hard
• Massive numbers of translation alternatives
In der Innenstadt explodierte eine Autobombe A car bomb exploded downtown
Monday, September 9, 13
![Page 59: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/59.jpg)
A car bomb exploded downtownIn der Innenstadt explodierte eine Autobombe
• String-to-string translation
• Became popular in the 1990’s with statistical MT
• State-of-the-art for many (most?) language pairs
• Especially: Closely related language pairs
• Especially: Typologically similar language pairs
• ~ Google Translate / Bing Translator
• Limitations
• Independence assumptions are wrong (too strong and too weak)
• No structural information available to improve modeling
• Models can be learned directly from parallel data
Monday, September 9, 13
![Page 60: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/60.jpg)
Monday, September 9, 13
![Page 61: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/61.jpg)
Monday, September 9, 13
![Page 62: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/62.jpg)
Monday, September 9, 13
![Page 63: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/63.jpg)
Greek
Egyptian
Monday, September 9, 13
![Page 64: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/64.jpg)
A car bomb exploded downtownIn der Innenstadt explodierte eine Autobombe
Monday, September 9, 13
![Page 65: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/65.jpg)
In der Innenstadt explodierte eine Autobombe A car bomb exploded downtown
In der Innenstadt explodierte eine Autobombe
!
Monday, September 9, 13
![Page 66: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/66.jpg)
In der Innenstadt explodierte eine Autobombe A car bomb exploded downtown
In der Innenstadt explodierte eine Autobombe
!
• Tree-to-string translation
• Syntactic analysis of source (parse)
• Transfer from tree to string
• Source trees have some benefits
• Proxy for semantic relationships
• Syntax is a natural source of reordering constraints
Monday, September 9, 13
![Page 67: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/67.jpg)
In der Innenstadt explodierte eine Autobombe A car bomb exploded downtown
In der Innenstadt explodierte eine Autobombe
!
Monday, September 9, 13
![Page 68: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/68.jpg)
In der Innenstadt explodierte eine Autobombe A car bomb exploded downtown
!
A car bomb exploded downtown
Monday, September 9, 13
![Page 69: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/69.jpg)
In der Innenstadt explodierte eine Autobombe A car bomb exploded downtown
!
A car bomb exploded downtown
• String-to-tree translation
• Transfer from source tree to target string
• Formally a generalization of monolingual parsing
• Intuition: it is more important to know the language you are translating into well than the source language
Monday, September 9, 13
![Page 70: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/70.jpg)
In der Innenstadt explodierte eine Autobombe A car bomb exploded downtown
!
A car bomb exploded downtown
• String-to-tree translation
• Transfer from source tree to target string
• Formally a generalization of monolingual parsing
• Intuition: it is more important to know the language you are translating into well than the source language
The best Chinese-English systems are string-to-tree
Monday, September 9, 13
![Page 71: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/71.jpg)
In der Innenstadt explodierte eine Autobombe A car bomb exploded downtown
In der Innenstadt explodierte eine Autobombe
! !
A car bomb exploded downtown
Monday, September 9, 13
![Page 72: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/72.jpg)
In der Innenstadt explodierte eine Autobombe A car bomb exploded downtown
In der Innenstadt explodierte eine Autobombe
! !
A car bomb exploded downtown
• Tree to tree translation
• Use syntax to predict syntax
• Benefits
• As parsers improve, MT will improve (we hope)
• Rich information for modeling in source and target
• Downside: where does the syntax come from?
Monday, September 9, 13
![Page 73: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/73.jpg)
In der Innenstadt explodierte eine Autobombe
Monday, September 9, 13
![Page 74: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/74.jpg)
In der Innenstadt explodierte eine Autobombe
detonate :arg0 bomb :arg1 car :loc downtown :time past
Semantics“logical form”
Monday, September 9, 13
![Page 75: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/75.jpg)
In der Innenstadt explodierte eine Autobombe
!
A car bomb exploded downtown
Syntax
detonate :arg0 bomb :arg1 car :loc downtown :time past
Semantics“logical form”
Monday, September 9, 13
![Page 76: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/76.jpg)
In der Innenstadt explodierte eine Autobombe A car bomb exploded downtown
!
A car bomb exploded downtown
Syntax
detonate :arg0 bomb :arg1 car :loc downtown :time past
Semantics“logical form”
Monday, September 9, 13
![Page 77: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/77.jpg)
In der Innenstadt explodierte eine Autobombe A car bomb exploded downtown
In der Innenstadt explodierte eine Autobombe
! !
A car bomb exploded downtown
detonate :arg0 bomb :arg1 car :loc downtown :time past
Monday, September 9, 13
![Page 78: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/78.jpg)
In der Innenstadt explodierte eine Autobombe A car bomb exploded downtown
In der Innenstadt explodierte eine Autobombe
! !
A car bomb exploded downtown
detonate :arg0 bomb :arg1 car :loc downtown :time past
report_event[ factivity=true explode(e, bomb, car) loc(e, downtown)]
explodieren :arg0 Bombe :arg1 Auto :loc Innenstadt :tempus imperf
Interlingua“meaning”
Monday, September 9, 13
![Page 79: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/79.jpg)
More Abstract Models
Monday, September 9, 13
![Page 80: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/80.jpg)
More Abstract Models• Modeling challenges
• What are the right abstract representations?
• How do we support more abstraction without sacrificing accuracy on frequent elements?
Monday, September 9, 13
![Page 81: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/81.jpg)
More Abstract Models• Modeling challenges
• What are the right abstract representations?
• How do we support more abstraction without sacrificing accuracy on frequent elements?
• Computational challenges
• Large search spaces
• Error propagation in pipelines
Monday, September 9, 13
![Page 82: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/82.jpg)
More Abstract Models• Modeling challenges
• What are the right abstract representations?
• How do we support more abstraction without sacrificing accuracy on frequent elements?
• Computational challenges
• Large search spaces
• Error propagation in pipelines
• Learning challenges
• Nonconvexity
• Where does the data come from?
Monday, September 9, 13
![Page 83: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/83.jpg)
In der Innenstadt explodierte eine Autobombe A car bomb exploded downtown
In der Innenstadt explodierte eine Autobombe
! !
A car bomb exploded downtown
detonate :arg0 bomb :arg1 car :loc downtown :time past
report_event[ factivity=true explode(e, bomb, car) loc(e, downtown)]
explodieren :arg0 Bombe :arg1 Auto :loc Innenstadt :tempus imperf
Interlingua“meaning”
Monday, September 9, 13
![Page 84: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/84.jpg)
In der Innenstadt explodierte eine Autobombe A car bomb exploded downtown
In der Innenstadt explodierte eine Autobombe
! !
A car bomb exploded downtown
detonate :arg0 bomb :arg1 car :loc downtown :time past
report_event[ factivity=true explode(e, bomb, car) loc(e, downtown)]
explodieren :arg0 Bombe :arg1 Auto :loc Innenstadt :tempus imperf
Interlingua“meaning”
Hidden
Monday, September 9, 13
![Page 85: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/85.jpg)
Other ChallengesEuropean parliament language (training):
I declare resumed the session of the European Parliament adjourned on Friday 17 December 1999, and I would like once again to wish you a happy new year in the hope that you enjoyed a pleasant festive period.
Monday, September 9, 13
![Page 86: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/86.jpg)
Other ChallengesEuropean parliament language (training):
Human language (testing):
I declare resumed the session of the European Parliament adjourned on Friday 17 December 1999, and I would like once again to wish you a happy new year in the hope that you enjoyed a pleasant festive period.
Monday, September 9, 13
![Page 87: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/87.jpg)
Other ChallengesEuropean parliament language (training):
Human language (testing):
I declare resumed the session of the European Parliament adjourned on Friday 17 December 1999, and I would like once again to wish you a happy new year in the hope that you enjoyed a pleasant festive period.
spelling error
Monday, September 9, 13
![Page 88: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/88.jpg)
Other ChallengesEuropean parliament language (training):
Human language (testing):
I declare resumed the session of the European Parliament adjourned on Friday 17 December 1999, and I would like once again to wish you a happy new year in the hope that you enjoyed a pleasant festive period.
abbreviationsMonday, September 9, 13
![Page 89: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/89.jpg)
Other ChallengesEuropean parliament language (training):
Human language (testing):
I declare resumed the session of the European Parliament adjourned on Friday 17 December 1999, and I would like once again to wish you a happy new year in the hope that you enjoyed a pleasant festive period.
nonstandard contractionsMonday, September 9, 13
![Page 90: Machine Translationufal.mff.cuni.cz/mtm13/files/01-mtm-intro-chris-dyer.pdf · • 1993 - Weaver’s model of translation prototyped by IBM; statistical revolution • 1999 - Open](https://reader034.fdocuments.in/reader034/viewer/2022052019/6032851c9a669459782355f4/html5/thumbnails/90.jpg)
Questions?
Monday, September 9, 13