Google Translate + TectoMT

download Google Translate + TectoMT

If you can't read please download the document

Transcript of Google Translate + TectoMT

  • 1. Google Translate & TectoMT Martin Majli

2. Plan

  • What is Google Translate?

3. What is TectoMT? 4. Why combine? 5. Results 6. TODO 7. Google Translate

  • Google service

8. 34 languages 9. AJAX API 10. http://translate.google.com/ 11. Google Translate API

  • http://ajax.googleapis.com/ajax/services/language/translate?v=1.0&q=it's lunchtime!&langpair=en%7Ccs

12. {"responseData": {"translatedText":"pestvka na obd, je to!"}, "responseDetails": null, "responseStatus": 200} 13. TectoMT

  • a highly modular extendable NLP software system

14. composed of numerous (mostly previously existing) NLP tools integrated into a uniform infrastructure 15. aimed at (not limited to) developing MT system 16. http://ufal.mff.cuni.cz/tectomt/ 17. Why combine? 18. Why combine? (cont.)

  • Google
  • Great amount of text

19. Computing power - 450k servers (2006) TectoMT

  • Linguistic approach

20. Hypothesis: translation choices are conditioned rather by governing/dependent words than by linear predecessors/followers 21. Translation 22. Examples

  • This time the fall in stocks on Wall Street is responsible for the drop.
  • GT: Tato doba je pokles zsob na Wall Street je zodpovdn za pokles.

23. GT+TMT: Tato doba je pokles zsob na wall street, je zodpovdn za pokles. 24. Examples (cont.)

  • Stocks fall in Asia
  • GT: Zsoby pd v Asii

25. GT+TMT: Zsoba pd v asie. 26. Examples (cont.)

  • The stock exchange in Taiwan dropped by 3.6 percent according to the local index.
  • GT: Burze, na Tchaj-wanu klesla o 3,6 procent v zvislosti na mstn rejstk.

27. Burze tchaj - wan klesl o 3, 6 procent na mstn rejstk. 28. Examples (cont.)

  • Despite the fact that it is a part of China, Hong Kong determines its currency policy separately, that is, without being dependent on the Chinese Central Bank.
  • GT: Navzdory skutenosti, e je soust ny, Hong Kong uruje jej mny politice samostatn, tedy bez zvislosti na nsk centrln bance.

29. GT+TMT: Navzdory skutenosti, e je soust ny, hong kong uruje jej mny politice samostatn, tedy bez na zvislosti nsk centrln bance. 30. Errors

  • Name entity
  • Kevin Rudd => Kevin Rudd => kevin rudd

31. Numbers! 32. Results BLEU NIST Google Translate 0.1543 5.2542 GT + Parahrasing 0.0953 4.4241 33. TODO

  • Somehow improve :)
  • Analyse english sentence

34. Translate subtrees 35. Merge subtrees 36. Analyze + Parahrase czech sentence 37. Questions? ? 38. Thank you!