Initiation of Standardization on Network-based Speech-to-speech Translation at ITU-T SG16

15
Initiation of Standardization on Network-based Speech-to-speech Translation at ITU-T SG16 National Institute of Information and Communications Technology, Japan Satoshi Nakamura Chiori Hori Contact : Name Satoshi Nakamura Chiori Hori Organization NICT Country Japan Tel: +81-774 95 1370 Fax: +81-774 95 1308 Email: [email protected] [email protected] INTERNATIONAL TELECOMMUNICATION COM 16 – C 196 – E TELECOMMUNICATION STANDARDIZATION SECTOR STUDY PERIOD 2009-2012 October 2009 English only Original: English Question(s): 7, 21, 22/16

description

Initiation of Standardization on Network-based Speech-to-speech Translation at ITU-T SG16 National Institute of Information and Communications Technology, Japan Satoshi Nakamura Chiori Hori. Many Languages All Over the World. http://en.wikipedia.org/wiki/List_of_language_families. - PowerPoint PPT Presentation

Transcript of Initiation of Standardization on Network-based Speech-to-speech Translation at ITU-T SG16

Page 1: Initiation of Standardization on  Network-based Speech-to-speech Translation  at ITU-T SG16

Initiation of Standardization on Network-based Speech-to-speech Translation

at ITU-T SG16

National Institute of Information and Communications Technology, JapanSatoshi Nakamura

Chiori Hori

Contact : Name Satoshi Nakamura Chiori HoriOrganization NICTCountry Japan

Tel: +81-774 95 1370Fax: +81-774 95 1308Email: [email protected] [email protected]

INTERNATIONAL TELECOMMUNICATION COM 16 – C 196 – ETELECOMMUNICATIONSTANDARDIZATION SECTOR

STUDY PERIOD 2009-2012

October 2009

English only

Original: EnglishQuestion(s): 7, 21, 22/16

Page 2: Initiation of Standardization on  Network-based Speech-to-speech Translation  at ITU-T SG16

Many Languages All Over the Many Languages All Over the WorldWorld

http://en.wikipedia.org/wiki/List_of_language_families

Page 3: Initiation of Standardization on  Network-based Speech-to-speech Translation  at ITU-T SG16

Breaking Language Breaking Language BoundariesBoundaries

Language boundaries is one of the causes of barriers to mutual understanding.

To remove language boundaries between people who speak different languages, Speech-to-Speech Translation (S2ST) technologies are an effective means of communication.

S2ST technologies have been studied.

Page 4: Initiation of Standardization on  Network-based Speech-to-speech Translation  at ITU-T SG16

EnglishEnglish““I go to school”I go to school”

Speech RecognitionRecognition

(ASR)(ASR)

MachineTranslationTranslation

(MT)(MT)

SpeechSynthesisSynthesis

(TTS)(TTS)

w a t a sh i w a t a sh i w a g a xtu w a g a xtu k o o n i…..k o o n i…..

私は私は学校に行く学校に行く

I to I to school goschool go

I go to I go to school school

JapaneseJapanese「私は学校に行「私は学校に行く」く」

CorporaCorpora

Convert to English word sequence

“「私は」⇒ I” “「学校に」⇒ to school”

“「行く」⇒ go”

Convert toword sequenceusing lexicon and grammar

Convert toJapanese phoneme sequence“w”, “a”, “t”…

Select appropriate waveform for English text

Reorder word sequences according toEnglish grammar “I” “ I” “to school” “ go” “go” “ to school”

Speech-to-Speech Translation Speech-to-Speech Translation (S2ST)(S2ST)

Page 5: Initiation of Standardization on  Network-based Speech-to-speech Translation  at ITU-T SG16

Stand Alone and Client-server Stand Alone and Client-server S2ST SystemsS2ST Systems

Stand alone system

Japanese

English Chinese

Indonesian

Packages the entire speech translation functions into a

handheld PC

Japanese speech“おはようございま

す.”

English speech

“Good morning.”

Client-server system

Page 6: Initiation of Standardization on  Network-based Speech-to-speech Translation  at ITU-T SG16

Why Network-based? Why Network-based?

Resource limitation in stand alone systems and language pairs are limited.

ASR/MT/TTS systems for many languages are available and needs to be maintained by each country.

Broadband network is available.

Page 7: Initiation of Standardization on  Network-based Speech-to-speech Translation  at ITU-T SG16

Standardization on Network-based Standardization on Network-based S2STS2ST

Language B

ASRASR

MTMT

Language B

Language A

Language B

Parallel corpuslexicon

LexiconSpeech

data

Language A

LexiconSpeech

data

Language A

Language A

LexiconSpeech data

Language B

LexiconSpeech data

TTSTTS

Parallel corpuslexicon

Speech of

Language B

Speech of

Language A

Synthesized Speech

Parallel corpus, Parallel corpus, Speech data, lexiconSpeech data, lexicon

StandardizationStandardization

Data format forData format forASR and MT results ASR and MT results

Communication protocol Communication protocol among modulesamong modulesTTSTTS

S2ST Client

MTMT

ASRASR

Synthesized Speech

S2ST Client

Page 8: Initiation of Standardization on  Network-based Speech-to-speech Translation  at ITU-T SG16

Lexicon for overall S2ST systemsLexicon for overall S2ST systems

An example of a lexicon for overall modules in S2ST systems

EntryLanguage

AttributeJapanese Korean Chinese English

Osaka

大阪おおさか

4モーラ0型

Osaka

・・

大阪ダーバンDaban

Da4ban3

四声三声

Osaka

Ōsaka

ɔː s a k a

Surface

Pronunciation

Accent

Tokyo

東京とうきょう・・・・

・・

東京トンジン

Tong1jing1

・・

Tokyo

Tōkyō

・・

Surface

Pronunciation

Accent

The global standardization for lexicon format and a system to collect and provide lexicon for all languages is requisite to maintaining reliable lexicon for overall S2ST systems.

Page 9: Initiation of Standardization on  Network-based Speech-to-speech Translation  at ITU-T SG16

Asian Network-Based S2ST System Asian Network-Based S2ST System by by A-STAR ConsortiumA-STAR Consortium

11National Institute of Information and Communications Technology (NICT), National Institute of Information and Communications Technology (NICT), JapanJapan

22Electronics and Telecommunications Research Institute (ETRI), KoreaElectronics and Telecommunications Research Institute (ETRI), Korea33Chinese Academy of Sciences (CASIA), ChinaChinese Academy of Sciences (CASIA), China

44National Electronics and Computer Technology Center (NECTEC), ThailandNational Electronics and Computer Technology Center (NECTEC), Thailand55Agency for the Assessment and Application of Technology (BPPT), IndonesiaAgency for the Assessment and Application of Technology (BPPT), Indonesia

66Center for Development of Advance Computing (CDAC), IndiaCenter for Development of Advance Computing (CDAC), India77Institute of Information Technology (IOIT), VietnamInstitute of Information Technology (IOIT), Vietnam

88Institute for Infocomm Research (I2R), SingaporeInstitute for Infocomm Research (I2R), Singapore

Page 10: Initiation of Standardization on  Network-based Speech-to-speech Translation  at ITU-T SG16

Server Location for Network-based S2ST

Page 11: Initiation of Standardization on  Network-based Speech-to-speech Translation  at ITU-T SG16

Speech Translation using Distributed Service Servers

Example: From Korean to Thai Speech Translation

Speech translation service client

TTSTTSserverserver

ASRASRserverserver

① Speech recognition (Korean)

② Language translation (Korean→Thai)

Synthesized speech

(Thai)

MTMTserverserver

Translated text (Thai)

Speech (Korean)

MTMTserverserver

TTSTTSserverserver

Text (Korean)

ASRASRserverserver

③ Speech synthesis (Thai)

Page 12: Initiation of Standardization on  Network-based Speech-to-speech Translation  at ITU-T SG16

S2ST Client and Server S2ST Client and Server

1212

Page 13: Initiation of Standardization on  Network-based Speech-to-speech Translation  at ITU-T SG16
Page 14: Initiation of Standardization on  Network-based Speech-to-speech Translation  at ITU-T SG16

Scope of StandardizationScope of Standardization

Draft Title Scope Target Date

F.S2STreqs Functional Requirements for Network-based S2ST

Definition of network-based S2ST

Functions and service requirements of network-based S2ST

During this Study Period (2009-2012)

H.S2STarch Architectural Requirements for Network-based S2ST

Functional architectures, mechanisms and

interface of network-based S2ST

During this Study Period (2009-2012)

Table : Draft Roadmap to develop standards for network-based S2ST

Page 15: Initiation of Standardization on  Network-based Speech-to-speech Translation  at ITU-T SG16

ConclusionConclusion

We would like to invite more people to standardization activities on network-based S2ST systems.

By leveraging the standardization, network-based S2ST systems can cover more languages.