Rhetorical Group plc Marc Moens January 2001 r Focus of Rhetorical Group Speech Synthesis ...

19
Rhetorical Group plc Rhetorical Group plc Marc Moens January 2001 r

Transcript of Rhetorical Group plc Marc Moens January 2001 r Focus of Rhetorical Group Speech Synthesis ...

Rhetorical Group plcRhetorical Group plc

Marc Moens

January 2001

r

Focus of Rhetorical GroupFocus of Rhetorical Group

Speech SynthesisProducing natural sounding speech

Talking computers Voices modelled on human voices and often

almost indistinguishable from the original voice

Different from speech recognitionUsed in dictation, voice-controlled operating systemsDifferent companies, targeting different markets

Core product: rVoice

r

Technological breakthroughTechnological breakthrough

Old technology:Formant-based synthesisDiphone-based synthesis

Lack vitalityMonotonicNot suitable for extended use

r

r

Technological breakthroughTechnological breakthrough

rVoice:Unit selection

More natural soundingSuitable for extended use In many applications:

almost indistinguishable from a human voice

“Welcome to our new speech synthesiser.”

Speech synthesisSpeech synthesis

Human voice vs Synthesised VoiceUnder controlled conditions

Mixing the human voice with the synthesised voice r

“Previously he was vice president of Eastern Edison.”

“Mrs Hill said many of the 25 countries that she placed under varying degrees of scrutiny had made genuine progress on this touchy issue.”

6 r

Rhetorical TeamRhetorical Team

Senior Management: Marc Moens (CEO) Paul Taylor (CTO) Peter Denyer (Chairman)

Other management: Keith Edwards (applications manager) Ian Hodson (product development manager) Art Blokland (consultancy manager)

Full Team: 35 people

7 r

rrVoice outlookVoice outlook

A variety of applications and platforms: Telephony industry Games Internet Mobile communications

A variety of input mechanisms Text (TTS) Concept to speech – in conjunction with language generation Domain specific applications

A variety of voices and languages rVoice rapid voice prototyper allows new voices to be added to the system

in a matter of weeks Different accents and languages covered

All within a single generic system

8 r

rVoice core capabilities: rVoice core capabilities: domain specific synthesisdomain specific synthesis

Flexible, scalable domain specific synthesisAirline information

Car directions

Financial news

9 r

rrVoice core capabilities: Voice core capabilities: multi-lingualitymulti-linguality

Currently only English available Plans:

German and French by Q2 2001Spanish Q3Dutch and Italian Q4

Same engine for all languages

10 r

rVoice core capabilities: text rVoice core capabilities: text analysisanalysis

Robust statistical:Text normalisation ($1.43 > one dollar forty

three cents)POS taggingPhrase break predictionLetter-to-sound rule transduction (including

automatic training)Syntactic parsing

11 r

Talking HeadsTalking Heads

Ongoing work on rFace Ability to capture 3D model of any head,

and combine it with speech

12 r

System OverviewSystem Overview

Two systems:rVoice developer

single user stand alone system with scripting language and graphical tools

rVoice runcompact fast run-time system, multithreaded. Client server

architecture and telephony hardware communications

13 r

Current PlatformsCurrent Platforms

Solaris 2.5, 2.5.1, 2.6, 2.7: FreeBSD 2.2, 3.x

Linux (Redhat 4.1, 5.0, 5.1, 5.2, 6.0 and other Linux distributions), OSF (Dec Alphas) SGI (Irix), HPs (HPUX).

Windows 95, 98, NT 4.0, 2000: Visual C++ v5.0 and v6.0

14 r

Speed and SizeSpeed and Size

rVoice 1.0 aims:10 simultaneous channels on Pentium 1GHz256M Ram of which

15M taken up by each channel75M of shared resource

Higher number of channels available with proportionate voice quality reduction

15 r

System Development ScheduleSystem Development Schedule

Basic Prototype January 31, 2001

Alpha releaseFebruary 28, 2001 (single thread)

Beta releaseApril 15, 2001 (multi-threaded)

Full releaseMay, June 2001

16 r

Development Schedule: Development Schedule: capabilitiescapabilities

First basic British voice: 6th December 2000

Five British voices: end January 2001

Five American voices: mid February 2001

VoiceXML end January 2001

Fast unit selection end January 2001

17 r

Future PlansFuture Plans

Extension to new platforms including games and mobile devices

Development and integration withlanguage generationinformation extraction and retrieval

ContactContact

Marc MoensRhetorical Group plc4, Buccleuch PlaceEdinburgh EH8 9LW

Tel : 0131 650 4427 07979 596770Fax: 0131 650 4587Email: [email protected]

Peter DenyerRhetorical Group plc4, Buccleuch PlaceEdinburgh EH8 9LW

Tel : 07770 416 699

Fax: 0131 650 4587Email: [email protected]

r

Make rVoice your voice.

r