Rhetorical Group plc Marc Moens January 2001 r Focus of Rhetorical Group Speech Synthesis ...
-
Upload
juliana-goodman -
Category
Documents
-
view
217 -
download
0
Transcript of Rhetorical Group plc Marc Moens January 2001 r Focus of Rhetorical Group Speech Synthesis ...
Focus of Rhetorical GroupFocus of Rhetorical Group
Speech SynthesisProducing natural sounding speech
Talking computers Voices modelled on human voices and often
almost indistinguishable from the original voice
Different from speech recognitionUsed in dictation, voice-controlled operating systemsDifferent companies, targeting different markets
Core product: rVoice
r
Technological breakthroughTechnological breakthrough
Old technology:Formant-based synthesisDiphone-based synthesis
Lack vitalityMonotonicNot suitable for extended use
r
r
Technological breakthroughTechnological breakthrough
rVoice:Unit selection
More natural soundingSuitable for extended use In many applications:
almost indistinguishable from a human voice
“Welcome to our new speech synthesiser.”
Speech synthesisSpeech synthesis
Human voice vs Synthesised VoiceUnder controlled conditions
Mixing the human voice with the synthesised voice r
“Previously he was vice president of Eastern Edison.”
“Mrs Hill said many of the 25 countries that she placed under varying degrees of scrutiny had made genuine progress on this touchy issue.”
6 r
Rhetorical TeamRhetorical Team
Senior Management: Marc Moens (CEO) Paul Taylor (CTO) Peter Denyer (Chairman)
Other management: Keith Edwards (applications manager) Ian Hodson (product development manager) Art Blokland (consultancy manager)
Full Team: 35 people
7 r
rrVoice outlookVoice outlook
A variety of applications and platforms: Telephony industry Games Internet Mobile communications
A variety of input mechanisms Text (TTS) Concept to speech – in conjunction with language generation Domain specific applications
A variety of voices and languages rVoice rapid voice prototyper allows new voices to be added to the system
in a matter of weeks Different accents and languages covered
All within a single generic system
8 r
rVoice core capabilities: rVoice core capabilities: domain specific synthesisdomain specific synthesis
Flexible, scalable domain specific synthesisAirline information
Car directions
Financial news
9 r
rrVoice core capabilities: Voice core capabilities: multi-lingualitymulti-linguality
Currently only English available Plans:
German and French by Q2 2001Spanish Q3Dutch and Italian Q4
Same engine for all languages
10 r
rVoice core capabilities: text rVoice core capabilities: text analysisanalysis
Robust statistical:Text normalisation ($1.43 > one dollar forty
three cents)POS taggingPhrase break predictionLetter-to-sound rule transduction (including
automatic training)Syntactic parsing
11 r
Talking HeadsTalking Heads
Ongoing work on rFace Ability to capture 3D model of any head,
and combine it with speech
12 r
System OverviewSystem Overview
Two systems:rVoice developer
single user stand alone system with scripting language and graphical tools
rVoice runcompact fast run-time system, multithreaded. Client server
architecture and telephony hardware communications
13 r
Current PlatformsCurrent Platforms
Solaris 2.5, 2.5.1, 2.6, 2.7: FreeBSD 2.2, 3.x
Linux (Redhat 4.1, 5.0, 5.1, 5.2, 6.0 and other Linux distributions), OSF (Dec Alphas) SGI (Irix), HPs (HPUX).
Windows 95, 98, NT 4.0, 2000: Visual C++ v5.0 and v6.0
14 r
Speed and SizeSpeed and Size
rVoice 1.0 aims:10 simultaneous channels on Pentium 1GHz256M Ram of which
15M taken up by each channel75M of shared resource
Higher number of channels available with proportionate voice quality reduction
15 r
System Development ScheduleSystem Development Schedule
Basic Prototype January 31, 2001
Alpha releaseFebruary 28, 2001 (single thread)
Beta releaseApril 15, 2001 (multi-threaded)
Full releaseMay, June 2001
16 r
Development Schedule: Development Schedule: capabilitiescapabilities
First basic British voice: 6th December 2000
Five British voices: end January 2001
Five American voices: mid February 2001
VoiceXML end January 2001
Fast unit selection end January 2001
17 r
Future PlansFuture Plans
Extension to new platforms including games and mobile devices
Development and integration withlanguage generationinformation extraction and retrieval
ContactContact
Marc MoensRhetorical Group plc4, Buccleuch PlaceEdinburgh EH8 9LW
Tel : 0131 650 4427 07979 596770Fax: 0131 650 4587Email: [email protected]
Peter DenyerRhetorical Group plc4, Buccleuch PlaceEdinburgh EH8 9LW
Tel : 07770 416 699
Fax: 0131 650 4587Email: [email protected]
r