The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications 14...
-
Upload
melinda-owen -
Category
Documents
-
view
216 -
download
0
Transcript of The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications 14...
![Page 1: The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications 14 Feb. 2002 Christophe Strobbe K.U.Leuven - ESAT-SCD-DocArch.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eca5503460f94bd7c56/html5/thumbnails/1.jpg)
The Voice-Enabled Web: VoiceXML and
Related Standards for Telephone Access to
Web Applications14 Feb. 2002
Christophe StrobbeK.U.Leuven - ESAT-SCD-DocArch
![Page 2: The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications 14 Feb. 2002 Christophe Strobbe K.U.Leuven - ESAT-SCD-DocArch.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eca5503460f94bd7c56/html5/thumbnails/2.jpg)
Overview• Voice browsers• History of voice markup languages• W3C Speech Interface Framework• Communication Architecture• VoiceXML 2.0• Grammars• SALT
• Not WAP/WML, Voice over IP
![Page 3: The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications 14 Feb. 2002 Christophe Strobbe K.U.Leuven - ESAT-SCD-DocArch.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eca5503460f94bd7c56/html5/thumbnails/3.jpg)
Voice Browser
Device (hardware and software) that interprets voice markup languages to generate voice output and interpret voice input.
![Page 4: The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications 14 Feb. 2002 Christophe Strobbe K.U.Leuven - ESAT-SCD-DocArch.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eca5503460f94bd7c56/html5/thumbnails/4.jpg)
Companies
![Page 5: The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications 14 Feb. 2002 Christophe Strobbe K.U.Leuven - ESAT-SCD-DocArch.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eca5503460f94bd7c56/html5/thumbnails/5.jpg)
History
1990s: companies developed their own markup languages:
• PhoneML (AT&T)
• PhoneML (Lucent)
• VoxML (Motorola)
• TalkML (HP Labs)
• SpeechML (IBM)
=> VoiceXML Forum : VoiceXML 1.0
• 1998: W3C Voice Browser Workshop
![Page 6: The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications 14 Feb. 2002 Christophe Strobbe K.U.Leuven - ESAT-SCD-DocArch.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eca5503460f94bd7c56/html5/thumbnails/6.jpg)
VoiceXML Specification History
• April 1999 – Initial spec – Request For Comment
• August 1999 – 0.9 Spec released
• March 2000 – 1.0 Spec released
• October 2001 – 2.0 Working Draft (W3C)
• March 2002 – next Working Draft
• 4th quarter 2002 – 2.0 Recommendation W3C?
![Page 7: The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications 14 Feb. 2002 Christophe Strobbe K.U.Leuven - ESAT-SCD-DocArch.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eca5503460f94bd7c56/html5/thumbnails/7.jpg)
Why Voice Markup Languages?
• “Voicifying” web pages by adding a few VoiceXML tags is not feasible:– basic design principles that make a good web page
are very different from those that make an efficient voice interface
– e.g. Raggett & Ben-Natan: “Voice Browsers” (W3C, 1998)
• … unless you want to create a multimodal interface (cf. SALT) ?
![Page 8: The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications 14 Feb. 2002 Christophe Strobbe K.U.Leuven - ESAT-SCD-DocArch.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eca5503460f94bd7c56/html5/thumbnails/8.jpg)
Speech Interface Framework
TTS
Language Understanding
WorldWideWeb
User
TelephoneSystem
DialogManager
LanguageGeneration
MediaPlanning
Prerecorded audio player
ASR
DTMF tone recognizer
Context Inter-
pretation
Lexicon Natural LanguageSemantics ML
VoiceXML2.0
Reusable ComponentsSpeech Synthesis ML
N-gram Grammar ML
SpeechRecognition
Grammar ML
![Page 9: The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications 14 Feb. 2002 Christophe Strobbe K.U.Leuven - ESAT-SCD-DocArch.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eca5503460f94bd7c56/html5/thumbnails/9.jpg)
Communication Architecture
![Page 10: The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications 14 Feb. 2002 Christophe Strobbe K.U.Leuven - ESAT-SCD-DocArch.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eca5503460f94bd7c56/html5/thumbnails/10.jpg)
What is VoiceXML?
For creating audio dialogs that include• Synthesized speech• Digitized audio• Recognition of spoken and DTMF key input• Recording of spoken input• Telephony• Mixed-initiative conversationsMajor goal: bring the advantages of web-based development
and content delivery to interactive voice response applications.
![Page 11: The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications 14 Feb. 2002 Christophe Strobbe K.U.Leuven - ESAT-SCD-DocArch.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eca5503460f94bd7c56/html5/thumbnails/11.jpg)
Advantages of VoiceXML
As perceived by Motorola et al:• People want a better mobile user interface
while on the go
• Device Independent
• Open standards create and drive market demand
• Easy to program since similar to other XML-based languages
• Utilizes existing web infrastructure
![Page 12: The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications 14 Feb. 2002 Christophe Strobbe K.U.Leuven - ESAT-SCD-DocArch.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eca5503460f94bd7c56/html5/thumbnails/12.jpg)
Developing applications• To develop VoiceXML applications you have
to learn several languages:– VoiceXML
– ECMAScript (JavaScript/Jscript)
– a grammar format (GSL, JSGF, Speech Recognition Grammar Specification)
– a back end scripting language (Perl, Java, …)
• Web developers are used to this kind of environment
![Page 13: The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications 14 Feb. 2002 Christophe Strobbe K.U.Leuven - ESAT-SCD-DocArch.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eca5503460f94bd7c56/html5/thumbnails/13.jpg)
VoiceXML Basics• XML-based
• More structured then HTML (describes structure and semantics of data, not presentation)– Must close all tags (i.e. <prompt> </prompt>)
• Structure of language described in a Document Type Description (DTD)
![Page 14: The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications 14 Feb. 2002 Christophe Strobbe K.U.Leuven - ESAT-SCD-DocArch.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eca5503460f94bd7c56/html5/thumbnails/14.jpg)
VoiceXML Applications
• An application consists of a single application root document as well as zero or more other documents
• The application root document is loaded whenever any other document is accessed
• The application root document grammars and variables are visible in other application documents
Document root
DocumentDocumentDocument
![Page 15: The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications 14 Feb. 2002 Christophe Strobbe K.U.Leuven - ESAT-SCD-DocArch.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eca5503460f94bd7c56/html5/thumbnails/15.jpg)
VoiceXML Documents• Documents can contain two types of dialogs:
– forms (<form>)
– menus (<menu>)
• Other elements:– <meta>: metadata, defined as name/value pair
– <var>: for declaring variables
– <script>: for client-side ECMAScript
– <catch>: for catching events
– <link>: transitions to other dialogs
![Page 16: The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications 14 Feb. 2002 Christophe Strobbe K.U.Leuven - ESAT-SCD-DocArch.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eca5503460f94bd7c56/html5/thumbnails/16.jpg)
Forms and menus• Forms may contain zero or more <field>
elements– the user must provide a value for the field before
proceeding to the next element in the form
– each field may specify a grammar that defines the allowable inputs
• Menus may contain one or more <choice> elements– a menu presents the user with a choice of options
and then transitions to another dialog
![Page 17: The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications 14 Feb. 2002 Christophe Strobbe K.U.Leuven - ESAT-SCD-DocArch.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eca5503460f94bd7c56/html5/thumbnails/17.jpg)
VoiceXML Example01 <!-- helloworld.vxml -->
02 <?xml version="1.0"?>
03 <vxml version="1.0">
04 <form>
05 <block>
06 <prompt>
07 Hello World!
08 </prompt>
09 </block>
10 </form>
11 </vxml>
![Page 18: The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications 14 Feb. 2002 Christophe Strobbe K.U.Leuven - ESAT-SCD-DocArch.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eca5503460f94bd7c56/html5/thumbnails/18.jpg)
Example with Grammar01 <vxml version="1.0">
02 <meta name=“maintainer" content=“[email protected]"/>
03 <form id="hello">
04 <field name="item">
05 <prompt>Would you like coffee, tea, or juice?</prompt>
06 <grammar type="application/x-gsl">
07 [coffee tea juice] </grammar>
08 <filled>
09 <prompt>Your <value expr="item"/>
10 will be ready momentarily</prompt>
11 </filled>
12 </field>
13 </form>
14 </vxml>
![Page 19: The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications 14 Feb. 2002 Christophe Strobbe K.U.Leuven - ESAT-SCD-DocArch.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eca5503460f94bd7c56/html5/thumbnails/19.jpg)
Dynamic VoiceXML#!perl –w
print "Content-type: text/x-vxml \n\n";
$HOMEBUFFER = '<?xml version="1.0"?>
<vxml version="1.0">
<form>
<block>
<prompt> Hello World </prompt>
</block>
</form>
</vxml>';
print $HOMEBUFFER;
![Page 20: The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications 14 Feb. 2002 Christophe Strobbe K.U.Leuven - ESAT-SCD-DocArch.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eca5503460f94bd7c56/html5/thumbnails/20.jpg)
Other Markup Languages• JSML: JSpeech Markup Language (Sun)
• Dialog ML (Dennis Heuer)
• SABLE (SABLE Consortium)
• DMML (Dialogue Moves Markup Language)
• SALT: Speech Application Language Tags (SALT Forum)
• (CallXML, Telephony Markup Language, …)
Progress since March 2000 (VoiceXML 1.0) ?
![Page 21: The Voice-Enabled Web: VoiceXML and Related Standards for Telephone Access to Web Applications 14 Feb. 2002 Christophe Strobbe K.U.Leuven - ESAT-SCD-DocArch.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eca5503460f94bd7c56/html5/thumbnails/21.jpg)
SALT• Speech Application Language Tags (SALT
Forum)
• SALT Forum founded by Microsoft, Intel, …; 15 October 2001
• very simple set of tags for extending existing markup languages (xHTML, XML)
• specification available Q1 2002
• specification submitted to standards body (W3C??) mid 2002