Speech Processing 11-492/18-492tts.speech.cs.cmu.edu/courses/11492/slides/sds_treedialogs.pdf ·...

Speech Processing 11-492/18-492

Spoken Dialog Systems

Tree based dialogs

VoiceXML

State-based Dialogs

Simple state-based dialog systems

Get Name

Get Account number

Get PIN

Present balance

Go back to start or exit

State-based Dialogs

Get Name:

What is your name?

ASR Name

May be correct (in the database)

May be unknown (not in database)

May not be name (What do I say?/Help/Repeat)

Should you echo the recognized name?

Confirmation (or not)

State-based dialog

Get name

Check in database

Ask again if not

Deal with help

Get account number

Check in database (with name)

Confirm account number and name

For security

State-based Interaction

Trees can get very large

User can get lost easily

You want to minimize the number of turns

Faster throughput means more calls

Faster throughput means happier customer

The level of help

First time users *need* a successful call

Otherwise, they wont call back

Having very helpful prompts is good

At start, gets annoying quickly

Designing prompts is a craft

What is spoken should be understood

How much should you tailor it to the user

VoiceXML

A W3C standard for voice browsing

XML based “programming” language for

speech

Output synthesized (and recorded) speech

Recognition of speech

Recording of spoken input

Telephony features

VoiceXML

From Grammars (JSGF: java speech grammar

format)

From tri-grams

From “Domain Managers”

Credit card numbers

City, Stats

VoiceXML

<ssml> markup

Choice of voice

Choice of language

Choice of how to pronounce things

Specify breaks, timing, emphasis

Structure

<form>

<block>

</block>

</form>

<block>

Goodbye!

</block>

</form>

</vxml>

Basic Tags

<field> gather info from user through

speech

<record> record data user

<subdialog> performs some sub dialog

<field> tag

<prompt>Which bus line do you want?</prompt>

<help> Please say your desired bus number, e.g. 61C</help>

</field>

</form>

Flow of Control

Goto <goto next=“#GetBusNumber>

<prompt> Sorry that bus no longer runs</prompt>

<prompt> Sorry it’ll be a long wait </prompt>

<prompt> One will be along shortly </prompt>

Variables

<prompt> I just wanted to say <value

expr=“var1”> </prompt>

Recognition Grammars

Speech Recognition Grammar Specification

(SRGS)

Augmented BNF

$order = I would like a $drink

$drink = coke | pepsi | mountain_dew

VoiceXML Browsers

Compatibility

Not as compatible as one would like

<objects> can be different (but useful)

City, State recognizers

ECMAscript (Javascript)

Beyond VoiceXML (in VoiceXML)

Mixing html/cgi scripts in VoiceXML

Use php to generate VoiceXML files

Use urls (with ?...) to calculate/get data

http://weather.com?zip=“15213”

Use urls to get waveforms

http://tts.com?text=“Hello World”

VoiceXML future

N-gram grammar Markup Language

Many browsers have own extensions

Pronunciation Lexicon Markup Language

A way to add new items to the lexicon

Hard to find good standards

Call Control Markup Language

For management and logging of calls

Microsoft SALT

SALT tags

Listen DTMF prompt bind grammar (plus ssml)

Designed for desktop not just phone

Design to be shared documents

Viewing (HTML) and Speech (SALT)

Available Systems

Nuance

Be-vocal

Tell Me

Tell-me studio

OpenVXI/publicvoicexml.org

Many others

SDS Architecture

Speech Processing 11-492/18-492tts.speech.cs.cmu.edu/courses/11492/slides/sds_treedialogs.pdf ·...

Documents

Transcript of Speech Processing 11-492/18-492tts.speech.cs.cmu.edu/courses/11492/slides/sds_treedialogs.pdf ·...

Cisco Unity Express JSP, Servlet, and VoiceXML Web ... · PDF fileCisco Unity Express JSP, Servlet, and VoiceXML Web Application Development Guide Developing the Web Application 3

Pace VoiceXML Absentee System

VoiceXML ACS 352 Based on .

Speech Processing 11-492/18-492tts.speech.cs.cmu.edu/courses/11492/slides/tts_voicebuild.pdfExtract pitch marks from data Find voices/unvoiced regions Add “fake” pitch marks during

Speech Processing 15-492/18-492tts.speech.cs.cmu.edu/courses/11492/slides/multilingual_all.pdf · Writing Systems Romanized writing systems Latin-1 (iso-8599-1) Covers many Western

VoiceXML: Forms, Menus, Grammars, Form Interpretation Algorithm

Developing with VoiceXML Building a Video Conference Application

Speech Processing 11-492/18-492tts.speech.cs.cmu.edu/courses/11492/slides/sds_intro_comp.pdfSDS Internals Language Understanding From words to structure Dialog Manager State of dialog

VoiceGenie VoiceXML Platform for Advanced …thevma.com/archives/2003/HeinrichWelter_VoiceGenie_V.pdfVoiceGenie VoiceXML Platform for Advanced Speech Solutions Peter Smith Product

VoiceXML Installation and Configuration Guide

Accelerating Application Development - …€¦ · Accelerating Application Development John Potemri David Beaulieu. Agenda zIntroduction to VoiceXML ... PLMN. VoiceXML – Introduction

VoiceXML Technical Reference - Genesys · The VoiceXML Technical Reference provides information about the Genesys PureConnect VoiceXML system. VoiceXML, the Voice Extensible Markup

IVRPowers VoiceXML IVR Platform

VoiceXML-Applications for E-Commerce and E-Learning

Aural Interfaces to Databases based on VoiceXML

Speech Processing 11-492/18-492tts.speech.cs.cmu.edu/courses/11492/slides/s2s_details.pdfTranstac System Details Two way system 2 ASR systems: English and Iraqi 2 way statistical translation

Holly5 VoiceXML Developer Guide - West Corporation · 3.2.3 Loquendo ..... 19 . Confidential Final Holly5 VoiceXML Developer Guide, v1-0, 22 December ... Holly5 VoiceXML Developer

Voice Extensible Markup Language (VoiceXML) Version 2users.skynet.be/fb295074/VoiceXML_2.0.pdfVoice Extensible Markup Language (VoiceXML) Version 2.0 Voice Browser Activity by participants

An Introduction to VoiceXML - Amazon S3 · 3 François Mairesse, University of Sheffield Brief history ¢ 1999: AT&T, IBM, Lucent Technology and Motorola formed the VoiceXML Forum

Sneak Preview: VoiceXML 3 - World Wide Web Consortium (W3C)