Redpaper - immagic.com€¦ · ibm.com/redbooks Redpaper Front cover Adding Voice to your your...

ibm.com/redbooks Redpaper

Front cover

Adding Voice to your your Portlet Applicationsations

Juan R. RodriguezEric Derksen

Muhammed OmarjeeLeandro Pedroso

Run voice portlets with WebSphere Voice Application Access V5.0

Run voice portlets in WebSphere Voice Application Access V5.0

Use actions and messaging to integrate voice portlets

http://www.redbooks.ibm.com/ http://www.redbooks.ibm.com/

International Technical Support Organization

Adding Voice to your Portlet Applications

July 2004

Copyright International Business Machines Corporation 2004. All rights reserved.Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP ScheduleContract with IBM Corp.

First Edition (July 2004)

This edition applies to Version 5, Release 0 of WebSphere Voice Toolkit and WebSphere Voice Application Access.

Note: Before using this information and the product it supports, read the information in Notices on page vii.

Contents

Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiTrademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ixThe team that wrote this Redpaper . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ixBecome a published author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xComments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .x

Chapter 1. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Introduction to voice portlet applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2 Developing a speech application based on a GUI portlet . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2.1 What is a speech application? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.2.2 What is VoiceXML?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.2.3 Converting GUI widgets to speech user interfaces. . . . . . . . . . . . . . . . . . . . . . . . . 51.2.4 Designing dialogs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.3 Implementing a basic speech application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.3.1 Example of a basic speech application. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.4 Developing voice portlet applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.4.1 Voice portal content structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

Chapter 2. Voice and portlet toolkit environment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.1 Software prerequisites for the development environment . . . . . . . . . . . . . . . . . . . . . . . 122.2 Getting started. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.3 Installing the development environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.3.1 Installing WebSphere Studio Application Developer. . . . . . . . . . . . . . . . . . . . . . . 122.3.2 Portal Toolkit V5.0.2 for WebSphere Studio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.3.3 Voice Toolkit for WebSphere Studio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.3.4 Installing the Voice Aggregator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.4 The voice perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192.4.1 Testing the microphone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192.4.2 Creating a VoiceXML file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202.4.3 Using the VoiceXML editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.4.4 Running and debugging VoiceXML files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.5 The voice portlet perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252.5.1 Creating a new portlet application project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262.5.2 File naming conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282.5.3 Creating .vxml files for your portlet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282.5.4 Converting .vxml files to .jsv files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292.5.5 Defining a test environment on a local server . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302.5.6 Starting the server. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312.5.7 Test the voice portlet. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322.5.8 Portal log files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332.5.9 Debugging the voice portlet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

Chapter 3. VoiceXML fragments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373.1 Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3.1.1 Built-in grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383.1.2 Inline grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

Copyright IBM Corp. 2004. All rights reserved. iii

3.1.3 External grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403.1.4 Reviewing grammar results in your application. . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.2 Help. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423.2.1 Help functions in voice portals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433.2.2 Self-revealing help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

3.3 Reusable dialog components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

Chapter 4. Using the Portlet API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 494.1 Overview of core portlet components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4.1.1 Portlet modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504.1.2 MIME types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514.1.3 Portlet descriptor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514.1.4 Portlet API tag library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524.1.5 Voice Aggregator tab libraries. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

4.2 Action events. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534.2.1 Sample scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 554.2.2 Portlet descriptor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 604.2.3 Running the sample scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

4.3 Portlet messaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 634.3.1 Sample scenario enhancements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 644.3.2 Running the sample scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

4.4 Global commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 704.4.1 Catching errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 714.4.2 Catching help requests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 724.4.3 Input mode switching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 724.4.4 A sample scenario using DTMF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 724.4.5 A sample voice portlet using general commands . . . . . . . . . . . . . . . . . . . . . . . . . 734.4.6 A sample scenario using DTMF and voice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

4.5 Other Portlet API services. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 764.6 National language support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

4.6.1 Setting the locale for your application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 774.6.2 Whole resource translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 794.6.3 Resource bundles translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

Chapter 5. Sample scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 835.1 Scenario 1: a company directory voice portlet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

5.1.1 Directory application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 845.1.2 Directory model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 855.1.3 Starting the development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

5.2 Scenario 2: ice cream voice portlet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 895.2.1 Using the Call Flow Builder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 895.2.2 Generating a voice application from the call flow . . . . . . . . . . . . . . . . . . . . . . . . . 93

5.3 Scenario 3: stock application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 945.3.1 Grammar file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 955.3.2 Dialogue component . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 965.3.3 Extracting the data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 975.3.4 Giving the caller the information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

Chapter 6. Deploying and running voice portlet applications . . . . . . . . . . . . . . . . . . . 996.1 Basic architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1006.2 Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

6.2.1 Software prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1006.2.2 Installation order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

6.3 Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

iv Adding Voice to your Portlet Applications

6.4 Deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

Appendix A. Additional tools in the Voice Toolkit . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109A.1 Grammar file test tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110A.2 Pronunciations Builder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114A.3 Audio Recorder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

Appendix B. Installing the Voice Toolkit without Internet connection . . . . . . . . . . . . 119

Appendix C. Additional material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123C.1 Locating the Web material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123C.2 Using the Web material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

C.2.1 System requirements for downloading the Web material . . . . . . . . . . . . . . . . . . 123C.2.2 How to use the Web material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

Related publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125Other publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125Online resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125How to get IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126Help from IBM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

Contents v

vi Adding Voice to your Portlet Applications

Notices

This information was developed for products and services offered in the U.S.A.

IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service.

IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not give you any license to these patents. You can send license inquiries, in writing, to: IBM Director of Licensing, IBM Corporation, North Castle Drive Armonk, NY 10504-1785 U.S.A.

The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you.

This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice.

Any references in this information to non-IBM Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk.

IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you.

Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.

This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental.

COPYRIGHT LICENSE: This information contains sample application programs in source language, which illustrates programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. You may copy, modify, and distribute these sample programs in any form without payment to IBM for the purposes of developing, using, marketing, or distributing application programs conforming to IBM's application programming interfaces.

Copyright IBM Corp. 2004. All rights reserved. vii

TrademarksThe following terms are trademarks of the International Business Machines Corporation in the United States, other countries, or both:

AIXBalanceCallFlowCloudscapeDB2EserverEserver

eServerEveryplaceibm.comIBMIllustraMVSNotes

PerformRedbooks (logo) RedbooksTMEViaVoiceWebSphere

The following terms are trademarks of other companies:

Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.

Java and all Java-based trademarks and logos are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States, other countries, or both.

UNIX is a registered trademark of The Open Group in the United States and other countries.

Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.

Other company, product, and service names may be trademarks or service marks of others

viii Adding Voice to your Portlet Applications

Preface

This IBM Redpaper is intended for solution architects and application developers who want to develop and test voice portlet applications and who have previously developed graphical-user-interface (GUI) portlet applications or VoiceXML applications. It shows how to develop a voice portlet using VoiceXML and the Portlet API. Furthermore, it introduces concepts of speech applications and the Portlet API. The Voice Toolkit V5.0, Portal Toolkit V5.0.2 and WebSphere Studio Application Developer V5.1 are used to develop sample scenarios and illustrate how to develop and implement Voice Portlet Applications. VoiceXML and the Portlet API are described with the help of example GUI portlets that progress to speech-user-interface (SUI) portlets. Deployment scenarios are included to illustrate how to use voice-enabled portlet applications.

A basic knowledge of Java, Servlet, JSP, MVC, and VoiceXML programming is recommended for developing voice portlet applications.

The team that wrote this RedpaperThis Redpaper was produced by a team of specialists from around the world while working at the International Technical Support Organization, Raleigh Center.

Juan R. Rodriguez is a Consultant at the IBM ITSO Center, Raleigh. He received his Master of Science degree in Computer Science from Iowa State University. He writes extensively and teaches IBM classes worldwide on such topics as networking, Web technologies, and information security. Before joining the IBM ITSO, he worked at the IBM laboratory in the Research Triangle Park (North Carolina, USA) as a designer and developer of networking products.

Eric Derksen works as a Technical Sales Specialist in the Advanced Technology Group, Region North, IBM EMEA WebSphere Pervasive Computing Division. He joined IBM in 1999 and has been working with Voice and Call Center products since then. First, he worked as a contact center consultant for IBM Business Consulting Services (BCS). In 2003, Eric became a technical sales representative for the IBM Software Group. He has knowledge of the IBM WebSphere pervasive portfolio and assists in architectural decisions, application development, proof-of-concepts and pre-sales activities involving IBM voice systems. He has worked on several interactive voice response (IVR), dual tone multiple frequency (DTMF), speech, and voicemail implementations throughout EMEA.

Muhammed Omarjee is an IT Specialist with Standard Bank Group, South Africa. As an employee of the Group IT Solutions Center, he works as an analyst / programmer to develop mobile banking and IVR applications. Areas of expertise are centered around pervasive computing and Java (J2EE) technologies. Muhammed is a co-author of other IBM Redbooks related to pervasive computing. He holds a National Diploma in Information Technology from the Technikon Witwatersrand in Johannesburg, South Africa.

Copyright IBM Corp. 2004. All rights reserved. ix

Leandro Pedroso is a System Specialist at IBM Brazil where he has worked for seven years. He graduated as an Electronics Engineer from the Instituto Mau de Tecnologia, So Caetano do Sul, Brazil, and currently works as a designer and implementor of voice application systems, including Computer-Telephony-Integration (CTI) and IVR. He has also worked with other contact center technologies such as problem and incident management.

Thanks to the following people for their contributions to this project:

George Kroner, Margaret TicknorInternational Technical Support Organization, Raleigh Center

Mary Fisher, Baiju MandaliaIBM Boca Raton, Florida, USA

Adam Orentlicher, Jing Fu, Mai PhamIBM Yorktown Heights, New York, USA

Girish Dhanakshirur; Boca Raton, Mai Pham; Fishkill, IBM Boca Raton, Florida, USA

Karen MatthewsInternational Technical Support Organization, Raleigh Center

Become a published authorJoin us for a two- to six-week residency program! Help write an IBM Redbook dealing with specific products or solutions, while getting hands-on experience with leading-edge technologies. You'll team with IBM technical professionals, Business Partners and/or customers.

Your efforts will help increase product acceptance and customer satisfaction. As a bonus, you'll develop a network of contacts in IBM development labs, and increase your productivity and marketability.

Find out more about the residency program, browse the residency index, and apply online at:

ibm.com/redbooks/residencies.html

Comments welcomeYour comments are important to us!

We want our papers to be as helpful as possible. Send us your comments about this Redpaper or other Redbooks in one of the following ways:

Use the online Contact us review redbook form found at:

http://www.redbooks.ibm.com/contacts.html

Send your comments in an Internet note to:

[email protected]

x Adding Voice to your Portlet Applications

http://www.redbooks.ibm.com/residencies.htmlhttp://www.redbooks.ibm.com/residencies.htmlhttp://www.redbooks.ibm.com/http://www.redbooks.ibm.com/contacts.html

Mail your comments to:

IBM Corporation, International Technical Support OrganizationDept. HZ8 Building 662P.O. Box 12195Research Triangle Park, NC 27709-2195

Preface xi

xii Adding Voice to your Portlet Applications

Chapter 1. Overview

Many business enterprises are currently providing mobile computing access by implementing voice recognition to computers. The availability of high quality automated speech recognition and speech synthesis technologies, combined with lower cost and higher performance hardware, make automated voice transactions feasible for most enterprise applications. Some examples are:

Accessing business information, including the corporate front desk tasks of identifying callers, automated telephone ordering processing, service help desks, order tracking, airline arrival and departure information, cinema and theater booking services, and home banking services.

Accessing public information, including community information such as weather, traffic conditions, school closures, directions and events, news, stock market information, and business and e-commerce transactions.

Accessing personal information, including calendars, address and telephone lists, to-do lists, shopping lists, and calorie counters.

Assisting users to communicate with others by sending and receiving voice-mail and e-mail messages using their voices.

This chapter provides an overview of voice portlet application development using the following software:

WebSphere Studio Application Developer Version 5.1

Portal Toolkit Version 5.0.2

WebSphere Voice Application Access Version 5.0 Voice Aggregator

Voice Toolkit Version 5.0

Also in this chapter, the following topics will be discussed:

1.1, Introduction to voice portlet applications on page 2

1.2, Developing a speech application based on a GUI portlet on page 3

1.3, Implementing a basic speech application on page 7

1.4, Developing voice portlet applications on page 8

1

Copyright IBM Corp. 2004. All rights reserved. 1

1.1 Introduction to voice portlet applicationsPortlets are reusable components that provide access to Web-based content, applications, and other resources. Web pages, applications, and syndicated content feeds can be accessed through portlets. Companies can create their own portlets or select them from a catalog of third-party portlets. Portlets are intended to be assembled into a portal page, with multiple instances of the same portlet displaying different data for each individual user.

A voice portlet is an application such as IBM WebSphere Portal that provides its own functionality through a speech-user-interface as well as a graphical-user-interface (GUI). It can function as a stand-alone application or as a front-end interface to a back-end application. A voice portlet generates Voice eXtensible Markup Language (VoiceXML) documents that implement the speech-user-interface (SUI). The advantage in this is that users can access the voice portlets anytime or anywhere from any telephone.

Users can interact with enterprise data (that is, data available via Web-style interfaces such as servlets, Active Server Pages (ASPs), JavaServer Pages (JSPs), JavaBeans, or Common Gateway Interface (CGI) scripts using speech and a telephone instead of a keyboard and a mouse. Voice portlets provide people who do not have access to a computer due to time, location, or cost constraints a convenient way to communicate and receive information. Providing conversational access to Web-based data enables companies to reach this largely untapped market.

Advantages of providing voice portlets, rather than speech applications working on the voice server, include:

Personalization: Users can customize their applications.

Cooperation: You can provide voice portlets that cooperate with, and share information with, other portlets.

Multi-channel reuse: channels such as Web, mobile devices, and telephones can share and reuse the same business logic to access and present data. All the Web-based, back-end connections, Web services, credential vaults, and so forth are readily available to the voice channel.

Figure 1-1 illustrates how audio devices access portlet applications using WebSphere Voice Application Access (WVAA) acting as an extension of WebSphere Portal. Portlet applications use the Portlet API to communicate with WebSphere Portal. In turn, enterprise back-end applications and other Internet applications are also accessible to provide dynamic content. For voice devices, VoiceXML code is generated.

Figure 1-1 Portlet applications

portlet application 1



Port

let

API

WVAA

WEA

WebSphere Portal

Web Applications

Web Services

Other Enterprise

Applications

2 Adding Voice to your Portlet Applications

A voice portlet application can be created from an existing GUI (graphical user interface) portlet by cloning the existing functions of the GUI and then converting it to a voice portlet.

Alternatively, a voice portlet application can also be created from an existing speech application by designing and coding a voice portlet and expanding its functions to match those of the existing speech application.

1.1.1 TerminologyThe terminology described in Table 1-1 is used in this document.

Table 1-1 General terminology

1.2 Developing a speech application based on a GUI portletTo develop a voice portlet from an existing GUI portlet, first create a working, stand-alone speech application based on the specifications of the existing GUI portlet. This can make your problem determination easier. Because you cannot implement functions that are dependent on portlet-like data management, you can use dummy values for these functions. See 5.1, Scenario 1: a company directory voice portlet on page 84 for a sample scenario describing how to create and test a speech application in VoiceXML by changing the sample GUI portlet into a speech application.

If you are familiar with speech application development or you already have a speech application that you would like to change to a voice portlet, you may skip this section.

Note: If you are not familiar with developing portlets or speech applications, it may be helpful to develop and troubleshoot a speech application before developing a portlet

Term Meaning

GUI Graphical user interface

Portal Server WebSphere Portal Server

SUI Speech user interface

Voice Toolkit Voice Toolkit for WebSphere Studio

Voice Portal Tools Voice Portlet Development and Debug Tools installed as a feature of the Voice Toolkit for WebSphere Studio

Voice Portal Voice Portal of WebSphere Voice Application Access

VoiceXML or .vxml Voice eXtensible Markup Language (VoiceXML), which is an XML-based markup language for creating distributed voice applications

WPS WebSphere Portal Server

WSAD WebSphere Studio Application Developer

WSSD WebSphere Studio Site Developer

WVAA WebSphere Voice Application Access

WVS WebSphere Voice Server

Chapter 1. Overview 3

Table 1-2 illustrates the terms used in describing speech technology.

Table 1-2 Speech technology terminology

Table 1-3 illustrates the terms used in describing speech recognition.

Table 1-3 Speech recognition terminology

Table 1-4 illustrates the terms used in describing Text-to-Speech (TTS) technology.

Table 1-4 Text-to-speech (TTS) technology terminology

Table 1-5 illustrates the terms used in describing Voice Response systems.

Table 1-5 Voice Response system terminology

Table 1-6 illustrates the terms used in describing important types of recognition events.

Table 1-6 Important recognition events terminology

Term Meaning

Speech Recognition The ability of a computer to decode human speech and convert it to text

Speech Synthesis The ability of a computer to read out loud, that is, to generate audio output from text input. Text-to-speech is abbreviated as TTS. Speech synthesis is often referred to as a text-to-speech (TTS) technology.

Term Meaning

Grammar The rules-based specification of words, phrases, and sentences that the system can recognize

Accuracy A quantitative measure of a speech recognition system's performance

Acceptance / Rejection The two states of recognition results that can be returned by the speech engine. (Recognized text is the string of characters and symbols returned by the recognizer.)

Term Meaning

Annotations Alphanumeric tags (embedded in the input text) that change the the characteristics (speed, emphasis, tone, etc.)

Prosody The degree to which the speech has appropriate rhythms and variations in amplitude

Term Meaning

DTMF Dual tone multiple frequency - the tones generated by the phone keypad

Prompt The audio played during a phone call

Barge-in The ability to interrupt a prompt as it is playing by saying something or pressing a key

Term Meaning

Help System recognition of a request for help


1.2.1 What is a speech application?A speech application is one that uses spoken, audible input and output rather than the visual input that is used in GUI programs. Users can access speech applications anytime, anywhere, from any telephonic device.

Each VoiceXML document specifies an interaction, or dialog, between the user and the application. The information played can be:

Prerecorded audio files

Synthesized by the text-to-speech engine from text specified in your VoiceXML files

The spoken words by a caller are passed to the speech-recognition engine. Based on the user's input, the VoiceXML browser proceeds with the interaction specified by the VoiceXML application. The Web server can use server-side programs to locate or update records in a back-end enterprise database and return the information to the VoiceXML browser. The VoiceXML browser presents this information to the user either by playing prerecorded audio files or by synthesizing speech based on the data retrieved from the database.

A speech application consists of one or more VoiceXML documents that describe the dialogs and one or more grammars that specify valid utterances. Servlets and Common Gateway Interface (CGI) programs can be used with VoiceXML documents.

1.2.2 What is VoiceXML?As stated in 1.1.1, Terminology on page 3, VoiceXML is an XML-based markup language for creating distributed voice applications. VoiceXML is an industry standard that has been defined by the VoiceXML Forum (http://www.voicexml.org), of which IBM is a founding member. It has been accepted for submission by the World Wide Web Consortium (W3C) as a standard for voice markup on the Web. The current version is 2.0.

Application developers can use VoiceXML to create Web-based voice applications that users can access by telephone or other pervasive devices. VoiceXML is designed to accept either spoken input, DTMF input, or both.

1.2.3 Converting GUI widgets to speech user interfacesThe presentation of information in a spoken format is often different from the presentation in a visual format, due to the obvious differences between the interfaces. For this reason, transcoding (that is, using a tool to automatically convert HTML files to VoiceXML) might not be the most effective way to create speech applications. In the initial phase, your goal is to define the proposed functionality and create an initial design. This involves designing dialogs, and defining grammars. For information about best practices in the design of speech user interfaces, consult the VoiceXML Programmers Guide (cited in , Related publications on page 125). Another source of information is the Getting started' section in the Getting Started with Developing Voice Portlet Applications manual that comes with the Voice Portlet Tool plug-ins in WebSphere Studio Application Developer or WebSphere Studio Site Developer.

Noinput No audible input was recognized by the system within a defined time period

Nomatch Input detected by the system for which there was no match in the systems grammar(s)

Term Meaning


http://www.voicexml.org

1.2.4 Designing dialogsThe following sections address topics that the other documents do not include, such as tips for converting specific GUI widgets to appropriate speech-user-interfaces.

Menus to select transactions or queriesIf there is more than one option on a panel, provide a menu to present users with a list of choices. For example:

Freeform input field

You need to determine the type of user input that the application needs. For example, an input field that accepts freeform input is not valid in a speech application. If your GUI portlet has this type of input field, you should change the functional specification.

Choices for list boxes or radio buttons

The prompts for a list box or radio button will depend on the choices available. For example:

a. If a user can make the choice without a prompt (users might have access to a catalog), then ask a question. For this type of list, the solution might be as simple as providing a list and prompting: Which item?

b. If users need choices provided, and there are relatively few items (typically fewer than ten), provide the choices in a menu.

c. If users need choices provided, and there are many choices (more than twenty), then redesign the application so users can navigate through the choices or abandon this feature in the voice application.

Auditory information versus visual information.When you develop a speech application based on a GUI application, you will need to change visual information to text.

Icons, pictures, and fonts in bold used to highlight specific types of information (such as a mark indicating Important). To replace visuals, consider using an auditory inflection technique. For example, try changing volume or pitch, providing a special dialog to prompt items marked with Important, or adding more informational text to indicate the meaning.

Tables. Table information might be easy to understand visually, but it can be difficult to convey verbally. You might try to change table information, such as the information in Table 1-7, into a sentence or list.

Table 1-7 Typical data that may need to be converted to speech

Symbols. If text includes symbols, change them to text that can be converted to speech correctly. For example, change v11 to version 11. Depending on the intelligence of the Text-to-Speech (TTS) engines and the preprocessors provided, you can use symbols like $ for instance. The TTS engine will recognize this as currency. Furthermore, you can define the types in your voice application by specifically addressing the TTS type, for instance: to specify the prompt to be spoken as a date.

Category Item No. Item Name Price

Food and wine 6170 Tomato juice 570

Ornaments 6005 Xmas candle 20

Electric appliance 3401 Washing machine 500

Business Accessories 7000 CD-R 10


One answer to one question dialogWhen there is more than one input field required in a dialog, consider presenting the fields as a series of questions. For example:

C: Which item?

H: Orange juice.

C: How many?

H: Five.

1.3 Implementing a basic speech applicationThis section describes a simple VoiceXML application that is based on an existing GUI portlet.

The following items need to be created:

VoiceXML documents. These documents describe dialogs between the system and users.

Grammars. A grammar is an enumeration of the set of utterances, words and phrases, that constitute the acceptable user response to a given prompt. It should include all the words and phrases to be recognized.

1.3.1 Example of a basic speech applicationExample 1-1 illustrates a basic voice dialog with a simple inline Speech Recognition Grammer Specification (SRGS) for an XML grammar.

Example 1-1 Sample speech application

Hello World

please say your name

JohnJeffreycardDavidWorld

Note: Discussing the VoiceXML specification and grammar specifications is beyond the scope of this document. For a list of those specifications, see:

http://www.voicexml.org/spec.html.

For some guidance on this topic, see Chapter 3, VoiceXML fragments on page 37.


http://www.voicexml.org/spec.html

1.4 Developing voice portlet applicationsThis section briefly describes voice portlet applications. For more details and sample scenarios see the next chapters in this book.

What is a VoiceXML fragment?A VoiceXML fragment is a VoiceXML document that does not have any of the top-level tags, such as:

, DOCTYPE, and any XML processing statements.

The VoiceXML fragment is augmented by the WebSphere Voice Application Access component to make it a complete browsable VoiceXML document.

What is a voice portlet?A voice portlet is a portlet designed to contain VoiceXML fragments to implement a speech user interface. The voice portlet can be allocated to the page or place where the VoiceXML markup is enabled. A voice portlet can also contain other markups, such as HTML (hypertext markup language) or WML (widget meta language), for multichannel purposes. Each of these markups has its own unique features.

Voice portal overviewWebSphere Voice Application Access (WVAA) Voice Portal handles and aggregates the voice portlets. It provides the following three major functions:

User authentication through speech user interface. If callers want access to a personalized page of WebSphere Portal, they need to be authenticated. On the other hand, if they want access to an anonymous page of WebSphere Portal, they do not need to be authenticated. For user authentication, WebSphere Voice Application Access Voice Portal generates a VoiceXML document to collect user information such as the user ID and password, which is passed to WebSphere Portal for authentication.

Navigation in a voice portal menu structure. If there are two or more voice applications, they are grouped into one or more pages, which can be further grouped into one or more places. WebSphere Voice Application Access Voice Portal constructs a three-level menu structure (place/page/application) so that users can navigate through the menu structure and execute a target application.

Provide a common execution environment for all voice applications. WebSphere Voice Application Access Voice Portal calls the target voice portlet that is chosen by the caller. The voice portlet returns to the WebSphere Voice Application Access Voice Portal when completed. During the execution of the voice portlet, WebSphere Voice Application Access Voice Portal provides a common execution environment. That includes global variables, event handlers, links, and so on.

1.4.1 Voice portal content structureThe content of the voice portal is organized in a three-level tree structure, with two levels containing grouped elements and voice portlets at the lowest level. The voice portal shows only the places and pages where the VoiceXML markup is enabled.

The following levels are available:

1. Voice portlet: at the lowest level

2. Page: used to group the portlets

3. Label: used to label pages


The root point of the structure is the portal home, from which the call starts flowing down a path to authentication. The whole tree structure is user-dependent and defined by the GUI preferences as shown in Figure 1-2.

Figure 1-2 A sample portal home structure

Page Three

Portlet ThreePortlet TwoPortlet One

Page One

Label Two

Page Four

Portlet Four

PortalHome

Label One

Page Two


Chapter 2. Voice and portlet toolkit environment

This chapter describes the installation, configuration and test of the development environment needed to design, build and test voice portlet applications.

The main topics discussed here are as follows:

2.1, Software prerequisites for the development environment on page 12

2.2, Getting started on page 12

2.3, Installing the development environment on page 12

2.4, The voice perspective on page 19

2.5, The voice portlet perspective on page 25

2


2.1 Software prerequisites for the development environmentTo create, debug and run a voice portlet, install the following programs or files:

WebSphere Studio Version 5.1.0 (higher versions are currently not supported)

Portal Toolkit V5.0.2 for WebSphere Studio WebSphere Portal V5.0.2 WebSphere Application Server Fix Pack 2

Voice Aggregator from the WebSphere Voice Application Access Voice Toolkit V5.0

2.2 Getting startedIt is not our intention in this chapter to replace the installation documents for each of the products mentioned here. Instead, we provide a summary of the steps needed to prepare your development environment. For the installation instructions of the individual products refer to:

Portal Toolkit (PortalToolkit_Install.htm), part of the installation package

Voice toolkit documents, part of the download package for the Voice Toolkit:

readme_install.htm file (provides installation hardware and software requirements as well as system prerequisites)

Voice_Application_Development.pdf(Title: Getting started with developing voice portlet applications )

Voice_Portlet_Development.pdf(Title: Getting started with developing voice applications )

2.3 Installing the development environmentThe following sections are a step-by-step guide to install a voice enabled portlet environment.

2.3.1 Installing WebSphere Studio Application Developer1. Run launchpad.exe from the WebSphere Studio V5.1 CD in case auto-run doesnt start

this automatically.

2. Click Install IBM WebSphere Studio.

3. Change the default Directory Name to C:\WSAD.

4. Ensure the check boxes under Required Features are checked.

5. If you plan to debug portlets for WebSphere Portal 4.2 locally, also select WebSphere Application Server v4.0 under Optional Run-Time Environments.

Note: Previous versions of the toolkits, such as those listed in Hints: on page 18, need to be uninstalled before you install the development environment.

Important: You must install the WebSphere Studio base product with a short path name, without spaces, period, or dollar sign in the directory name. This is a requirement to debug and test voice portlets.


6. When the installation finishes, restart the system.

7. Verify that the installation was completed successfully by starting WebSphere Studio.

2.3.2 Portal Toolkit V5.0.2 for WebSphere StudioFrom the following Web site, download the toolkit to a temp directory (such as C:\Temp), and take a note of the directory location:

http://www.ibm.com/websphere/portal/toolkit

We recommend using the following instructions:

Start the installation process by unpacking the files. Then pause to read the PortalToolkit_Install.htm document located in the temp directory (you can leave the install panel open while you do this). Read the instructions carefully in Step 4 Installation Scenarios Using WebSphere Portal 5.0.0 and WebSphere Portal 5.0.2. Then follow the steps below:

1. Install the WebSphere Studio base product in a short path name.

2. Install WebSphere Application Server interim fixes, prerequisites for the Portal Toolkit V5.0.2.

On the download site, you should download the WebSphere Application Server Fix Pack 2 (known as WebSphere Application Server 5.0.2 Fixes). This file is named WAS502Windows.zip. While there, you can also download the Portal V5.0 Fix Pack 2. This file is named Fixpack2.zip. You will need this file in Step 4 of the Portal Toolkit installation instructions. Download these files to a directory on your hard drive.

3. This step continues the Portal Toolkit installation wizard. On the installation panel, select to install both of the following components, and follow the instructions:

a. Portal Toolkit V5.0.2 and

b. WebSphere Portal V5.0 for Test Environment.

4. Install the Portal 5.0 Fix Pack 2. This file is named Fixpack2.zip. You downloaded this file in Step 2. Follow the instructions that come with the software to install the fix pack.

Important: Upon starting Studio, you are requested to enter the path for the workspace directory, as you did before, using a short path name, no spaces, periods or $-sign. We suggest to use: C:\WSAD\workspace

Note: Do NOT apply WebSphere Studio 5.1.0 Interim Fix 001 if you plan to use the Web Service Client Portlet Project wizard.

Note: Take care in naming the directories that you will use to extract files. Use simple names.

Note: As the Portal Toolkit is being installed, you will need to insert the WebSphere Portal V5.0.2 CD #2 to locate the WebSphere Portal 5.0 installer, named wpsinstall.jar.

Chapter 2. Voice and portlet toolkit environment 13

http://www.ibm.com/websphere/portal/toolkit

2.3.3 Voice Toolkit for WebSphere StudioYou should be aware that an active Internet connection is required to download the Voice Toolkit. The concatenative text-to-speech (CTTS) engine is packaged as a jar file. Upon download the file is placed into the temporary directory structure created by the base Voice Toolkit install.

To download WebSphere Studio Voice Toolkit from the Internet, do the following:

1. Go to http://www.ibm.com/software/pervasive/voice_toolkit/ and select the Download Voice Toolkit for WebSphere Studio on the right hand side of the window. The Voice Toolkit is free of charge, but you are required to register in order to download the toolkit.

2. Proceed through the enrollment and login process and select the download you require. See Figure 2-1 showing the download options.

Figure 2-1 Download page for Voice Toolkit

3. Run the Voice Toolkit install file VoiceTools_setup.exe

4. Select Next on the Voice Toolkit for WebSphere Studio welcome window.

5. The readme file is displayed within the install window, providing specific details on the Voice Toolkit. Click Next.

Note: The Voice Toolkit can be installed on a machine with no Internet connection, but special instructions must be followed. Refer to Appendix B, Installing the Voice Toolkit without Internet connection on page 119.


http://www.ibm.com/software/pervasive/voice_toolkit/

6. A list of toolkit features are displayed. Features that are already installed can be left unchecked as they are marked as (INSTALLED). See Figure 2-2 on page 15.

For a first time install, select: VOICEXML Application Development and Debug VOICE PORTLET Application Development and Debug

To install additional languages, leave all options unchecked.

Figure 2-2 Installed feature list

7. The Voice Portal Application Development and Debug feature window appears, listing all languages supported by type TTS (see Figure 2-3 on page 16). Typically, CTTS offers better quality but requires a larger download. Items marked with an asterisk (*) require an Internet download. Select the required language option and click Next.


Figure 2-3 Voice Portlet CTTS language support

8. The VoiceXML Application Development and Debug feature window appears as shown in Figure 2-4. It lists all languages supported by the VoiceXML mark-up language for TTS and CTTS engines. Select the language required and click Next.

Figure 2-4 VoiceXML CTTS language support

9. The installation wizard now verifies available disk space.


10.When there is sufficient disk space, the software license agreement displays. Scroll down to read it and check the radio button to indicate acceptance of the terms in the agreement. Click Next.

11.The installation wizard now processes the selected options. Please wait for the next window to appear.

12.A summary is displayed that provides information about the install location, the chosen CTTS features, and required disk space as shown in Figure 2-5. If correct, click Next.

Figure 2-5 Summary of install options

13.The install procedure now begins. This is a quite a huge file, and dependant on the type of Internet connection you have, it may take a while to download and install.

14.Upon completion, it is recommended that you save all your work, close all applications, and restart your development machine.


2.3.4 Installing the Voice Aggregator To install the Voice Aggregator follow these steps:

1. Insert the CD for WebSphere Voice Application Access.

2. Unzip \install\\voice.zip into the \runtimes\wvaa directory.

3. Edit install.properties to specify the correct user ID, password, port, and host information. For host name, use localhost. Also, specify the correct paths to:

a. appserver.dir = /runtimes/base_v5

b. portalserver.dir = /runtimes/portal_v50. Make sure that the directory specified by temp.dir exists.

c. Where is the path to the WebSphere Studio base product, such as C:\wssd\ or C:\wsad\

4. Run install.bat.

5. After you complete the installation of the software prerequisites, restart WebSphere Studio. Again, make sure that the workspace directory does not have spaces, periods, or dollar signs.

Hints:

You should be aware of the following issues:

1. Errors that occur during the installation process will be logged in the voicetools_log.txt file located in the directory identified by your TMP environment variable.

2. Errors may occur if you have a previous version of one of the following components:

Voice Toolkit for WebSphere Studio, Portal Toolkit for WebSphere Studio Multimodal Toolkit for WebSphere Studio WebSphere VoiceServer SDK (must have JRE 1.3.1 to uninstall) ViaVoice Outloud

After uninstalling the speech run-time engine, you may need to manually delete the speech root environment variables. To do this follow these steps:

a. Right-click the My Computer desktop icon and select Properties.

b. On the System Properties dialog, click the Advanced tab, and then click the Environment Variables button.

c. In the Environment Variables window, under System variables, scroll to find the SPCH_ROOT, SPCH_BIN, and IBMVS environment variables. Select these variables, and then click Delete.

d. Click OK to close the dialog box.

3. For the changes to take effect, reboot your system.

Note: Be sure to use forward slashes in the path name as shown in the examples above.


2.4 The voice perspective

This is the perspective provided by the Voice Toolkit. It is used to handle and create all the voice resources you will need to design your voice application to run in WebSphere Voice Server or any other VoiceXML 2.0 compliant browser. The mentioned resources can be VoiceXML, grammar, and audio files.

In this section we will explain a view features of this perspective to enable you to starting the development work. You can find more information about the additional voice tools in Appendix A, Additional tools in the Voice Toolkit on page 109.

To open the voice perspective in WebSphere Studio Application Developer, select Window Open perspective Other...

Then select Voice Perspective as shown in Figure 2-6.

Figure 2-6 Select Voice Perspective

2.4.1 Testing the microphoneThe first thing you should do after preparing your development environment is test the microphone. This perspective has an Audio Analyses Tool to do this task.

1. Click the microphone icon available in the Studio toolbar for this perspective or click Run Test Microphone. The tool displays a panel similar to the one in Figure 2-7 on page 20.


Figure 2-7 Audio Analysis Tool

2. Then press the Display Script button. The Hide Script button displays in Figure 2-7 because this button toggles back and forth between the two options and this image shows the panel after the Display Script button has already been pressed.

3. Click Start and read the script to test your microphone. When the test is finished you will hear a tone and the results of the test appear in the window.

2.4.2 Creating a VoiceXML fileThe primary file for a voice application is the VoiceXML file. First you need to create a Voice project. Follow the instructions below to create a VoiceXML file:

1. Select File New VoiceXML File to start the New VoiceXML File wizard.

2. In the New VoiceXML File window shown in Figure 2-8 on page 21, type a unique name for your VoiceXML file (file extension .vxml).

3. Click the Finish button. This launches the VoiceXML editor and opens a basic VoiceXML file with the required heading.


Figure 2-8 New voiceXML file

2.4.3 Using the VoiceXML editorTo write VoiceXML code using the toolkit, you can manually enter your own code or you can use the built-in reusable dialog components.

Writing your own codeTo use the editor to write your own VoiceXML code, you need to know the features and syntax of the language. You can learn about VoiceXML from a variety of sources, including:

Voice Extensible Markup Language (VoiceXML) Version 2.0, available from the Web at: http://www.w3.org/TR/voicexml20/). This document provides details on the VoiceXML languages supported in this toolkit.

Speech Recognition Grammar Specification Version 1.0, available on the Web at: http://www.w3.org/TR/speech-grammar/). This document provides details on the syntax used to create speech grammars so that you can learn how to specify the words and word patterns a speech recognizer should recognize.

The VoiceXML Programmer's Guide (pgmguide.pdf), packaged with the Voice Toolkit available online at: http://www-306.ibm.com/software/pervasive/voice_toolkit/. This guide provides guidance on creating and testing voice applications.

Other downloadable documents (.PDF format) are available on the IBM Publications Center Web site at:

http://www.elink.ibmlink.ibm.com/public/applications/publications/cgibin/pbi.cgi


http://www.w3.org/TR/voicexml20/http://www.w3.org/TR/speech-grammar/http://www-306.ibm.com/software/pervasive/voice_toolkit/http://www.elink.ibmlink.ibm.com/public/applications/publications/cgibin/pbi.cgi

To use the Web site, select your country and search for keywords such as VoiceXML or Voice Server to find documents related to your specific connection environment.

ExampleThe following example provides you with full VoiceXML code that you can copy and use to test your installation. Copy and paste this in your newly created VoiceXML file (extension .vxml):

Example 2-1 Sample application to test your setup

exit

Welcome to the virtual ice creamshop. Please make your choice: Vanilla, Chocolate, Rocky Road, or Exit.

Tips:

This section includes the following tips:

For detailed information about the features available in the VoiceXML editor, refer to the online help. Select Help Help Contents Voice Portlet developer information VoiceXML editor.

The VoiceXML editor automatically colorizes the source code as you type, using the color scheme defined in the Preferences window. To define your preferences, select Window Preferences Voice Tools VoiceXML Files VoiceXML Styles.

If you like using Content Assist, the preference is enabled so that Content Assist appears automatically whenever you type

vanilla chocolate rocky road

Regular or French?

regular french

That is a fine ice cream.

Excellent choice!

Chocolate is my favorite flavor.

Are you sure? It is kind of expensive.

OK! One scoop of Rocky Road coming up!

All right.


Thanks for visiting the virtual ice cream shop. Goodbye!

2.4.4 Running and debugging VoiceXML filesYou can test the voice application that you just created by following the steps below. Refer to Figure 2-9, Figure 2-10, and Figure 2-11 on page 25. Remember to keep your .vxml file selected while following these steps.

Figure 2-9 Running VoiceXML applications

1. Click Run As Voice XML Application. When the VoiceXML application starts to work, you will see the voice browser printing each message as the system recognizes your audible answers.

2. You will be asked several credit card questions. Answer each question, then at the end of dialog you should hear: Thanks for visiting the virtual ice cream shop. Goodbye! This is the answer we included in our program at the end of 2.4.3, Using the VoiceXML editor on page 21.

This example is quite simple, but you can see how the voice browser works with voice recognition and DTMF commands (See Table 1-5 on page 4 for a definition of this term).

If you use the wrong tag inside a form by mistake, the VoiceXML editor wont alert you, but when you run the .vxml file, you can see the errors printed in the Call Simulator. Figure 2-10 shows a common error and its exact location.

Figure 2-10 Call Simulator prompting errors

Note: We will be using the virtual ice cream shop in a number of other examples in this paper.


Another important feature of the Call Simulator is a DTMF keypad simulator (pictured in Figure 2-11) that can be used during your testing. Instead of voice responses, try using the DTMF keypad to input the credit card number that we used in our previous test example.

Figure 2-11 DTMF Keypad Simulator

2.5 The voice portlet perspectiveNow that we have covered the VoiceXML approach to voice applications, lets continue with the transformation of this voice application into a voice portlet.

As this is your first time developing this kind of application in your environment, you need to open the voice perspective in WebSphere Studio Application Developer by clicking Window Open perspective Other. Then select the Voice Portlet Perspective. See Figure 2-12.

Figure 2-12 Select Voice Perspective

Note: If you intend to debug your VoiceXML file, select the point at which you want to break into the code and insert a breakpoint. Then click Run Debug As VoiceXML Application. This action launches the debug perspective and will begin to debug your VoiceXML file like any other java application.


2.5.1 Creating a new portlet application projectAll the files for a portlet must be within a voice portlet project. To start a voice portlet project:

1. Select File New Portlet Application Project.

The Create a Portlet Project wizard appears, as shown in Figure 2-13, with the default option to create a basic portlet selected.

2. Give your portlet project a name. Make sure you select the Configure advanced options check box (otherwise the settings described in steps 3-7 will not be presented). Then click Next.

Figure 2-13 Create a Portlet Project Wizard

3. On the J2EE Settings panel, accept the defaults and click Next.

4. On the Portlet Settings panel, accept the defaults and click Next.

5. On the Event Handling panel, accept the defaults and click Next.

6. On the Single Sign-On panel, accept the defaults and click Next.

7. For the next step, be sure to select Add VoiceXML markup support as shown in Figure 2-14 on page 27. After that, click Finish.

Note: Make sure you select Configure advanced options in Figure 2-13. Otherwise, you will not be presented with additional settings.


Figure 2-14 Create a Portlet Project Wizard - Miscellaneous

8. When a prompt asks whether you want to change to the Portlet perspective, select No.

The portlet.xml file is launched with its respective editor. Remember you are creating a voice portlet, and the aggregator will offer the portlets available in the portal to the users who voice the name of your project. If you choose a complex name, the users might not be able to understand and select it. Change the name of your Concrete Portlet Application in the portlet.xml file.

The file structure required for developing the voice portlet appears in the J2EE Navigator panel as illustrated in Figure 2-15 on page 28.


Figure 2-15 J2EE File Structure

Notice that under the jsp folder, there are two other folders where .jsp and .jsv (JavaServer Page with VoiceXML) files reside. Because the files are stored in different places, you dont have to specify the complete path to refer to these files in your java file.

2.5.2 File naming conventionsTo illustrate the importance of file naming conventions, open the file SimpleVoiceProjectPortlet.java. At the beginning of the code, you can see the following constant string declaration:

public static final String VIEW_JSP = "/simplevoiceproject/jsp/SimpleVoiceProjectPortletView.";

This string defines the location of the .jsp file in the view mode. You dont need to say in which directory the file is located. Based on its extension, the controller can find the correct file.

The file extension is defined using the method getJspExtension that uses the markup name of the request to know the specific media that the user is using.

For this reason, its very important to use the same name for both the .jsp and .jsv files. It keeps the controller code simple, and later, if you enable another media type in your portlet, you only have to create the new view file with its respective extension.

2.5.3 Creating .vxml files for your portletThe primary file for a voice application is the VoiceXML (.vxml) file. The process to create this file is quite similar to the process we used to create a VoiceXML file described in 2.4, The

Note: We recommend that you use the same names for both fields and forms for the same reason; its easier for the controller to recognize.


voice perspective on page 19, but you have to pay particular attention that you place this file in the correct location. Follow the instructions below to create a VoiceXML file:

1. Select File New VoiceXML File to start the New VoiceXML File wizard.

2. In the New VoiceXML File window, expand the project you just created and click WebContent icecream jsp vxml directory so that the entire parent folder appears in the first field. To publish to the application server properly, the file must be created under Web Content or a child of Web Content.

3. Enter the name for your VoiceXML file (file extension .vxml) follow the naming conventions mentioned in 2.5.2, File naming conventions on page 28. In this case, the name should be SimpleVoiceProjectPortletView.vxml

4. Click the Finish button. This launches the VoiceXML editor and opens a basic VoiceXML file with the required heading.

The editor and resources used in this perspective are the same as the ones used for the voice perspective.

2.5.4 Converting .vxml files to .jsv filesThe toolkit allows two extensions for the JavaServer Pages used in portlets: .jsp and .jsv. We recommend using .jsp for HTML-based JavaServer Pages and .jsv for VoiceXML-based JavaServer Pages. There are several ways to get the VoiceXML code from a VoiceXML file into a VoiceXML-based JavaServer Page.

The simplest way is to convert the VoiceXML file to a VoiceXML JavaServer Page.

To do this task:

1. If open, close the VoiceXML file that you want to convert to a VoiceXML JavaServer Page.

2. Since you will be renaming the .vxml file, we recommend that you make a copy of it first. Select the file in the navigator, enter CTRL+c directly followed by CTRL+v. Accept the new name: Copy of SimpleVoiceProjectPortletView.vxml. This way you preserve the .vxml file for possible use later.

3. Right-click the selected VoiceXML file in the Navigator, and select Rename.

4. Delete the extension .vxml and type the new extension .jsv. Press Enter.

5. Answer Yes to the overwrite warning.

6. Open the .jsv file by double-clicking it in the Navigator.

7. You need to complete the conversion to a .jsv fragment by deleting the following four lines from the file:

The XML processing instruction ()

The Document Type Declaration ()

The VoiceXML root (or document) start-tag ()

The VoiceXML root (or document) end-tag () at the end of the file. Then select File Save.

You need to remove these lines because the Voice Aggregator, a feature of Portal Server, adds them as it processes global commands, global event handlers, and global variables. If you plan to use the file as a fully-formed VoiceXML JavaServer Page rather than as a fragment used as a voice portlet, you do not need to remove anything.


Insert the following line as the first line in the file:

This explicitly defines the file as a Java Server Page with VoiceXML content.

8. Save the file by selecting File Save.

9. Search for and replace it with . The tag stops the entire system (which you do not want to do in a portlet), and the tag returns control to the Voice Aggregator).

2.5.5 Defining a test environment on a local serverUse the following steps:

1. From the Voice Portlet perspective, select the Server Configuration tab at the bottom of the Navigator.

2. If the server configuration has been defined in the toolkit, you will be able to expand the Servers folder and see your server configuration in the list. If it has not been defined, you must do the following:

a. Right-click Servers, and select New Server and Server Configuration to start the Create a New Server and Server Configuration wizard (shown in Figure 2-16 on page 31).

Tip:

If you rename the VoiceXML file while it is open, or use File Save to convert it to a .jsv file, you must close and re-open the VoiceXML JavaServer Page so that the toolkit will recognize it as a VoiceXML JavaServer Page. Pay attention to the naming convention and the location of the .jsv file.

Another easy way to create a .jsv file from a .vxml file is to copy all the contents of the .vxml file, except the , , and , which are conveniently located at the top and bottom of the file. Paste this code into the .jsv file, replacing all but the header.


Figure 2-16 Configuring new WebSphere Studio Application Developer Server Environment

b. Type a name for your server (such as Myserver) in the Server name field. From the Server type list, select WebSphere Portal Version 5.0 and highlight Test Environment as shown Figure 2-16. Then click Next.

c. On the WebSphere Server Configuration Settings panel, accept the default values, and click Finish. The server configuration will now appear in the Server Configuration view when you expand the Servers folder.

2.5.6 Starting the serverTo do this task:

1. Save everything by selecting File Save All. The server runs only saved versions of files.

2. Right-click your project folder in the Navigator and select Run on Server.

3. Click the default Use an existing server button to accept it. In the Server list, highlight the server you just created and click Finish. This publishes your project and starts the portlet running on the local server. The startup status of the server shows in the Console view. In a few moments, a Web Browser opens, indicating that the portlet is running.


2.5.7 Test the voice portletTo test the voice portlet running on the local server, you need to point to the voice portal with the speech browser.

1. Make sure that your local server is started and then select Run Run VoiceXML JavaServer Page. If this is the first time you are running the portlet, click the New button.

2. On the Run panel, in the Name field, type a name for this configuration, such as Simple Voice Project.

3. In the Local JavaServer Page URL field, enter the following URL:

http://localhost:9081/wps/portal/!ut/p/.cmd/LoginUserNoAuth?userid=wpsadmin&password=wpsadmin

4. Specify the correct port, user ID, and password. See Figure 2-9 on page 24 below to verify your settings.

Figure 2-17 Running your Voice Portlet

5. Click Apply, and then click Run to start the voice browser. Your published application should start.


http://localhost:9081/wps/portal/!ut/p/.cmd/LoginUserNoAuth?userid=wpsadmin&password=wpsadminhttp://localhost:9081/wps/portal/!ut/p/.cmd/LoginUserNoAuth?userid=wpsadmin&password=wpsadmin

2.5.8 Portal log filesSometimes you receive an error during testing and the Call Simulator doesnt show anything. This is probably because the server found an error. These errors dont show clearly in the Console tab.

The details of the errors in the portal can be found in the log files located at:

\runtimes\portal_v50\log.

Look for the error we forced in our project, shown in Table 2-1.

Table 2-1 Part of portal log file showing the location of the error

2.5.9 Debugging the voice portletThere are two phases of debugging a .jsv file. The first one allows you to debug the .jsv file on the server during its construction, so you can debug all the .jsp commands and scriptlets you developed. In this phase, you can insert a breakpoint in the code by simply double-Clicking beside any .jsp code. You will notice that it is impossible to insert a breakpoint using this procedure next to .vxml code. To execute the initial phase one debugging of your voice portlet, follow the steps below:

1. From the Voice Portlet perspective, select the Server Configuration tab at the bottom of the Navigator.

2. Right-click your server and choose Stop. Wait until the server stops processing completely.

3. Open the .jsv file that you want to debug.

The second phase involves debugging your VoiceXML code. Now that your .vxml file is already constructed, you can debug your dialogs and .vxml tags.

To insert a .vxml breakpoint in your code, follow the steps below:

4. Put the cursor beside any .vxml tag.

5. Click Run Add VoiceXML new file Breakpoint as shown in Figure 2-18 on page 34.

Tip: When your application stops during testing and you cant restart it because your last session was not terminated in the Call Simulator, you have the following options:

1. Click the Terminate button in the Call Simulator.2. Close the Call Simulator.3. Restart the Studio.

2004.05.20 16:29:45.203 E com.ibm.wps.pe.pc.legacy.impl.PortletContextImpl include javax.servlet.ServletException:

Unable to compile class for JSP

An error occurred between lines: 37 and 40 in the jsp file: /directory/jsp/vxml/DirectoryPortletView.jsv


Figure 2-18 Adding VoiceXML Breakpoint

6. Right-click your project folder in the Navigator and select Debug on server.

7. Select your server and click Finish. The server will be started in debug mode and the Debug Perspective will launch.

8. In the panel shown in Figure 2-19, click the Skip radio button, check the Disable step-by-step mode check box, and click OK.

Figure 2-19 Step-by-step debug

Now you are ready to debug your .jsv file. Wait for the server to complete the debug mode start-up process before proceeding. You can check whether the server is ready by clicking the Server tab in the Debug Perspective. When the server is in debug mode, follow the steps below.

9. Click the Debug icon as shown in Figure 2-20 on page 35 and choose the VoiceXML Java Page that you configured during your test session described in 2.5.7, Test the voice portlet on page 32.


Figure 2-20 Debugging JSV files

10.When the Studio stops in the first beakpoint, press the F6 key until the end of the file. Notice that youre debugging only the .jsp commands and tags.

After that, a new tab will be created on your Debug perspective with the name Portlet.vxml.

11.Now you can press the F6 key to debug your .vxml file step by step.


Chapter 3. VoiceXML fragments

This chapter describes the use of the VoiceXML markup language. Its not intended to go into depth in explaining the full specification. For the full reference to the specification, please visit: http://www.voicexml.org/spec.html

In this chapter we will touch on some vital VoiceXML pieces and provide snippets of example code. The features we discuss here are:

3.1, Grammars on page 38

3.1.2, Inline grammars on page 38

3.1.3, External grammars on page 40

3.1.4, Reviewing grammar results in your application on page 42

3.2, Help on page 42

3.2.1, Help functions in voice portals on page 43

3.2.2, Self-revealing help on page 44

3.3, Reusable dialog components on page 45

3


http://www.voicexml.org/spec.html

3.1 GrammarsVoiceXML grammars are meant to define selectable words or phrases that instruct the voice application to take action. Grammars can have multiple ways of being present in a voice application. The grammar specification is part of the VoiceXML 2.0 specification and is referred to as Speech Recognition Grammar Specification (SRGS) 1.0

According to the VoiceXML specification, a grammar:

specifies a set of utterances that a user may speak to perform an action or supply information

returns a corresponding semantic interpretation, for a matching utterance. This may be a simple value (such as a string), a flat set of attribute-value pairs (such as day, month, and year), or a nested object (for a complex request).

3.1.1 Built-in grammarsBy using the type element for the tag, a specific built-in grammar can be selected. Built-in grammars are optional, but greatly facilitate and speed up the work of developing a voice applications. IBM VoiceXML browser supports built-in grammars. There are guidelines and descriptions for built-in grammars in the VXML 2.0 specification.

These built-in grammars are:

3.1.2 Inline grammarsIn Example 3-1 an inline grammar is used, meaning that the valid choices are within the tags and directly located in the voice document. This type is not advisable for more complex grammars.

Note: When using abbreviations in a grammar, be sure to use CAPITALS. This will instruct the TTS and ASR engine to treat that word as an abbreviation.

Built-in grammar type Description

boolean The input is based on affirmative or negative input for the specific language. In DTMF 1 will be affirmative and 2 will be negative.

date Input is a full date, day, month, year. The result is a fixed-length date string with format yyyymmdd.

digits Input includes spoken single or multiple digits, 0 to 9. In DTMF * is used as the decimal sign.

currency Input includes currency values, and sometimes the currency indicator (USD, GBP). In case this is not provided, the currency setting will be based on the locale.

number Input includes spoken full numbers. In DTMF * is used as the decimal sign.

phone Input includes phone numbers.

time Input includes spoken time, resulting in a 5 digit string hhmmx, where x equals a for AM, p for PM and h for a 24 hour notation.


Example 3-1 XML inline grammar

Regular or French?

regularfrench

A more advanced use of inline grammars is when the actual grammar choices are also used to speak back the valid options to the caller. This is usually used in structures.

Example 3-2 Menu dynamic grammar using

Select one of the following numbers:

OneTwoThree

Example 3-2 will generate the following menu prompt: Select one of the following numbers: One, Two, Three. The grammar is dynamically generated to be One or Two or Three, and when selected, it will also go to the corresponding item.

XML versus ABNFThe SRGS specification dictates that the VoiceXML 2.0 compliant browser must support the XML (extensible markup language) format and should support the ABNF (augmented backus-naur form) format. By definition the ABNF format is shorter in notation and more readable than the XML format; however, it may be less portable, because there is no requirement to support this format.

Compare the syntax in Example 3-3 to Example 3-1.

Example 3-3 ABNF inline grammar

Regular or French?

#ABNF 1.0; language en-US; mode voice; root $rule3; public $rule3 = regular|french;

We will provide another example of ABNF syntax when we cover external grammars.

Voice portal considerationsDue to the simplicity of inline grammars, voice portals readily recognize their lexicon. Using them requires less effort on your part to achieve pure VoiceXML.

Chapter 3. VoiceXML fragments 39

3.1.3 External grammarsAs mentioned before, grammars can also be referenced as external files, keeping the voice application separated from the grammar.

XML versus ABNFJust as with inline grammars, there are two types of external grammars: XML and ABNF. To illustrate similarities and differences in both types, we use the same example for both.

From a .vxml file, the grammars are referenced as follows:

XML:

ABNF:

How to handle this reference in the voice portal is discussed in Voice portal considerations on page 42.

In this example, we demonstrate how to use multiple rules in the external grammar to return only the desired results to the voice application. We will do this for both XML and ABNF types.

The challenge for this example is to enable the caller to say sentences like:

Give me chocolate ice cream please. I will have vanilla ice cream please. Rocky Road would be nice.

Only chocolate, vanilla and rocky road should be returned to the voice application for further navigation.

Lets start with the ABNF grammar shown in Example 3-4.

Example 3-4 ABNF external grammar

#ABNF 1.0 iso-8859-1;mode voice;root $main_rule;tag-format ;

$beginning = I would like | I love | Ill have | Make it | Give me;$icecream = $flavor (ice cream) {$ = $flavor} ;$flavor = (chocolate|vanilla|rocky road);$closure = please | would be nice;

public $main_rule = $beginning $icecream $closure {$ = $icecream};

In the above example there are five rules defined:

$main_rule: as the name indicates, this is the main rule, one that is public in scope. $beginning: this is the grammar rule that captures the beginning of the sentence. $icecream: this rule captures the actual flavor, including a possible post-utterance of ice

cream. Only the value of flavor is tagged. $flavor: this is the actual desired result and is the most important part of this grammar. Its

the basis for further use in the application. $closure: this is the ending of the sentence.

Note: WebSphere Portals current architecture does not allow the dynamic updating of an external grammar without the use of a supporting servlet.


The denotes that this utterance cannot occur or can occur once. In other words the previous utterance is optional. In the $icecream rule, {$ = $flavor} is used to set the result of the rule to the mandatory input and leave out the optional, ice cream. The same goes for $main_rule. The $main_rule uses the other rules to parse the results back to the application. As you can see, beginning and closing statements are optional in this $main_rule.

In order to test whether this grammar satisfies your application needs, refer to Appendix A, Additional tools in the Voice Toolkit. Appendix A also describes a way of generating all the possible sentences covered by this grammar.

This is how Example 3-4 would look in the XML type format:

Example 3-5 XML external grammar

I would likeI loveIll haveMake itGive me

ice cream$=$flavor

vanillachocolaterocky road

pleasewould be nice

$=$iceCream

Chapter 3. VoiceXML fragments 41

Example 3-5 demonstrates the lengthy nature of an XML grammar format.

Voice portal considerationsExternal grammars take some consideration when using the voice portal environment. You will have to use the Portlet API to encode a URL with the external grammar file.

In order to use the Portlet API, add the following to the top of your .jsv file:

The grammar file can then be referred to in your application by using:

The path (icecream/jsp/vxml/IceCreamGrammar.gram) starts with the name of the project package you are developing. The /jsp/vxml/IceCreamGrammar.gram is subject to change depending on the location in which you create the grammar file.

3.1.4 Reviewing grammar results in your applicationIn case you are using optional grammar rules in either your external of inline grammars, there may be a need for you to review the result of the grammar fully.

The next snippet will illustrate how to do that:

Example 3-6 VXML example to review grammar results

Please make your choice: Vanilla, Chocolate, or Rocky Road.

Your selected flavor is: ,You said: "",The confidencelevel: .

Example 3-6 also shows why we would want the grammar to return only the flavor. The routines in the section are used to make decisions based on the value of the mainChoice and is filled in by the grammar. If we didnt use tags in the grammar, this routine would fail, because the condition would not be met.

3.2 HelpThis section covers the use of help in voice applicat

Redpaper - immagic.com€¦ · ibm.com/redbooks Redpaper Front cover Adding Voice to your your...

Documents

Transcript of Redpaper - immagic.com€¦ · ibm.com/redbooks Redpaper Front cover Adding Voice to your your...