Voice Extreme™ Toolkit

108
Voice Extreme™ Toolkit Programmer’s Manual With Sensory Speech 6 Technology © 2002 Sensory, Inc. P/N 80-0200-D

description

 

Transcript of Voice Extreme™ Toolkit

Page 1: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual With Sensory Speech 6 Technology

© 2002 Sensory, Inc.

P/N 80-0200-D

Page 2: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

Contents Overview................................................................................................................................................................. 4 The Development Environment ........................................................................................................................... 5

The Editor...................................................................................................................................................................... 6 The Project Manager..................................................................................................................................................... 8 Programmer .................................................................................................................................................................. 9 Program Options ......................................................................................................................................................... 10 Error Management ...................................................................................................................................................... 12 Debugging Terminal .................................................................................................................................................... 13 Folder Structure........................................................................................................................................................... 14 Commands .................................................................................................................................................................. 15

File Menu ......................................................................................................................................................... 15 Edit Menu......................................................................................................................................................... 15 View Menu ....................................................................................................................................................... 15 Macros Menu ................................................................................................................................................... 16 Project Menu.................................................................................................................................................... 16 Tools Menu ...................................................................................................................................................... 17 Window Menu .................................................................................................................................................. 17 Help Menu........................................................................................................................................................ 17 Toolbar............................................................................................................................................................. 17 Taskbar ............................................................................................................................................................ 17 Statusbar.......................................................................................................................................................... 17

Sample Projects .................................................................................................................................................. 18 Speaker Independent (sidemo.vep) ............................................................................................................................ 18 Speaker Independent Math (simath.vep) .................................................................................................................... 18 Continuous Listening (clsi.vep).................................................................................................................................... 18 Speaker Dependent Sfx (sdsfx.vep)............................................................................................................................ 19 Speaker Verification (svdemo.vep).............................................................................................................................. 19 Word Spot (wsdemo.vep) ............................................................................................................................................ 19 Record and Play (rpdemo.vep).................................................................................................................................... 20 Touch Tones (ttones.vep)............................................................................................................................................ 20 Music (music.vep) ....................................................................................................................................................... 20

Voice Extreme™ Development Board............................................................................................................... 21 Voice Extreme™ Module............................................................................................................................................. 24

Creating Your First Voice Extreme™ Project ................................................................................................... 28 Creating a ‘Hello World’ Program ................................................................................................................................ 28

Creating the Project ......................................................................................................................................... 28 Building the Project .......................................................................................................................................... 28 Downloading .................................................................................................................................................... 29

Extending the ‘Hello World’ Program (add a blinking LED feature) ............................................................................. 29 Creating a Speaker Independent (SI) Sample Program. ............................................................................................. 30 Creating a Speaker Dependent (SD) Sample Program............................................................................................... 31

Voice Extreme™ Language: VE-C Introduction............................................................................................... 34 VE-C/ANSI C comparison ........................................................................................................................................... 34 VE-C Data Types ........................................................................................................................................................ 36

Standard Data Types ....................................................................................................................................... 36 Built-in Sensory Data Types............................................................................................................................. 36 Derived Data Types ......................................................................................................................................... 37 Storage Options for the Data Types................................................................................................................. 38

VE-C Language........................................................................................................................................................... 39 Functions, Blocks, Statements and Expressions ............................................................................................. 39 Operators ......................................................................................................................................................... 39 Control Statements .......................................................................................................................................... 41 Functions, Calls and Returns ........................................................................................................................... 43 Preprocessor Features .................................................................................................................................... 44

Using the Sensory Technologies: Common Issues ..................................................................................................... 45 Output Devices ................................................................................................................................................ 45 Jumpout Output ............................................................................................................................................... 45 Debug Output................................................................................................................................................... 46 Technology Configuration ................................................................................................................................ 46 Accessing Multiple Results .............................................................................................................................. 46

Using the Sensory Technologies: Specific Examples.................................................................................................. 46 Speech Synthesis ............................................................................................................................................ 46

2 P/N 80-0200-D © 2002 Sensory Inc.

Page 3: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

Pattern Generation........................................................................................................................................... 47 Speaker Independent Speech Recognition...................................................................................................... 48 Speaker Dependent Speech Recognition ........................................................................................................ 48 Speaker Verification......................................................................................................................................... 49 Continuous Listening ....................................................................................................................................... 50 WordSpot ......................................................................................................................................................... 50 Record and Play .............................................................................................................................................. 51 TouchTones (DTMF)........................................................................................................................................ 52 Music ............................................................................................................................................................... 52 Other Voice Extreme™ Built-In functions......................................................................................................... 52

Debugging and Troubleshooting Tips.......................................................................................................................... 53 Debugging Functions and Macros ................................................................................................................... 53 Run-time Error Checking and Reporting .......................................................................................................... 53 Error Messages from the Parser ...................................................................................................................... 53

Voice Extreme™ Data Files ........................................................................................................................................ 54 Sentence Table Format ................................................................................................................................... 54 Speech File ...................................................................................................................................................... 57 Weights file ...................................................................................................................................................... 57 Music file .......................................................................................................................................................... 57

Voice Extreme™ Language: VE-C Built-in Functions ..................................................................................... 58 Technology Configuration............................................................................................................................................ 58 Speech Synthesis........................................................................................................................................................ 60 Pattern Generation ...................................................................................................................................................... 61 Speaker Independent Recognition .............................................................................................................................. 65 Speaker Dependent Recognition................................................................................................................................. 69 Speaker Verification .................................................................................................................................................... 76 Continuous Listening................................................................................................................................................... 80 WordSpot .................................................................................................................................................................... 83 Record and Play.......................................................................................................................................................... 85 DTMF .......................................................................................................................................................................... 88 Music ........................................................................................................................................................................... 89 Serial (RS-232) Communication.................................................................................................................................. 90 Debug Output .............................................................................................................................................................. 93 I/O ............................................................................................................................................................................... 94 Keypad Functions........................................................................................................................................................ 96 Timing Functions ......................................................................................................................................................... 98 Utility Functions ......................................................................................................................................................... 100

Serial Packets Overview ................................................................................................................................... 104 Serial Packets Implementation .................................................................................................................................. 104

Disclaimer .......................................................................................................................................................... 105 Voice Extreme™ Toolkit Limited Warranty..................................................................................................... 105 SENSORY Software End User License Agreement ....................................................................................... 106 The Interactive Speech™ Product Line .......................................................................................................... 108

© 2002 Sensory Inc. P/N 80-0200-D 3

Page 4: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

Overview Voice Extreme™ Integrated Development Environment (IDE) allows quick development of application programs using Sensory Speech™ technologies. Voice Extreme allows a developer to write the application program in a higher-level “C-like” language on a PC, accessing the Sensory Speech technologies through calls to “built-in” functions. The Voice Extreme™ package streamlines development by:

Allowing programs to be written in VE-C, a commonly used higher level language very similar to ANSI C Allowing a simple means of linking the program and the data files it requires Providing access to technology functions in a manner that conceals many of the details Providing access to hardware features such as I/O ports, timers and an RS-232 interface Providing a set of “header” files that define commonly used VE-C macros and constants

This help file provides an introduction to the process of developing a Voice Extreme™ application, the Voice Extreme™ language (VE-C) and its data types, Sensory technologies and the Voice Extreme™ IDE software. The Voice Extreme™ Toolkit contains:

Voice Extreme™ Development Board Voice Extreme™ Module Power supply (may be an option) Serial cable for PC RS-232 connection Quick Start Guide Software CD containing the Voice Extreme™ IDE, Quick Synthesis™, sample projects, sample data files and documentation

4 P/N 80-0200-D © 2002 Sensory Inc.

Page 5: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

The Development Environment Voice Extreme™ IDE is the development environment for creating VE-C applications. It is composed of the following modules:

Editor with syntax coloring, macro generation, bookmarks, line reminders and other utilities Project Management and Project Editing Debug and parsing of errors Single button builder that parses, assembles, links and checks the project Programmer Utilities

Toolbars Include most importantcommands that can beselected from the menus

Working Area This is the working area where allwindows are shown. You can move,resize, maximize or iconize eachwindow. Using the Windows Menu youcan also cascade, horizontally tile orvertically tile all windows.

Statusbar Shows a description ofeach menu item

Taskbar Displays an icon and title for each open window. Click on an icon to activate the associated window. You also can find a list of all open windows in the Window Menu

© 2002 Sensory Inc. P/N 80-0200-D 5

Page 6: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

The Editor Voice Extreme™ IDE is equipped with an editor which has been specially developed for this VE-C language, but can also be used for standard ANSI C.

Bookmark Line reminder

Vertical window splitter

Horizontal window splitter

Document Toolbar The main features are:

Syntax coloring Bookmarks Line reminders Line numbering Vertical/horizontal split Tabs customization “Auto-Indent” function Case control Search and replace function Macros Multi Undo and Redo function Text selection: free, word, row, column and full document Copy, cut and paste functions Printing of the full document or of the selected text Color or black & white printing Line number printing

Note that a pop-up menu can be activated by clicking the right mouse button on the editor window. This menu contains all main editor functions.

6 P/N 80-0200-D © 2002 Sensory Inc.

Page 7: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

The “Document Toolbar” contains some editor functions.

Save (CTRL-S)

Save the document. Parse (F12)

Parse the document (.VEC) - very useful for checking errors without running the complete Build Process. Assemble (CTRL-F12)

Assemble the Sentence Table source (.VEA). Toggle bookmark (F5 or Mouse Left Button click on the left margin of the document)

Toggle a bookmark on/off. The bookmarks feature allows the programmer to mark one or more lines, and to move quickly between them. They are represented by a light blue highlighting of line number.

Go to previous bookmark (CTRL-UP) Move the cursor to the previous bookmark.

Go to next bookmark (CTRL-DOWN) Move the cursor to the next bookmark.

Clear all bookmarks (CTRL-F5) Clear all bookmarks.

Toggle line reminder (F6 or Mouse Right Button click on the left margin of the document) Toggle a line reminder on/off. Unlike bookmarks, “Line Reminders” are used only for marking a line, and you cannot browse them. They are represented by a small yellow mark near the line number. Note that you can mark a line with both a Bookmark and a Line Reminder.

Clear all line reminders (CTRL-F6) Clear all line reminders.

Toggle Error List Toggle on/off the Error List.

Move cursor (go to…) Input the desired values in the Row and Col fields, then press this button, this will cause the cursor to move.

© 2002 Sensory Inc. P/N 80-0200-D 7

Page 8: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

The Project Manager The “Project Manager” enables creation and editing of projects prior to compiling and linking. Each project can contain only one source document (unlimited number of includes) and all desired data files (speech, sentence table, weights and music). For documentation purposes, there is a window that describes the project.

Toolbar

Project description

Project file list

Horizontal splitter

The following commands are available:

Add document Use this button to add the main source to the project.

Add Speech Use this button to add one or more speech files to the project.

Add Sentence Table Use this button to add one or more sentence table files to the project.

Add Weights Use this button to add one or more weights files to the project.

Add Music Use this button to add one or more music files to the project.

Remove Use this button to remove the selected file from the project.

Details Use this button to show details of the selected file (name, path, size, etc.).

Rename the project Use this button to rename the project.

Save the project Use this button to save the project. Note that it is enabled only if the project has been modified. Note that a pop-up menu can be activated by clicking the right mouse button on the file list. This menu contains all main project functions.

8 P/N 80-0200-D © 2002 Sensory Inc.

Page 9: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

Programmer The Programmer has been implemented to allow more flexible and quicker programming of Voice Extreme™ Modules. To show or hide the Programmer window, go to “Tools” then “Voice Extreme Programmer”.

Toolbar

Binary file details

To load the binary file (.VEB), click on the “Load binary file” button. Next, start the download process with the “Start download” button.

© 2002 Sensory Inc. P/N 80-0200-D 9

Page 10: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

Program Options The Options window allows personalization of the program. Press "Ok" to save all changes and “Cancel” to ignore them.

Ignore changes

Save and exit

Sections

Section: Editor

Syntax color and style The integrated editor recognizes programming syntax, and allows syntax-sensitive text color and style customization. The following syntax types are supported: Number, Symbol, String, Comment, Preprocessor block, Character, Language reserved words, VE built-in functions, VE macros and VE Constants. First, select the desired syntax group. For each group, use “Fore” (foreground color) and “Back” (background color) sliders to select text color, and “N” (normal), “B” (bold) and “I” (italic) buttons for the style. Tab size and char The tab size can vary from 2 to 22 characters. If you want to replace the tab with the same number of space characters, disable the option “Use tab char”. Auto-indent Enables the auto-indent function. Margin Disable this option to hide the gray margin on the left side of the editor window that contains row numbers, bookmarks and line reminders. In this mode, bookmarks and line reminders will be indicated by row color and not as symbols. Line numbers Displays line numbers.

Section: Print

Page header A customized header may be printed at the top of each page by entering text in the window and checking the “Enable” option. Page footer A customized footer may be printed at the top of each page by entering text in the window and checking the “Enable” option. Note that the text you enter will be centered on the page, with the document name on the left and the page number on the right.

Use syntax colors Enables color printing (the colors correspond to those set in the editor). Print line numbers Enables printing of line numbers.

10 P/N 80-0200-D © 2002 Sensory Inc.

Page 11: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

Section: Tools

Includes folder (VEinclude) (Builder) This is the file path for included files. The parser will search this folder any included files with #include <filename.veh>. Note that if the file to be included is in the same folder as the document, use #include “filename.veh”.

Close Build window automatically (Builder) Enable this option if you want the Build window to close automatically at the end of the process. If there are errors, the window will remain open so that you can review them. Serial port (Programmer) This parameter sets the number of the serial port that is connected to the development board. Close Download window automatically (Programmer) Enable this option if you want the Build window to close automatically at the end of the process. If there are errors, the window will remain open so that you can review them.

Section: Application

Project default folder In this text box you can specify the default project folder where Voice Extreme™ IDE will store your projects (to browse, click the folder icon on the right). Action performed at application startup This parameter allows certain actions to be performed automatically upon each launch of the Voice Extreme™ IDE: - No action - Create a new document - Load last document - Load last project / Load main source when project is loaded Make a backup copy Enable this parameter to create a backup copy (.BAK) of each saved document or project.

© 2002 Sensory Inc. P/N 80-0200-D 11

Page 12: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

Error Management During the Build or Parse process all parser errors are shown in the bottom of the editor window (see image below).

Splitter

Error List

Editor

An error list appears automatically if errors occur. The error report includes: line number, error type and description. Clicking on an error will cause the cursor to be positioned on corresponding line of code. The line will be highlighted depending on the class of error: red (error) or yellow (warning). When editing is resumed on the line containing the error, the highlighting will disappear. The error list can be toggled on/off using the button or resized moving the splitter up/down.

12 P/N 80-0200-D © 2002 Sensory Inc.

Page 13: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

Debugging Terminal The Debugging Terminal can be used to receive debug data over the serial port from a Voice Extreme™ application that uses the Debug Output functions (see Debug Output section).

The Debugging terminal is very easy to use, just open it and press the “Open Connection” button. This makes the terminal ready to receive data. Note that during the Download process, the terminal connection will be temporary closed until the process ends.

© 2002 Sensory Inc. P/N 80-0200-D 13

Page 14: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

Folder Structure After a successful installation of Voice Extreme™ IDE, you will find the following folder structure (usually under "c:\Program Files" folder, but may vary depending on Windows localization):

Sensory This is the main folder where all Sensory programs are installed. VoiceExtreme All program files are stored in this folder. Bin This folder contains all executables used by the build process. Docs This folder contains the Voice extreme™ Documentation. Samples This is the default folder for storing project files. Samples\Coach This folder contains the new “Coach Extreme” sample project. Samples\Source This folder contains all example project sources. Samples\Data\Music This folder contains all sample music files. Samples\Data\Speech This folder contains all sample speech music files. Samples\Data\Weights This folder contains all sample weights sets plus some extra weights in English, German (DemoSI) and Japanese (DemoSI). Include This is the default include folder (VEinclude). Note the ‘ve.veh’ file is located in this folder. Include\Speech This folder contains all speech include files. Include\Weights This folder contains all weights include files.

Note that the sample files are provided for demonstration purposes only. They are not intended for use in final products and such use is strictly prohibited.

14 P/N 80-0200-D © 2002 Sensory Inc.

Page 15: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

Commands File Menu

The “File Menu” contains all commands to manage a document (new, load, save, print, etc.) as well as a list of the last five viewed documents. New… Use this button to create a new document ("untitled.vec"). There are three options:

Empty document (CRTL-N) Standard Voice Extreme™ document (CTRL-SHIFT-N) Standard Sentence Table document

Open… (CTRL-O) Load a document.

Save (CTRL-S) Save the current document.

Save as… Save the document under a different name.

Print… (CTRL-P) Print the entire document or the selected text.

Exit Close the Voice Extreme™ IDE program.

Edit Menu The “Edit Menu” contains commands to manage the text (copy, cut, paste, select, etc.) as well as the standard “Search” and “Replace”.

Undo (CTRL-Z) Undo the last edit.

Redo (CTRL-Y) Redo the last edit.

Copy (CTRL-C or CTRL-INS) Copy selected text to the clipboard.

Cut (CTRL-X or SHIFT-CANC) Delete selected text and copy it to the clipboard.

Paste (CTRL-V or SHIFT-INS) Paste the clipboard contents into the document.

Select line (CTRL-L) Select an entire line of text.

Select all (CTRL-A) Select the entire document.

Find (CTRL-F) Show the standard “Search text” window.

Find next (F3) Repeat the last search.

Replace (F4) Search and replace text.

View Menu The “View menu” allows toolbars to be shown or hidden. In addition, details of the current document as well as the Option window can be displayed. Document Toolbar Show/hide the document toolbar. Project Toolbar Show/hide the Project toolbar.

© 2002 Sensory Inc. P/N 80-0200-D 15

Page 16: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

Bookmarks Toolbar Show/hide the Bookmarks toolbar. Debug Toolbar Show/hide the Debug toolbar. Application Toolbar Show/hide the Application toolbar. Tools Toolbar Show/hide the Tools toolbar. Taskbar Show/hide the Taskbar. Status Bar Show/hide the Status bar.

Document details (CTRL-D) Show all document details (name, path, size, etc.).

Macros Menu The “Macro Menu” can be used to paste commonly used code blocks into the document. Note that the Macro will replace any selected text.

Document header Import the standard document header.

Comment separator (line) (CTRL-SHIFT-L) Import the standard line separator.

Comment block (CTRL-SHIFT-B) Import a comment block. Note that unlike other macros, this one does not replace the selected text, but places it between the start (/*) and end (*/) comments.

Main () Import the main() block.

If (CTRL-SHIFT-I) Import the if() statement.

If…Else (CTRL-SHIFT-E) Import the if() else statement.

While (CTRL-SHIFT-W) Import the while() statement.

Do…While (CTRL-SHIFT-D) Import the do while() statement.

Switch…case (CTRL-SHIFT-C) Import the switch case statement.

#Include <…> Import the standard #include statement. Note that before importing this macro you will be prompted to select the file to be included. The file path will be inserted between “<” and “>”, which becomes necessary when the file to be included is in a different folder.

#Include <ve.veh> Import the standard #include <ve.veh> statement.

Project Menu The “Project Menu” contains all commands for Project management (new, load, save, build, etc.) as well as a list of the last five viewed documents.

New… (CTRL-SHIFT-N) Create a new project ("untitled.vep")

Open… (CTRL-SHIFT-O) Load a project.

Save (CTRL-SHIFT-S) Save the current project.

16 P/N 80-0200-D © 2002 Sensory Inc.

Page 17: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

Save as…

Save the project under a different name. Build (F9)

Build the current project. Download (F10)

Download the project binary file (.VEB) into the Voice Extreme™ Module. Project window (F11)

Show or hide the project window.

Tools Menu The “Tools Menu” contains all utility commands.

Quick Synthesis Launch Sensory Quick Synthesis™. This button is enabled only if Quick Synthesis™ has been installed.

Debug Terminal Show the integrated Voice Extreme™ Debug Terminal.

Voice Extreme™ Programmer Show the integrated Voice Extreme™ Programmer, very useful for programming multiple modules.

Options… Show the program options window.

Windows calculator Launch the standard Windows Calculator (calc.exe).

Window Menu The “Window Menu” contains all commands necessary to manage the program windows. It also includes a dynamic list of open windows (same function as the Taskbar).

Cascade Cascades all open windows.

Tile horizontal Tiles all open windows horizontally.

Tile vertical Tiles all open windows vertically.

Arrange icons Arranges all iconized windows at the bottom of the program.

Help Menu Help contents (F1)

Show the Voice Extreme™ help file. Sensory web site…

Launch your default browser and connect to the Sensory web site. About…

Retrieve information about the software.

Toolbar The Toolbar contains the most frequently used buttons from the menus. Each toolbar can be moved on the desktop or docked on the left, right, top, or bottom of the window. When you close the program, your toolbar configuration is saved.

Taskbar The Taskbar shows a list of all open windows (similar to the Windows Taskbar).

Statusbar The Statusbar gives you a description of menu items or any program messages.

© 2002 Sensory Inc. P/N 80-0200-D 17

Page 18: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

Sample Projects Speaker Independent (sidemo.vep)

This program demonstrates the Speaker Independent Speech Recognition technology. Operation: After an initial BEEP the program loops forever, waiting for button presses.

Press Button A for a full prompt

Press Button B for a short prompt. Respond to the prompt by saying one of the 6 possible words. The program tells you what word you said or announces an error. Notes: The program is linked with both a SPEECH and a WEIGHTS data file. This program calls the PatGenW and Recog functions and checks their returns. It illustrates the use of the confidence level stored in the WEIGHTS file and special processing for NOTA (None Of The Above) recognition. Files:

sidemo.vec (source) sidemo.ves (speech) si6.vew (weights)

Speaker Independent Math (simath.vep) This program demonstrates the Speaker Independent speech Recognition technology, using the Prior function for emphasizing the correct answer in a weight set. Operation: After an initial BEEP the program loops forever, waiting for user to press Button A. When the button is pressed, the program generates a random math problem, asks it and waits for an answer. If the answer has a low confidence level, the program re-prompts for confirmation, and then announces the final Correct/Incorrect result. Note: This program illustrates the use of more than one weight set, using Prior to emphasize the right answer, accessing the multiple results from recognition and using them to re-prompt for confirmation.

Files: simath.vec (source) simath.ves (speech) noyes.vew (weights) digits.vew (weights)

Continuous Listening (clsi.vep) This program demonstrates Continuous Listening used with the Speaker Independent technology. Operation: The program loops forever, waiting for the user to say "Place call" and beeps at the beginning of each loop. It lights the green LED to signal that it is "ready" for the first word, the red LED when it is "busy" processing either word, and the yellow LED to signal that it is "triggered" for the 2nd word. It lights all of the LEDs to signal that it is "activated"; i.e. it has recognized both words. It also announces "Sensory Powered". If any PatGen or Recognition error condition occurs, the program beeps and waits again for "Place call". Note: This program illustrates the use of the confidence levels stored in the weights file and the use of word durations to speed rejection of inappropriate words. It also can be edited to show the effects of the different MicDistances. Files:

clsi.vec (source) sensopow.ves (speech) sidemo.ves (speech) cldemoa.veo (sentence table) place.vew (weights) call.vew (weights)

18 P/N 80-0200-D © 2002 Sensory Inc.

Page 19: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

Speaker Dependent Sfx (sdsfx.vep)

This program demonstrates Speaker Dependent technology. Operation: After an initial BEEP the program loops forever, waiting for button presses.

Button A initiates a recognition phase, where the program prompts for a command, then attempts to recognize it and emit the corresponding sound (it Beeps if none trained).

Button B initiates the training phase, where the program prompts for up to six words (e.g. FORWARD, LASER, SHIELD, TARGET, ROTATE, STOP). This phase can be aborted by pressing Button A or by not responding to the prompt for the next password (it beeps if all are trained). Button C announces all commands already trained (beeps if none trained).

Notes: The program is linked with two sentence tables, one for the sound effects and one for prompts and error messages.

It illustrates the use of simultaneous PatGen and Record, where the actual voice pattern for each command is saved and can be replayed. It illustrates the use of variables in Flash (non-volatile) memory, so that commands trained during one session are still available in later sessions (i.e. after a system reset). It illustrates the use of RecogSD to compare a newly trained template with others already in the set, in order to reject a command which is too similar to one already trained. The lines which actually configure and output signals in response to the various commands have been commented out, but serve as examples of how one might use such a feature.

Files: sdsfx.vec (source) sddemo.ves (speech) lasers.vea (speech) sddemoa.veo (sentence table) lasersa.veo (sentence table) si6.vew (weights)

Speaker Verification (svdemo.vep) This program demonstrates Speaker Verification technology. Operation: After an initial BEEP the program loops forever, waiting for button presses. The user may want to run HyperTerminal to see debug printout from the Speaker Verification process.

Button A initiates a recognition phase, where the program waits for the user to say each password in order (it BEEPs if none trained). Button B initiates the training phase, where the program prompts for up to four passwords. The red LED is on during Pattern Generation. This phase can be aborted by pressing Button A or by not responding to the prompt for the next password (it BEEPs if all are trained). Button C allows the user to cycle the "level" which controls the tradeoff between false accepts and false rejects (1-5, with 5 being the strictest).

Notes: This program illustrates direction of debug output to the serial port and also the need to temporarily disable the RS-232 lines during PatGen. It also uses the #pragma directives. Files:

svdemo.vec (source) svdemo.ves (speech) svdemoa.veo (sentence table)

Word Spot (wsdemo.vep) This program demonstrates Word Spot technology. Operation: After an initial BEEP the program loops forever, waiting for button presses.

Button A initiates a recognition phase, where the program waits for the user to say each password in order (it BEEPs if none trained). The green LED comes on when the program is listening for the first word, the yellow LED comes on when the program is listening for the second. When the second word is recognized, the red LED comes on for a second. If a timeout occurs, the program double BEEPs.

© 2002 Sensory Inc. P/N 80-0200-D 19

Page 20: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

Button B initiates the training phase, where the program prompts for up to two passwords. The red LED is on during Pattern Generation. This phase can be aborted by pressing Button A or by not responding to the prompt for the next password (it BEEPs if all are trained).

Button C allows the user to cycle the "knob" which controls the tradeoff between false accepts and false rejects (1-4, with 4 being the strictest).

Notes: This program illustrates the use of prompted Word Spotting (with no timeout when listening for the first word of a sequence) and non-prompted (3 second timeout when listening for the second word of the sequence). Files:

wsdemo.vec (source) sddemo.ves (speech) wsdemoa.veo (sentence table)

Record and Play (rpdemo.vep) This program demonstrates Record and Play technology. It illustrates the 3 compression levels (4-bit, 3-bit, 2-bit) and the amount of storage required for various lengths and compression levels of recording. Operation: After an initial BEEP the program loops forever, waiting for button presses.

Button A allows you to make a single recording, which you can stop by pressing any button. The size in blocks of the recording is announced and the uncompressed recording is played. Button B allows you to compress the recording to the next level (3-bit or 2-bit). The size in blocks of the recording is announced and the compressed recording is played. Button C replays the current recording at the current compression level, announcing the level first.

Notes: This demo illustrates the use of variables in Flash (non-volatile) memory, so that the compression level during one session is still available in later sessions (i.e. after a Reset). Files:

rpdemo.vec (source) rpdemo.ves (speech)

Touch Tones (ttones.vep) This program demonstrates DTMF technology Operation: After an initial BEEP the program loops forever, waiting for user to press Button A or B. Depending on which button is pressed, the program announces and dials a local or long distance phone number, preceded by a dial tone. Notes: This program illustrates the use of strings, pointers and "const" variables. Files:

ttones.vec (source) numbers.ves (speech)

Music (music.vep) This program demonstrates Music. Operation: When all LEDs are on, press one of the 3 buttons to play one of 3 tunes. The corresponding LED lights while the tune is playing. Pressing any key on the 3X4 keypad stops the current tune. Notes: The program is linked with both NOTES and TUNES, which occupy ~64K of the .veb file size. Files:

music.vec (source) musicdwc.vem (music) musiclis.vem (music)

20 P/N 80-0200-D © 2002 Sensory Inc.

Page 21: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

Voice Extreme™ Development Board The Voice Extreme™ Development board is the larger of the two circuit boards in the Voice Extreme™ Toolkit, and provides a convenient means of interfacing the Voice Extreme™ Module with the development PC, as well as many handy “accessory” functions. The Development Board offers several features:

Speaker The onboard speaker has a fixed volume and is intended for application debugging. For better audio quality, use of an external speaker with adjustable volume is recommended. Plugging an external speaker into the speaker jack will disable the onboard speaker. Prototyping Area A grid of 0.1” through-holes for use by the application developer to add external circuitry. RS-232 Port A 9-pin connector for connecting the serial cable to the development PC. I/O Port A standard 20-pin shrouded header with 0.1” centers for wiring the I/O lines from the development board to the target application. The I/O pins are initially assigned and configured as:

I/O Description I/O Description P0.0 Serial port input from Host (RCV) P1.0 Button A, input, strong P/U (0=pressed) P0.1 Serial port output to Host (XMT) P1.1 Button B, input, strong P/U (0=pressed) P0.2 Unused, input, weak pull-up P1.2 Button C, input, strong P/U (0=pressed) P0.3 Unused, input, weak pull-up P1.3 Red LED, output (0=on) P0.4 Unused, input, weak pull-up P1.4 Unused, input, weak pull-up P0.5 DO NOT USE (Flash address bit A16) P1.5 Green LED, output (0=on) P0.6 DO NOT USE (Flash address bit A17) P1.6 Yellow LED, output (0=on) P0.7 Unused, input, weak pull-up P1.7 Serial port enable, output (0=off, 1=on)

© 2002 Sensory Inc. P/N 80-0200-D 21

Page 22: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

If an application is stand alone, the two serial I/O pins, P0.0 and P0.1, and the serial port enable, P1.7, may be used for other purposes; however, programs will download via asynchronous serial I/O. Since I/O pins P0.5 and P0.6 are connected to the address bus of the Flash memory, they should not be used under any circumstances. A 4X5 Matrix Keypad is supported. When the keypad is being scanned, the columns are driven (active low) and the rows are sensed (pulled high). When the keypad is being scanned, all previous configuration and output values for these pins are saved and restored. The I/O pinouts are as follows:

Pin P0.5 P1.5 P0.6 P1.6 P0.2 P0.3 1 2 3 A E P1.3 4 5 6 B F P0.4 7 8 9 C G P1.4 * 0 # D H

Voice Extreme™ Module The Voice Extreme™ (VE) module is the heart of the Voice Extreme™ system, and is intended to be removed from the Development Board and wired into the target application. Microphone As with the speaker, the onboard microphone is intended for application debugging. In an actual product, the microphone should be mounted in a location that is convenient for the user and will result in the highest possible signal-to-noise ratio. Plugging an external microphone into the external microphone jack will disable the onboard microphone. Power Jack A 2.1mm jack for connecting the external wall-mount power supply. Power regulation is provided on the development board. Polarity of the connector is indicated on the silkscreen of the printed circuit board. External Speaker Jack Accepts speakers with stereo or mono 3.5mm plugs. Stereo speakers will only play out of one speaker. This output is intended to drive 8-ohm speakers, and provides up to 350 mW of power. External Microphone Jack Accepts microphones with stereo or mono 3.5mm plugs. Reset Switch Hardware reset of the VE Module. Download Switch Places the VE Module in a state such that it is waiting for a program to be downloaded from the development PC. Switch A, B and C Tactile momentary button available for use by the application developer. Led 1, 2 and 3 Green, Yellow and Red LEDs available for use by the application developer. Power Led A green LED indicating power has been connected to the board.

22 P/N 80-0200-D © 2002 Sensory Inc.

Page 23: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

© 2002 Sensory Inc. P/N 80-0200-D 23

Page 24: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

Voice Extreme™ Module The Voice Extreme™ (VE) module is the heart of the Voice Extreme™ system, and is intended to be removed from the Development Board and wired into the target application. The VE module offers several features:

Mounting Hole Four mounting holes are provided for secure mechanical connections in mounting the module in an application. When the module header connector is used to connect the module to another board, the mounting holes are normally not required. Module Connector A standard header with 0.1” centers to carry signals between the Development Board and the Module. VE IC The Voice Extreme™ Integrated Circuit (VE IC) is the central speech processor. Please refer to the Voice Extreme™ IC Datasheet for a detailed specification. VE ROM This 64KB OTP ROM contains the VE interpreter and the speech technology code. If the VE IC is purchased from Sensory, the VE interpreter and speech technology code are masked into an internal ROM on the VE IC. In this case, the external ROM is not required. Microphone Gain Resistor An optional microphone resistor may be added to the VE module. From the factory, the microphone gain is pre-set to a level suitable for arms-length user interfaces. An application note is available for selecting a gain resistor for other applications. Osc 1 Crystal A 14.318 MHz crystal used to establish the frequency of Timer 1 on the VE IC. Timer 1 is used for all speech recognition functions. Osc 2 Crystal A 32.768 KHz crystal used to establish the frequency of Timer 2 on the VE IC. Timer 2 is useful for timekeeping applications since it is not interrupted, unlike Timer 1. Note that Timer 2 is not available on the TQFP versions of the VE IC. 2MB Flash Memory This memory is required on the VE module and all VE applications. Due to the powerful dynamic memory handler of VE system software, this Flash is designed to store the application code, speaker independent weight sets, speech templates, record and playback data, program data, and music data.

24 P/N 80-0200-D © 2002 Sensory Inc.

Page 25: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

Voice Extreme™ Module Connector A standard 34-pin header with 0.1" centers to carry signals between the Development Board and the Voice Extreme™ module. This header has pins on 0.1” centers, and interfaces the VE module with the Development Board. It is referenced as J3 on the Development Board and J1 on the VE Module.

Pin Description Pin Description Pin Description 1 GND 13 DOWNLOAD 25 P1.0 2 GND 14 RESET 26 P1.1 3 GND 15 VDD 27 P1.2 4 MIC-IN 16 PDN 28 P1.3 5 N.C. 17 P0.0 Serial port input from Host (RCV) 29 P1.4 6 DAC-OUT 18 P0.1 Serial port output to Host (XMT) 30 P1.5 7 GND 19 P0.2 31 P1.6 8 GND 20 P0.3 32 P1.7 Serial port enable, output (0=off, 1=on) 9 AUDIO-RET 21 P0.4 33 GND 10 AUDIO-OUT 22 P0.5 DO NOT USE (Flash address bit A16) 34 PND 11 PWM0 23 P0.6 DO NOT USE (Flash address bit A17) 12 PWM1 24 P0.7

Note, when designing your own hardware, note that the above table references the VE Module (NOT the development board). Note also that if an application is stand alone, the two serial I/O pins, P0.0 and P0.1, and the serial port enable, P1.7, may be used for other purposes; however, programs will download via asynchronous serial I/O. Since I/O pins P0.5 and P0.6 are connected to the address bus of the Flash memory, they should not be used under any circumstances. A 4X5 Matrix Keypad is supported. When the keypad is being scanned, the columns are driven (active low) and the rows are sensed (pulled high). When the keypad is being scanned, all previous configuration and output values for these pins are saved and restored. The I/O pinouts are as follows:

Pin P0.5 P1.5 P0.6 P1.6 P0.2 P0.3 1 2 3 A E P1.3 4 5 6 B F P0.4 7 8 9 C G P1.4 * 0 # D H

© 2002 Sensory Inc. P/N 80-0200-D 25

Page 26: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

26 P/N 80-0200-D © 2002 Sensory Inc.

Page 27: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

Getting Started What is a Voice Extreme™ Application? Before we get started, you may find it helpful to review an overview of the Voice Extreme System. (If you just want to get going, you can safely skip to Creating Your First Voice Extreme™ Project). A Voice Extreme™ application consists of a program file plus any data files it needs, linked together into a binary image file that can be downloaded to a 2Mbyte flash data memory. This program is then executed by the Voice Extreme™ interpreter, which runs in program memory built in to the VE IC. Please refer to the Voice Extreme Manual for a more detailed explanation and example code. Voice Extreme™ data files are described in detail in the “Voice Extreme™ Data Files” section of the Help file. They consist of:

Speech synthesis files, also known as vocabulary tables (.VES file) Speech sentences files (.VEO files) Weights files, for use with Speaker Independent recognition (.VEW file) Notes and tunes files, for use with the Music technology (.VEM file)

Most programs begin with a series of statements that define, and optionally initialize, the data used by the program. VE-C allows both standard C data types and built-in data types specifically designed for use with the Sensory technologies. Data types used by Voice Extreme™ applications include:

Integers for use as counters, indices, error codes Characters and strings of characters for messages Boolean, or true/false, variables to control program flow Templates (and arrays of templates) for use with Sensory recognition Built-in Sensory data types (SPEECH, SENTENCES, WEIGHTS, NOTES, TUNES) contained in data files and linked together with Voice Extreme™ programs

The data declarations are followed by a series of control statements that define the program flow. The Voice Extreme™ language, VE-C, is very similar to ANSI-standard C. The following types of statements are available:

Assignment of values to variables Arithmetic and logical operations on variables Looping (for, while, do…while) Branching (if…else, switch…case, goto) Calls to built-in functions that access Sensory technologies and system resources

© 2002 Sensory Inc. P/N 80-0200-D 27

Page 28: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

Creating Your First Voice Extreme™ Project No C manual would be complete without a “hello world” example. We will build a sample application that speaks “hello world” and runs on the Voice Extreme™ Toolkit. To build our application:

Create a document Create a project Build the project Download the project to flash memory on the VE module

Note that we need a vocabulary containing one phrase, “hello world”. For this example, Quick Synthesis™ was used to produce a speech file, “hello.ves”. This file contains the label “VPhelloworld”, which allows our program to link to it.

Creating a ‘Hello World’ Program Using the editor, we first create a document and save it:

Go to "File", "New" then “Empty document” (CTRL-N) or press the button.

An "untitled.vec" document will be created. Enter the following text:

/* My first Voice Extreme program */ extern SPEECH VPhelloworld; // Label or address of speech data main () { Talk(0, &VPhelloworld); // Say “hello world” }

Save it by pressing the button on the editor toolbar, using “hello” as a file name (the extension will be added automatically).

The file “hello.vec” will be created.

Creating the Project To generate the binary file, we need to create a project that contains the document, and then add any data files required for linking.

Go to "Project" then "New". A document named "untitled.vep" will be created, and the project window will be opened. Add the document to the project by pressing the button. Add the speech file “hello.ves” by pressing the button; the file is located in the “samples\data\speech” folder.

Change the project description, if you like. Save the project. Go to “Project”, then “Save” and use “hello” as a file name (the extension will be added automatically). A file named “hello.vep” will be created.

Building the Project At this point, we can compile our document and link it with the vocabulary file to produce a binary image in “hello.veb”.

Go to “Project”, then “Build” or press the button.

The "Build process" box will show each step performed by the builder, reporting any errors (shown in red). The blue progress bar provides an estimate of the remaining time to complete the build. At the end of the build process, the program makes a single beep and the "Build progress" box reports: "BUILD PROCESS OK”. If there are any errors, the text in the "Build progress" box turns red, and an error message will be reported.

28 P/N 80-0200-D © 2002 Sensory Inc.

Page 29: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

Downloading After a successful build, the binary image file is generated, in this case “hello.veb”. Now you are ready for the “Download” process.

Go to “Project” then “Download” or press the button.

The "Download process" box will show each step performed by the programmer, reporting any errors (shown in red). The blue progress bar provides an estimate of the remaining time. At the end of the download process the program makes a single beep, and the download progress box reports: "DOWNLOAD PROCESS OK". If there are any errors, the text in the build progress box turns red and an error message will be reported.

The application is now stored in non-volatile memory in the module, allowing us to turn it off and on, saying “hello world” every time.

Extending the ‘Hello World’ Program (add a blinking LED feature) 1. Open the “hello.vec” file or a new file and enter the following text:

extern SPEECH VPhelloworld; // Start of the speech table #include <ve.veh> // Standard VE defs main() { ConfigureIO (1, 3, 3); // Configure the red LED (P1.3) to an output ConfigureIO (1, 5, 3); // Configure the green LED (P1.5) to an output ConfigureIO (1, 6, 3); // Configure the yellow LED (P1.6) to an output BEEP; while(FOREVER) //Infinite loop { if (ButtonAPressed)

© 2002 Sensory Inc. P/N 80-0200-D 29

Page 30: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

Talk(0, &VPhelloworld); // Say "hello world" else if (ButtonBPressed) { // Red, Yellow and Green LEDs are active low DebugD8(1); // Debug speech “1” WritePort1(0xDF); // 0xDF = 11011111b, green LED DelayMilliSeconds(250); // on, all other LEDs off DebugD8(2); // Debug speech “2” WritePort1(0xBF); // 0xBF = 101111111b, yellow LED DelayMilliSeconds(250); // on, all other LEDs off DebugD8(3); // Debug speech “3” WritePort1(0xF7); // 0xF7 = 11110111b, red LED DelayMilliSeconds(250); // on, all other LEDs off DebugD8(4); // Debug speech “4” WritePort1(0xBF); // 0xBF = 101111111b, yellow LED DelayMilliSeconds(250); // on, all other LEDs off DebugD8(5); // Debug speech “5” WritePort1(0xDF); // 0xDF = 11011111b, green LED DelayMilliSeconds(250); // on, all other LEDs off DebugD8(6); // Debug speech “6” } } }

2. Build and download the program Build the project by pressing the button. After a successful build, download the “hello.veb” by pressing the button and pressing the “DOWNLOAD” button on the Voice Extreme development board. After you download the program, press button A or B on the development board to play “Hello World” speech or toggle the LED and play debug speech.

Creating a Speaker Independent (SI) Sample Program. Now let’s replace the button press with a continuous listening SI word set.

Note that currently Sensory does not offer tools to create custom SI vocabularies. Custom SI word sets can be ordered through Sensory at additional cost. Inquiries should be sent to: techsupport @sensoryinc.com.

1. Open the “hello.vec” file or a new file and enter the following text:

extern SPEECH VPhelloworld; // Address of speech data extern WEIGHTS WT_call; // Weight file for word “call” const uint8 length = 0; // 0 for short duration const uint8 conf = 80; // Confidence threshold, 0 being the // worst score and 100 being the best score #include <ve.veh> // Standard VE definitions uint8 result; main() { while(FOREVER) //Infinite loop { JustGreenOn; // System "ready" result = CLPatGenW( 1, &WT_call); // Returns 0 if a valid // pattern created if ( result != 0) // If result not 0 execute next line continue; // otherwise go back to the beginning // of while loop JustRedOn; // System "busy" result = CheckDuration( length ); // Check the duration of the

30 P/N 80-0200-D © 2002 Sensory Inc.

Page 31: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

if ( result != 0 ) // pattern created continue; // Try again on duration error result = Recog( &WT_call); // Try recognize the pattern if ( result != 0 ) continue; // Try again on Recognition error result = GetRecogLevel1(); // Get the best recognition score if ( result < conf ) // Continue if the score is above continue; // the specified threshold JustYellowOn; // System "trigger" result = GetRecogMatch1(); // Get the best match if ( result == 0 ) Talk(0, &VPhelloworld); // Say "hello world" } }

2. Include the weight file to your project

Open the project window by clicking the “Project” pull down menu and then selecting the “Project Window”, or just press F11. Add the SI weight file, “call.vew” by pressing the button. Sample weight files can be found in the “sensory\samples\data\weights” folder. Save the project by pressing the button.

3. Build and download the program

Build the project by pressing the button. After a successful build, download the “hello.veb” by pressing the button and pressing the “DOWNLOAD” button on the Voice Extreme development board. After you download the program, say “call” to play the “hello world” speech.

Creating a Speaker Dependent (SD) Sample Program. 1. Open the “hello.vec” file or a new file and enter the following text:

extern SPEECH VPhelloworld; // start of the speech table extern SENTENCES SNsddemoa; TEMPLATE SD_Templates[1]; // allocate space in flash for 1 templates // Sentence table messages. Refer to sddemoa.vea for more information. #define SEN_SAY_WORD_1 1 // index of the first sentence #define SEN_SAY_A_WORD 33 // These index numbers are defined #define SEN_REPEAT 66 // in sddemoa.vea sentence table #define SEN_BEEP 71 #include <ve.veh> // Standard VE definitions const uint8 length = 0; // 0 = short; 1 = medium; 2 = long duration const uint16 conf = 130; // Confidence level threshold, 0 being the // best score and 255 being the worst score uint8 result; // Declare an 8-bit variable called result uint8 msg; uint8 tries = 3; uint8 word = 0; main() { //--------------------------------------------------------------------------- // Start the training process msg = SEN_SAY_WORD_1; // Setup the correct message to play

© 2002 Sensory Inc. P/N 80-0200-D 31

Page 32: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

while (FOREVER) { SenTalk(msg, &SNsddemoa); // Prompt the user GreenOn; // Show the user it's ready to speak result = PatGen(STANDARD); // Create a template of the spoken word // The template will be saved in // the UNKNOWN buffer GreenOff; // Pattern generation complete if (result == 0) // Proceed if there is no PatGen error { if (word == 0) { PutTemplate(UNKNOWN, 0, SD_Templates); // Save the first // utterance into flash msg = SEN_REPEAT; // Setup the message to “Repeat” word = 1; } else { // load the first template into the KNOWN buffer GetTemplate(KNOWN, 0, SD_Templates); // compare & store the average of UNKNOWN & KNOWN template // in UNKNOWN buffer result = TrainSD(UNKNOWN, KNOWN, UNKNOWN); if (result == 0) // If the templates matches { PutTemplate(UNKNOWN, 0, SD_Templates); // save the average // template into flash break; // Exits the while loop } else // the templates are different { msg = SEN_SAY_WORD_1; // Start the training process again word = 0; } } } else msg = SEN_REPEAT; } // end of while loop // Training complete //--------------------------------------------------------------------------- SenTalk(SEN_BEEP, &SNsddemoa); // Beep twice to inform the SenTalk(SEN_BEEP, &SNsddemoa); // Training is complete while (FOREVER) { if (ButtonAPressed) { SenTalk(SEN_SAY_A_WORD, &SNsddemoa); YellowOn; // Ready to speak result = PatGen(STANDARD); // Create a template of the spoken word YellowOff; // PatGen done if (result == 0) { result = RecogSD(1, SD_Templates); // Try to recognize the template // found in UNKNOWN buffer if (result == 0) // If the word recognized, play Talk(0, &VPhelloworld); // the "Hello World" speech

32 P/N 80-0200-D © 2002 Sensory Inc.

Page 33: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

else { SenTalk(SEN_BEEP, &SNsddemoa); // otherwise, beep twice SenTalk(SEN_BEEP, &SNsddemoa); } } } } }

2. Add the speech file and sentence table to your project

Open the project window by clicking the “Project” pull down menu and selecting the “Project Window”, or just press F11. Add the speech file, “sddemo.ves” by pressing the button, and add the sentence table, “sddemoa.veo by pressing the button. Sample speech files and sentence tables can be found in the sensory\samples\data\speech folder. Save the project by pressing the button.

3. Build and download the program

Build the project by pressing the button, or just press F9. After a successful build, download the “hello.veb” by pressing the button and pressing the “DOWNLOAD” button on the Voice Extreme development board.

After you download the program, the unit will automatically restart and start the training process. Follow the prompts and train a word or short phrase. Once the word has been trained, press button A to start the recognition. After the prompt, say the trained word. If the word is recognized, the unit will say “Hello World”, otherwise, it will beep twice.

© 2002 Sensory Inc. P/N 80-0200-D 33

Page 34: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

Voice Extreme™ Language: VE-C Introduction The Voice Extreme™ language, VE-C, is very similar to ANSI-standard C. It is assumed that the Voice Extreme™ programmer is familiar with C or similar higher-level language. This section will describe some of the features of the language, but it is also helpful to have a standard C reference manual, such as Kernighan and Ritchie’s The C Programming Language.

VE-C/ANSI C comparison VE-C recognizes both C-style comments (/* … */) and C++-style comments (// …). VE-C has no limit on the length of identifiers, but the assembler limits symbols to 62 characters. Therefore, do not create any labels longer than 62 characters. The VE-C compiler generates labels that it uses when referencing string literals. It also generates a few segment names and a few labels for data structures that are used exclusively by VE. All of these names begin with the 3-character sequence “VE_”. A user can avoid conflicts with these names by not using any identifier that begins with this sequence. All VE-C variables and constants are static; i.e. space is statically allocated rather than using a stack. All constant objects are stored in flash memory. Variables declared with the flash (or FLASH) storage class are stored in flash memory. Variables declared with no storage class are stored in on-chip RAM. On-chip RAM storage for variables is limited to approximately 500 bytes. VE-C does not recognize the C storage class keywords static, auto, and register nor the type qualifier volatile. VE-C has seven basic types: int8, int16, int24, uint8, uint16, uint24, and void. sint8, sint16, sint24 are alternative names for int8, int16, int24. char and bool are functionally the same as int8; uchar is the same as uint8. VE-C recognizes the following built-in types that are not a part of C: TEMPLATE, WEIGHTS, SPEECH, SENTENCES, NOTES, TUNES. Type names can be written in all upper case or all lower case. The VE-C interpreter (that runs on the RSC) treats all operands as 24-bit values. If the operand is signed and shorter than 24 bits (i.e., its type is int8 or int16), the interpreter sign extends the value to 24 bits. Otherwise the interpreter zero-extends the value to 24 bits. The interpreter treats the extended 24-bit values as signed values. The means that VE-C can properly operate on only those unsigned values that are less than 0x800000. Arithmetic and logical operations on unsigned values that are greater than or equal to 0x800000 behave as expected. Comparison operations on these values treat them as signed (and therefore negative) values. VE-C does not recognize the type specifiers: short, long, unsigned, float, double. VE-C does not recognize long or floating constants or enumerated (enum) constants. VE-C recognizes only the following forms of multi-character constants inside string and character literals:

\n Line feed. This generates a new line, which is a carriage return followed by a line feed. \t Horizontal tab. \v Vertical tab. \b Backspace. \r Carriage return. \f Form feed. \a Audible alert (bell). \\ Backslash. \” Double quote. \’ Single quote. \ooo Unsigned 8-bit value represented by ooo, a sequence of 1 or more octal digits. If the value is

larger than octal 377 (decimal 255), the constant’s value is its least significant byte. A special

34 P/N 80-0200-D © 2002 Sensory Inc.

Page 35: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

case of this is ‘\0’, representing a zero byte.

\xhh Unsigned 8-bit value represented by hh, a sequence of 1 or more hex digits. If the value is larger than hex FF (decimal 255), the constant’s value is its least significant byte.

\. The period represents any single character that already hasn’t already been described in this list. The value of this character constant is the numeric value of the character that follows the backslash; e.g. ‘\?’ is the same as ‘?’.

VE-C does not recognize wide-char string literals. VE-C does not recognize the type pointer to function. VE-C does not allow assignment to structures or unions. VE-C does not allow bitfields within structures or unions. VE-C also does not allow a user to declare structures or unions that contain members that are any of the built-in data types or that have type “array of” any of the built-in data types. Integers may be added to pointers or subtracted from pointers. Before the addition or subtraction, the integer is multiplied by the size of the object to which the pointer points. VE does not allow subtraction of pointers. VE-C does not recognize the operators / and % (division and modulo).

VE-C does not recognize the conditional operator (condition ? iftrue : iffalse). VE-C always evaluates both operands of logical AND (&&) and logical OR (||) operators, rather than stopping as soon as the value of the expression can be determined. The current version of the VE-C compiler restricts user-defined functions to those that do not use arguments. Because all variables, including local function variables, are allocated statically, care must be taken in writing recursive functions. VE-C allows only ANSI-standard function declarations and definitions, i.e. the types of the parameters must be specified inside the parentheses, not in a list that follows. VE-C does not allow variable length function parameter lists (i.e. ellipses …).

The VE-C compiler does not provide access to the standard C runtime library. It does, however, provide access to about 100 built-in functions, described in the section “Voice Extreme Data Files”. The VE-C preprocessor recognizes the directives #include and #define but does not recognize the following directives: #if, #ifdef, #ifndef, #else, #elif, #endif, #undef, #error. If the filename for an #include is enclosed in angle brackets (#include <file>) the compiler looks for it in the folder defined by the VEInclude environment variable. If this variable is not defined, or if the filename is enclosed in double quotes, the compiler looks in the local folder. VE-C recognizes only the following two types of #pragma statements:

#pragma VE_APP_TEXT “<string>” and

#pragma VE_APP_VERSION <number> Note that multi-character constants (e.g. \n) are not supported inside pragma strings. VE-C does not allow substitution of variables in user-defined macros (#define identifier(identifier-list) token-sequence) nor does it recognize and replace the trigraph sequences (in the form ??x, where x is one of the following characters: = / ‘ ( ) ! < > -). The following identifiers are not predefined in VE-C: __LINE__, __FILE__, __DATE__, __TIME__. __STDC__

© 2002 Sensory Inc. P/N 80-0200-D 35

Page 36: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

VE-C does not support inline assembly using the asm directive. VE-C does not recognize the following ANSI C keywords: asm, auto, catch, class, const_cast, default, delete, double, dynamic_cast, enum, explicit, float, friend, inline, int(1), long(1), mutable, namespace, new, operator, private, protected, public, register, reinterpret_cast, short, signed(1), static(2), static_cast, template(3), this, throw, try, typedef, typeid, typename, unsigned(1), using, virtual, volatile and wchar_t. Note: (1) VE recognizes the following standard data types: int8, int16, int24, sint8, sint16, sint24, uint8, uint16, uint24, char, uchar, bool and void. (2) VE-C declares all variables as static. (3) There is a VE data type called TEMPLATE, but it refers to speaker-dependant voice templates and has no relation to the ANSI C "template" reserved word.

VE-C Data Types This section explains in details how Voice Extreme™ implements the data types. Note that the types can be spelled with all lower case or all upper case letters. An experienced C-programmer might want to just scan this section to understand the differences between ANSI-C and the Voice Extreme™ language (VE-C). Many programs use only the standard types and the built-in Sensory types, so the inexperienced programmer should not be intimidated by the more complex types!.

Standard Data Types Unsigned integers. Unsigned integers are numbers that can only be 0 or positive. VE-C allows 3 types of integers, depending on whether the number is stored in 1,2 or 3 bytes. Since each byte is 8 bits, the types are uint8, uint16 and uint24. In multi-byte integers, the low order byte is stored first. Note that all VE-C addresses are stored as 24-bit unsigned integers. For example:

UINT8 j; // declares a variable “j” that can store numbers // from 0 to 255 uint8 count = 128U; // Initializes a count variable to 128. The “U” tells // the parser that this is an “unsigned” number and

// avoids a warning

Signed integers. Signed integers are numbers that can be positive, negative or 0. As with unsigned integers, VE-C allows 3 types of signed integers: int8, int16 and int24. sint8, sint16 and sint24 are alternate names. For example:

sint8 error; // declares a variable “error” that can store

//numbers from -128 to 127

Char, Uchar. These data types are functionally the same as int8 and uint8, respectively, except that they normally only contain ASCII values. For example:

uchar initial = ‘A’; // declares a variable “initial” which has //the value ‘A’ = 0x41

Bool. This data type is functionally the same as int8, except that it normally contains just the Boolean values 0 = FALSE or 1 = TRUE. For example,

bool onoff; // declares a variable “onoff” which will either be TRUE or FALSE

Voide. This data type is used to describe functions that either require no arguments or that return no values. In the “Hello World”, “main()” is equivalent to “void main(void)”. In function description, a function that does not return a value will be described as void.

Built-in Sensory Data Types Template. A template is a 128-byte representation of a speech pattern. Typically a Voice Extreme™ program using the Speaker Dependent, Verification, or Wordspot technologies will declare storage for an array of

36 P/N 80-0200-D © 2002 Sensory Inc.

Page 37: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

TEMPLATEs. Voice Extreme™ automatically reserves template storage in flash memory, starting each template on a 128-byte boundary. For example:

TEMPLATE passwords[4]; // declare storage for 4 passwords in flash

Note that a TEMPLATE is the only built-in Sensory data type that actually requires the Voice Extreme™ program to reserve storage space. The following types are always declared as “extern” to the program; that is, the data is actually stored in a file linked with the program. “Voice Extreme™ Data Files” section contains more detailed descriptions of the data files and how they are generated. Weight. Weight sets are used with Speaker Independent Recognition. These data files typically contain a label starting with “WT”. Corresponding files in the weights folder contain the external definition of the label and a definition for the confidence level for the set. For example, the sample file “digits.veh” contains the lines:

extern WEIGHTS WTdigits; #d efine CONFdigits 61

Speech. Speech vocabulary tables typically contain a label starting with “VP”. Corresponding sample files in the include\speech folder contain the external declaration of the label and definitions of the individual utterances in the set. For example, the sample file “hello.veh” contains the lines:

extern SPEECH VPhelloworld; #define MSG_HELLO_WORLD 0 // First utterance in the set

Sentences. Sentence tables typically contain a label starting with “SN”. The “sample shpeec folder” contains a source “.vea” file as well as an object “.veo” file. Corresponding files in the include\speech folder contain the external declaration of the label and definitions of the individual sentences in the set. For example, svdemoa.veh contains the lines:

extern SENTENCES SNsvdemoa; #define SEN_PLEASE_SAY_FIRST 1 // First sentence in the set

Tunes, Notes. A program using Music technology needs to link with both these types of data. Example: extern NOTES VEmusicdwc; extern TUNES VEmusiclis;

Derived Data Types Arrays. An array is a group of same-typed variables stored consecutively in memory that are accessed with an integer 0-based index. For example:

uint8 digits[10]; // The variable “digits” is an array of 10

// unsigned 8-bit integers digits[0] = 0; // Initialize the first element of the array

VE-C allows multi-dimensional arrays, such as: UINT8 phoneNumbers[4][10]; // Storage for four 10-digit phone numbers

Arrays can be initialized by specifying the contents for each member; e.g. ch ar id[2] = {‘V’,‘E’}; // Initialize the two id bytes

Strings. A string is an array of characters, terminated by a null (zero) byte. For example: uchar name[8] = “Sensory”; // Be sure to define enough storage for the

© 2002 Sensory Inc. P/N 80-0200-D 37

Page 38: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

//final null byte!

Pointers. A pointer is a variable that points to, i.e. contains the address of, another variable. For example: uint8 number; // The variable “number” is a standard unsigned 8-bit integer uint8 *ptr; // The variable “ptr” is a 24-bit pointer to an unsigned 8-bit // integer ptr = &number; // This statement causes “ptr” to contain the address // of “number” *ptr = 1; // This statement causes “number” to contain the value 1

Structures. A structure is a group of possibly mixed-type variables stored consecutively in memory. They are accessed by their structure member names, either directly or through a pointer. For example:

struct entry // Define what an entry contains { int8 type; // An entry consists of a 1-byte type uchar name[30]; // associated with a 30-character name }; struct entry directory[10]; // directory is an array of 10 entries struct entry *currentEntry; // currentEntry points to an entry directory[0].type = 0; // Initialize the type member in the first // entry currentEntry = &directory[1]; // Point to the second directory entry CurrentEntry->name[0] = ‘A’; // Initialize part of the name member in

// the second entry

Unions. A union is a single variable that may be used to hold objects of different types. A union is used when it is convenient for the program sometimes to consider a variable to be of one type and at other times to treat it as another type. As with structures, the members are accessed by name, either directly or through a pointer. For example:

union { // Define a 3 byte ( 24-bit ) area called “addr” SPEECH *pointer; // Sometimes the 3 bytes will point to data uint8 bytes[3]; // At others, we’ll index the individual bytes } addr; addr.pointer = &VPhelloworld; // Use addr as a 24-bit pointer ad dr.bytes[2] = 0x01; // Change a single byte of addr

User defined types. Additional types may be defined with the typedef statement. This is sometimes done to improve program readability. For example:

typedef ERROR_CODE sint8; // NOTE: usually the new type name is upper case ERROR_CODE error; // Reserve space for an error code error = PatGen( STANDARD); // Assign a value to the error code

Note that VE-C allows arbitrarily complicated combinations of the above data types. For example, one could define a structure containing arrays of pointers to unions! Note also that VE-C does not allow the Sensory-defined types (e.g. TEMPLATE) to be members of structures or unions. Pointers to these types may be members.

Storage Options for the Data Types Voice Extreme™ has access to two storage areas. The first is an area of on-chip RAM that is approximately 500 bytes long. If a variable declaration is not preceded by an identifier, Voice Extreme™ will attempt to allocate storage for it in RAM. This area is volatile; i.e. its contents are not preserved over a reset. Any variable in this area is initialized to 0 at power-up, unless the program specifies another initial value. For example:

uint8 count; // declares a 1 byte RAM variable, initialized to 0

38 P/N 80-0200-D © 2002 Sensory Inc.

Page 39: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

char name[4] = “Rob”; // declares 4 bytes of RAM, initialized to // R’,’o’,‘b’,and ‘\0’

Voice Extreme™ also has access to 2Mbit of non-volatile flash memory. Since this memory is also used to store the actual program and any data files, the exact amount available for data storage varies. The program specifies that a variable is to be located in flash by preceding its declaration with the FLASH (or flash) identifier. Any uninitialized flash variable is set to 0xFF at the time the flash image is built. If data is stored in an uninitialized variable during the execution of the program, then the variable will still contain this value the next time the program is powered up. Initialized flash variables are set to their initial value every time the program resets. Example:

flash uint8 id[4]; // Declares a 4 byte uninitialized area in flash memory. // These would be 0xFF the first time the program ran. // If the program stores data here, then the data will // still be there after a system reset FLASH char id[2] = {‘V’,‘E’}; // These are initialized each time the // program resets

When Voice Extreme™ programs use data in files that are linked with the program file, the VE-C parser doesn’t need to reserve storage for them. The program only needs to know the name and type of the data and declare it “external” to the program. E.g., in the “Hello World” example, the program doesn’t actually contain a speech table; it just links to a speech table in a data file.

extern SPEECH VPhelloworld; // declare the name of an external speech table

Variables that are declared constant are stored in a special area of flash memory and may not be changed during the course of the program.

const char id[8] = “Sensory”; // declare a constant string

VE-C Language Functions, Blocks, Statements and Expressions A Voice Extreme™ program is organized as a series of function blocks, with “main()” being the reset point. Variables that are declared outside function blocks are considered to be “global” and may be accessed by any following function block. Variables that are declared inside a function block are known as “local variables” and are only accessible to that particular function. A block is made up of a series of statements, which can declare variables, control the flow of the program, operate on variables, invoke the preprocessor, or simply be commentary. Single statements are terminated with “;” and multiple statements can be “blocked”, using curly braces, {}. C allows the use of blank lines to improve program readability. More than one statement can be on a line, although this style is generally not encouraged. Any characters following “//” on a single line or enclosed between “/*” and “*/” are considered to be comments. An expression can be a constant, a variable, a variable with a unary operator, two variables connected with a binary operator or a function call. Every expression has a value which is considered to be TRUE if non-zero and FALSE if zero. Numeric constants may be specified as decimal, hexadecimal (starting with a leading 0x, digits 0-f) or octal (starting with a leading 0, digits 0-7). A character constant is a single character written within single quotes, as in ‘A’, which specifies the standard ASCII value of the character. Strings are a list of characters written within double quotes, with a final zero (NULL) byte appended. Certain non-graphic characters can also be represented in strings, such as ‘\n’ (new line), ‘\t’ (tab) and ‘\0’ (NULL). Names for constants, variables and functions are a sequence of alphanumeric characters and underscore characters that begins with an underscore or an alphabetic character.

Operators Assignment. The above sections have already included examples of the assignment operator (=). It can be used both in declaring a variable to assign its initial value and during the course of the program. To repeat some examples above,

© 2002 Sensory Inc. P/N 80-0200-D 39

Page 40: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

uint8 j = 1; // Declare and initialize an 8-bit variable digits[0] = 0; // Change the first element of an array

The value of an assignment expression is the value assigned to its left-most variable, so complex expressions such as the following are legal:

if (result = Recog(&Wtdigits)) // assigns a value to “result” and also tests it // Note: a single “=” means “assign”

Arithmetic operators. These operators allow arithmetic operations on signed or unsigned integers. VE-C only implements binary addition (+), subtraction (-) and multiplication (*); it does not allow division(/) or modulo(%). All arithmetic is performed as though the numbers were 24-bit quantities, with the results truncated to the number of bits available in the result variable. For example:

int8 number1, number2, number3; int16 answer; answer = number1 + number2; // Note: answer is large enough to handle any //result of this addition number3 = number1 * number2; // Note: if number1 = 16 and number2 = 16, //the operation overflows and number3 = 0;

Unary operators include two’s complement negation (-) and the increment (++) and decrement (--) operators. These operators may precede or follow the variable they affect. For example:

SINT8 j = -1; // j is stored as 0xff uint8 ctr = 0; ctr++; // ctr now = 1 ctr--; // ctr now = 0 if ( ctr++ )… // would evaluate as FALSE if ctr originally = 0, but if ( ++ctr )… // would evaluate as TRUE if ctr originally = 0

Bitwise logical operators. These operators include AND (&), inclusive OR (|), exclusive OR (^), left shift (<<), right shift (>>) and unary one’s complement (~).

uint8 j = 0; uint8 k = 1; uint8 m,n; m = j | 0x0f; // m will contain the value 0x0f n = m & k; // n will contain the 1 n = j ^ m; // n now contains the value 0xf0 n = k << 2; // n now contains the value 0x04 n = n >>1; // n now contains the value 0x02 j = ~j; // j now contains the value 0xff

Note that C allows an abbreviated syntax for arithmetic and logical operators that assign a value to the first operand. For example j = j+1; can also be expressed as j += 1; Relational operators. These operators allow comparisons of signed or unsigned integers and include less than (<), less than or equal (<=), greater than (>), greater than or equal (>=), equal (==) and not equal (!=). Care needs to be taken with signed and unsigned integers. For example:

uint8 j= 0; uint8 k; sint8 m = 0; sint8 n; k = j-1; // k will contain the value 0xff which is interpreted as 255 n = m-1; // m will contain the value 0xff which is interpreted as -1

40 P/N 80-0200-D © 2002 Sensory Inc.

Page 41: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

if ( j < k ) … // this will evaluate as TRUE if ( m < n ) … // this will evaluate as FALSE if ( j == 0 ) … // this will evaluate as TRUE if ( j != 0 ) … // this will evaluate as FALSE

Note that one of the most common programming errors with C is forgetting that the equals operator is a double equal sign. In the following example:

if ( j = 1 ) … // this will always evaluate as TRUE, no matter what // value j originally had, because it assigns the value 1 // to j, rather than comparing j with 1 !!

Logical connective operators. These operators include unary NOT(!), binary AND (&&) and binary OR (||). These operators differ from the bitwise operators described above in that they operate on the TRUE (non-zero) or FALSE (zero) values of the operands. For example,

uint8 j = 0x0f; uint8 k = 0xf0; if ( j && k )… // this evaluates as TRUE. Note: (j & k) would=0 or FALSE if ( j || k )… // this evaluates as TRUE = 1. Note: (j | k) would=0xff if ( !k ) … // this evaluates as FALSE and doesn’t change the value of k

Note that VE-C differs from standard C in that it always evaluates both operands of a logical connective expression. Standard C stops after the first operand if it is FALSE in an && expression and TRUE in an || expression. Sizeof operator. C provides a compile-time unary operator called sizeof that can be used to compute the size, in bytes, of any object. It is most useful with complex data structures. For example,

TEMPLATE passwords[4]; uint16 tsize = sizeof passwords; // should set tsize to 4*128 = 512

“Casting”. Occasionally it is useful to convert a variable of one type to another type. This is sometimes used to bypass strict type-checking for function calls, as described below. A variable is “cast” as another by preceding it with the desired type, in parentheses. Caveat emptor: be careful that a difference in type size does not lead to unwanted truncation of the value. For example, if j were signed and k were unsigned, the following line would “cast” k as signed.

j = (sint8) k;

Control Statements The following types of statements allow the Voice Extreme™ programmer to control the flow of the program. In the examples below we will make the hello world example more sophisticated. For loops. These statements are generally used to allow the program to execute a group of instructions for a specified number of repetitions. The for statement specifies the variable to be used as a counter, its initial value, its ending value, and how it is changed at the end of each loop. For example:

uint8 j; // j will be used as the loop counter for (j = 0; j < 3; j++) // start j at 0, increment it each loop, // stop after 2 Talk(0, &VPhelloworld); // Say “hello world” 3 times

While loops. These statements allow the program to loop while a specified condition is TRUE. For example: while (1) Talk(0, &VPhelloworld); // Say “hello world” forever

© 2002 Sensory Inc. P/N 80-0200-D 41

Page 42: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

A variant is the do…while loop, where the condition is tested at the end of the loop instead of the beginning. For example:

uint8 j = 0; do { Talk(0, &VPhelloworld); // Say “hello world” twice j++; // Increment j } while (j < 2); // Note the use of curly braces to block statements

If…else statements. These statements allow the program to execute one group of instructions when a condition is TRUE and a different group when the condition is FALSE. For example, the following program alternates speech with silence.

BOOL onoff = TRUE; while (1) { if (onoff) Talk(0, &VPhelloworld); // Say “hello world” else

DelaySeconds(1); // Do nothing for one second onoff = !onoff; // Toggle onoff

}

Note that if statements do not need to include an else clause, and also that if…else clauses can be nested. If more than one statement is to be executed on a branch, then the group of statements must be blocked with curly braces. Switch statement. This statement allows for multi-way branching based on the value of a variable. The above program could also be written as:

bool onoff = TRUE; while (1) { switch (onoff) { case TRUE: Talk(0, &VPhelloworld); // Say “hello world”

break; // This ends the processing of this case case FALSE: DelaySeconds(1); // Do nothing for one second break; } onoff = !onoff; // Toggle onoff }

Break, continue statements. These statements allow a means for early termination of loops and are often used in handling error conditions. break causes the innermost for, while, do or switch loop to be exited immediately. continue causes the next iteration of the enclosing loop to begin. The following two programs would both result in only 10 instances of speech.

uint j=1; while (j++) // This would loop until j went from 255 to 0 { Talk(0, &VPhelloworld); if ( j > 10 ) // Let’s just stop after only 10 times break; }

42 P/N 80-0200-D © 2002 Sensory Inc.

Page 43: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

uint j=1; while (j++) // This will loop until j goes from 255 to 0 { if ( j > 10 ) continue; // After 10 times, skip the talking

Talk(0, &VPhelloworld); }

Goto’s and labels. These statements are generally frowned upon by C programmers, but they are supported in VE-C. For example:

if ( error ) goto abort; else { ... } abort: AllLedsOn; ...

Return statements. These statements are used to end the execution of a function and optionally return a value. For example:

return; // Just finish the function and return to the caller return size; // Finish the function and return a value return (1); // This is an alternate syntax for returning a value

Functions, Calls and Returns Programs are often divided into smaller units called functions, or subroutines. Functions are used when the same series of statements is needed at several places in the program. They are also useful in breaking down complex logic into smaller, more understandable, pieces. They optionally return a result that can be tested or assigned to a variable. Variables that are only used in a given function can be declared “locally” at the beginning of the function block and are not available outside of the function. Local variables declared with initial values will be set to the initial value each time the function is called.

VE-C built-in functions are often called with arguments, which allow the function to perform the same operation on different data or in a slightly different manner. The arguments are enclosed in parenthesis and separated with commas. The arguments passed to a function can be arbitrarily complex expressions. The compiler performs “type checking”; i.e. it makes sure that the arguments passed to a function are the type that the function expects. Sometimes this can be subverted by the use of “casting”, as described above. C uses “call by value”; i.e. the actual value of the argument at the time of the call is the value the function uses. User defined functions often are first defined using a function prototype statement to indicate the type of data which will be returned. The arguments that a function requires and the type of value that it returns may be described by a function prototype statements such as the following:

sint8 AskAndRecord(void); // A function with no arguments void CycleLevel(); // A “procedure” that doesn’t return a value sint16 Add(sint8 first, sint8 second); // A function with arguments and a

// return value

For example, the function Add, described above, could be called in the following ways: result = Add(1, 1); // Called with two constants if (Add(j, 1)) … // Tests result of adding a constant and a variable result = Add(Add (j, k), 1<<4); // Recursive use of Add with a complex constant result = Add(1, “one”); // The compiler will not allow this mistake

User defined functions must be defined before they are called, either by including the function body in the source program preceding any calls to it, or by including a function prototype before the calls.

© 2002 Sensory Inc. P/N 80-0200-D 43

Page 44: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

The following program is an example of the first style:

void SayHello() { // Define the function body first Talk(0, &VPhelloworld); // Say “hello world” } // Note that no “return” statement is required

// for a function that returns “void” main() { SayHello(); // Call the function } The following program uses the function prototype method, so that “main()” can be near the beginning of the file: void SayHello(void); // Declare the function main() { SayHello(); // Call the function } SayHello() // Define the function body { // Note that the “void”s are optional UINT8 index = 0; // index is an initialized local variable Talk(index, &VPhelloworld); // Say “hello world” }

Built-in Functions. Most C compilers provide a library of functions and also allow user-defined functions. VE-C includes a large number of built-in functions that provide access to Sensory technologies, as well as hardware and general purpose operations. The syntax for these functions is described in detail in “VE Built-In Functions” section and their purposes will be introduced in the next few chapters. The function prototypes for these functions are “built-in” to VE-C, meaning that no definition file for them needs to be included in a Voice Extreme™ program. User Defined Functions. VE-C supports a limited user-defined function capability. The current restrictions are:

Functions cannot use arguments; any value that the function needs must be passed in a global variable.

Local variables are allocated statically, rather than on a stack. This restriction means that care must be taken in writing recursive functions (functions that call themselves) because the local variables could be overwritten.

This limited ability would support the AskAndRecord and CycleLevel functions described above, but would not support the Add function. The sample programs supplied with Voice Extreme™ include many examples of user-defined functions.

Preprocessor Features VE-C contains a preprocessor to allow a few features that simplify the development and readability of application programs. Preprocessor statements begin with “#” and do not have a “;” at the end. #define statements allow a user to define a symbolic name for a particular string of characters. #define can be used to name a constant or to define a macro. This improves the readability of a program and makes it easy to change just one statement in order to change a parameter used throughout a program. For example:

extern SPEECH VPhelloworld; #define MSG_HIWORLD 0 // define the index in the speech table #define SayHelloWorld Talk(MSG_HIWORLD, &VPhelloworld) // define a macro main () { SayHelloWorld; }

44 P/N 80-0200-D © 2002 Sensory Inc.

Page 45: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

#include statements allow a user to insert another file into the VE-C program. #include files often contain #define statements for constants and macros used by many different applications. If the file name is specified in double quotes, then Voice Extreme™ looks for it in the local folder. If it is in angle brackets, then it is assumed to be in a special VEInclude folder, setup by the Options window. VE-C comes with a set of standard #include files, such as ve.veh, that define commonly used constants and macros.

#include <ve.veh> // Include the standard VE-C definitions #include “mymacros.veh” // Also include locally defined macros

#pragma statements allow compiler-specific definitions. VE-C has two of these that allow a program to define a text description and an integer version number. The string and version number are stored with the program and are accessible to the program through the built-in functions GetApplicationText() and GetApplicationVersion(). For example:

#pragma VE_APP_TEXT "Speaker Verification Sample Program"

Note that the text is considered to be all of the characters within the enclosing double quotes. Escape sequences are not translated ; i.e. if the string were “Hello\n“, then the application text would contain the word “Hello“ followed by the two characters ‘\’ and ‘n’ rather than a Newline. Note also that the application text cannot contain a double quote character. If the text is longer than the space allotted for it (currently 94 characters), then it is truncated to the maximum length. A null (\0) byte is always appended to the application text so that it is a standard string.

#pragma VE_APP_VERSION 1

Note that the application version should be specified as a decimal number and is treated as a uint8 variable. If the version number is larger than 256 then it is stored modulo 256.

Using the Sensory Technologies: Common Issues Voice Extreme™ accesses the Sensory Technologies through the built-in functions that are described in detail in “VE-C Built-in Functions” section. This section discusses some of the issues common to all technologies. Voice Extreme™ has been designed so that the application programmer can access the technologies without having to be familiar with the details of how they are invoked; however there are some decisions that need to be made in using any technology. In the examples below, extensive use is made of the definitions in <ve.veh>.

Output Devices The Talk, Play, DTMF and Music technologies can drive either or both of two output devices, the DAC(analog waveforms) or the PWM (pulse width modulator). Different technologies can use different output devices. The VE-C default is to send all output to BOTH; if this is acceptable then the VE-C program does not need to call SetOutput. For example:

SetOutput(TALK_SETUP, PWM); // Speech output is to PWM only SetOutput(MUSIC_SETUP, DAC); // Music output is to DAC only

Note that Debug Speech Output, described below, is always sent to BOTH (DAC and PWM).

Jumpout Output All technologies except recognition may be interrupted. The “stop condition” can either be an IO event, such as pushing a button on the Development Board, or a keypad event, such as pressing or releasing a key on the keypad. The VE-C default is to disallow any interrupt condition for any technology. To enable a stop condition, the program needs to call two functions. The first tells what type of interrupt will be used by which technology; the second specifies more details about the condition. For example:

SetStopCondition(TALK_SETUP, IO); // Speech is stopped StopOnButtonAPressed; // when Button A is pressed

Different technologies can use different stop conditions in the same program, so we could also simultaneously have:

© 2002 Sensory Inc. P/N 80-0200-D 45

Page 46: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

SetStopCondition(MUSIC_SETUP, KEYPAD); // Music is stopped StopOnAnyKeyPressed; // when any key is pressed

Debug Output VE-C provides many functions to help the developer during program development. Debug output can either be spoken or sent over the RS-232 port to a program such as HyperTerminal. To enable debug output, the program needs to include a call to any of the Debug functions and also to specify where the output is directed. The programmer may want to include lots of calls to Debug functions and then selectively enable/disable the debug output. The VE-C defaults are to speak GENERAL debug and to turn off all other debug. For example:

SetDebug(RECOG_RESULT, RS232); // Send Recog results to HyperTerminal DebugH8(ctr); // Speak the current contents of “ctr” DebugRecogSI(); // Print the SI Recog results SetDebug(GENERAL, NONE); // Turn off GENERAL debug speech

Note that Debug Speech Output is always sent to both DAC and PWM and is not interruptible.

Technology Configuration Many of the technologies can be configured to behave differently based on the settings of certain parameters; Voice Extreme™ provides intelligent default values for these parameters. Advanced applications may necessitate changing some of these settings and Voice Extreme™ provides mechanisms for doing this. In “VE Built-In Functions” section, a number of functions named “Set…” are described. These functions must be called before the call to the technology function. They do not need to be called if the VE-C default is acceptable. For example:

PlayMusic(1, &VEmusicdwc, &VEmusiclis); // Play tune 1 with the default filter SetMusicFilter(0); // Change the filter to minimize “wobbles” PlayMusic(0, &VEmusicdwc, &VEmusiclis); // Play tune 0 with the new filter

Accessing Multiple Results Most functions return a single value; however many of the recognition and training functions return multiple results, such as the index of the “winning” word and its score. VE-C stores these results in its own storage and allows the program to look at the results using “accessor” functions, which have names beginning with “Get…”. The current results are only valid until the next call to a recognition or training function, which is generally long enough. An application program could allocate its own storage and save the results by calling each accessor function and storing its return value. For example:

result = Recog(&WTcall); // Try to recognize the answer if ((level = GetRecogLevel1()) < CONFcall) // Is the score high enough? Talk(MSG_WHAT, &TalkTable); // No, “what did you say?”

Using the Sensory Technologies: Specific Examples Speech Synthesis VE-C allows speech synthesis of either words or sentences. In all discussions regarding the Speech Synthesis technology, the terms “speech” and “word” are used in the strict sense and also in the general sense of an utterance, to include individual words as well as brief phrases or even sound effects like a <beep> or a specific duration of silence. Note that a “word” may also be a sound, such as a “Beep” or a specific duration of silence. Example 1: ‘Talk’ function The Talk function allows a VE-C program to speak a word in a SPEECH file that is linked into the application. For example:

extern SPEECH VPsidemo3; Talk(0, &VPsidemo3); // Say the first word in the Vpsidemo3 vocabulary. // Note: speech tables are 0-indexed

46 P/N 80-0200-D © 2002 Sensory Inc.

Page 47: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

Example 2: ‘SenTalk’ function The SenTalk function allows a VE-C program to speak a sentence in a SENTENCES file. Sentence tables are useful to simplify the development of application programs supporting multiple languages. The VE-C program needs to link to both the SENTENCES file and the SPEECH files it references. The format of a VE-C sentence file is described in “Voice Extreme Data Files” section. For example:

#include <speech\sddemoa> // Include extern labels and definitions for ( j = 1 ; j <= 4 ; j++ ) // Say the first 4 sentences SenTalk(j, &SNsddemoa); // in the SNsddemoa table.

// Note: sentence tables are 1-indexed

Note that silences can be embedded within sentences if the vocabulary table contains entries which represent differing lengths of silence. Silences between sentences can be produced by calling the DelayMilliSeconds or DelaySeconds functions

Pattern Generation Pattern Generation is the first step in using any of Sensory’s recognition technologies. PatGen listens to speech and generates a pattern that is then used to recognize or train a word. There are several varieties of Pattern Generation, depending on whether the patten is to be used with Speaker Independent, Speaker Dependent, Continuous Listening or WordSpot technology. The calling sequence and the returns from each of these PatGen functions are very similar. Pattern Generation can run by itself, which is the STANDARD mode. It can also run in conjunction with Voice Recording, so that the actual speech is also saved. In this case PatGen runs in a BACKGROUND mode and is started before the recording. If RP_THRESH is specified, PatGen also runs in the BACKGROUND but only provides thresholding information so the recording can be post-processed to remove initial silence and glitches. For example:

PatGenW(STANDARD, &WTSI6); // Listen for a word in the SI set Recog(&WTSI6); // and try to recognize it

Or PatGen(BACKGROUND); // Simultaneous PatGen and Record RecordRP(0, 0, 0); // Record the speech if (!GetPatGenResult()) // Now check for any errors from PatGen RecogSD(ctr, templates); // and recognize if everything went well

Pattern Generation can be run in any of three different ways: 1. Pattern Generation can run by itself, which is the STANDARD mode. For example:

PatGenW(STANDARD, &WTSI6); // Listen for a word in the SI set Recog(&WTSI6); // and try to recognize it

2. It can also run in conjunction with Voice Recording, so that the actual speech is also saved. In this case, PatGen runs in a BACKGROUND mode and is started before the recording. For example:

PatGen(BACKGROUND); // Simultaneous PatGen and Record RecordRP(0, NO_THRESH, 0); // Record the speech if (!GetPatGenResult()) // Now check for any errors from PatGen RecogSD(ctr, templates); // and recognize if everything went well

3. If RP_THRESH is specified, PatGen also runs in the BACKGROUND, but only provides threshold information, so the recording can be post-processed to remove initial silence and glitches. For example:

PatGen(RP_THRESH); // PatGen thresholding only for Record RecordRP(0, FULL_THRESH, 0); // Record the speech while removing all silences if (!GetPatGenResult()) // Now check for any errors from PatGen

© 2002 Sensory Inc. P/N 80-0200-D 47

Page 48: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

RecogSD(ctr, templates); // and recognize if everything went well

Configuration functions allow customization of the maximum number of words considered in generating a pattern, the amount of silence allowed before and between the words, and the level of error reporting. Except for Continuous Listening and WordSpot applications, Pattern Generation is typically preceded by a speech prompt. Before recognition can begin there must be time for the room echo of the prompt to decay away. Voice Extreme™ automatically monitors the echo level and detects when the echoes have died; but if sufficient decay is not achieved within 150 milliseconds or if the user begins speaking during the echo decay period, a “spoke too soon” error is reported. A call to SetPatGenNoErrors (TRUE) provides an unobtrusive way of dealing with “spoke too soon” errors.

Speaker Independent Speech Recognition Speaker Independent Recognition involves linking the program to a WEIGHTS file, which is used to guide the neural-net processing during SI Recognition. The program must use PatGenW to listen for the pattern and Recog to try to recognize the pattern in the WEIGHTS set. Optionally, the Prior function may be used to post-process the Recog results in order to add emphasis to one word in the set (e.g., the “right” answer to a math problem), or to optionally remove a word in the WEIGHTS set from consideration during SI recognition. For example: SI Racognition with ‘Prior’ity

PatGenW(STANDARD, &Wtdigits); // Listen for a word in the set Recog(&Wtdigits); // Try to recognize it as a digit Prior(answer, 1); // Give emphasis to “answer” if ((GetRecogMatch1() < GetRecogSetSize()) // If some digit is recognized and && (GetRecogLevel1() >= CONFdigits)) // the confidence level is high, ... // process the recognized digit

Speaker Independent weight sets include an extra recognition class, REJECT (or NOTA -- none of the above), which helps recognition in two ways. First, if the user says an unexpected word, the recognizer will usually return the REJECT class. For example, if the application prompts with “Continue?” and is expecting YES/NO, but the user says, “continue”, the application can deal with this result in an appropriate way (e.g., by saying, “Please say yes or no”). Second, if an abrupt or irregular noise occurs during recognition (such as a sneeze or a door slamming), the recognizer can return the REJECT class. The NOTA class is always the highest index class in the weight set. For example, the Yes/No SI weight set has the following classes:

Index Word 0 No/Nope 1 Yes/Yeah/Yep 2 NOTA

Speaker Dependent Speech Recognition Speaker Dependent Recognition is generally used when a single user needs to discriminate between words in a vocabulary. Smaller vocabularies give better recognition results, with the maximum practical size being about 64 words; an application could potentially switch between different vocabularies, if needed. The Speaker Dependent technology involves training a set of templates, storing them in flash memory and then performing recognition against the trained set. In the training phase, PatGen is used to generate patterns, TrainSD is often used to average two templates to increase the accuracy of recognition, and PutTemplate and GetTemplate are used to transfer templates between temporary and permanent storage. In the recognition phase, PatGen is again used to generate a template and RecogSD is used to perform the recognition. The MaskTemplate function allows individual templates to be temporarily or permanently removed from the set. A configuration function allows variation of the performance level, i.e. the tradeoffs between speed, ease of use and accuracy. Note that PatGen always uses the UNKNOWN storage area, so if two templates are averaged, the first needs to be saved before the second is generated.

48 P/N 80-0200-D © 2002 Sensory Inc.

Page 49: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

In addition to storage in flash memory, there are two storage areas in the Voice Extreme’s internal register memory, called ‘UNKNOWN’ and ‘KNOWN’. PatGen always uses the UNKNOWN storage area, so if two templates are averaged, the first needs to be saved before the second is generated. Example 1: SD Training

TEMPLATE templates[4]; // Reserve enough room for a set of 4 SD templates uint8 ctr=0; // Use this as an index to the current template PatGen(STANDARD); // Create first pattern PutTemplate(UNKNOWN, ctr, templates); // Save the first try PatGen(STANDARD); // Create a second pattern GetTemplate(KNOWN, ctr, templates); // Retrieve the first try TrainSD(UNKNOWN, KNOWN, UNKNOWN); // Average into UNKNOWN PutTemplate(UNKNOWN, ctr, templates); // and save the result ctr++; // Now go on to the next template

After all of the templates are trained and stored, the recognition phase might look like: Example 2: SD Recognition with masking

PatGen(STANDARD); // Listen for a word MaskTemplate(0, 3, templates) // mask out the last template result = RecogSD(ctr,templates); // Recognize against the whole set of templates

Speaker Verification Speaker Verification is generally used to discriminate between two speakers, e.g. in a password protection application. The use of Speaker Verification technology is quite similar to that of Speaker Dependent, in that it requires a training and a recognition phase and involves a set of templates. In Speaker Verification, this template set is limited to four words and is generally used to store a password sequence, since requiring a sequence of passwords increases the security of a verification system. SV uses its own TrainSV and RecogSV functions. A configuration function allows variation of the security level, i.e. the tradeoff between false accepts and false rejects. The training phase would be identical to that described in the Speaker Dependant section, except that TrainSV would be called instead of TrainSD. Example 1: SV Training

TEMPLATE templates[4]; // Reserve enough room for a set of 4 SV templates uint8 ctr=0; // Use this as an index to the current template PatGen(STANDARD); // Create first pattern PutTemplate(UNKNOWN, ctr, templates); // Save the first try PatGen(STANDARD); // Create a second pattern GetTemplate(KNOWN, ctr, templates); // Retrieve the first try TrainSV(UNKNOWN, KNOWN, UNKNOWN); // Average into UNKNOWN PutTemplate(UNKNOWN, ctr, templates); // and save the result ctr++; // Now go on to the next template

The recognition phase would usually involve listening for all of the trained passwords and then making a pass/fail decision based on the composite score. A parameter to the RecogSV function specifies whether or not the passwords need to be spoken in strict order. For example: Example 2: SV Recognition

for (i=0; i<ctr; i++) // prompt and recognize all passwords { SenTalk(SEN_PLEASE_SAY_FIRST + i, &SNsvdemoa);

© 2002 Sensory Inc. P/N 80-0200-D 49

Page 50: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

PatGen(STANDARD); RecogSV(i+1, ctr, 1, templates); // NOTE: recognize in strict order } if (GetRecogSVResult()) // test overall pass/fail result SenTalk (SEN_PASSWORD_REJECTED, &SNsvdemoa); else SenTalk (SEN_PASSWORD_ACCEPTED, &SNsvdemoa);

Note that the above example did not make an accept/reject decision until all of the passwords were processed. An earlier pass/fail decision could be implemented by calling the GetRecogSVWordResult() function after each individual word to see if it was recognized.

Continuous Listening Continuous Listening is a variation of Sensory’s speech recognition technology that provides the capability to listen continuously for a “trigger” word or phrase to be spoken. This technology does not recognize words embedded in speech; the WordSpot technology is available for those applications. CL is generally used to recognize a short command sequence, such as “Place call”. Each of these words is recognized individually, with the first word being a “trigger” word and the second word actually causing an “action” to be performed. CL can be used with either the Speaker Independent or Speaker Dependent/Verification technologies. SI involves recognition against a weight set; SD/SV would also involve a training phase such as those described above. CL uses its own PatGen functions and also provides a quick means of testing words by simply checking their duration. Configuration functions allow the program to specify the amount of time to wait for the first word of a phrase and to control the tradeoff between accuracy and responsiveness. For example, with SI: Example 1: CL with SI Recognition

if (!CLPatGenW(0, &WTplace) // If PatGen is successful if ((CheckDuration(SHORT_DUR) == 0) // If it’s a short word, && (Recog(&Wtplace) == 0) // recognized as word[0] of the set && (GetRecogLevel1() >= CONFplace)) // with high enough confidence AllLedsOn; // Signal "success"

Note that when Continuous Listening is used with the Speaker Verification technology, SetSVSecurityLevel(0) is often used to force a special mode for CL. The “size” parameter for RecogSV is then forced to 1 and SV uses a special “extra low” security level that is more appropriate for CL applications. Although CL can trigger based on a single word, Sensory recommends using two or more words in sequence unless a prompt is used. This will greatly reduce the number of false triggers (random words falsely recognized as the trigger). Each word should be a multi-syllabic word or a brief phrase, ideally with 3-5 syllables. Since there is a brief delay of about .5 seconds between the recognition of one word and listening for the next, the best designs use trigger words that are naturally separated by other speech or words that cannot easily be run together.

WordSpot Word Spot provides the ability to recognize trigger words embedded in continuous speech; thus the password sequence “Robert Henson” could be recognized if spoken as “My name is Robert John Henson”. WS can only be used with the Speaker Dependent technology; thus it always requires a training phase using the PatGenWS function. Unlike CL technology which can listen for and recognize a single word out of a set of words, WS can also only listen for and recognize a single word at a time. This makes it very useful for a “gateway” or “trigger” word, but not useful for selecting a command word from a list of commands. The WordSpot recognition function allows control of the amount of time to “listen” for a word. A configuration function allows the program to control the tradeoff between false accepts and false rejects.

50 P/N 80-0200-D © 2002 Sensory Inc.

Page 51: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

For example, the training phase could be identical to the one described in the “Speaker Dependent” section, except that it would call PatGenWS. Example 1: WS Training

TEMPLATE templates[2]; // Reserve enough room for a set of 2 wordspot templates uint8 ctr=0; // Use this as an index to the current template PatGenWS(STANDARD); // Create first pattern PutTemplate(UNKNOWN, ctr, templates); // Save the first try PatGenWS(STANDARD); // Create a second pattern GetTemplate(KNOWN, ctr, templates); // Retrieve the first try TrainSD(UNKNOWN, KNOWN, UNKNOWN); // Average into UNKNOWN PutTemplate(UNKNOWN, ctr, templates); // and save the result ctr++; // Now go on to the next template

The recognition phase could then consist of: Example 2: WS Recognition

SetStopCondition(WS, IO); // Allow any button press to abort WordSpot StopOnAnyButtonPressed; // since no timeouts will be used for (i=0; i<ctr; i++) // Recognize all passwords in order, no timeout if (WordSpot(0, i, templates)) break; // Abort on any error if (i == ctr) ... // Success = all words recognized else ... // Perform action for failure or interrupt

Although WordSpot can trigger based on a single word, Sensory recommends using two or more words in sequence unless a prompt is used. This will greatly reduce the number of false triggers (random words falsely recognized as the trigger). Each word should be a multi-syllabic word or a brief phrase, ideally with 3-5 syllables. Since there is a brief delay between the recognition of one word and listening for the next, the best designs use trigger words that are naturally separated by other speech or words that cannot easily be run together. As mentioned earlier, WS is very useful for gateway words. Usually WS is used in tandem with SD recognition. For example, an application to control a light switch might use WS technology to continuously recognize the word “Lights”. When that word was recognized, SD technology could be used to recognize the words “on” or “off”. If the word “Lights” was accidentally spoken, then the application program would only listen for command words “on” and “off” for a few seconds. If the command word was not recognized, then the program would go back and listen for the WS word again. Typically a timeout would be used on the second word of a Wordspot sequence, so that if the second word is not spoken, the program can resume listening for the first word. In the example above, the word “Robert” would not have a timeout, but “Henson” would.

Record and Play The Voice Record and Playback technologies allow a voice recording to be saved in flash memory for later playback. Post-processing functions are available to cleanup and compress the recording or to erase entire recordings, allowing more efficient use of the flash memory. The VE-C program does not need to explicitly declare any storage for voice recordings; a memory manager built into Voice Extreme™ keeps track of available space and maintains a 0-indexed table of recordings. Recordings share the 2Mb flash with both the VE-C program and its data files, so that the exact amount of storage available varies with the application, but the maximum is just under 60 seconds of uncompressed speech. For example, to record, store and play speech all that is needed is: Example 1: Simple Record/Playback

EraseRP(0); // Erase any previous recording in slot #0

© 2002 Sensory Inc. P/N 80-0200-D 51

Page 52: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

RecordRP(10,0,0); // Record 5 seconds of speech Compress(2,0); // Compress to the smallest storage (optional) PlayRP(0); // Play it back

Voice Recording can be used in conjunction with PatGen. If a set of templates has been generated in the following way: Example 2: Simultaneous PatGen + Record/Playback

PatGen(BACKGROUND); // Run PatGen in the BACKGROUND RecordRP(0, 0, ctr); // Save the pattern as recording# ctr PutTemplate(UNKNOWN, ctr, templates); // Save the pattern ctr++; // Increment count of patterns and recordings

Then subsequent recognition can reference the actual recorded speech: RecogSD(ctr, templates); // Recognize against an SD set index = GetRecogSDClass1(); // Get the index of the recognized word PlayRP(index); // Play back the original recording

TouchTones (DTMF) The Touch Tone, or DTMF (Dual Tone Multi-Frequency) technology allows acoustic tone dialing. Voice Extreme™ programs can generate dial tones and tones corresponding to the digits on a 3X4 or 5X4 keypad. Configuration functions allow variation of the length of the tones and the silences between tones. Example 1: DTMF Tone and Dial Tone Generation

TTone(DIAL_TONE); // Start with a dial tone for (j=0; j < 7; j++) TTone(j); // Dial 012-3456

The stop condition for the Touch Tone technology works differently depending on whether the tone being generated is a dial tone or a regular tone. For a dial tone, the stop condition interrupts the tone, as is the normal case. For any other tone, the stop condition lengthens the tone. Thus in the example above, if we had previously specified:

SetStopCondition(DTMF_SETUP, KEYPAD); StopOnAnyKeyPressed;

then pressing any key during the dial tone would stop it, but pressing any key during the phone number would cause the current tone to continue as long as the key remained pressed.

Music To access the Music technology, the VE-C application must link with both a TUNES (music MIDI) file and a NOTES file located in the “music” folder. These generally have labels beginning with “VEmusic”. The files must have been generated in such a way that they both fit into a single 64Kb bank of the flash memory (see “Voice Extreme Data Files” section). A configuration function allows variation in the quality of the music. For example: Example 1: Music Synthesis

extern NOTES VEmusicdwc; extern TUNES VEmusiclis; PlayMusic(0, &VEmusicdwc, &VEmusiclis); // Play the 1st tune in the file

Other Voice Extreme™ Built-In functions VE-C provides the developer with many other built-in functions that are described in detail in “VE Built-In Function” section. These routines allow the following:

52 P/N 80-0200-D © 2002 Sensory Inc.

Page 53: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

Full control of two 8-bit general purpose IO ports, including the ability to wait or sleep until a specific event occurs Serial (RS-232) IO at a packet or a byte level Full support of a 4X5 Keypad, including the ability to wait for key presses or releases Delays in seconds or milliseconds, timed sleep and a real-time seconds counter Program resets, block memory copies, random number generation Access to application-specific information, such as version number, description and whether this is the first time an application has been executed (allowing special first-time initialization)

Debugging and Troubleshooting Tips Debugging Functions and Macros Built-in functions allow debug output to be spoken or sent over the RS-232 port to the internal Debugging Terminal. During program development it is often helpful to include statements such as:

DebugH8(ctr); // Speak the current value of ctr

or even DebugH4(1); // Speak “1” if the program gets this far

Note that the default for such “GENERAL” debug output is SPEECH_OUTPUT. Built-in functions allow debug output of the results of certain technologies. Since these often involve multiple results, this output is often directed to the RS-232 port.

SetDebug(RECOG_RESULT, RS232); Init232(); // This call is necessary to enable the port … RecogSV(i, 3, size, password); DebugRecogSV(); // Results may be viewed with HyperTerminal

Macros have been provided for using the colored LEDs on the Development Board. For instance: AllLedsOn; // Turn on all LEDs when the program starts … GreenToggle; // Toggle the Green LED each time through

Run-time Error Checking and Reporting The first (boot) block of the flash image contains information such as a checksum of the command list and version number of the parser used to generate the program. At reset, the interpreter looks for any abnormalities in the boot block, such as a checksum error or an incompatible parser version number. If it finds any error, it doesn’t start the program, but instead waits for a valid flash image to be downloaded. The following fatal error conditions can occur during program execution. Voice Extreme™ handles them by writing an error code into a known location, lighting an LED on the Development Board and then executing a tight loop that awaits bytes over the serial port. Voice Extreme™ responds to any character by sending an error code from the list below:

Y (Yellow) Internal Stack Underflow. Probably from too complex an expression G (Green) Illegal Function Code or Operator. Probably indicates that the command list has been corrupted R (Red) Command List Error, attempt to execute command outside command list. Might indicate that the command list has been corrupted or just that the program has finished executing all of its statements. Note that the “hello.vec” example program would end up in this state.

Error Messages from the Parser The Build process calls a series of programs to parse the source file, assemble it, link it with data files and add checksums into the final flash image. Any one of these programs can potentially put out error messages, but the

© 2002 Sensory Inc. P/N 80-0200-D 53

Page 54: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

messages most often seen are generated by the parser, in response to a syntax error in the Voice Extreme™ application program. All of these messages contain a line number, which should help pinpoint the location of the problem, but the Voice Extreme™ programmer should be aware of the following:

Often a single error can result in a number of error messages, since it may take the parser a line or two to “recover” from the error. Often fixing the first problem noted will eliminate multiple error messages. Sometimes the line number may not accurately reflect the location of the problem. For example, if the program ends with a mismatched “{“, this may not show up until the last line of the program, but actually results from a missing “}” somewhere earlier in the file. In cases where it is hard to locate the source of a syntax error, it is sometimes helpful to comment out large sections of the code with /*…*/ style comments to try to isolate the area of the program where the problem is located. The parser issues a few warnings, such as “Initializer is larger than object; info might be lost”. The application program may be used with these warnings, but it is generally good coding practice to try to fix them. Although many of the parser error messages are intended to be self-explanatory (e.g. “Unterminated string”, “break not inside of loop”), other errors describe conditions internal to the parser (e.g. “Allocation failure”, “null Type pointer”). These messages often include the name of the routine inside the parser where the error occurred. If you are unable to eliminate these errors by fixing a primary error, as described above, please note the exact error message, save the program that caused the error and send this information to a Sensory applications engineer.

Voice Extreme™ Data Files Sentence Table Format

A sentence table allows sentences to be made up of utterances from one or more speech tables. A sentence table is a form of assembly language code. The Voice Extreme™ editor will allow you to create a sentence table file. Unlike a Voice Extreme™ source file which is built by pressing the button, a sentence table is assembled by pressing the button on the editor toolbar. Sentence tables use the “.VEA” extension and assembled sentence table files use the “.VEO” extension. The assembler will assemble the .VEA source file and create a .VEO object file. Sentence tables typically contain a label starting with “SN”. The “\sample\speech” folder contains a few source files as well as corresponding object files. There are seven main components that are required to create a sentence table file.

1. Define the speech file(s) as an external component. Example:

extern VPsddemo extern VPsensopow extern VPhelloworld

2. Define the sentence table label as a public label, allowing the Voice Extreme program to access the

sentence table. Example:

public SNtest

3. Include the sentence table definition. This is required because most of the definitions and macros used in a sentence table file are defined in the sentable.inc file. Example:

include "sentable.inc"

4. Name the sentence label segment and define the label. Example:

_SNtest segment "CDATA" ; Name the segment beginning with “_” SNtest: ; Sentence Table starts with Speech Table List

54 P/N 80-0200-D © 2002 Sensory Inc.

Page 55: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

5. Define the label, add the number of speech files and list the speech file labels. A sentence table

supports up to 30 speech files. The number of speech files should be preceded by a db. The labels for the speech file should preceded by a “dt”. Example:

db 3 ; 3=number of speech files dt VPsddemo ; 0=sddemo phrases dt VPsensopow ; 1="Sensory powered" dt VPhelloworld ; 2="Hello World" Note: Comments are preceded with a semicolon.

6. Define the number of sentences to be created and create the actual sentence strings. Example:

db 7 ; M=number of sentences, maximum=255 SpeechTable 0 ; use speech table 0=VPsddemo db 21, EOM ;1:“Beep” is word 21 in VPsddemo ; EOM = End of sentence token SpeechTable 0 db 21, 21, EOM ;2:“Beep” “Beep” SpeechTable 0 ; use speech table 0=VPsddemo db 13, MSIL, 34, 1, EOM ;3:“say” MSIL “word” “one”. MSIL=75mS silence SpeechTable 0 db 13, MSIL, 34, 2, EOM ;4:“say” MSIL “word” “two” SpeechTable 0 db 13, MSIL, 34, 3, EOM ;5:“say” MSIL “word” “three” SpeechTable 2 ; use speech table 2=VPhelloworld db 0, EOM ;6:“Hello World” SpeechTable 1 ; use speech table 0=VPsensopow db 0, EOM ;7:“Sensory Powered”

7. Finally, all sentence tables should be ended as follows:

end ; This signifies the end of the file

Creating a Sentence Table File

Using the editor, here’s how to create a Sentence Table File and save it:

1. Go to "File", "New" then “Standard Sentence Table document”, or press the button. 2. An "untitled.vea" document will be created. 3. Edit the document. 4. Save it by pressing the button on the editor toolbar, using “test” as a file name (the extension will be

added automatically). The file “vedemo.vea” will be created. Enter the following text: ;------------------------------------------------------------ ; File: vedemoa.vea ; Purpose: Sample of a VE Sentence Table ; Copyright: (c) 2002 by Sensory, Inc., All Rights Reserved ;------------------------------------------------------------ extern VPsidemo3 ; Include externs for each speech table referenced extern Vpsensopow public Snvedemo ; Define a public name for the sentence table include "sentable.inc" ; Include definitions of the sentence tokens ;------------------------------------------------------------ _SNvedemo segment "CDATA" ; Name the segment beginning with “_” ; NOTE: CDATA allows 24-bit addresses ; defined by “dt”

© 2002 Sensory Inc. P/N 80-0200-D 55

Page 56: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

SNvedemo: ; Sentence Table starts with Speech Table List db 2 ; N=number of speech files, maximum=30 dt VPsidemo3 ; 0=sidemo3 phrases dt Vpsensopow ; 1="Sensory powered" ; The actual sentences follow, preceded by the total number of sentences. ; Each sentence starts with a Speech Table token, followed by one or more ; indices into that table, ; optionally followed by further Speech Table tokens and indices, and ends ; with an EOM token. ; SIL tokens can be used to embed silence into the sentence. ;------------------------------------------------------------ ; Sentence Strings db 2 ; M=number of sentences, maximum=255 s1: ; NOTE: Sentence 1 is only one utterance SpeechTable 0 ; s1 is a label, it can be omitted db 27 ; “Beep” is word 27 in Vpsidemo3 db EOM ; End of sentence token s2: ; NOTE: Sentence 2 is two utterances SpeechTable 1 ; "Sensory Powered" from VPsensopow db 0 SpeechTable 0 ; followed by a "beep" from Vpsidemo3 db 27 db EOM end ; This signifies the end of the file

5. Assemble the sentence table by pressing the button (editor toolbar). If there are any errors, make corrections and re-assemble. This generates the object file “vedemo.veo”. Please save a copy of “sentable.inc” in you project folder.

The sentence table can now be used in your project.

Creating a simple program to play these sentences

1. Open the “hello.vec” file (or a new file) and enter the following text:

extern SENTENCES SNvedemo; // Start of the sentence table // Sentence table messages. #define SEN_BEEP 1 // Index of the first sentence #define SEN_SENSO_POW_BEEP 2 // These index numbers are defined #include <ve.veh> // Standard VE definitions main() { SenTalk(SEN_BEEP, & SNvedemo); // “Beep” DelaySeconds(1); // Delay 1 Second SenTalk(SEN_SENSO_POW_BEEP, & SNvedemo); // “Sensory Powered” “Beep” }

2. Include the speech file and the sentence table in your project

Open the project window by clicking the “Project” pull down menu and selecting the “Project Window”, or just press F11.

56 P/N 80-0200-D © 2002 Sensory Inc.

Page 57: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

Add the speech files “sidemo.ves” and “sensopow.ves” from the sensory\samples\data\speech by pressing the button, and add the sentence table “vedemo.veo” by pressing the button. Save the project by pressing the button.

3. Build and download the program

Build the project by pressing the button, or just press F9. After a successful build, download the “hello.veb” by pressing the button and pressing the “DOWNLOAD” button on the Voice Extreme development board.

After you download the program into your Voice Extreme Toolkit the unit will automatically restart.

Speech File Speech files for Voice Extreme™ can be produced by Sensory linguists, or developers can create them using the Sensory Quick Synthesis™ application. Refer to separate documentation for details of vocabulary development using Quick Synthesis™. The end result of the Quick Synthesis™ process will be two files:

A Speech file, “filename.ves”. A VE-C include file, “filename.veh”.

After creating Quick Synthesis™ files, add them to your program as follows:

Add the speech file to you project Add the line “#include <filename.veh>” to your program. This will allow the use of symbolic names for the individual phrases in the speech file.

Weights file Currently SI weights files must be created by Sensory linguists. Contact Sensory for assistance.

Music file Contact Sensory for assistance in creating music “tunes” files.

© 2002 Sensory Inc. P/N 80-0200-D 57

Page 58: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

Voice Extreme™ Language: VE-C Built-in Functions Technology Configuration The following configuration functions provide control over subsequent technology operation by specifying the output device, the jumpout conditions and the amount and “voice“ of any debug output. If a “Set“ function has not been invoked for a particular variable, then the technology uses the VE-C default value. Most of the symbols shown in this section are defined in <ve.veh>. The SetOutput Function Usage:

BOOL SetOutput(UINT8 technology, UINT8 device) Description:

Selects the output device to be used for the sound output technologies. Arguments:

Technology Device 0 TALK_SETUP 0 NONE 1 PLAY_SETUP 1 DAC 2 DTMF_SETUP 2 PWM 3 MUSIC_SETUP 3 BOTH

Returns: TRUE if function call successful FALSE if any input argument invalid.

Example: // This will set TALK_SETUP and PLAY_SETUP for PWM output and DTMF_SETUP // for DAC output SetOutput(TALK_SETUP, PWM); SetOutput(PLAY_SETUP, PWM); SetOutput(DTMF_SETUP, DAC);

Notes: The VE-C default is “BOTH“ for all output technologies.

See also: Talk, SenTalk, TTones

The SetDebug Function Usage:

BOOL SetDebug(UINT8 type, UINT8 voice) Description:

Selects the output voice and level of detail of destination for debug output information. The VE-C default is “NONE“ for all types except GENERAL; “SPEECH_OUTPUT“ for GENERAL. Note that Debug Speech goes to both DAC and PWM and is not interruptible.

Arguments: Type Voice 0 RECOG_RESULT(results from Recog…, Prior) 0 NONE 1 SILENCE(plus max power, AGC) 1 SPEECH_OUTPUT 2 PATGEN_RESULT 2 RS232 3 GENERAL(DebugH4, D8, etc.)

Returns: TRUE if function call successful FALSE if any input argument invalid.

Example: // This will set all debug information to speech output SetDebug(RECOG_RESULT, SPEECH_OUTPUT); SetDebug(SILENCE, SPEECH_OUTPUT); SetDebug(PATGEN_RESULT, SPEECH_OUTPUT); // SetDebug(GENERAL, SPEECH_OUTPUT); not needed, already defaults to speech.

Notes: 1. The VE-C default is “NONE“ for all types except GENERAL; “SPEECH_OUTPUT“ for GENERAL.

58 P/N 80-0200-D © 2002 Sensory Inc.

Page 59: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

2. The SPEECH_OUTPUT option outputs to both DAC and PWM and is not interruptible.

See also: DebugH4, DebugH8, DebugH16, DebugH24, DebugD8, DebugD16, DebugD100, DebugPatGen, DebugRecog, DebugRecogSD, DebugRecogSV

The SetStopCondition Function Usage:

BOOL SetStopCondition(UINT8 technology, UINT8 handler) Description:

Selects the stop (jumpout) condition for a given technology. Arguments:

Technology Handler 0 TALK_SETUP 0 NONE 1 PLAY_SETUP 1 IO 2 DTMF_SETUP 2 KEYPAD 3 MUSIC_SETUP 4 PATGEN_SETUP = CL_SETUP = WS_SETUP 5 RECORD_SETUP

Returns: TRUE if function call successful FALSE if any input argument invalid.

Example: // This will allow Patgen (including CL and WS) to be interrupted by // P0.0 going low SetStopCondition(PATGEN_SETUP, IO); SetIOStopCondition(0, 0x01, 1); // stop when P0.0 == 1

Notes: 1) The VE-C default for all technologies is “NONE“, meaning non-interruptible. 2) Debug Speech (See SetDebug Function) is always non-interruptable.

See also: SetIOStopCondition, SetKeypadStopCondition

The SetIOStopCondition Function Usage:

BOOL SetIOStopCondition(UINT8 port, UINT8 bits, UINT8 states) Description:

Used in conjunction with SetStopCondition to further specify a stop condition based on an IO event by specifying which bit(s) in which port need to reach which state(s) to cause an abort.

Arguments: port = 0 for P0, or 1 for P1 bits = an 8-bit mask with 1’s in the bit positions which must be tested states = an 8-bit mask of the states the bits must reach (0=low, 1=high)

Returns: TRUE if function call successful FALSE if any input argument invalid

Example: // This will allow Patgen (including CL and WS) to be interrupted // by P0.0 going low SetStopCondition(PATGEN_SETUP, IO); SetIOStopCondition(0, 0x01, 1); // stop when P0.0 == 1

Notes: 1) The VE-C default for all technologies is “NONE“, meaning non-interruptible. 2) Debug Speech (See SetDebug) is always non-interruptible.

See also: SetDebug, SetStopCondition

The SetKeypadStopCondition Function Usage:

© 2002 Sensory Inc. P/N 80-0200-D 59

Page 60: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

BOOL SetKeypadStopCondition(UINT8 key, UINT8 state)

Description: Used in conjunction with SetStopCondition to further specify a stop condition based on a keypad event by specifying which key needs to reach which state to cause an abort.

Arguments: Key State 0-9 (0-9) 0 RELEASE (Abort will happen if key is released) 10 (*) 1 PRESS (Abort will happen if key is pressed) # 12-19 (A-H) 255 (ANY_KEY)

Returns: TRUE if function call successful FALSE if any input argument invalid

Example: // This will allow Patgen (including CL and WS) to be interrupted // by any keypad keypress SetStopCondition(PATGEN_SETUP, KEYPAD); SetKeypadStopCondition(ANY_KEY, PRESS); // stop when any key is pressed

Notes: 1) The VE-C default for all technologies is “NONE“, meaning non-interruptible. 2) Debug Speech (See SetDebug) is always non-interruptible.

See also: SetDebug, SetStopCondition

Note:

ANY_KEY, RELEASE means that the abort will happen if no keys are pressed key = 0–19 or 255=ANY_KEY state = 0 for RELEASE, 1 for PRESS

Speech Synthesis The Talk Function Usage:

void Talk (UINT8 messageNumber, SPEECH *speechData) Description:

Speaks the utterance at index messageNumber in the speechData vocabulary. Arguments:

messageNumber = The 0-based index number of the utterance to be spoken SpeechData = A Pointer to a Speech File

Returns: void

Example: // This will emit a short <beep> extern SPEECH VPsidemo3; #define MSG_BEEP 27 Talk(MSG_BEEP, &VPsidemo3);

Notes: Utterances are numbered starting with 0.

See also: SetOutput, SetStopCondition

The SenTalk Function Usage:

SINT8 SenTalk (UINT8 sentenceNumber, SENTENCES *sentenceTable) Description:

Speaks the sentence at index sentenceNumber in sentenceTable. Arguments:

60 P/N 80-0200-D © 2002 Sensory Inc.

Page 61: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

sentenceNumber = The 1-based index number of the sentence to be spoken sentenceTable = A pointer to a Sentence Table

Returns: 0 = Sentence is spoken OK. 1 = SenTalk function is interrupted. 2 = sentenceNumber is 0 or > number of sentences in a table. 3 = Any other error is detected in the sentence table.

Example: // This will emit a short <beep> #include <speech\cldemoa.veh> SenTalk(SEN_BEEP, &SNcldemo); // Give a welcome beep

Notes: 1) Sentences are numbered starting with 1.

See also: SetOutput, SetStopCondition

Pattern Generation Pattern Generation can run by itself, which is the STANDARD mode, and is the usual first step in training recognizing speech. It can also run in conjunction while Voice Recording, so that the actual speech is saved and can be replayed. In this case PatGen runs in a BACKGROUND mode and is started before the recording. If RP_THRESH is specified, PatGen also runs in the BACKGROUND and provides threshold information so the recording can be post-processed to remove initial silence and glitches, but no pattern is actually generated. The PatGen Function Usage:

SINT8 PatGen(UINT8 runHow) Description:

Generates a pattern for SD/SV recognition or training. Arguments:

runHow 0 STANDARD 1 BACKGROUND 3 RP_THRESH

Returns: 0 = OK 1 = no data (time out) 2 = too long 3 = too noisy 4 = too soft 5 = too loud 6 = too soon -1 = interrupted

EXAMPLES: // Example 1 – ‘PatGen’ function in STANDARD mode PatGen(STANDARD); // Listen for a word in the SD set RecogSD(ctr, templates); // and try to recognize it

// Example 2 – ‘PatGen’ function in BACKGROUND mode PatGen(BACKGROUND); // Simultaneous PatGen and Record RecordRP(0, NO_THRESH, 0); // Record the speech if (!GetPatGenResult()) // Now check for any errors from PatGen RecogSD(ctr, templates); // and recognize if everything went well

// Example 3 – ‘PatGen’ function in RP_THRESH mode PatGen(RP_THRESH); // PatGen thresholding only for Record RecordRP(0, FULL_THRESH, 0); // Record the speech while removing all silences if (!GetPatGenResult()) // Now check for any errors from PatGen

© 2002 Sensory Inc. P/N 80-0200-D 61

Page 62: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

RecogSD(ctr, templates); // and recognize if everything went well

Notes: 1. PatGen can be run in one of three ways:

It can run by itself, which is the STANDARD mode, and is the usual first step in training or recognizing speech. It can also run in conjunction with the Record/Playback technology, so that the actual speech is saved and can be replayed. In this case PatGen runs in a BACKGROUND mode and is started before the recording. If RP_THRESH is specified, PatGen also runs in the BACKGROUND and provides threshold information so the recording can be post-processed to remove initial silence and glitches, but no pattern is actually generated.

2. PatGen Automatically performs a silence measurement before which will take up to 150 milliseconds before the start of pattern generation.

3. Certain errors (too soft, too loud and too soon) can be suppressed with the SetPatGenNoErrors function.

4. Patterns are always recorded into the UNKNOWN buffer. See also:

SetStopCondition, DebugPatGen, SetPatGenSepSil, SetPatGenPreSil, SetPatGenNoErrors, RecordRP.

The PatGenW Function Usage:

SINT8 PatGenW(UINT8 runHow, WEIGHTS *weightTable) Description:

Generates a pattern for SI recognition. Arguments:

runHow weightTable 0 STANDARD A pointer to a speaker independent weights table. 1 BACKGROUND 3 RP_THRESH

Returns: 0 = OK 1 = no data (time out) 2 = too long 3 = too noisy 4 = too soft 5 = too loud 6 = too soon -1 = interrupted

EXAMPLES: // Example 1 – ‘PatGen’ function in STANDARD mode PatGenW(STANDARD, &WTSI6); // Listen for a word in the SI SI6 set Recog(&WTSI6); // and try to recognize it

// Example 2 – ‘PatGen’ function in BACKGROUND mode PatGenW(BACKGROUND, &WTDigits); // Simultaneous PatGen and Record RecordRP(0, NO_THRESH, 0); // Record the speech if (!GetPatGenResult()) // Now check for any errors from PatGen Recog(&WTDigits); // and recognize if everything went well

// Example 3 – ‘PatGen’ function in RP_THRESH mode PatGenW(RP_THRESH, &WTDigits); // PatGen thresholding only for Record RecordRP(0, FULL_THRESH, 0); // Record the speech while removing silences if (!GetPatGenResult()) // Now check for any errors from PatGen Recog(&WTDigits); // and recognize if everything went well

Notes: 1. PatGenW can be run in one of three ways:

62 P/N 80-0200-D © 2002 Sensory Inc.

Page 63: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

It can run by itself, which is the STANDARD mode, and is the usual first step in training or recognizing speech. It can also run in conjunction with the Record/Playback technology, so that the actual speech is saved and can be replayed. In this case PatGenW runs in a BACKGROUND mode and is started before the recording. If RP_THRESH is specified, PatGenW also runs in the BACKGROUND and provides threshold information so the recording can be post-processed to remove initial silence and glitches, but no pattern is actually generated.

2. PatGenW Automatically performs a silence measurement before which will take up to 150 milliseconds before the start of pattern generation.

3. Certain errors (too soft, too loud and too soon) can be suppressed with the SetPatGenNoErrors function.

4. Patterns are always recorded into the UNKNOWN buffer. See also:

SetStopCondition, DebugPatGen, SetPatGenSepSil, SetPatGenPreSil, SetPatGenNoErrors, RecordRP.

The GetPatGenResult Function Usage:

SINT8 GetPatGenResult(void) Description:

Returns the result from the most recent call to Patgen, PatgenW, PatGenWS, CLPatgen or CLPatGenW functions.

Arguments: void

Returns: 0 = OK 1 = no data (time out) 2 = too long 3 = too noisy 4 = too soft 5 = too loud 6 = too soon -1 = interrupted

Example: PatGenW(BACKGROUND, &WTDigits); // Simultaneous PatGen and Record RecordRP(0, NO_THRESH, 0); // Record the speech if (!GetPatGenResult()) // Now check for any errors from PatGen Recog(&WTDigits); // and recognize if everything went well

Notes: This function is intended for use when Patgen, PatgenW, PatGenWS, CLPatgen or CLPatGenW have been run in BACKGROUND or RP_THRESH mode.

See also: SetStopCondition, DebugPatGen, SetPatGenSepSil, SetPatGenPreSil, SetPatGenNoErrors.

The DebugPatGen Function Usage:

void DebugPatGen(void) Description:

Outputs debug information from the most recent call to Patgen, PatgenW, PatGenWS, CLPatgen or CLPatGenW functions.

Arguments: void

Returns: void

Example: // This sets up to output the PatGen result and silence info.

© 2002 Sensory Inc. P/N 80-0200-D 63

Page 64: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

SetDebug(SILENCE, SPEECH_OUTPUT); SetDebug(PATGEN_RESULT, SPEECH_OUTPUT); PatGen(STANDARD); // Listen for a word in the SD set DebugPatGen(); // output debug information

Notes: 1) Debug output can be “spoken” to the PWM and DAC output or sent as serial data to the RS-232

port by the SetDebug function. 2) This function will output the PATGEN_RESULT and SILENCE information 3) Debug output only occurs if the most recent Patgen, PatgenW, PatGenWS, CLPatgen or

CLPatGenW result was non-zero. See also:

SetDebug, PatGen, PatGenW, PatGenWS, CLPatgen, CLPatGenW. The SetPatGenMaxWords Function Usage:

BOOL SetPatGenMaxWords(UINT8 maxWords) Description:

Controls the maximum number of words allowed by PatGen, PatGenW and PatGenWS. Arguments:

maxWords = The maximum number of words allowed (1, 2 or 3). Returns:

TRUE if function call successful FALSE if (1 > maxWords > 3)

Example: SetPatGenMaxWords(1); // only listen for 1 word max. PatGen(STANDARD); // Listen for a word in the SD set

Notes: 1) Setting maxWords to 1 improves the apparent response time for single word applications. 2) The VE-C default is SetPatGenMaxWords(2).

See also: SetPatGenSepSil

The SetPatGenSepSil Function Usage:

BOOL SetPatGenSepSil(UINT8 sepSil) Description:

Used in conjunction with SetPatGenMaxWords to control the maximum amount of word separation when >1 words are recorded in PatGen, PatGenW or PatGenWS.

Arguments: sepSil = The maximum amount of separation measured in 12.5 millisecond blocks (range 1 to 255).

Returns: TRUE if function call successful FALSE if (1 > sepSil > 255)

Example: SetPatGenMaxWords(2); // listen for 1 or 2 words. SetPatGenSepSil(80); // allow up to 1 second between words PatGen(STANDARD); // Listen for a word in the SD set

Notes: 1) If SetPatGenMaxWords(1) then this function has no effect. 2) The VE-C default is SetPatGenSepSil(40), or approximately 500 milliseconds.

See also: SetPatGenMaxWords

The SetPatGenPreSil Function Usage:

BOOL SetPatGenPreSil (UINT16 preSil) Description:

64 P/N 80-0200-D © 2002 Sensory Inc.

Page 65: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

Controls the amount of time before a PatGen, PatGenW or PatGenWS no data (timeout) result. If nothing is spoken before the PreSil timeout duration, PatGen, PatGenW and PatGenWS will return a value of 1 (no data).

Arguments: presil = The timeout period measured in 12.5 millisecond blocks (range 1 to 65536).

Returns: TRUE if function call successful FALSE if (1 > preSil > 65535)

Example: SetPatGenPreSil(800); // allow up to 10 seconds for timeout PatGen(STANDARD); // Listen for a word in the SD set

Notes: The VE-C default is SetPatGenPreSil(256), or approximately 3.2 seconds.

The SetPatGenNoErrors Function Usage:

BOOL SetPatGenNoErrors (BOOL OnOff) Description:

Controls whether errors due to “too soft“, “too loud“ or “too soon“ are ignored by PatGen, PaGenW and PatGenWS.

Arguments: OnOff = TRUE (ignore errors) or FALSE (allow errors)

Returns: TRUE if function call successful FALSE if (1 > preSil > 65535)

Example: SetPatGenNoErrors(TRUE); // Ignore some errors PatGen(STANDARD); // Listen for a word in the SD set

Notes: 1) If OnOff is TRUE, and an ignorable error occurs, then PatGen, PatGenW and PatGenWS continue

listening until the user says another word or until the function times out. 2) The VE-C default is SetPatGenNoErrors(FALSE)

See also: SetPatGenPreSil

Speaker Independent Recognition 1) The following functions return multiple results that may be obtained by calling accessor functions

(GetRecog…) or spoken with the DebugRecog function. These results are only valid until the next call to a Recog/Prior/Train function.

The Recog Function Usage:

UINT8 Recog(WEIGHTS *weights) Description:

Performs speaker independent recognition against a given weights set. Arguments:

*weights = a pointer to a Speaker Independent Weight Set (.VEW) file Returns:

Index of best match: (range = 0 to (number of words in weight set – 1)) Example:

PatGenW(STANDARD, &WTSI6); // Listen for a word in the SI SI6 set Recog(&WTSI6); // and try to recognize it

Notes: 1) The sum of all the scores for all the members of an SI set will always equal 100 (minus rounding

errors). 2) Recog returns multiple results that may be obtained by calling the following accessor functions:

GetRecogMatch1, GetRecogLevel1, GetRecogMatch2, GetRecogLevel2,

© 2002 Sensory Inc. P/N 80-0200-D 65

Page 66: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

GetRecogMatch3, GetRecogLevel3 and GetRecogSetSize or spoken with the DebugRecog function.

3) The result from Recog is the same as from the accessor function GetRecogMatch1 See also:

GetRecogMatch1, GetRecogLevel1, GetRecogMatch2, GetRecogLevel2, GetRecogMatch3, GetRecogLevel3, GetRecogSetSize, DebugRecog

The Prior Function Usage:

UINT8 Prior(UINT8 favorite, BOOL emphasize) Description:

Postprocesses the results of the most recent Recog to emphasize or remove a specific favorite answer. Arguments:

favorite = The index number to be emphasized or removed from consideration during the previous Recog call. emphasize = TRUE (emphasize) or FALSE (remove)

Returns: Index of new best match: (range = 0 to (number of words in weight set – 1))

Example: // Example 1 – Give emphasis to a particular subset of answers Recog(&WTDIGITS); // recognize against the digits set Prior(1, TRUE); // Give priority to digits 1, 2 and 3 Prior(2, TRUE); Prior(3, TRUE); // Example 2 – Remove a particular subset of answers Recog(&WTDIGITS); // recognize against the digits set Prior(0, FALSE); // Remove ‘0’ as a possible answer

Notes: 1) The sum of all the scores for all the members of an SI set will always equal 100 (minus rounding

errors). 2) Prior(n, TRUE) works by doubling the probability score of index ‘n’ at the expense of the rest of the

members in the same SI set. 3) Prior(n, FALSE) works by “zero”-ing out the probability score of index ‘n’ and redistributing its

percentage points to the rest of the members in the SI set. 4) Use Prior when there is a single correct or preferred answer, or a single answer that should be

removed from consideration. 5) The result from Prior is the same as from the accessor function GetRecogMatch1

See also: Recog, GetRecogMatch1

The GetRecogMatch1 Function Usage:

UINT8 GetRecogMatch1(void) Description:

Returns the index number of the best match for the most recent Recog or Prior function call. Arguments:

void Returns:

Index of best match: (range = 0 to (number of words in weight set – 1)) Example:

uint8 bestMatch; Recog(&WTSI6); // recognize against the SI6 set bestMatch = GetRecogMatch1(); // which was it?

Notes: 1) The GetRecogLevel function returns the confidence score for the companion GetRecogMatch1

function result.

66 P/N 80-0200-D © 2002 Sensory Inc.

Page 67: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

2) The result from GetRecogMatch1 is only valid until the next call to a Recog, Prior or Train

function. See also:

Recog, Prior, GetRecogLevel1 The GetRecogLevel1 Function Usage:

UINT8 GetRecogLevel1(void) Description:

Returns confidence level of the best match for the most recent Recog or Prior function call. Arguments:

void Returns:

Confidence score in the range 0-100, with 100 being the highest. Example:

Recog(&WTSI6); // recognize against the SI6 set if(GetRecogLevel1() > 90) { // there was a successful match…

… }

Notes: 1) The GetRecogLevel function returns the confidence score for the companion GetRecogMatch1

function result. 2) The result from GetRecogLevel1 is only valid until the next call to a Recog, Prior or Train

function. See also:

Recog, Prior, GetRecogMatch1 The GetRecogMatch2 Function Usage:

UINT8 GetRecogMatch2(void) Description:

Returns the index number of the 2nd best match for the most recent Recog or Prior function call. Arguments:

void Returns:

Index of 2nd best match: (range = 0 to (number of words in weight set – 1)), or 255 if there is no 2nd best match (i.e. GetRecogLevel2() == 0).

Example: uint8 bestMatch2; Recog(&WTSI6); // recognize against the SI6 set BestMatch2 = GetRecogMatch2(); // which was the 2nd best match?

Notes: 1) The GetRecogLeve2 function returns the confidence score for the companion GetRecogMatch2

function result. 2) The result from GetRecogMatch2 is only valid until the next call to a Recog, Prior or Train

function. See also:

Recog, Prior, GetRecogLevel2 The GetRecogLevel2 Function Usage:

UINT8 GetRecogLevel2(void) Description:

Returns confidence level of the 2nd best match for the most recent Recog or Prior function call. Arguments:

void Returns:

© 2002 Sensory Inc. P/N 80-0200-D 67

Page 68: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

Confidence score in the range 0-100, with 100 being the highest.

Example: Recog(&WTSI6); // recognize against the SI6 set if(GetRecogLevel2() > 90) { // there was a good 2nd best match…

… }

Notes: 1) The GetRecogLeve2 function returns the confidence score for the companion GetRecogMatch2

function result. 2) The result from GetRecogLevel2 is only valid until the next call to a Recog, Prior or Train

function. See also:

Recog, Prior, GetRecogMatch2 The GetRecogMatch3 Function Usage:

UINT8 GetRecogMatch3(void) Description:

Returns the index number of the 3rd best match for the most recent Recog or Prior function call. Arguments:

void Returns:

Index of 3rd best match: (range = 0 to (number of words in weight set – 1)), or 255 if there is no 3rd best match (i.e. GetRecogLevel3() == 0).

Example: uint8 bestMatch3; Recog(&WTSI6); // recognize against the SI6 set BestMatch3 = GetRecogMatch3(); // which was the 3rd best match?

Notes: 1) The GetRecogLeve3 function returns the confidence score for the companion GetRecogMatch3

function result. 2) The result from GetRecogMatch3 is only valid until the next call to a Recog, Prior or Train

function. See also:

Recog, Prior, GetRecogLevel3 The GetRecogLevel3 Function Usage:

UINT8 GetRecogLevel3(void) Description:

Returns confidence level of the 3rd best match for the most recent Recog or Prior function call. Arguments:

void Returns:

Confidence score in the range 0-100, with 100 being the highest. Example:

Recog(&WTSI6); // recognize against the SI6 set if(GetRecogLevel3() > 90) { // there was a good 3rd best match…

… }

Notes: 1) The GetRecogLeve3 function returns the confidence score for the companion GetRecogMatch3

function result. 2) The result from GetRecogLevel3 is only valid until the next call to a Recog, Prior or Train

function. See also:

Recog, Prior, GetRecogMatch3

68 P/N 80-0200-D © 2002 Sensory Inc.

Page 69: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

The GetRecogSetSize Function Usage:

UINT8 GetRecogSetSize(void) Description:

Returns the number of members in the SI recognition set for the most recent Recog or Prior function call.

Arguments: void

Returns: Number of members in the SI recognition set. (range 1-N)

Example: if(Recog(&WTSI6) == GetRecogSetSize()) {

… // The last index number in the SI6 set was matched }

Notes: 1) The result from GetRecogSetSize is only valid until the next call to a Recog, Prior or Train function. 2) SI weight sets usually have a NOTA or REJECT category as the last member.

See also: Recog, Prior

The DebugRecog Function Usage:

Void DebugRecog(void) Description:

Outputs debug information from the most recent call to Recog. Arguments:

void Returns:

void Example:

// This sets up to output the Recog debug results to the speaker. SetDebug(RECOG_RESULT, SPEECH_OUTPUT); // output to the speaker Recog(&WTSI6); // recognize against the SI6 set DebugRecog(); // output debug information

Notes: 1) Debug output can be “spoken” to the PWM and DAC output or sent as serial data to the RS-232

port by the SetDebug function. 2) This function will output the RECOG_RESULT and SILENCE information 3) DebugRecog output = Match1, Level1, Match2, Level2, Match3 and Level3

See also: SetDebug, Recog

Speaker Dependent Recognition Speaker Dependent Recognition and Speaker Verification both use the PutTemplate, GetTemplate and MaskTemplate functions described below. The PutTemplate Function Usage:

UINT8 PutTemplate(UINT8 sourceTemplate, UINT8 index, TEMPLATE *baseTemplate) Description:

Copies a PatGen type pattern from an internal Voice Extreme IC buffer to a TEMPLATE array in flash memory.

Arguments: sourceTemplate = One of two internal template buffers. (UNKNOWN or KNOWN) index = The array element of baseTemplate where the pattern will be copied baseTemplate = the TEMPLATE array in flash memory

© 2002 Sensory Inc. P/N 80-0200-D 69

Page 70: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

Returns:

0 if function call successful 1 if sourceTemplate is invalid (i.e. (sourceTemplate != UNKNOWN) && (sourceTemplate != KNOWN))

Example: // This records twice for SD training. // NOTE: error handling has been left out // of this example for ease of reading. TEMPLATE tempTemplates[1]; // create an array for temp. storage PatGen(STANDARD); // say it once PutTemplate(UNKNOWN, 0, tempTemplates); // save it PatGen(STANDARD); // say it again GetTemplate(KNOWN, 0, tempTemplates); // get the 1st utterance TrainSD(UNKNOWN, KNOWN, UNKNOWN); // average them together

Notes: 1) Normally only patterns from PatGen, CLPatGen and PatGenWS are saved in the flash memory. 2) Use the GetTemplate function to retrieve patterns from the flash memory.

See also: PatGen, CLPatGen, PatGenWS, GetTemplate, TrainSD, TrainSV

The GetTemplate Function Usage:

UINT8 GetTemplate(UINT8 destTemplate, UINT8 index, TEMPLATE *baseTemplate) Description:

Copies a PatGen type pattern from a TEMPLATE array in flash memory to an internal Voice Extreme IC buffer.

Arguments: destTemplate = One of two internal template buffers (UNKNOWN or KNOWN) index = The array element of baseTemplate where the pattern will be copied baseTemplate = the TEMPLATE array in flash memory

Returns: 0 if function call successful 1 if sourceTemplate is invalid (i.e. (sourceTemplate != UNKNOWN) && (sourceTemplate != KNOWN))

Example: // This records twice for SD training. // NOTE: error handling has been left out // of this example for ease of reading. TEMPLATE tempTemplates[1]; // create an array for temp. storage PatGen(STANDARD); // say it once PutTemplate(UNKNOWN, 0, tempTemplates); // save it PatGen(STANDARD); // say it again GetTemplate(KNOWN, 0, tempTemplates); // get the 1st utterance TrainSD(UNKNOWN, KNOWN, UNKNOWN); // average them together

Notes: 1) Normally only patterns from PatGen, CLPatGen and PatGenWS are saved in the flash memory 2) Use the PutTemplate function to copy patterns into the flash memory

See also: PatGen, CLPatGen, PatGenWS, PutTemplate, TrainSD, TrainSV

The MaskTemplate Function Usage:

Void MaskTemplate (BOOL enable, UINT8 index, TEMPLATE *baseTemplate) Description:

Disables or enables a PatGen type pattern stored in a TEMPLATE array in flash memory. Arguments:

enable = FALSE (disable) or TRUE (enable) the pattern index = The array element of baseTemplate where the pattern will be copied. baseTemplate = the TEMPLATE array in flash memory

Returns:

70 P/N 80-0200-D © 2002 Sensory Inc.

Page 71: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

void

Example: // This will remove the first pattern in an array before // calling RecogSD. // NOTE: This example assumes that the first pattern is valid. #define PATCOUNT 5 TEMPLATE myTemplates[PATCOUNT]; // create array for SD recognition … MaskTemplate(FALSE, 0, myTemplates); // mask out pattern #0 RecogSD(PATCOUNT, myTemplates); // Now recognize MaskTemplate(TRUE, 0, myTemplates); // restore pattern #0

Notes: 1) Normally only patterns from PatGen, CLPatGen and PatGenWS are saved in the flash memory. 2) MaskTemplate works by overwriting the first byte of the Pattern in the flash memory. Normally this

first byte is 0x10. 3) It is possible to enable a pattern which does not exist. In this case the pattern will still not be

recognized during RecogSD. See also:

RecogSD, RecogSV The SetSDPerformance Function Usage:

BOOL SetSDPerformance(UINT8 performanceLevel) Description:

Controls the tradeoff between recognition speed and accuracy for RecogSD. Arguments:

performanceLevel = The performance setting (1, 2, 3, 4 or 5). Returns:

TRUE if function call successful FALSE if (1 > performanceLevel > 5)

Example: #define PATCOUNT 5 TEMPLATE myTemplates[PATCOUNT]; // create array for SD recognition SetSDPerformance(5); // Set for most strict setting RecogSD(PATCOUNT, myTemplates); // Now recognize

Notes: 1) Higher numbers give better accuracy but have a stricter tolerance in the way the word must be

spoken and take more time to recognize. 2) The VE-C default is SetSDPerformance(3)

See also: RecogSD

The TrainSD Function Usage:

UINT8 TrainSD(UINT8 srcTemplateA, UINT8 srcTemplateB, UINT8 dstTemplate) Description:

Compares two patterns recorded with PatGen or PatGenWS and averages them into a third template suitable for use with RecogSD or Wordspot.

Arguments: srcTemplateA, srcTemplateB and dstTemplate = = One of two internal template buffers (UNKNOWN or KNOWN)

Returns: 0 – The patterns are similar enough to each other to be successfully averaged 1 – The patterns are too dissimilar

Example: // This records twice for SD training. // NOTE: error handling has been left // out of this example for ease of reading.

© 2002 Sensory Inc. P/N 80-0200-D 71

Page 72: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

TEMPLATE tempTemplates[1]; // create an array for temp. storage PatGen(STANDARD); // say it once PutTemplate(UNKNOWN, 0, tempTemplates); // save it PatGen(STANDARD); // say it again GetTemplate(KNOWN, 0, tempTemplates); // get the 1st utterance TrainSD(UNKNOWN, KNOWN, UNKNOWN); // average them together

Notes: 1) In most cases, the proper way to call this function is:

TrainSD(UNKNOWN, KNOWN, UNKNOWN); 2) TrainSD returns multiple results that may be obtained by calling the following accessor function:

GetTrainSDScore See also:

PatGen, PutTemplate, GetTemplate, GetTrainSDScore The GetTrainSDScore Function Usage:

UINT8 GetTrainSDScore(void) Description:

Returns the comparison score from the most recent call to TrainSD Arguments:

void Returns:

The comparison score (range 0 to 255) Example:

// This example averages two patterns and speaks the // comparison score. TrainSD(UNKNOWN, KNOWN, UNKNOWN); // average them together DebugH8(GetTrainSDScore()); // speak the debug score

Notes: 1) The lower the score, the better the match between the two trained patterns. In theory 0x00 is a

perfect match and 0xFF is the worst possible match. In practice, the effective range is smaller. 2) The maximum threshold score for acceptance in TrainSD is about 0x96.

See also: TrainSD

The RecogSD Function Usage:

UINT8 RecogSD(UINT8 numPatterns, TEMPLATE *baseTemplate) Description:

Performs speaker dependant recognition. Arguments:

numPatterns = the number of contiguous patterns in the flash memory TEMPLATE array to compare. baseTemplate = the TEMPLATE array in flash memory.

Returns: 0 – Succesful SD recognition. 1 – Failed SD recognition. 2 – Uncertain SD recognition (two or more equally good matches).

Example: #define PATCOUNT 5 TEMPLATE myTemplates[PATCOUNT]; // create array for SD recognition RecogSD(PATCOUNT, myTemplates); // Now recognize

Notes: 1) Use the SetSDPerformance function to control the strictness of the recognition. 2) The UNKNOWN internal template buffer is always used during SD recognition 3) The numPatterns count should include any patterns disabled with the MaskTemplate function,

but these disabled patterns will not be compared. 4) RecogSD returns multiple results that may be obtained by calling the following accessor functions:

GetRecogSDResult, GetRecogSDClass1, GetRecogSDScore1, GetRecogSDClass2,

72 P/N 80-0200-D © 2002 Sensory Inc.

Page 73: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

GetRecogSDScore2, GetRecogSDDiff and GetRecogSDSetSize or spoken with the DebugRecogSD function.

5) The result from RecogSD is the same as from the accessor function GetRecogSDResult

See also: SetSDPerformance, MaskTemplate, GetRecogSDResult, GetRecogSDClass1, GetRecogSDScore1, GetRecogSDClass2, GetRecogSDScore2, GetRecogSDDiff GetRecogSDSetSize DebugRecogSD

The GetRecogSDResult Function Usage:

UINT8 GetRecogSDResult(void) Description:

Returns the result for the most recent RecogSD function call. Arguments:

void Returns:

0 – Succesful SD recognition. 1 – Failed SD recognition. 2 – Uncertain SD recognition (two or more equally good matches).

Example: #define PATCOUNT 5 TEMPLATE myTemplates[PATCOUNT]; // create array for SD recognition RecogSD(PATCOUNT, myTemplates); // Now recognize DebugH8(GetRecogSDResult()); // output the result from RecogSD

Notes: The result from GetRecogSDResult is only valid until the next call to a RecogSD or TrainSD function.

See also: RecogSD, TrainSD

The GetRecogSDClass1 Function Usage:

UINT8 GetRecogSDClass1(void) Description:

Returns the index number of the best match for the most recent RecogSD function call. Arguments:

void Returns:

Index of best match: (range = 0 to numPatterns-1) or if RecogSD failed, then returns numPatterns. Example:

#define PATCOUNT 5 TEMPLATE myTemplates[PATCOUNT]; // create array for SD recognition RecogSD(PATCOUNT, myTemplates); // Now recognize DebugH8(GetRecogSDClass1()); // output best match from RecogSD

Notes: 1) The GetRecogSDScore1 function returns the confidence score for the companion

GetRecogSDClass1 function result. 2) The result from GetRecogSDClass1 is only valid until the next call to a RecogSD or TrainSD

function. See also:

RecogSD, TrainSD, GetRecogSDScore1 The GetRecogSDScore1 Function Usage:

UINT8 GetRecogSDScore1(void) Description:

Returns the best recognition score from the most recent call to RecogSD Arguments:

© 2002 Sensory Inc. P/N 80-0200-D 73

Page 74: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

void

Returns: The best recognition score (range 0 to 255)

Example: #define PATCOUNT 5 TEMPLATE myTemplates[PATCOUNT]; // create array for SD recognition RecogSD(PATCOUNT, myTemplates); // Now recognize DebugH8(GetRecogSDScore1()); // output best score from RecogSD

Notes: 1) The GetRecogSDScore1 function returns the confidence score for the companion

GetRecogSDClass1 function result. 2) The lower the score, the better the recognition match. In theory 0x00 is a perfect match and 0xFF

is the worst possible match. In practice, the effective range is smaller. 3) The maximum threshold score for acceptance in RecogSD is about 0x96, but will vary depending

on how the SetSDPerformance function has been called. 4) The result from GetRecogSDScore1 is only valid until the next call to a RecogSD or TrainSD

function. See also:

RecogSD, TrainSD, GetRecogSDClass1, SetSDPerformance The GetRecogSDClass2 Function Usage:

UINT8 GetRecogSDClass2(void) Description:

Returns the index number of the 2nd best match for the most recent RecogSD function call. Arguments:

void Returns:

Index of 2nd best match: (range = 0 to numPatterns-1) or if RecogSD failed, then returns numPatterns. Example:

#define PATCOUNT 5 TEMPLATE myTemplates[PATCOUNT]; // create array for SD recognition RecogSD(PATCOUNT, myTemplates); // Now recognize DebugH8(GetRecogSDClass2()); // output 2nd best from RecogSD

Notes: 1) The GetRecogSDScore2 function returns the confidence score for the companion

GetRecogSDClass2 function result. 2) The result from GetRecogSDClass2 is only valid until the next call to a RecogSD or TrainSD

function. See also:

RecogSD, TrainSD, GetRecogSDScore2 The GetRecogSDScore2 Function Usage:

UINT8 GetRecogSDScore2(void) Description:

Returns the 2nd best recognition score from the most recent call to RecogSD Arguments:

void Returns:

The 2nd best recognition score (range 0 to 255) Example:

#define PATCOUNT 5 TEMPLATE myTemplates[PATCOUNT]; // create array for SD recognition RecogSD(PATCOUNT, myTemplates); // Now recognize DebugH8(GetRecogSDScore2()); // output 2nd best from RecogSD

Notes:

74 P/N 80-0200-D © 2002 Sensory Inc.

Page 75: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

1) The GetRecogSDScore2 function returns the confidence score for the companion

GetRecogSDClass2 function result. 2) The lower the score, the better the recognition match. In theory 0x00 is a perfect match and 0xFF

is the worst possible match. In practice, the effective range is smaller. 3) The maximum threshold score for acceptance in RecogSD is about 0x96, but will vary depending

on how the SetSDPerformance function has been called. 4) The result from GetRecogSDScore2 is only valid until the next call to a RecogSD or TrainSD

function. See also:

RecogSD, TrainSD, GetRecogSDClass2, SetSDPerformance The GetRecogSDDiff Function Usage:

UINT8 GetRecogSDDiff(void) Description:

Returns difference between the two best scores from the most recent call to RecogSD. Arguments:

void Returns:

The difference in recognition scores (range 0 to 255) Example:

#define PATCOUNT 5 TEMPLATE myTemplates[PATCOUNT]; // create array for SD recognition RecogSD(PATCOUNT, myTemplates); // Now recognize DebugH8(GetRecogSDDiff()); // output score diff. from RecogSD

Notes: 1) This function is equivalent to: (GetRecogSDScore2() – GetRecogSDScore1()); 2) This function is most useful when RecogSD returns an uncertain response. 3) The result from GetRecogSDDiff is only valid until the next call to a RecogSD or TrainSD

function. See also:

RecogSD, TrainSD The GetRecogSDSetSize Function Usage:

UINT8 GetRecogSDSetSize(void) Description:

Returns the number of patterns argument (numPatterns) from the most recent call to RecogSD. Arguments:

void Returns:

The number of patterns. Example:

#define PATCOUNT 5 TEMPLATE myTemplates[PATCOUNT]; // create array for SD recognition RecogSD(PATCOUNT, myTemplates); // Now recognize DebugH8(GetRecogSDSetSize()); // How many patterns?

Notes: The result from GetRecogSDSetSize is only valid until the next call to a RecogSD or TrainSD function.

See also: RecogSD, TrainSD

The DebugRecogSD Function Usage:

void DebugRecogSD(void) Description:

Outputs debug information from the most recent call to RecogSD. Arguments:

© 2002 Sensory Inc. P/N 80-0200-D 75

Page 76: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

void

Returns: void

Example: // This sets up to output the RecogSD debug results to the speaker. SetDebug(RECOG_RESULT, SPEECH_OUTPUT); // output to the speaker RecogSD(ctr, myTemplates); // recognize against the SD set DebugRecogSD(); // output debug information

Notes: 1) Debug output can be “spoken” to the PWM and DAC output or sent as serial data to the RS-232

port by the SetDebug function. 2) This function will output the RECOG_RESULT and SILENCE information 3) DebugRecogSD output = Result, Class1, Score1, Class2, Score2, Diff

See also: SetDebug, RecogSD

Speaker Verification The SetSVSecurityLevel Function Usage:

BOOL SetSVSecurityLevel(UINT8 level) Description:

Controls the tradeoff between false accepts (FA) and false rejects (FR) for RecogSV. Arguments:

level = The performance setting (0, 1, 2, 3, 4 or 5). Returns:

TRUE if function call successful FALSE if (0 > level > 5)

Example: #define PWDCOUNT 1 TEMPLATE myPassword[PWDCOUNT]; // create array for SV recognition SetSVSecurityLevel(5); // Set for most strict setting RecogSV(1, 1, 1, myPassword); // Now recognize

Notes: 1) Higher numbers make RecogSV acceptance more difficult. 2) ‘0’ is a special value used with Continuous Listening only, which forces the size parameter of

RecogSV to 1 and causes RecogSV to use an extra-low security level more appropriate for CL applications.

3) The VE-C default is SetSVSecurityLevel(3) See also:

RecogSV The TrainSV Function Usage:

UINT8 TrainSV(UINT8 srcTemplateA, UINT8 srcTemplateB, UINT8 dstTemplate) Description:

Compares two patterns recorded with PatGen and averages them into a third template suitable for use with RecogSV.

Arguments: srcTemplateA, srcTemplateB and dstTemplate = = One of two internal template buffers (UNKNOWN or KNOWN)

Returns: 0 – The patterns are similar enough to each other to be successfully averaged. 1 – The patterns are too dissimilar.

Example: // This records twice for SV training. // NOTE: error handling has been left out of this example for // ease of reading.

76 P/N 80-0200-D © 2002 Sensory Inc.

Page 77: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

TEMPLATE tempTemplates[1]; // create an array for temp. storage PatGen(STANDARD); // say it once PutTemplate(UNKNOWN, 0, tempTemplates); // save it PatGen(STANDARD); // say it again GetTemplate(KNOWN, 0, tempTemplates); // get the 1st utterance TrainSV(UNKNOWN, KNOWN, UNKNOWN); // average them together

Notes: 1) In most cases, the proper way to call this function is:

TrainSV(UNKNOWN, KNOWN, UNKNOWN); 2) TrainSV returns multiple results that may be obtained by calling the following accessor function:

GetTrainSVScore See also:

PatGen, PutTemplate, GetTemplate, GetTrainSVScore The GetTrainSVScore Function Usage:

UINT8 GetTrainSVScore(void) Description:

Returns the comparison score from the most recent call to TrainSV Arguments:

void Returns:

The comparison score (range 0 to 255) Example:

// This example averages two patterns and speaks the comparison score. TrainSV(UNKNOWN, KNOWN, UNKNOWN); // average them together DebugH8(GetTrainSVScore()); // speak the debug score

Notes: 1) The lower the score, the better the match between the two trained patterns. In theory 0x00 is a

perfect match and 0xFF is the worst possible match. In practice, the effective range is much smaller.

2) The maximum threshold score for acceptance in TrainSV is about 0x80. See also:

TrainSV The RecogSV Function Usage:

SINT8 RecogSV(UINT8 element, UINT8 size, UINT8 classes, TEMPLATE *baseTemplate)

Description: Performs speaker verification type recognition.

Arguments: element = The word number in a multi-password sequence, or ‘1’ in a single password sequence (range = 1 to size) size = The number of passwords in the sequence (range = 1 to 4) classes = Controls whether the passwords must be spoken in a strict sequence, or may be spoken in any order. baseTemplate = The TEMPLATE array in flash memory.

Returns: 0 – Succesful SV recognition. 1 – Failed SV recognition. -1 – Undecided SV recognition. (more passwords needed)

Example: #define PWDCOUNT 1 TEMPLATE myPassword[PWDCOUNT]; // create array for SV recognition RecogSV(1, 1, 1, myPassword); // Now recognize

Notes: 1) Use the SetSVSecurityLevel function to control the strictness of the password recognition.

© 2002 Sensory Inc. P/N 80-0200-D 77

Page 78: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

2) The UNKNOWN internal template buffer is always used during SV recognition 3) if classes = 1 then the passwords must be spoken in strict sequence. If classes = size then the

passwords may be spoken in any order. 4) RecogSV returns multiple results that may be obtained by calling the following accessor functions:

GetRecogSVResult, GetRecogSVWordResult, GetRecogSVClass and GetRecogSVScore or spoken with the DebugRecogSV function.

5) The result from RecogSD is the same as from the accessor function GetRecogSVResult See also:

SetSVSecurityLevel, GetRecogSVResult, GetRecogSVWordResult, GetRecogSVClass, GetRecogSVScore, DebugRecogSV

The GetRecogSVResult Function Usage:

SINT8 GetRecogSVResult(void) Description:

Returns the result for the most recent RecogSV function call. Arguments:

void Returns:

0 – Successful SV recognition. 1 – Failed SV recognition. -1 – Undecided SV recognition (more passwords needed).

Example: #define PWDCOUNT 1 TEMPLATE myPassword[PWDCOUNT]; // create array for SV recognition RecogSV(1, 1, 1, myPassword); // Now recognize DebugH8(GetRecogSVResult()); // output the result from RecogSV

Notes: The result from GetRecogSVResult is only valid until the next call to a RecogSV or TrainSV function.

See also: RecogSV, TrainSV

The GetRecogSVWordResult Function Usage:

UINT8 GetRecogSVWordResult(void) Description:

Returns the current word result for the most recent RecogSV function call. Arguments:

void Returns:

0 – Successful SV recognition. 1 – Failed SV recognition. -1 – Undecided SV recognition (more passwords needed).

Example: #define PWDCOUNT 1 TEMPLATE myPassword[PWDCOUNT]; // create array for SV recognition RecogSV(1, 1, 1, myPassword); // Now recognize DebugH8(GetRecogSVResult()); // output the result from RecogSV

Notes: The result from GetRecogSVResult is only valid until the next call to a RecogSV or TrainSV function.

See also: RecogSV, TrainSV

The GetRecogSVClass Function Usage:

UINT8 GetRecogSVClass(void) Description:

Returns the index number of the best match for the most recent RecogSV function call.

78 P/N 80-0200-D © 2002 Sensory Inc.

Page 79: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

Arguments:

void Returns:

Index of best match: (range = 0 to classes-1) or if RecogSV failed, then returns classes. Example:

#define PWDCOUNT 1 TEMPLATE myPassword[PWDCOUNT]; // create array for SV recognition RecogSV(1, 1, 1, myPassword); // Now recognize DebugH8(GetRecogSVClass()); // output the result from RecogSV

Notes: 1) The GetRecogSVScore function returns the confidence score for the companion

GetRecogSVClass function result. 2) The result from GetRecogSVClass is only valid until the next call to a RecogSV or TrainSV

function. See also:

RecogSV, TrainSV, GetRecogSVScore The GetRecogSVScore Function Usage:

UINT8 GetRecogSVScore(void) Description:

Returns the best recognition score from the most recent call to RecogSV Arguments:

void Returns:

The best recognition score (range 0 to 255) Example:

#define PWDCOUNT 1 TEMPLATE myPassword[PWDCOUNT]; // create array for SV recognition RecogSV(1, 1, 1, myPassword); // Now recognize DebugH8(GetRecogSVScore()); // output the result from RecogSV

Notes: 1) The GetRecogSVScore function returns the confidence score for the companion

GetRecogSVClass function result. 2) The lower the score, the better the recognition match. In theory 0x00 is a perfect match and 0xFF is

the worst possible match. In practice, the effective range is smaller. 3) The maximum threshold score for acceptance in RecogSV is about 0x96, but will vary depending on

how the SetSVSecurityLevel function has been called. 4) The result from GetRecogSVScore is only valid until the next call to a RecogSV or TrainSV

function. See also:

RecogSV, TrainSV, GetRecogSVClass, SetSVSecurityLevel The GetRecogSVSetSize Function Usage:

UINT8 GetRecogSVSetSize(void) Description:

Returns the set size argument (size) from the most recent call to RecogSV. Arguments:

void Returns:

The set size. Example:

#define PWDCOUNT 1 TEMPLATE myPassword[PWDCOUNT]; // create array for SV recognition RecogSV(1, 1, 1, myPassword); // Now recognize DebugH8(GetRecogSVSetSize()); // output the result from RecogSV

© 2002 Sensory Inc. P/N 80-0200-D 79

Page 80: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

Notes:

The result from GetRecogSVSetSize is only valid until the next call to a RecogSV or TrainSV function.

See also: RecogSV, TrainSV

The DebugRecogSV Function Usage:

void DebugRecogSV(void) Description:

Outputs debug information from the most recent call to RecogSV. Arguments:

void Returns:

void Example:

// This sets up to output the RecogSV debug results to the speaker. SetDebug(RECOG_RESULT, SPEECH_OUTPUT); // output to the speaker RecogSV(1, 1, 1, myPassword); // Now recognize DebugRecogSV(); // output debug information

Notes: 1) Debug output can be “spoken” to the PWM and DAC output or sent as serial data to the RS-232

port by the SetDebug function. 2) This function will output the RECOG_RESULT and SILENCE information 3) DebugRecogSV output = Result, WordResult, Class, Score

See also: SetDebug, RecogSV

Continuous Listening The CLPatGen Function Usage:

SINT8 CLPatGen(UINT8 word) Description:

Generates a pattern for SD/SV recognition using continuous listening. Arguments:

word = the word number (0-N) in a multi word CL sequence, or ‘0’ in a single word CL sequence Returns:

0 = OK 1 = no data (time out) 2 = too long 3 = too noisy 4 = too soft 5 = too loud 6 = too soon -1 = interrupted

EXAMPLES: // Example 1 – ‘CLPatGen’ listening for a single CL trigger word do { CLPatGen(0); // Listen for the CL gateway word until (CheckDuration(SHORT) == 0); RecogSD(1, triggerTemplates); // and try to recognize it // Example 2 – multi word ‘CLPatGen’ listening for a command word ctr = 0; do { do { CLPatGen(ctr); // Listen for a CL trigger word

80 P/N 80-0200-D © 2002 Sensory Inc.

Page 81: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

until (CheckDuration(VARIABLE) == 0); if (RecogSD(1, &trigTemplates[ctr]) == 0) { ctr += 1; // and try to recognize it } until (ctr == 2); // Example 3 – ‘CLPatGen’ listening for a command word do { CLPatGen(0); // Listen for a CL command word until (CheckDuration(VARIABLE) == 0); RecogSD(numCommands, cmdTemplates); // and try to recognize it

Notes: 1) CLPatGen Automatically performs a silence measurement before which will take up to 150

milliseconds before the start of pattern generation. 2) Patterns are always recorded into the UNKNOWN buffer. 3) Use the DebugPatGen function to output debug information from CLPatGen

See also: CheckDuration, DebugPatGen

The CLPatGenW Function Usage:

SINT8 CLPatGenW(INT8 word, WEIGHTS *weightsTable) Description:

Generates a pattern for SI recognition using continuous listening. Arguments:

word = the word number (0-N) in a multi word CL sequence, or ‘0’ in a single word CL sequence weightsTable = A pointer to a speaker independent weights table.

Returns: 0 = OK 1 = no data (time out) 2 = too long 3 = too noisy 4 = too soft 5 = too loud 6 = too soon -1 = interrupted

EXAMPLES: // Example 1 – ‘CLPatGenW’ listening for a single CL trigger word do { CLPatGenW(0, WTTrigger); // Listen for the CL gateway word until (CheckDuration(SHORT) == 0); Recog(WTTrigger); // and try to recognize it

// Example 2 – multi word ‘CLPatGenW’ listening for a trigger word ctr = 0; do { do { CLPatGenW(WTTrigger[ctr]); // Listen for a CL trigger word until (CheckDuration(VARIABLE) == 0); if (Recog(WTTrigger[ctr]) == 0) { ctr += 1; // and try to recognize it } until (ctr == 2); // Example 3 – ‘CLPatGenW’ listening for a command word do { CLPatGenW(WTCommands); // Listen for a CL command word until (CheckDuration(VARIABLE) == 0);

© 2002 Sensory Inc. P/N 80-0200-D 81

Page 82: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

Recog(WTCommands); // and try to recognize it

Notes: 1) CLPatGenW Automatically performs a silence measurement before which will take up to 150

milliseconds before the start of pattern generation. 2) Patterns are always recorded into the UNKNOWN buffer. 3) Use the DebugPatGen function to output debug information from CLPatGenW

See also: CheckDuration, DebugPatGen

The CheckDuration Function Usage:

UINT8 CheckDuration(UINT8 duration) Description:

Checks the duration of the pattern most recently recorded with CLPatGen or CLPatGenW to see if it’s reasonable.

Arguments: duration = 0 (SHORT), 1 (MEDIUM), 2 (LONG) or 3 (VARIABLE/UNKNOWN)

Returns: 0 – The pattern is of a reasonable duration, or 1 – The pattern is too short or too long.

EXAMPLES: do { CLPatGenW(0, WTTrigger); // Listen for the CL gateway word until (CheckDuration(SHORT) == 0); Recog(WTTrigger); // and try to recognize it

Notes: 1) It is not mandatory to call CheckDuration after each call to CLPatGen or CLPatGenW, but it is

highly recommended. 2) Use the VARIABLE/UNKNOWN duration for when there are multiple word durations possible.

See also: CLPatGen, CLPatGenW

The SetCLPerformance Function Usage:

BOOL SetCLPerformance(UINT8 performanceLevel) Description:

Controls the tradeoff between recognition speed and accuracy for CLPatGen and CLPatGenW. Arguments:

performanceLevel = The performance setting (1, 2, 3). Returns:

TRUE if function call successful FALSE if (1 > performanceLevel > 3)

Example: SetCLPerformance(1); // Set for fastest response do { CLPatGenW(0, WTTrigger); // Listen for the CL gateway word until (CheckDuration(SHORT) == 0); Recog(WTTrigger); // and try to recognize it

Notes: 1) Higher numbers give better accuracy but have a stricter tolerance in the way the word must be

spoken and take more time to recognize. 2) The VE-C default is SetCLPerformance(3)

See also: CLPatGen, CLPatGenW

The SetCLPreSil Function Usage:

BOOL SetCLPreSil(UINT16 period)

82 P/N 80-0200-D © 2002 Sensory Inc.

Page 83: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

Description:

Controls the amount of time before a CLPatGen or CLPatGenW no data (timeout) result. If nothing is spoken before the period timeout duration, CLPatGen and CLPatGenW will return a value of 1 (no data).

Arguments: period = The timeout period measured in 12.5 millisecond blocks. (range 1 to 65536)

Returns: TRUE if function call successful FALSE if (1 > period > 65535)

Example: SetCLPreSil(4800); // allow up to 1 minute for timeout CLPatGen(0); // Listen for the CL gateway word

Notes: The VE-C default is SetCLPreSil(65536), or approximately 15 minutes.

See also: CLPatGen, CLPatGenW

WordSpot The PatGen Function Usage:

SINT8 PatGenWS(UINT8 runHow) Description:

Generates a pattern for Wordspot training. Arguments:

RunHow 0 STANDARD 1 BACKGROUND 3 RP_THRESH

Returns: 0 = OK 1 = no data (time out) 2 = too long 3 = too noisy 4 = too soft 5 = too loud 6 = too soon -1 = interrupted

Examples: // This records twice for SD training. // NOTE: error handling has been left out of this example for // ease of reading. TEMPLATE tempTemplates[1]; // create an array for temp. storage PatGenWS(STANDARD); // say it once PutTemplate(UNKNOWN, 0, tempTemplates); // save it PatGenWS(STANDARD); // say it again GetTemplate(KNOWN, 0, tempTemplates); // get the 1st utterance TrainSD(UNKNOWN, KNOWN, UNKNOWN); // average them together

Notes: 1) PatGenWS can be run in one of three ways:

a. It can run by itself, which is the STANDARD mode, and is the usual first step in training or recognizing speech.

b. It can also run in conjunction with the Record/Playback technology, so that the actual speech is saved and can be replayed. In this case PatGenWS runs in a BACKGROUND mode and is started before the recording.

c. If RP_THRESH is specified, PatGenWS also runs in the BACKGROUND and provides threshold information so the recording can be post-processed to remove initial silence and glitches, but no pattern is actually generated.

© 2002 Sensory Inc. P/N 80-0200-D 83

Page 84: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

2) PatGenWS automatically performs a silence measurement before which will take up to 150

milliseconds before the start of pattern generation. 3) Certain errors (too soft, too loud and too soon) can be suppressed with the SetPatGenNoErrors

function. 4) Patterns are always recorded into the UNKNOWN buffer.

See also: SetStopCondition, DebugPatGen, SetPatGenSepSil, SetPatGenPreSil, SetPatGenNoErrors, RecordRP

The WordSpot Function Usage:

SINT8 WordSpot(UINT8 timeout, UINT8 index, TEMPLATE *baseTemplate) Description:

Performs speaker dependant recognition with wordspotting. Arguments:

timeout = Controls whether WordSpot will time out or listen indefinitely. index = The array element of baseTemplate being compared. baseTemplate = the TEMPLATE array in flash memory

Returns: 0 – Succesful Wordspot recognition. 1 – Failed Wordspot recognition. -1 – interrupted

Example: #define PATCOUNT 1 TEMPLATE myTemplates[PATCOUNT]; // create array for WS recognition WordSpot(0, 0, myTemplates); // Now recognize

Notes: 1) Use the SetWSPerformance function to control the strictness of the recognition. 2) Only a single word can be compared against during WordSpot

See also: SetWSPerformance

The SetWSPerformance Function Usage:

BOOL SetWSPerformance(UINT8 performanceLevel) Description:

Controls the tradeoff between recognition speed and accuracy for WordSpot. Arguments:

performanceLevel = The performance setting (1, 2, 3 or 4). Returns:

TRUE if function call successful FALSE if (1 > performanceLevel > 4)

Example: #define PATCOUNT 1 TEMPLATE myTemplates[PATCOUNT]; // create array for WS recognition SetWSPerformance(4); // most strict wordspotting. WordSpot(0, 0, myTemplates); // Now recognize

Notes: 1) Higher numbers give better accuracy but have a stricter tolerance in the way the word must be

spoken and take more time to recognize to minimize false triggering. 2) The VE-C default is SetWSPerformance(2)

See also: RecogSD

84 P/N 80-0200-D © 2002 Sensory Inc.

Page 85: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

Record and Play The RecordRP Function Usage:

SINT8 RecordRP(UINT8 maxTime, UINT8 threshType, UINT8 recordingNumber) Description:

Records digital audio and stores it in the flash memory. Arguments:

maxTime = the number of half-second intervals to record, or if maxTime = 0, then record until A) RecordRP is stopped by a jumpout condition, B) the end of memory is reached, or C) if PatGen, PatGenW, PatGenWS, are running simultaneously, until pattern generation stops. threshType = the type of threshold to be used. Can be NO_THRESH, TRIM_THRESH (threshold initial silence only) or FULL_THRESH (threshold all silences). recordingNumber = A user defined recording number associated with this recording.

Returns: 0 = OK 1 = recordingNumber already exists -1 = no memory is available -2 = illegal threshold

Example: EraseRP(1); // erase any old recording #1, if present RecordRP(6, NO_THRESH, 1); // rec. for 3 sec, with no thresholding. PostRP(1); // Post-process the recording CompressRP(2,1); // Compress the recording to 2-bit level PlayRP(1); // Play it normal speed… PlayFastRP(1); // and play it fast (raise the pitch).

Notes: 1) The recordingNumber index number of the recordings do not have to be in any order. Any value

from 0 to 255 may be used. 2) After recording using RecordRP, it is usually a good idea to post process using PostRP. 3) The SetStopCondition function sets the conditions under which the recording may be halted by an

external I/O event. See also:

PatGen, PatGenW, PatGenWS, CLPatGen, CLPatGenW, PostRP, SetStopCondition The PlayRP Function Usage:

UINT8 PlayRP(UINT8 recordingNumber) Description:

Plays back digital audio stored in the flash memory. Arguments:

recordingNumber = A user defined recording number associated with this recording. Returns:

0 = OK 1 = recordingNumber does not exists

Example: EraseRP(1); // erase any old recording #1, if present RecordRP(6, NO_THRESH, 1); // rec. for 3 sec, with no thresholding. PostRP(1); // Post-process the recording CompressRP(2,1); // Compress the recording to 2-bit level PlayRP(1); // Play it normal speed… PlayFastRP(1); // and play it fast (raise the pitch).

Notes: 1) The recordingNumber index number of the recordings do not have to be in any order. Any value

from 0 to 255 may be used. 2) Use PlayFastRP to play back the recording at an accelerated rate. 3) Use the SetOutput function to control whether output is sent to the DAC or PWM or both.

© 2002 Sensory Inc. P/N 80-0200-D 85

Page 86: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

4) The SetStopCondition function sets the conditions under which the playback may be halted by an

external I/O event. See also:

RecordRP, PlayFastRP, SetOutput, SetStopCondition The PlayFastRP Function Usage:

UINT8 PlayFastRP(UINT8 recordingNumber) Description:

Plays back digital audio stored in the flash memory at an accelerated rate. Arguments:

recordingNumber = A user defined recording number associated with this recording. Returns:

0 = OK 1 = recordingNumber does not exists

Notes: 1) The recordingNumber index number of the recordings do not have to be in any order. Any value

from 0 to 255 may be used. 2) Use PlayRP to play back the recording at the normal rate. 3) Use the SetOutput function to control whether output is sent to the DAC or PWM or both. 4) The SetStopCondition function sets the conditions under which the playback may be halted by an

external I/O event. See also:

RecordRP, PlayRP, SetOutput, SetStopCondition The PostRP Function Usage:

SINT8 PostRP(UINT8 recordingNumber) Description:

Postprocesses speech from a previous RecordRP to trim and adjust gain. Arguments:

recordingNumber = A user defined recording number associated with this recording. Returns:

0 = OK 1 = recordingNumber does not exists -1 = no memory is available

Example: EraseRP(1); // erase any old recording #1, if present RecordRP(6, NO_THRESH, 1); // rec. for 3 sec, with no thresholding. PostRP(1); // Post-process the recording CompressRP(2,1); // Compress the recording to 2-bit level PlayRP(1); // Play it normal speed… PlayFastRP(1); // and play it fast (raise the pitch).

Notes: The recordingNumber index number of the recordings do not have to be in any order. Any value from 0 to 255 may be used.

See also: RecordRP

The CompressRP Function Usage:

SINT8 CompressRP(UINT8 destLevel, UINT8 recordingNumber) Description:

Compresses speech from a previous RecordRP to use less flash memory. Arguments:

destLevel = The desired level of compression. (range = 2 or 3) recordingNumber = A user defined recording number associated with this recording.

Returns:

86 P/N 80-0200-D © 2002 Sensory Inc.

Page 87: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

0 = OK 1 = recordingNumber does not exists 2 = bad destLevel -1 = no memory is available

Example: EraseRP(1); // erase any old recording #1, if present RecordRP(6, NO_THRESH, 1); // rec. for 3 sec, with no thresholding. PostRP(1); // Post-process the recording CompressRP(2,1); // Compress the recording to 2-bit level PlayRP(1); // Play it normal speed… PlayFastRP(1); // and play it fast (raise the pitch).

Notes: 1) The recordingNumber index number of the recordings do not have to be in any order. Any value

from 0 to 255 may be used. 2) Compression takes about as long as playing the recording. 3) If there is not enough memory available to construct the compressed image, the original recording is

retained. See also:

RecordRP The EraseRP Function Usage:

UINT8 EraseRP(UINT8 recordingNumber) Description:

Erases speech from the flash memory. Arguments:

recordingNumber = A user defined recording number associated with this recording. Returns:

0 = OK 1 = recordingNumber does not exists

Example: EraseRP(1); // erase any old recording #1, if present RecordRP(6, NO_THRESH, 1); // rec. for 3 sec, with no thresholding. PostRP(1); // Post-process the recording CompressRP(2,1); // Compress the recording to 2-bit level PlayRP(1); // Play it normal speed… PlayFastRP(1); // and play it fast (raise the pitch).

Notes: 1) The recordingNumber index number of the recordings do not have to be in any order. Any value

from 0 to 255 may be used. 2) Compression takes about as long as playing the recording. 3) If there is not enough memory available to construct the compressed image, the original recording is

retained. See also:

RecordRP The GetAvailableMemory Function Usage:

UINT8 GetAvailableMemory(void) Description:

Returns the amount of free memory available for recording in the flash memory. Arguments:

void Returns:

The number of 1024-byte clusters of flash memory available for recordings. Example:

uint8 f; f = (GetAvailableMemory() / 2) // how much memory is left?

© 2002 Sensory Inc. P/N 80-0200-D 87

Page 88: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

EraseRP(1); // erase any old recording #1, if present RecordRP(f, NO_THRESH, 1); // rec. for ‘f’ sec

Notes: 1) Each cluster can hold approximately 1/4 seconds of 4-bit speech. 2) The maximum value which can be returned is 255, which equals just over 60 seconds of speech.

See also: EraseRP, RecordRP

DTMF The TTone Function Usage:

UINT8 TTone(UINT8 toneNumber) Description:

Generates a single DTMF tone Arguments:

toneNumber = The tone to be generated (range 0 to 15), or 255 to generate a dial tone. Returns:

0 = OK 1 = toneNumber is invalid

Example: uint8 phoneNumber[7] = {5, 5, 5, 1, 2, 3, 4}; uint8 index; SetTToneDur(100); // Output for 1 second… SetTToneSil(100); // with a 1 sec. delay… TTone(255); // a short dial tone SetTToneDur(10); // Output for .1 second for (index = 0; index < 7; index++) TTone(phoneNumber[index]); // output a phone number

Notes: 1) Use the SetOutput function to control whether DTMF output is sent to the DAC or PWM or both. 2) The SetStopCondition function sets the conditions under which the dial tone may by halted or

the DTMF tones extended by an external I/O event. See also:

SetOutput, SetStopCondition The TToneDur Function Usage:

BOOL SetTToneDur(UINT8 duration) Description:

Specifies the tone duration for subsequent calls to TTone Arguments:

duration = The tone duration in 10 millisecond increments. (range = 1 to 255) Returns:

TRUE if function call successful FALSE if (duration = 0)

Example: uint8 phoneNumber[7] = {5, 5, 5, 1, 2, 3, 4}; uint8 index; SetTToneDur(100); // Output for 1 second… SetTToneSil(100); // with a 1 sec. delay… TTone(255); // a short dial tone SetTToneDur(10); // Output for .1 second for (index = 0; index < 7; index++) TTone(phoneNumber[index]); // output a phone number

Notes: 1) The VE-C default is TToneDur(10), or approximately 100 milliseconds.

88 P/N 80-0200-D © 2002 Sensory Inc.

Page 89: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

2) The SetStopCondition function sets the conditions under which the dial tone may by halted or

the DTMF tones extended by an external I/O event. See also:

TTone, SetStopCondition, SetTToneSil The TToneSil Function Usage:

BOOL SetTToneSil(UINT8 duration) Description:

Specifies the silence duration following DTMF output for subsequent calls to TTone Arguments:

duration = The silence duration in 10 millisecond increments. (range = 1 to 255) Returns:

TRUE if function call successful FALSE if (duration = 0)

Example: uint8 phoneNumber[7] = {5, 5, 5, 1, 2, 3, 4}; uint8 index; SetTToneDur(100); // Output for 1 second… SetTToneSil(100); // with a 1 sec. delay… TTone(255); // a short dial tone SetTToneDur(10); // Output for .1 second for (index = 0; index < 7; index++) TTone(phoneNumber[index]); // output a phone number

Notes: The VE-C default is TToneSil(9), or approximately 90 milliseconds.

See also: Ttone, SetTToneDur

Music The Music Function Usage:

SINT8 PlayMusic(UINT8 tune, NOTEDATA *noteData, TUNEDATA *tuneData) Description:

Plays MIDI type music. Arguments:

tune = The index number of the song to be played. noteData = An instrumentation array (.VEM file) tuneData = A song array (.VEM file)

Returns: 0 = OK 1 = noteData and tuneData are not loaded in the same 64K bank.

Example: extern NOTES VEmusicdwc; extern TUNES VEmusiclis; SetMusicFilter(0); // minimize "wobbles" PlayMusic(0, &VEmusicdwc, &VEmusiclis);

Notes: 1) Use the SetOutput function to control whether music output is sent to the DAC or PWM or both. 2) The SetStopCondition function sets the conditions under which the music may by halted by an

external I/O event. See also:

SetOutput, SetStopCondition The SetMusicFilter Function Usage:

BOOL SetMusicFilter(UINT8 filter)

© 2002 Sensory Inc. P/N 80-0200-D 89

Page 90: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

Description:

Controls the filter used by subsequent calls to Music Arguments:

filter = The filter to be used (range = 0 to 3) Returns:

TRUE if function call successful FALSE if (filter > 3)

Example: extern NOTES VEmusicdwc; extern TUNES VEmusiclis; SetMusicFilter(0); // minimize "wobbles" PlayMusic(0, &VEmusicdwc, &VEmusiclis); // play the tune SetMusicFilter(3); // Get richest sound PlayMusic(0, &VEmusicdwc, &VEmusiclis); // play the tune

Notes: 1) Lower values of filter produce sound which is “tinnier “, but reduce the “wobbles“ sometimes audible

in long notes. 2) The VE-C default is SetMusicFilter(2).

See also: Music

Serial (RS-232) Communication Serial communication at the application level in Voice Extreme™ are always performed at 9600 baud, 8 data bits, no parity, 1 stop bit (the download operation runs at 115Kbaud). Voice Extreme ™ is not normally configured for RS232 communications, so the Init232 function must be called before any other Serial Communication function. The Init232 Function Usage:

void Init232(void) Description:

Initializes the serial communication software drivers and I/O hardware. Arguments:

void Returns:

void Example:

Init232(); PutByte232(0x21); // output a ‘!’ Idle232();

Notes: Init232 must be called before using other RS-232 functions (including debugging, if debug output has been directed to RS-232 using the SetDebug function).

See also: SetDebug, Idle232

The Idle232 Function Usage:

void Idle232(void) Description:

Disables serial communication software drivers and I/O hardware. Arguments:

void Returns:

void Example:

Init232(); PutByte232(0x21); // output a ‘!’

90 P/N 80-0200-D © 2002 Sensory Inc.

Page 91: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

Idle232();

Notes: On the Development Board, it is recommended to call Idle232 before calls to PatGen, PatGenW, PatGenWS, CLPatGen, and CLPatGenW to minimize system noise.

See also: Init232

The following functions provide a relatively high-speed communication method for communicating with a HOST computer. Refer to “Serial Packets Overview” section for details of the packet format and protocol. The GetPacket Function Usage:

SINT8 GetPacket (UINT8 MAX_SIZE, CHAR *dataBuffer) Description:

Receives a serial packet from an external serial source. Arguments:

MAX_SIZE = The number of data bytes in the packet. dataBuffer = buffer where the incoming data bytes will be placed. Note dataBuffer must be in RAM, not flash memory.

Returns: The length of the data received, or: -1 = transmission error -2 = dataBuffer is not in RAM.

Example: Init232(); GetPacket(10, myBuffer); // get packet from host computer Idle232();

Notes: 1) This function provides a relatively high-speed communication method for communicating with a

HOST computer. 2) Refer to “Serial Packets Overview” section for details of the packet format and protocol.

See also: Init232, SendPacket

The SendPacket Function Usage:

SINT8 SendPacket(UINT8 bufferSize, CHAR *dataBuffer) Description:

Transmits a serial packet to an external serial destination. Arguments:

MAX_SIZE = The number of data bytes in the packet. dataBuffer = buffer where the incoming data bytes will be placed. Note dataBuffer must be in RAM, not flash memory.

Returns: The length of the data received, or: -1 = transmission error -2 = dataBuffer is not in RAM

Example: Init232(); SendPacket(10, myBuffer); // send packet to host computer Idle232();

Notes: 1) This function provides a relatively high-speed communication method for communicating with a

HOST computer. 2) Refer to “Serial Packets Overview” section for details of the packet format and protocol.

See also: Init232, GetPacket

© 2002 Sensory Inc. P/N 80-0200-D 91

Page 92: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

The PutByte232 Function Usage:

UINT8 PutByte232(INT8 value) Description:

Sends one character to the serial port Arguments:

value = The character to the transmitted Returns:

0 = OK 1 = transmission error

Example: Init232(); PutByte232(0x21); // output a ‘!’ Idle232();

Notes: This function provides byte-level control of the serial transmission line.

See also: Init232, WaitByte232, WaitByteTimeout232, SendPacket

The WaitByte232 Function Usage:

UINT8 WaitByte232(void) Description:

Waits indefinitely to receive one character from the serial port Arguments:

void Returns:

The character received. Example:

Init232(); b = WaitByte232(); // get a byte Idle232();

Notes: This function provides byte-level control of the serial transmission line.

See also: Init232, PutByte232, WaitByteTimeout232, GetPacket

The WaitByteTimeout232 Function Usage:

UINT16 WaitByteTimeout232(void) Description:

Waits one second to receive one character from the serial port Arguments:

void Returns:

The character received in bits [7:0], or 0x80XX if timeout. Example:

Init232(); b = WaitByteTimeout232(); // get a byte Idle232();

Notes: 1) This function provides byte-level control of the serial transmission line. 2) This function returns the character in bits [7:0], and status information in bits [15:8]. Bit15 set

indicates timeout. See also:

Init232, PutByte232, WaitByte232, GetPacket The WriteString Function

92 P/N 80-0200-D © 2002 Sensory Inc.

Page 93: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

Usage:

UINT8 WriteString232(CHAR *string) Description:

Sends a string of characters to the output port. Arguments:

string = A character array. Returns:

0 = OK 1 = transmission error

Example: #pragma VE_APP_TEXT ‘Hello, world/n’ Init232(); WriteString(GetApplicationText()); // Output ‘Hello world’ Idle232();

Notes: 1) This function provides byte-level control of the serial transmission line. 2) Sends characters from string, up to, but not including, the terminating NULL character (00h or \0).

See also: Init232, PutByte232, WaitByte232, GetPacket

Debug Output The following functions can be used to output the value of VE variables or constants. The DebugXXX Functions Usage:

void DebugH4(UINT8 value) void DebugH8(UINT8 value) void DebugH16(UINT16 value) void DebugH24(UINT24 value) void DebugD8(UINT8 value) void DebugD16(UINT16 word) void DebugD100(UINT8 value)

Description: Output the values of VE-C variables or constants

Arguments: DebugH4 – value = Least Significant Nibble output as a single hex digit (0-F). DebugH8 – value = Output as two hex digits. DebugH16 – value = Output as four hex digits. DebugH24 – value = Output as six hex digits. DebugD8 – value = Output as three decimal digits. DebugD16 – value = Output as five decimal digits. DebugD100 – value = Outputs as two decimal digits for values < 100, as 100 if value = 100, and as a Beep (or no RS-232 output) if value > 100.

Returns: void

Example: Init232(); b = WaitByteTimeout232(); // get a byte DebugH8(b); // output its value Idle232();

Notes: 1) The SetDebug function for class “GENERAL“ specifies whether the output from these functions is

spoken or sent to the RS-232 port. 2) The Init232 function needs to have been called if output is directed to RS-232. When output is

directed to RS-232, each burst is followed by a space. 3) Serial communications at the application level in VE are always performed at 9600 baud, 8 data

bits, no parity, 1 stop bit (the Flash Load operation runs at 115Kbaud,). 4) If debug output is to be spoken, then both PWM and DAC output will be used.

© 2002 Sensory Inc. P/N 80-0200-D 93

Page 94: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

See also:

SetDebug, Init232

I/O The following functions allow full control of the two 8-bit General Purpose I/O ports. Note that Voice Extreme™ requires certain I/O resources as described in “Interfaces” section. Changing critical I/O pins will cause Voice Extreme to crash. Note that following a RESET, the I/O lines are initially configured per the following table:

I/O Line Initial Configuration P0.0 - P0.6 Weak-pullup input (~50K) P0.7 High output P1.0 - P1.2 Strong-pullup input (~5K) P1.3 High output P1.4 Weak-pullup input (~50K) P1.5 - P1.6 High output P1.7 Low output

The ConfigurePortX Functions Usage:

void ConfigurePort0(UINT8 controlA, UINT8 controlB) void ConfigurePort1(UINT8 controlA, UINT8 controlB)

Description: Allows configuration of all pins in a single IO port.

Arguments: controlA and controlB = control the function of the specified port: controlB controlA Function 0 0 input, weak (~50K) pullup 0 1 input, strong (~5K) pullup 1 0 input, no pullup 1 1 output

Returns: void

Example: ConfigurePort1(0xFF, 0xFF); // configure all P1 pins as outputs

Notes: Voice Extreme™ requires certain I/O resources as described in the “Interfaces” section. Changing critical I/O pins will cause Voice Extreme™ to crash.

See also: ConfigureIO, ReadPortX, ReadOutputPortX, WritePortX

The ConfigureIO Function Usage:

UINT8 ConfigureIO (INT8 port, INT8 bit, INT8 function) Description:

Allows configuration of a single IO pin. Arguments:

port = 0 or 1 bit = 0-7 (0 is the least significant bit) function = 0 input, weak (~50K) pullup

1 input, strong (~5K) pullup 2 input, no pullup 3 output

Returns:

94 P/N 80-0200-D © 2002 Sensory Inc.

Page 95: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

0 = OK 1 = one or more function arguments invalid

Example: ConfigureIO(1, 0, 3); // configure P1.0 pin as an output

Notes: Voice Extreme™ requires certain I/O resources as described in the “Interfaces” section. Changing critical I/O pins will cause Voice Extreme™ to crash.

See also: ConfigurePortX, ReadPortX, ReadOutputPortX, WritePortX

The ReadPortX Functions Usage:

UINT8 ReadPort0(void) UINT8 ReadPort1(void)

Description: Reads the value of the input pins of the specified I/O port

Arguments: void

Returns: The value of the input pins of the I/O port.

Example: p = ReadPort0(); // Read the P0 input pins Init232(); DebugH8(p); // output its value Idle232();

See also: ConfigurePortX, ConfigureIO, ReadOutputPortX, WritePortX

The ReadOutputPortX Functions Usage:

UINT8 ReadOutputPort0(void) UINT8 ReadOutputPort1(void)

Description: Reads the value last output to the specified I/O port

Arguments: void

Returns: The value of the I/O port.

Example: p = ReadOutputPort0(); // Read the last value output to P0 Init232(); DebugH8(p); // output its value Idle232();

See also: ConfigurePortX, ConfigureIO, ReadPortX, WritePortX

The WritePortX Functions Usage:

void WritePort0(UINT8 value) void WritePort1(UINT8 value)

Description: Writes to the output pins of the specified I/O port

Arguments: value = The value to be written

Returns: void

Example: ConfigurePort1(0xFF, 0xFF); // configure all P1 pins as outputs

© 2002 Sensory Inc. P/N 80-0200-D 95

Page 96: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

WritePort1(0xFF); // and output an 0xFF

Notes: Generally 1’s should be written to any input pins in the port.

See also: ConfigurePortX, ConfigureIO, ReadPortX, ReadOutputPortX

The SleepIO Function Usage:

UINT8 SleepIO(UINT8 port, UINT8 bits, UINT8 states) Description:

Places the hardware into a low power sleep mode until a specific IO event happens. Arguments:

port = 0 or 1 bits = an 8-bit mask with 1’s in the bit positions which will be tested states = an 8-bit mask of the states the bits must reach (0=low, 1=high)

Returns: 0 = the hardware has reawakened 1 = there was an error in the input parameters, including attempting to wait on a pin that is configured as an output

Example: ConfigurePort1(0xFC, 0xFC); // configure P1.0 and P1.1 pins as inputs SleepIO(1, 0x03, 0x02); // sleep P1.0 goes low OR P1.1 goes high

Notes: 1) All I/O pins to be considered must be in the same P0 or P1 port. 2) Multiple I/O pins are ‘OR’ed together. In other words, if two or more I/O pins are specified in bits,

then wakeup will occur when any of them change to the states state. See also:

ConfigurePortX, ConfigureIO, WaitForIO, SleepT2 The WaitForIO Function Usage:

UINT8 WaitForIO(UINT8 port, UINT8 bits, UINT8 states) Description:

waits until a specific IO event happens. Arguments:

port = 0 or 1 bits = an 8-bit mask with 1’s in the bit positions which will be tested states = an 8-bit mask of the states the bits must reach (0=low, 1=high)

Returns: 0 = the wait is done 1 = there was an error in the input parameters, including attempting to wait on a pin that is configured as an output

Example: ConfigurePort1(0xFC, 0xFC); // configure P1.0 and P1.1 pins as inputs WaitForIO(1, 0x03, 0x02); // wait P1.0 goes low OR P1.1 goes high

Notes: 1) All I/O pins to be considered must be in the same P0 or P1 port. 2) Multiple I/O pins are ‘OR’ed together. In other words, if two or more I/O pins are specified in bits,

then wakeup will occur when any of them change to the states state. See also:

ConfigurePortX, ConfigureIO, SleepIO

Keypad Functions These functions deal with a 4x5 matrix keypad but can be used with a standard 4x3 keypad. Refer to “Interfaces” section for the required I/O mapping for the keypad. The following routines allow the application to specify a key using the following codes:

Key Key Code

96 P/N 80-0200-D © 2002 Sensory Inc.

Page 97: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

0-9 0-9 * 10 # 11

A-H 12-19 Note that there are no DTMF tones for keys E-H. The ScanKeypad Function Usage:

SINT8 ScanKeypad(void) Description:

Scans an external 4x5 keypad one time. Arguments:

void Returns:

The keypad code detected (range 0-19) –1 = (NO_KEY) if no keys were detected.

Example: do { k = ScanKeypad(); // wait for any key to be pressed until (k != NO_KEY);

Notes: 1) Keypad scanning is done from top to bottom, left to right. 2) If multiple keys are depressed, only the first detected key code will be returned. 3) No key debouncing is done.

See also: WaitForKeypadPress, WaitForKeypadRelease

The WaitForKeypadPress Function Usage:

SINT8 WaitForKeypadPress(UINT8 key, UINT8 debounceCount) Description:

Waits until a keypad button has been pressed for a specified time. Arguments:

key = The key to scan for (range = 0 to 19), or 255 = (ANY_KEY) debounceCount = how many consecutive scans the key must be pressed.

Returns: The keypad code detected (range 0-19) –1 = if (key > 19)

Example: k = WaitForKeypadPress(0xFF, 20); // wait for any key to be pressed WaitForKeypadRelease(0xFF, 5); // wait for release

Notes: 1) Keypad scanning is done from top to bottom, left to right. 2) If multiple keys are depressed, only the first detected key code will be returned.

See also: ScanKeypad, WaitForKeypadRelease

The WaitForKeypadRelease Function Usage:

SINT8 WaitForKeypadRelease(UINT8 key, UINT8 debounceCount) Description:

Waits until a keypad button has been released for a specified time. Arguments:

key = The key to scan for (range = 0 to 19), or 255 = (ANY_KEY) debounceCount = how many consecutive scans the key must be released.

Returns: 0 = success

© 2002 Sensory Inc. P/N 80-0200-D 97

Page 98: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

–1 = if (key > 19)

Example: k = WaitForKeypadPress(0xFF, 20); // wait for any key to be pressed WaitForKeypadRelease(0xFF, 5); // wait for release

Notes: 1) Keypad scanning is done from top to bottom, left to right. 2) If multiple keys are depressed, only the first detected key code will be returned.

See also: ScanKeypad, WaitForKeypadPress

Timing Functions The DelayXXX Functions Usage:

void DelayMilliSeconds(UINT16 milliseconds) void DelaySeconds(UINT16 seconds)

Description: Delays program execution for a specified time.

Arguments: DelayMilliSeconds – milliseconds = The number of milliseconds to delay DelaySeconds – seconds = The number of seconds to delay

Returns: void

Example: DelaySeconds(3); // delay for 3 seconds

The ReadTime Function Usage:

UINT16 ReadTime(void) Description:

Returns the time since the last system reset. Arguments:

void Returns:

The time, in seconds. Example:

DebugD16(ReadTime()); // How long has the program been running? The SetCrystalTimer2 Function Usage:

void SetCrystalTimer2(void) Description:

Allows the application program to specify that the seconds counter (OCS2) is configured with a crystal rather than with the standard RC setup.

Arguments: void

Returns: void

Example: SetCrystalTimer2(); // use an external crystal for the T2 timer.

See also: SleepT2

The SleepT2 Function Usage:

UINT8 SleepT2(UINT16 seconds) Description:

Places the hardware into a low power sleep mode for a specified amount of time.

98 P/N 80-0200-D © 2002 Sensory Inc.

Page 99: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

Arguments:

seconds = The number of seconds to sleep. Returns:

0 = the hardware has reawakened 1 = there was an error in the input parameter

Example: SleepT2(5); // sleep for 5 seconds

Notes: 1) The hardware is actually awakened once a second, during which time the VE-C counts down its

sleep seconds counter and then returns to sleep unless the counter expires. 2) The VE-C does not sleep if seconds = 0.

See also: SetCrystalTimer2

© 2002 Sensory Inc. P/N 80-0200-D 99

Page 100: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

Utility Functions The CopyMemory Function Usage:

UINT8 CopyMemory(UINT16 numberOfBytes, void *source, void *dest) Description:

Copies a block of memory from one place to another. Arguments:

numberOfBytes = The number of bytes to copy. source = The place to copy from. dest = The place to copy to.

Returns: 0 = success 1 = part of either source or dest block is outside the user variable area.

Example: TEMPLATE srcTemplates[10]; // create 2 arrays for SD recognition TEMPLATE dstTemplates[10]; CopyMemory(128, &srcTemplates[1], dstTemplaltes);

Notes: source and dest may be addresses in either VE-C User RAM or User Flash memory.

See also: EraseFlash, FillMemory

The EraseFlash Function Usage:

UINT8 EraseFlash(UINT16 numberOfBytes, void *start) Description:

Erases a block of User flash memory. Arguments:

numberOfBytes = The number of bytes to copy. start = The place to start erasure from.

Returns: 0 = success 1 = part of the user flash block is outside the user variable area.

Example: TEMPLATE myTemplates[10]; // create 2 arrays for SD recognition EraseMemory(128, &myTemplates[9]);

Notes: 1) start must refer to an address in VE-C User Flash memory. 2) Erasure is done by writing ‘0xFF’s into the User flash memory block.

See also: CopyMemory, FillMemory

The FillMemory Function Usage:

UINT8 FillMemory(UINT16 numberOfBytes, UINT8 value, void *dest) Description:

Fills a block of memory with a specified constant. Arguments:

numberOfBytes = The number of bytes to copy. value = The byte value to fill. dest = The place to copy to.

Returns: 0 = success 1 = part of dest block is outside the user variable area.

Example: TEMPLATE myTemplates[10];

100 P/N 80-0200-D © 2002 Sensory Inc.

Page 101: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

FillMemory(128, 0xFF, &myTemplates[9]);

Notes: dest may be an address in either VE-C User RAM or User Flash memory.

See also: CopyMemory, EraseFlash

The GetApplicationText Function Usage:

UINT24* GetApplicationText(void) Description:

Returns a pointer to the application text string. Arguments:

void Returns:

A pointer to the application text string stored in the Boot Block of the program Example:

#pragma VE_APP_TEXT ‘Hello, world/n’ Init232(); WriteString(GetApplicationText()); // Output ‘Hello world’ Idle232();

Notes: The text pointed to by this function is declared by a #pragma VE_APP_TEXT line in the source code.

See also: WriteString232

The GetFirstTime Function Usage:

BOOL GetFirstTime(void) Description:

Returns TRUE the first time an application program is run and FALSE all other times. Arguments:

void Returns:

TRUE the first time this function is called and FALSE all other times. Example:

if (GetFirstTime()) { … // do initialization stuff }

Notes: This function allows the application program to initiate specialized first time actions (such as zeroing a directory structure or training templates).

See also: ResetSystem

The GetVersion Function Usage:

UINT8 GetVersion(UINT8 type) Description:

Returns one of four version numbers from the application. Arguments:

type = The version number to be returned: 0 Application Version 1 Boot Block Version 2 Command Format Version 3 Parser Version

Returns: The version number

Example:

© 2002 Sensory Inc. P/N 80-0200-D 101

Page 102: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

Init232(); WriteString(GetVersion(0)); // Output application version Idle232();

Notes: If type is > 3 it is treated as if it were a 0 and returns Application Version.

The Random Function Usage:

UINT8 Random(UINT8 seed) Description:

Returns a pseudo-random 8-bit number Arguments:

seed = A seeding number Returns:

A pseudo-random number. Example:

uint8 seed; seed = Random(seed) Init232(); WriteString(seed)); // Output a random number Idle232();

Notes: A pseudo-random sequence can be created by setting seed to the value returned on the last call; however, note that in such a sequence the least significant bit just alternates between 0 and 1.

See also: The ResetSystem Function Usage:

void ResetSystem(BOOL firstTime) Description:

Performs a soft reset of the application program Arguments:

firstTime = Controls whether the GetFirstTime function is also affected. Returns:

void (program restarts) Example:

ResetSystem(FALSE); // reset app, but not the GetFirstTime variable Notes:

If firstTime is non-zero, then it also reinitializes the global system variable so that the reset performs as though this were the first time the program was executed.

See also: GetFirstTime

The SetLEDOutput Function Usage:

BOOL SetLEDOutput(BOOL onOff) Description:

Enables or disable the error output normally sent to the Development Board LEDs Arguments:

onOff = TRUE enables LED output, or FALSE disabled LED output. Returns:

TRUE = success FALSE = onOff not equal to 0 or 1.

Example: SetLEDOutput(FALSE); // reset app, but not the GetFirstTime variable

Notes: The VE-C default = SetLEDOutput(TRUE)

102 P/N 80-0200-D © 2002 Sensory Inc.

Page 103: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

The SetMicDistance Function Usage:

BOOL SetMicDistance(UINT8 Distance) Description:

Affects the internal amplifier gain setting. Choose the value most appropriate for the application. Arguments:

Distance 1 = HEADSET – the microphone is an inch or so from the mouth. 2 = ARMS_LENGTH – normal nearby operation. 3 = FAR_MIC – the microphone is about 10 feet away.

Returns: 0 = fail 1 = okay

Example: SetMicDistance(HEADSET);

Notes: The VE-C default is ARMS_LENGTH

© 2002 Sensory Inc. P/N 80-0200-D 103

Page 104: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

Serial Packets Overview To simplify Host/Slave applications, Voice Extreme™ supports packet communication. In this protocol, all data is transmitted in 8-bit bytes, and all messages are sent in packets. The packet format conveys error checking and byte synchronization information. A packet always starts with a “sync field” (1 or more bytes of 0xFF) followed by a length byte, then one or more data bytes, then a checksum byte. The length byte specifies the number of bytes to follow including the checksum. The checksum is the 8-bit additive, modulo-256 sum of all the data bytes and the length byte added together. For example the data 01, 02, 03, 04 would be sent as follows:

Sample Data Packet Byte Value Notes 0 0xFF Sync Field, 8 or more consecutive 1 bit 1 0x05 Packet length, count of bytes to follow 2 0x01 First data byte 3 0x02 Second data byte 4 0x03 Third data byte 5 0x04 Fourth data byte 6 0x0F Checksum, (5+1+2+3+4)

Serial Packets Implementation The VE-C application requests a HOST packet by calling the GetPacket built-in function. The packet is received and stored in its entirety, and the application proceeds to interpret the individual data bytes as needed. Similarly the SendPacket function packages and transmits an entire packet to the HOST. Packet/byte synchronization is accomplished at the byte level as follows:

The receiver receives bytes until one or more 0XFF bytes (all bits are one) are received. The receiver then waits for the first non-FF byte and uses this as the packet length byte.

Once a valid packet is found the receiver accepts (length) bytes and performs the checksum calculation (ascertains that the sum of all the data bytes plus the length byte is equal to the checksum byte). Note that modulo-256 arithmetic is used; the carry is discarded during the checksum calculation. When using the packet protocol, all packet communication is initiated by the HOST; the Voice Extreme™ never sends packet data unless requested by the HOST. Accordingly, all commands are in one of the following formats:

A request from the HOST to the Voice Extreme™. A request could consist of a command byte and possible parameters or data. A response from the Voice Extreme™ to the HOST. A response could consist of a status byte and possible data.

After a command is issued to the Voice Extreme™, the HOST must wait until the response is received from VE. The HOST cannot interrupt a command.

104 P/N 80-0200-D © 2002 Sensory Inc.

Page 105: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

Disclaimer WARNING This kit is intended for use by consumers experienced with building electronic kits. As with any electronic kit, caution should be exercised during assembly, and all connections should be double checked that they are clean, safe and properly soldered before applying any power source. Important Disclaimer To the fullest extent permitted by acceptable law, Sensory, Inc. expressly disclaims the implied warranty of fitness for a particular purpose. Customer should understand that Sensory does not make any representation that products purchased will suit customer’s part purpose. Customer must rely on customer’s own skill or judgment in selecting suitable products for customer. To the extent warranty is applicable, such warranty shall be limited to 90 days from the date of purchase.

Voice Extreme™ Toolkit Limited Warranty The Voice Extreme™ Toolkit is warranted against defects and workmanship for a period of 90 days from the date of product purchase. Sensory, Inc. will, at its option, either repair or replace a product that proves to be defective either upon receipt or through normal usage. If a Sensory Speech Recognition Kit product has become obsolete or is no longer in production and deemed irreparable, Sensory will, at its option, provide an equivalent product or system for a nominal fee. Sensory, Inc. warrants this Speech Recognition Kit product, when properly installed and used, will execute its programmed instructions. However, Sensory, Inc. does not warrant that the operation of the Product, its firmware and software will be uninterrupted or totally error free. The Product must be returned to Sensory, Inc. for warranty service within the warranty period to the following address: 1991 Russell Ave., Santa Clara, CA 95054. The Buyer will pay all shipping and other charges or assessments for the return of the Product to Sensory, Inc. Limitation of Warranty The foregoing warranty shall not apply to defects resulting from maintenance performed by anyone other than Sensory, Inc., modifications made by Buyer or any third party, Buyer supplied software or interfacing, misuse, abuse, accident, mishandling, operation outside the environmental specifications for the Product, or improper setup or maintenance. Limitation of Liability Sensory's liability shall be limited to the repair or replacement of defective products in accordance with the Voice Extreme™ Toolkit Limited Warranty. Sensory shall not be liable for any incidental, special or consequential damages for breach of any warranty, expressed or implied, directly or indirectly arising out of Sensory's sale of merchandise, including any failure to deliver any merchandise, or arising out of customer's installation or use, whether proper or improper, of the product, separately or in combination with other equipment, or from any other cause. Products sold by Sensory are not authorized for use as critical components in life support devices or systems. Exclusive Remedies The remedies provided herein are Sensory’s sole liability and Buyer’s sole and exclusive remedies for breach of warranty. Sensory shall not be liable for any special, incidental, consequential, direct or indirect damages, whether based on contract, tort, or any legal theory. The foregoing warranty is in lieu of any and all other warranties, whether express, implied, or statutory, including but not limited to warranties of merchantability and suitability for a particular purpose. IMPORTANT NOTICES Sensory reserves the right to make changes to or to discontinue any product or service identified in this publication at any time without notice in order to improve design and supply the best possible product. Sensory does not assume responsibility for use of any circuitry other than circuitry entirely embodied in a Sensory product. Information contained herein is provided gratuitously and without liability to any user. Reasonable efforts have been made to verify the accuracy of this information but no guarantee whatsoever is given as to the accuracy or as to its applicability to particular uses. Applications described in this data sheet are for illustrative purposes only, and Sensory makes no warranties or representations that the RSC series of products will be suitable for such applications. In every instance, it must be the responsibility of the user to determine the suitability of the products for each application. Sensory products are not authorized for use as critical components in life support devices or systems. Sensory conveys no license or title, either expressed or implied, under any patent, copyright, or mask work right to the RSC series of products, and Sensory makes balance between recognition and synthesis no warranties or representations that the RSC series of products are free from patent, copyright, or mask work right infringement, unless otherwise specified. Nothing contained herein shall be construed as a recommendation to use any product in violation of existing patents or other rights of third parties. The sale of any Sensory product is subject to all Sensory Terms and Conditions of Sales and Sales Policies.

© 2002 Sensory Inc. P/N 80-0200-D 105

Page 106: Voice Extreme™ Toolkit

Voice Extreme™ Toolkit Programmer’s Manual

SENSORY Software End User License Agreement Important: this software end user license agreement ("EULA") is a legal agreement between you and Sensory. Read it carefully before completing the installation process and using the software. It provides a license to use the software and contains warranty information and liability disclaimers. By installing and using the software, you are confirming your acceptance of the software and agreeing to become bound by the terms of this agreement. If you do not agree to be bound by these terms, then select the "cancel" button and do not install the software. 1. Definitions

(a) "Sensory" means Sensory, Inc. and its suppliers and licensors, if any. (b) "Not For Resale (NFR) Version" means a version of the Software, so identified, to be used to review and evaluate

the Software, only. (c) "Software" means the software program supplied by Sensory herewith, which may also include documentation,

associated media, printed materials, and online and electronic documentation. 2. License This EULA allows you to:

(a) Install and use the Software on a single computer; OR install and store the Software on a storage device, such as a network server, used only to run or install the Software on your other computers over an internal network, provided you have a license for each separate computer on which the Software is installed or run from the storage device. A license for the Software may not be shared or used concurrently on different computers.

(b) Make one copy of the Software in machine-readable form solely for backup purposes. You must reproduce on any such copy all copyright notices and any other proprietary legends on the original copy of the Software.

3. License Restrictions

(a) Other than as set forth in Section 2, you may not make or distribute copies of the Software, or electronically transfer the Software from one computer to another or over a network.

(b) You may not decompile, reverse engineer, disassemble, or otherwise reduce the Software to a human-perceivable form.

(c) You may not sell, rent, lease, transfer or sublicense the Software. (d) You may not modify the Software or create derivative works based upon the Software. (e) You may not export the Software into any country prohibited by the United States Export Administration Act and the

regulations there under (f) In the event that you fail to comply with this EULA, Sensory may terminate the license and you must destroy all

copies of the Software. 4. Upgrades If this copy of the Software is an upgrade from an earlier version of the Software, it is provided to you on a license exchange basis. You agree by your installation and use of this copy of the Software to voluntarily terminate your earlier EULA and that you will not continue to use the earlier version of the Software or transfer it to another person or entity. 5. Ownership The foregoing license gives you limited rights to use the Software. Sensory and its suppliers retain all right, title and interest, including all copyrights, in and to the Software and all copies thereof. All rights not specifically granted in this EULA, including Federal and International Copyrights, are reserved by Sensory and its suppliers. 6. Limited warranty and disclaimer

(a) Limited warranty. Sensory warrants that, for a period of ninety (90) days from the date of delivery (as evidenced by a copy of your receipt): (i) when used with a recommended hardware configuration, the software will perform in substantial conformance with the documentation supplied with the software; and (ii) that the physical media on which the software is furnished will be free from defects in materials and workmanship under normal use.

(b) No other warranty, except as set forth in the foregoing limited warranty, sensory and its suppliers disclaim all other warranties, either express or implied, or otherwise including the warranties of merchantability and fitness for a particular purpose. Also, there is no warranty of noninfringement, title or quiet enjoyment. If applicable law implies any warranties with respect to the software, all such warranties are limited in duration to ninety (90) days from the date of delivery. No oral or written information or advice given by sensory, its dealers, distributors, agents or employees shall create a warranty or in any way increase the scope of this warranty.

(c) Some states (USA only) do not allow the exclusion of implied warranties, so the above exclusion may not apply to you. This warranty gives you specific legal rights and you may also have other legal rights that vary from state to state.

7. Exclusive Remedy Your exclusive remedy under Section 6 is to return the Software to the place you acquired it, with a copy of your receipt and a description of the problem. Sensory will use reasonable commercial efforts to supply you with a replacement copy of the

106 P/N 80-0200-D © 2002 Sensory Inc.

Page 107: Voice Extreme™ Toolkit

Programmer’s Manual Voice Extreme™ Toolkit

Software that substantially conforms to the documentation, provide a replacement for defective media, or refund to you your purchase price for the Software, at its option. Sensory shall have no responsibility if the Software has been altered in any way, if the media has been damaged by accident, abuse or misapplication, or if the failure arises out of use of the Software with other than a recommended hardware configuration. 8. Limitation of liability.

(a) Neither sensory nor its suppliers shall be liable to you or any third party for any indirect, special, incidental or consequential damages (including damages for loss of business, loss of profits, business, interruption or the like), arising out of the use or inability to use the software or this EULA based on any theory of liability including breach of contract, breach of warranty, tort (including negligence), product liability or otherwise, even if sensory or its representatives have been advised of the possibility of such damages and even if a remedy set forth herein is found to have failed of its essential purpose.

(a) Sensory’s total liability to you for actual damages for any cause whatsoever will be limited to the greater of $500 us dollars or the amount paid by you for the software that caused such damage.

(b) (USA only) some states do not allow the limitation or exclusion of liability for incidental of consequential damages, so the above limitation or exclusion may not apply to you and you may also have other legal rights that vary from state to state.

9. Basis of Bargain The Limited Warranty, Exclusive Remedies and Limited Liability set forth above are fundamental elements of the basis of the agreement between Sensory and you. Sensory would not be able to provide the Software on an economic basis without such limitations. 10. U.S. GOVERNMENT RESTRICTED RIGHTS LEGEND This Software and the documentation are provided with "RESTRICTED RIGHTS". Use, duplication, or disclosure by the U.S. Government is subject to restrictions as set forth in this EULA and as provided in DFARS 227.7202-1(a) and 227.7202-3(a) (1995), DFARS 252.227-7013 (c)(1)(ii)(OCT 1988), FAR 12.212(a)(1995), FAR 52.227-19, or FAR 52.227-14, as applicable. Manufacturer: Sensory, Inc., 1991 Russell Ave. Santa Clara, CA 95054. 11. Consumer End Users Only (outside of the USA) The limitations or exclusions of warranties and liability contained in this EULA do not affect or prejudice the statutory rights of a consumer, i.e., a person acquiring goods otherwise than in the course of a business. 12. General Provisions This EULA shall be governed by the internal laws of the State of California. This EULA contains the complete agreement between the parties with respect to the subject matter hereof, and supersedes all prior or contemporaneous agreements or understandings, whether oral or written. All questions concerning this EULA shall be directed to: Sensory, Inc., 1991 Russell Ave. Santa Clara, CA 95054, attention: General Counsel.

© 2002 Sensory Inc. P/N 80-0200-D 107

Page 108: Voice Extreme™ Toolkit

The Interactive Speech™ Product Line The Interactive Speech line of ICs and software was developed to “bring life to products” through advanced speech recognition and audio technology. The Interactive Speech Product Line was designed for consumer telephony products and cost-sensitive consumer electronic applications such as home electronics, personal security, and personal communication. The product line includes award-winning RSC series general-purpose microcontrollers and tools, SC series of speech microcontrollers, plus a line of easy-to-implement chips that can be pin-configured or controlled by an external host microcontroller. Sensory’s software technologies run on a variety of microcontrollers and DSPs. RSC Microcontrollers and Tools

The RSC product line contains low-cost 8-bit speech-optimized microcontrollers designed for use in consumer electronics. All members of the RSC family are fully integrated and include A/D, pre-amplifier, D/A, ROM, and RAM circuitry. The RSC family can perform a full range of speech/audio functions including speech recognition, speaker verification, speech and music synthesis, and voice record/playback. The family is supported by a complete suite of evaluation tools and development kits.

SC Microcontrollers and Tools

The SC-6x product line features the highest quality speech synthesis ICs at the lowest data rate in the industry. The line includes a 12.32 MIPS processor for high-quality low data-rate speech compression and MIDI music synthesis, with plenty of power left over for other processor and control functions. Members of the SC-6x line can store as much as 37 minutes of speech on chip and include as much as 64 I/O pins for external interfacing. Integrating this broad range of features onto a single chip enables developers to create products with high quality, long duration speech at very competitive price points.

Application Specific Standard Products (ASSPs)

Voice Direct™ 364 provides inexpensive speaker-dependent speech recognition and speech synthesis. This easy-to-use, pin-configurable chip requires no custom programming and can recognize up to 60 trained words in slave mode, and 15 words in stand-alone mode. Ideal for speaker-dependent command and control of household consumer products, Voice Direct 364 is part of a complete product line that includes the IC, module, and Voice Direct 364 Speech Recognition Kit.

Voice Extreme™ simplifies the creation of fully custom speech-enabled products by offering developers the capability of programming the chip in a high-level C-like language. Program code, speech data, and even record and playback information can be stored on a single off-chip Flash memory. Based on Sensory's RSC-364 speech processor, Voice Extreme includes a highly efficient on-chip code interpreter, and is supported by a comprehensive suite of low-cost development tools.

Software and Technology

Voice Activation™ micro footprint software provides advanced speech technology on a variety of microcontroller and DSP platforms. A flexible design with a broad range of technologies allows manufacturers to easily integrate speech functionality into consumer electronic products.

Fluent Speech™ small footprint software recognizes up to 50,000 words; offers Animated Speech with the ability to automate enunciation and articulation; performs text-to-speech synthesis in either male or female voices; provides noise and echo cancellation, performs Wordspotting for natural language usage; offers telephone barge-in; and provides continuous digit recognition.

Important notices

Reasonable efforts have been made to verify the accuracy of information contained herein, however no guarantee can be made of accuracy or applicability. Sensory reserves the right to change any specification or description contained herein.

1991 Russell Ave., Santa Clara, CA 95054

© 2002 SENSORY, INC. ALL RIGHT RESERVED. Sensory is registered by the U.S. Patent and Trademark Office. All other trademarks or registered trademarks are the

Tel: (408) 327-9000 Fax: (408) 727-4748 property of their respective owners.

www.sensoryinc.com