Speech Interface to Virtual Reality Applications
description
Transcript of Speech Interface to Virtual Reality Applications
![Page 1: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/1.jpg)
Speech Interface to Virtual Reality Applications
ReporterChun-Feng Liao
AuthorsWauchope, K., S. Everett, D. Tate, T. Maney
M.Cernak, A.Sannier
![Page 2: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/2.jpg)
M.Cernak, A.Sannier ,Technical Report, “Command Speech Interface to Virtual Reality Applications”,Virtual Reality Applications Center at Iowa State University of Science and Technology, June 2002.
Wauchope, K., S. Everett, D. Tate, T. Maney, "Speech-Interactive Virtual Environments for Ship Familiarization," 2nd International EuroConference on Computer and IT Applications in the Maritime Industries (COMPIT '03), Hamburg, Germany, May 14-17, 2003, pp. 70-83.
ReferencesThis report discuss 2 implementations of Speech Interface to Virtual Reality Applications.
![Page 3: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/3.jpg)
![Page 4: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/4.jpg)
Agenda Introduction Paper I Paper II Conclusion System design Discussion
![Page 5: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/5.jpg)
Introduction Both papers are newly published.(2002,200
3) These 2 papers address technical details of S
peech-VR integration.\ The 2nd paper take more modern approach . Both of them use similar architecture.(and a
re also similar to ours!) Ex:Choosing VRML + Java Speech API platform and encountered serveral difficult problems such as java security constraint and were force to use a “brwoser as an application ” instead of “browser as an applet”
![Page 6: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/6.jpg)
Paper I M.Cernak, A.Sannier ,Technical Rep
ort, “Command Speech Interface to Virtual Reality Applications”,Virtual Reality Applications Center at Iowa State University of Science and Technology, June 2002.
![Page 7: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/7.jpg)
Purposes of this paper Describe an approach to control VR
applications using multimodal command speech interface (CSI)based on dialog modeling.
Used to imporve the usability of VRAC’s C6 .
VRAC : Virtual Reality Applications Center
C6 is a Virtual Reality System developed by VRAC.
![Page 8: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/8.jpg)
Multimodal Interaction U :MoleBio S :Yes U :(Targeting the atom 512 by mous
e) U :Go There ! S :OK (goto Atom number 512 ).U: User , S: System
Command Addressing,used to trigger system start to record user’s voice for recognition.
![Page 9: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/9.jpg)
System Architecture
Dialog Management and Speech facilities VR System
![Page 10: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/10.jpg)
System Architecture VR : VRAC’s C6 TTS : Festival SR : CSLU Toolkit Platform : Windows OS on PII 400
![Page 11: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/11.jpg)
Three Main Components(1) Speech Synthesis (TTS) : Festival .
![Page 12: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/12.jpg)
Three Main Components(2) CSLU Toolkit :Dialog Modeling , Spe
ech Recognition and Nature Language Processing.
CSLU was implemented in C and Tcl/tk , developed by OGI (Oregon Graduate Institute )
CSLU (Center of Spoken Language Understanding)
![Page 13: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/13.jpg)
![Page 14: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/14.jpg)
Three Main Components(3) Communication Bridge to VR applic
ation. To Integrate CSLU(Speech) and C6(V
R).
![Page 15: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/15.jpg)
How to Integrate CSLU and C6
Initial Attempt : CORBA• C6 support CORBA .• Try to use “Combat” as tcl extension
as CORBA Client but failed.• Try to use “Tcl Blend”:
- Tck->Java->CORBA->C6 (efficient problems)
• Result : use TCP socket.
![Page 16: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/16.jpg)
Natural Language Processing
Instead of using standard JSGF , the authors use a custom grammar and wrote a specific parser to evaluate it.
Very similar to JSGF. We will not discuss the custom gram
mar in detail here.
![Page 17: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/17.jpg)
SCI Test Environment A RAD (GUI) tool that help develope
rs to quickly build the dialog flow.
![Page 18: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/18.jpg)
Paper I Conclusion Major advantage of this system is qu
ick deployment. The problematic area is the Speech
Recognition Accuracy(provided by CSLU) was poor.
US Navy also developed a Speech Inteface to VR System , they will imporved the interaction with VR in terms of their method.
![Page 19: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/19.jpg)
Future Work Change TTS and SR to IBM ViaVoice .
• Support JSAPI(Java Speech API)• Java is easier to communicate with C6 v
ia CORBA.
![Page 20: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/20.jpg)
![Page 21: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/21.jpg)
Paper II Wauchope, K., S. Everett, D. Tate, T. Man
ey, "Speech-Interactive Virtual Environments for Ship Familiarization," 2nd International EuroConference on Computer and IT Applications in the Maritime Industries (COMPIT '03), Hamburg, Germany, May 14-17, 2003, pp. 70-83.
![Page 22: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/22.jpg)
Introduction This paper intruduce 2 systems whic
h help newly-aboard crews of US Navy ships to be familiar with their environment quickly.
User : Tell me where is Rom 101 !
![Page 23: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/23.jpg)
Motivation Architects of US Navy Ships heavily
use CAD tools to design ship models. CAD file can be transferred to 3D m
odel format with little effort. Accroding to author’s previous res
earch ,this Virtual Envirionment did shorten crews’ learning time.
![Page 24: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/24.jpg)
Systems introduced 2 Systems
• MSFT(Multimodal Ship Familiarization Tool)
• ISFS(Interactive Ship Familiarization System)
ISFS is a recent transition fo MSFT.
![Page 25: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/25.jpg)
System Architecture:MSFT
Run as different process
![Page 26: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/26.jpg)
MSFT VE veiwer component and speech in
terface run as two separate processes.
Speech interface : using a total IBM solution :• ViaVoice.• IBM’s SMAPI.• IBM’s SRCL grammar.
Platform : PIII 500MHz
![Page 27: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/27.jpg)
ISFS A recent transistion of MSFT. Using VRML as 3D modeling language. Using JSAPI as interface to speech en
gine.• ViaVoice totally support JSAPI.• VRML support Java as a scripting languag
e Other structure is identical to MSFT sy
stem.Platform : Xeon 2.0GHz ->Need more computing power!
![Page 28: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/28.jpg)
Why Chose to Use Standalone VRML Brwoser?
Security Limitations.(detail will be discussed later)
VM Limitations.(detail will be discussed later)
Provide opportunities to customize interface to VRML browser.
In my personal experience,system usually become unstable when speech engine work with VRML Plug-in via EAI’s Java interface.
![Page 29: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/29.jpg)
Security Limitations JRE imposes security limitations on
Java Applets. JSAPI was unable to establish a conn
ection with speech engine unless we explicitly reconfig the security settings.
![Page 30: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/30.jpg)
Limited VM Most VRML Browser ‘s EAI were i
mplemented using ActiveX thus only support Microsoft’s old VM which dosen’t support most modern functions of Java.• Ex:This may force us to use Java AWT i
nstead of swing which provide better GUI.
![Page 31: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/31.jpg)
Providing GUI as VUI Fallback
GUI provides a fallback in case the speech recognizer is having trouble accurately transcribing the user’s voice.
GUI is adjusted dynamically to provide one-to-one correspondence to VUI .
![Page 32: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/32.jpg)
Paper 2 Conclusion The Speech Interface is needed beca
use GUI and VE Viewer both rely on direct manipulation and keep our hand too busy.
As HCI become increasingly multimodel,care must be taken to integrate in natural manner.
![Page 33: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/33.jpg)
Future Work VRML is more close to Object –oriente
d and tree-structured. It is hard to represent them in RDBMS. Must find some way to store model dat
a easily and efficiently.
Personal thought : Using XML Database.
![Page 34: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/34.jpg)
Switchable!
Discussions
![Page 35: Speech Interface to Virtual Reality Applications](https://reader034.fdocuments.in/reader034/viewer/2022051001/56814eee550346895dbc7bf1/html5/thumbnails/35.jpg)
Q & A