Speaking to Computers
-
Upload
nemesio-lazaro -
Category
Documents
-
view
27 -
download
2
description
Transcript of Speaking to Computers
![Page 1: Speaking to Computers](https://reader031.fdocuments.in/reader031/viewer/2022013101/568137e3550346895d9f901f/html5/thumbnails/1.jpg)
Speaking to Computers
Alex AceroManager, Speech Research GroupMicrosoft Research
[email protected] Feb 14th 2003
![Page 2: Speaking to Computers](https://reader031.fdocuments.in/reader031/viewer/2022013101/568137e3550346895d9f901f/html5/thumbnails/2.jpg)
Talk Outline
Role of speech technology in devices
Telephony Smartphones and PDAs Multimodality in User Interface
![Page 3: Speaking to Computers](https://reader031.fdocuments.in/reader031/viewer/2022013101/568137e3550346895d9f901f/html5/thumbnails/3.jpg)
The Promise of Speech Technology
![Page 4: Speaking to Computers](https://reader031.fdocuments.in/reader031/viewer/2022013101/568137e3550346895d9f901f/html5/thumbnails/4.jpg)
HighHigh
InternetInternetTVTV
PhonePhone
PDAPDA
Ease of text input (keyboard/pen)Ease of text input (keyboard/pen)
Ease Ease of GUIof GUI
(screen/(screen/Pointer)Pointer)
LowLow HighHigh
PCPC
TabletTabletPCPC
ScreenScreenPhonePhoneScreenScreenPhonePhone
PDAPDA
TabletTabletPCPC
CarCarCarCar
InternetInternetTVTV
Role of Speech in Different Devices
![Page 5: Speaking to Computers](https://reader031.fdocuments.in/reader031/viewer/2022013101/568137e3550346895d9f901f/html5/thumbnails/5.jpg)
PhonePhone
PCPC
ScreenScreenPhonePhone
PDAPDA
TabletTabletPCPC
CarCar
InternetInternetTVTV
A Roadmap for Speech
Ease of text input (keyboard/pen)Ease of text input (keyboard/pen)
Ease Ease of GUIof GUI
(screen/(screen/Pointer)Pointer)
HighHigh
HighHighLowLow
Speech-Only Speech-Only TelephonyTelephony
DictationDictation
Multimodal Multimodal Command/ControlCommand/Control
![Page 6: Speaking to Computers](https://reader031.fdocuments.in/reader031/viewer/2022013101/568137e3550346895d9f901f/html5/thumbnails/6.jpg)
Speech Technology
Meeting / Voicemail Transcription
Market Opportunity
Mobile Devices / Cars
Telephony / Call Center
Accessibility
Desktop Dictation
Desktop Command & Control
Technology Readiness
Customer Need
PoorAlternative
![Page 7: Speaking to Computers](https://reader031.fdocuments.in/reader031/viewer/2022013101/568137e3550346895d9f901f/html5/thumbnails/7.jpg)
The Business Value of Speech for Call Centers
Customer Focus
Less Time/Call
Efficient Agents
Less Time in Queue
Increased System Usage
Customer Retention
$5/call to $.20/call
Reduced Call Time
Fewer Agents
New Revenue Opportunities
Up-Sell/Cross-Sell
![Page 8: Speaking to Computers](https://reader031.fdocuments.in/reader031/viewer/2022013101/568137e3550346895d9f901f/html5/thumbnails/8.jpg)
Amtrak61% Increase in Satisfaction
75% Increase in Automation Rate
90% Increase in Ticket Sales
Thrifty Car Rental40% increase in CSR productivity $1 million first year savings
Merrill LynchAutomation rates from 82% to 90%
First Year Savings $6.3M
Call Center Examples
![Page 9: Speaking to Computers](https://reader031.fdocuments.in/reader031/viewer/2022013101/568137e3550346895d9f901f/html5/thumbnails/9.jpg)
The Business Value of Speech for Operators
0
5000
10000
15000
20000
25000
30000
35000
2000 2001 2002 2003 2004 2005 2006 2007
Data Revenue
Voice RevenueRevenueIn US$M
The mobile operators need to make money from value-added services!
![Page 10: Speaking to Computers](https://reader031.fdocuments.in/reader031/viewer/2022013101/568137e3550346895d9f901f/html5/thumbnails/10.jpg)
If you still doubt speech is goodfor the call center….
![Page 11: Speaking to Computers](https://reader031.fdocuments.in/reader031/viewer/2022013101/568137e3550346895d9f901f/html5/thumbnails/11.jpg)
Why Speech at Microsoft?
Natural UI, or the combination of speech recognition, natural language understanding, automatic learning... Those are the key technologies that will have the most impact over the next 15 years.
Bill Gates, Microsoft Chairman
![Page 12: Speaking to Computers](https://reader031.fdocuments.in/reader031/viewer/2022013101/568137e3550346895d9f901f/html5/thumbnails/12.jpg)
Microsoft Speech Server & SDK
Visual Studio + ASP.NET + SALT
Multiple Devices
Call center + multimodal solution
Unifies web & call center
Reduces TCO
![Page 13: Speaking to Computers](https://reader031.fdocuments.in/reader031/viewer/2022013101/568137e3550346895d9f901f/html5/thumbnails/13.jpg)
Speech in Mobile Devices
Microsoft Smartphone & PocketPC Phones• Rich Client• 3% to 16% of WW mobile phone market
Smartphones• Thin Client• 11% to 25% of WW mobile phone market
Cellular Phones• No Client• 86% to 59% of WW mobile phone market
SOURCE: Gartner, IDC, Microsoft
2004 2007
![Page 14: Speaking to Computers](https://reader031.fdocuments.in/reader031/viewer/2022013101/568137e3550346895d9f901f/html5/thumbnails/14.jpg)
Thin Client Devices Over Voice Channel
Web ServerMS Speech Server
PSTN
SMS Messages
Voic
e O
nly
Ap
ps
![Page 15: Speaking to Computers](https://reader031.fdocuments.in/reader031/viewer/2022013101/568137e3550346895d9f901f/html5/thumbnails/15.jpg)
GrammarsGrammars
PromptsPrompts
ASP.NETDialogs
ASP.NETDialogs
Speech EngineServices
Speech EngineServices
Telephony AppServices
Telephony AppServices
Rich Client Devices Over Data Channel
Web ServerMS Speech Server
SMS Push for Brower Launch
![Page 16: Speaking to Computers](https://reader031.fdocuments.in/reader031/viewer/2022013101/568137e3550346895d9f901f/html5/thumbnails/16.jpg)
Microsoft Voice Command
Pocket PC voice-enabled applications: Voice Dialer, Contacts, Calendar, Media
Player No connectivity necessary (100%
embedded) No training needed, (speaker-
independent) Continuous speech recognition
“Call John at home”
![Page 17: Speaking to Computers](https://reader031.fdocuments.in/reader031/viewer/2022013101/568137e3550346895d9f901f/html5/thumbnails/17.jpg)
Multimodal Interactive Pad (MIPAD)
![Page 18: Speaking to Computers](https://reader031.fdocuments.in/reader031/viewer/2022013101/568137e3550346895d9f901f/html5/thumbnails/18.jpg)
Multimodal Map
![Page 19: Speaking to Computers](https://reader031.fdocuments.in/reader031/viewer/2022013101/568137e3550346895d9f901f/html5/thumbnails/19.jpg)
Current Speech User Interfaces
Need improved Speech user interfaces Even no-errors and fast processing not sufficient But errors occur: better error correction needed
Social issues: Microphones can’t tether user Users more comfortable talking to phones, cars. Talking to computers not likely in meetings or
cubicles
![Page 20: Speaking to Computers](https://reader031.fdocuments.in/reader031/viewer/2022013101/568137e3550346895d9f901f/html5/thumbnails/20.jpg)
The Future of Natural User Interfaces
![Page 21: Speaking to Computers](https://reader031.fdocuments.in/reader031/viewer/2022013101/568137e3550346895d9f901f/html5/thumbnails/21.jpg)
End User End User NeedsNeeds
Technology, Technology, ResearchResearch
Software ScenariosSoftware Scenarios
Bridging The Gap
![Page 22: Speaking to Computers](https://reader031.fdocuments.in/reader031/viewer/2022013101/568137e3550346895d9f901f/html5/thumbnails/22.jpg)
Thank You!
http://research.microsoft.com/srg