Lti Shaping Spoken Input in User-Initiative Systems Stefanie Tomko and Roni Rosenfeld Language...
-
Upload
julius-lynch -
Category
Documents
-
view
213 -
download
0
Transcript of Lti Shaping Spoken Input in User-Initiative Systems Stefanie Tomko and Roni Rosenfeld Language...
lti
Shaping Spoken Input in User-Initiative Systems
Stefanie Tomko and Roni Rosenfeld
Language Technologies InstituteSchool of Computer Science Carnegie Mellon University
Presented by: Thomas Kevin Harris
2
lti
outline
• introduction– related work
• method
• results– user perceptions
• discussion
3
lti
introduction
• (all other things being equal) spoken dialog systems perform best when users speak within the grammar that the system understands– make grammar more accepting?– get users to speak within the smaller grammar?
• ok, how do we do this?– this study is a preliminary step
4
lti
related work
• shaping is pretty ubiquitous – users adapt to features of system prompts
• formality (Ringle & Halstead-Nussloch 1989)• length (Zoltan-Ford 1991)• vocabulary (Brennan 1996; Gustafson et al 1997)
– user also simplify input under higher WER conditions (Shriberg et al 1992)
5
lti
foundation: speech graffiti
• a structured, subset language interaction protocol for interacting with simple machines
• user input– slot+value pairs: theater is the Galleria Six
– what-questions: what are the movies?
• system output– terse restatement of value: Galleria six
• user initiative
6
lti
previous speech graffiti studies
• shown to be an effective interaction style– compared to a natural language interface
• higher user satisfaction• lower task completion times• similar task completion rates
• but… users can have difficulty learning & speaking subset language
• when users tried NL with speech graffiti system, their utterances were simpler than NL to an NL system
7
lti
initial questions about shaping• how can different instructions influence
user input?• how will users shape input in response to
– rejection of conversational, NL speech– speech graffiti-style, terse, value-only
confirmations
• how will this work in a user-initiative environment?
wizard-of-oz study
8
lti
outline
• introduction– related work
• method
• results– user perceptions
• discussion
9
lti
wizard-of-oz study
• 18 participants, mostly CMU students– most with non-technical backgrounds– most had used ASR systems before but not regularly
• interact with a telephone information system providing movie schedules & airline flight data
• 10 tasks, e.g.
– a friend told you that Miracle was pretty good. where is this movie playing?
– a friend has told you that she's flying to San Francisco on United flight 500. when will she get there?
10
lti
instructions• 3 conditions: short - medium - long
welcome: Welcome to the InfoLine.
instruction-short: The system you are talking to only understands very simple English, so please speak to it as simply as you can.
instruction-medium: The system you are talking to only understands very simple English, so please speak to it as simply as you can. It will understand you best if you tell it only one idea at a time.
instruction-long: The system you are talking to only understands very simple English, so please speak to it as simply as you can. It will understand you best if you tell it only one idea at a time. This system understands only keywords, and not the structure of sentences.
example: For instance, you might say "movie The Lord of the Rings,” or “airline is United,” or “what are show times?"
prompt: You can now start speaking whenever you're ready.
11
lti
wizard rules
• reject:– non-task conversational words
• could you tell me…
– task-based non-content items• what movies are showing in West Mifflin?
– task-based OOV words• films; earliest flight
• rejection messages– excuse me?– I'm sorry, I didn't understand that.– [replay instruction & example strings from introduction]
12
lti
sample user interactionTASK A: user: where is Miracle movie playing?wizard: excuse me?u: Miracle moview: Miracle u: where is it playing?w: excuse me?u: which theaters?w: 3 theaters: Destinta Theatres Plaza 22, Loews Waterfront Theatre, Northway Mall Cinemas 8
TASK B: u: United flight 500 to San Francisco, what time does it arrive?w: United, to San Francisco, flight 500u: Arrival timew: arriving at 12:33 pm
13
lti
outline
• introduction– related work
• method
• results– user perceptions
• discussion
14
lti
results• overall completion rate: 96.1%
• users generally used value-only
• no effect of instruction length on number of utterances per session
• longer instruction length shorter user utterances– due to extra content?
instruction condition
mean # words per utterance
short 4.49
medium 3.36
long 2.98
15
lti
rejected input
• on average, about 22% of user's utterances were rejected– no effect of instruction condition– users only repeated input verbatim in 7 cases
sequential rejection instance
# of occurrences
1st 123
2nd 52
3rd 15
4th 3
rejected input shaped after
this point
50%
75%
84%
85%
excuse me?
I'm sorry…
the system you are talking to…
16
lti
user perceptions
• participants clearly aware of limited style• participants mentioned
– simplification– minimization– keywords (key words)
• these ideas parallel the instruction conditions– speak simply– one idea at a time– use keywords
• but comments did not match condition
17
lti
outline
• introduction– related work
• method
• results– user perceptions
• discussion
18
lti
discussion (1)
• how can different instructions influence user input?– more explanation of simplification shorter
utterances• how will users shape input in response to
– rejection of conversational, NL speech• 50% of rejected utts shaped after one rejection • 75% shaped after two rejections
– speech graffiti-style, terse, value-only confirmations
• most shaped user input mimicked this style
19
lti
discussion (2)
• how will this work in a user-initiative environment?– 96% task completion rate without explicit system
prompts
• future work– shape input more precisely
• shape with slot+value confirmation, to avoid ambiguity?• shape to specific acoustically distinct vocabulary?
– can input be shaped even if some NL is handled?