Lti Shaping Spoken Input in User-Initiative Systems Stefanie Tomko and Roni Rosenfeld Language...

lti

Shaping Spoken Input in User-Initiative Systems

Stefanie Tomko and Roni Rosenfeld

Language Technologies InstituteSchool of Computer Science Carnegie Mellon University

Presented by: Thomas Kevin Harris

2

lti

outline

• introduction– related work

• method

• results– user perceptions

• discussion

3

lti

introduction

• (all other things being equal) spoken dialog systems perform best when users speak within the grammar that the system understands– make grammar more accepting?– get users to speak within the smaller grammar?

• ok, how do we do this?– this study is a preliminary step

4

lti

related work

• shaping is pretty ubiquitous – users adapt to features of system prompts

• formality (Ringle & Halstead-Nussloch 1989)• length (Zoltan-Ford 1991)• vocabulary (Brennan 1996; Gustafson et al 1997)

– user also simplify input under higher WER conditions (Shriberg et al 1992)

5

lti

foundation: speech graffiti

• a structured, subset language interaction protocol for interacting with simple machines

• user input– slot+value pairs: theater is the Galleria Six

– what-questions: what are the movies?

• system output– terse restatement of value: Galleria six

• user initiative

6

lti

previous speech graffiti studies

• shown to be an effective interaction style– compared to a natural language interface

• higher user satisfaction• lower task completion times• similar task completion rates

• but… users can have difficulty learning & speaking subset language

• when users tried NL with speech graffiti system, their utterances were simpler than NL to an NL system

7

lti

initial questions about shaping• how can different instructions influence

user input?• how will users shape input in response to

– rejection of conversational, NL speech– speech graffiti-style, terse, value-only

confirmations

• how will this work in a user-initiative environment?

wizard-of-oz study

8

lti

outline


• method


• discussion

9

lti

wizard-of-oz study

• 18 participants, mostly CMU students– most with non-technical backgrounds– most had used ASR systems before but not regularly

• interact with a telephone information system providing movie schedules & airline flight data

• 10 tasks, e.g.

– a friend told you that Miracle was pretty good. where is this movie playing?

– a friend has told you that she's flying to San Francisco on United flight 500. when will she get there?

10

lti

instructions• 3 conditions: short - medium - long

welcome: Welcome to the InfoLine.

instruction-short: The system you are talking to only understands very simple English, so please speak to it as simply as you can.

instruction-medium: The system you are talking to only understands very simple English, so please speak to it as simply as you can. It will understand you best if you tell it only one idea at a time.

instruction-long: The system you are talking to only understands very simple English, so please speak to it as simply as you can. It will understand you best if you tell it only one idea at a time. This system understands only keywords, and not the structure of sentences.

example: For instance, you might say "movie The Lord of the Rings,” or “airline is United,” or “what are show times?"

prompt: You can now start speaking whenever you're ready.

11

lti

wizard rules

• reject:– non-task conversational words

• could you tell me…

– task-based non-content items• what movies are showing in West Mifflin?

– task-based OOV words• films; earliest flight

• rejection messages– excuse me?– I'm sorry, I didn't understand that.– [replay instruction & example strings from introduction]

12

lti

sample user interactionTASK A: user: where is Miracle movie playing?wizard: excuse me?u: Miracle moview: Miracle u: where is it playing?w: excuse me?u: which theaters?w: 3 theaters: Destinta Theatres Plaza 22, Loews Waterfront Theatre, Northway Mall Cinemas 8

TASK B: u: United flight 500 to San Francisco, what time does it arrive?w: United, to San Francisco, flight 500u: Arrival timew: arriving at 12:33 pm

13

lti

outline


• method


• discussion

14

lti

results• overall completion rate: 96.1%

• users generally used value-only

• no effect of instruction length on number of utterances per session

• longer instruction length shorter user utterances– due to extra content?

instruction condition

mean # words per utterance

short 4.49

medium 3.36

long 2.98

15

lti

rejected input

• on average, about 22% of user's utterances were rejected– no effect of instruction condition– users only repeated input verbatim in 7 cases

sequential rejection instance

# of occurrences

1st 123

2nd 52

3rd 15

4th 3

rejected input shaped after

this point

50%

75%

84%

85%

excuse me?

I'm sorry…

the system you are talking to…

16

lti

user perceptions

• participants clearly aware of limited style• participants mentioned

– simplification– minimization– keywords (key words)

• these ideas parallel the instruction conditions– speak simply– one idea at a time– use keywords

• but comments did not match condition

17

lti

outline


• method


• discussion

18

lti

discussion (1)

• how can different instructions influence user input?– more explanation of simplification shorter

utterances• how will users shape input in response to

– rejection of conversational, NL speech• 50% of rejected utts shaped after one rejection • 75% shaped after two rejections

– speech graffiti-style, terse, value-only confirmations

• most shaped user input mimicked this style

19

lti

discussion (2)

• how will this work in a user-initiative environment?– 96% task completion rate without explicit system

prompts

• future work– shape input more precisely

• shape with slot+value confirmation, to avoid ambiguity?• shape to specific acoustically distinct vocabulary?

– can input be shaped even if some NL is handled?

Lti Shaping Spoken Input in User-Initiative Systems Stefanie Tomko and Roni Rosenfeld Language...

Documents

Transcript of Lti Shaping Spoken Input in User-Initiative Systems Stefanie Tomko and Roni Rosenfeld Language...