Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine...

30
Non-Native Users in the Let’s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute Carnegie Mellon University

Transcript of Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine...

Page 1: Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.

Non-Native Users in the Let’s Go!! Spoken Dialogue System:

Dealing with Linguistic Mismatch

Antoine Raux & Maxine Eskenazi

Language Technologies Institute

Carnegie Mellon University

Page 2: Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.

Background

Speech-enabled systems use models of the user’s language

Such models are tailored for native speech

Great loss of performance for non-native users who don’t follow typical native patterns

Page 3: Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.

Previous Work on Non-Native Speech Recognition

Assumes knowledge about/data from a specific non-native population

Often based on read speech Focuses on acoustic mismatch:

• Acoustic adaptation

• Multilingual acoustic models

Page 4: Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.

Linguistic Particularities of Non-Native Speakers

Non-native speakers might use different lexical and syntactic constructs

Non-native speakers are in a dynamic process of L2 acquisition

Page 5: Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.

Outline of the Talk

Baseline system and data collection Study of non-native/native mismatch and

effect of additional non-native data Adaptive lexical entrainment

Page 6: Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.

The CMU Let’s Go!! System:Bus Schedule Information for the Pittsburgh Area

ASRSphinx II

ParsingPhoenix

Dialogue ManagementRavenClaw

Speech SynthesisFestival

HUBGalaxy

NLGRosetta

Page 7: Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.

Data Collection

Baseline system accessible since February 2003

Experiments with scenarios Publicized the phone number inside

CMU in Fall 2003

Page 8: Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.

Data Collection Web Page

Page 9: Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.

Data

Directed experiments: 134 calls• 17 non-native speakers (5 from India, 7 from

Japan, 5 others) Spontaneous: 30 calls Total: 1768 utterances Evaluation Data:

• Non-Native: 449 utterances

• Native: 452 utterances

Page 10: Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.

Speech Recognition Baseline

Acoustic Models: • semi-continuous HMMs (codebook size: 256)

• 4000 tied states

• trained on CMU Communicator data

Language Model: • class-based backoff 3-gram

• trained on 3074 utterances from native calls

Page 11: Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.

Speech Recognition Results

Native Non-Native

20.4% 52.0%

Causes of discrepancy:• Acoustic mismatch (accent)• Linguistic mismatch (word choice, syntax)

Word Error Rate:

Page 12: Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.

Language Model Performance

05

1015

2025

3035

40

Per

plex

ity

Native Non-Native

Perplexity0

0.5

1

1.5

2

2.5

3

3.5

% to

kens

Native Non-Native

OOV Rate

0

2

4

6

8

10

12

14

% ut

tera

nces

Native Non-Native

Rate of utterances with OOV

Evaluation on transcripts. Initial model: 3074 native utterances

Page 13: Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.

Adding non-native data:3074 native+1308 non-native utterances

Initial (native) modelMixed model

Language Model Performance

0

0.5

1

1.5

2

2.5

3

3.5

% to

kens

Native Non-Native

OOV Rate

0

2

4

6

8

10

12

14

% ut

tera

nces

Native Non-Native

Rate of utterances with OOV

05

1015

2025

3035

40

Per

plex

ity

Native Non-Native

Perplexity

Page 14: Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.

Natural Language Understanding

Grammar manually written incrementally, as the system was being developed

Initially built with native speakers in mind Phoenix: robust parser (less sensitive to

non-standard expressions)

Page 15: Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.

Grammar Coverage

05

1015202530354045

% wor

ds n

otco

vere

d by

par

se

Native Non-Native

Parse Word Coverage

0

10

20

30

40

50

60

% ut

tera

nces

not

fully

par

sed

Native Non-Native

Parse Utterance Coverage

Initial grammar:• Manually written for

native utterances

Page 16: Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.

Grammar Coverage

05

1015202530354045

% wor

ds n

otco

vere

d by

par

se

Native Non-Native

Parse Word Coverage

0

10

20

30

40

50

60

% ut

tera

nces

not

fully

par

sed

Native Non-Native

Parse Utterance Coverage

Grammar designed to accept some non-native patterns: • “reach” = “arrive”

• “What is the next bus?” = “When is the next bus?”

Page 17: Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.

Relative Improvement due to Additional Data

0

10

20

30

40

50

60

% Im

prov

emen

t

% OOV % utt w/OOV

Perplexity WordCoverage

Utt.Coverage

Native Set Non-Native Set

Page 18: Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.

Effect of Additional Data on Speech Recognition

0

10

20

30

40

50

60

Wor

d Err

or R

ate

(%)

Native Set Non-Native Set

Native ModelMixed Model

Page 19: Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.

Adaptive Lexical Entrainment

“If you can’t adapt the system, adapt the user” System should use the same expressions it

expects from the user But non-native speakers might not master all

target expressions

Use expressions that are close to the non-native speaker’s language

Use prosody to stress incorrect words

Page 20: Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.

Adaptive Lexical Entrainment:Example

I want to go the airport

I want to go the airport?TODid you mean:

Page 21: Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.

Adaptive Lexical Entrainment:Algorithm

TargetPrompts

ASR Hypothesis

DP-basedAlignment

PromptSelection

EmphasisConfirmation

Prompt

I want to go the airport

Page 22: Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.

Adaptive Lexical Entrainment:Algorithm

TargetPrompts

ASR Hypothesis

DP-basedAlignment

PromptSelection

EmphasisConfirmation

Prompt

I want to go the airport

I’d like to go to the airport

Page 23: Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.

Adaptive Lexical Entrainment:Algorithm

TargetPrompts

ASR Hypothesis

DP-basedAlignment

PromptSelection

EmphasisConfirmation

Prompt

I want to go the airport

I’d like to go to the airport

I want to go to the airport

Page 24: Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.

Adaptive Lexical Entrainment:Algorithm

TargetPrompts

ASR Hypothesis

DP-basedAlignment

PromptSelection

EmphasisConfirmation

Prompt

I want to go the airport

I’d like to go to the airport

I want to go to the airport

Page 25: Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.

Adaptive Lexical Entrainment:Algorithm

TargetPrompts

ASR Hypothesis

DP-basedAlignment

PromptSelection

EmphasisConfirmation

Prompt

I want to go the airport

I’d like to go to the airport

I want to go to the airport

Page 26: Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.

Adaptive Lexical Entrainment:Algorithm

TargetPrompts

ASR Hypothesis

DP-basedAlignment

PromptSelection

EmphasisConfirmation

Prompt

I want to go the airport

I’d like to go to the airport

I want to go to the airportDid you mean: ?

Page 27: Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.

Adaptive Lexical Entrainment:Study

24 dialogues, 12.4 turns/dialogue, 2.5 entrainment prompts/dialogue

Trigger entrainment whenever best target does not perfectly match ASR• Too simple? 40.7% false positive

64.4% prompts are triggered by misrecognitions

Page 28: Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.

Conclusion

LM and Grammar perform significantly worse on non-native speech

Including non-native data reduces the gap in linguistic metrics but not WER

Future work:• Improve Speech Recognition

• More data for LM

• Acoustic/Lexical adaptation

• Integrate adaptive lexical entrainment in the dialog management framework

Page 29: Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.

Example Dialogue

S: Welcome to the Let’s Go!! bus information system. What can I do for you?

U: Excuse me. I’m in the place which uh where the Forbes Avenue and Murray Avenue are crossing and I want to go to the place mm Forbeswhere the Forbes Avenue and Craig Street are crossing.

S: I heard “SIX”. Is that correct?

U: Which bus uh should I take? Uh sorry uh maybe 61C bus I have… What time the next 61C bus will come?

S: The time is currently 5:37 PM. Where are you leaving from?

Page 30: Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.

Example Dialogue

S: Welcome to the Let’s Go!! bus information system. What can I do for you?

U: I want to go to downtown now I’m at Fifth and Bigelow.

S: Sorry, did you say “I want to go to downtown”?

U: Which number bus can I take and when will the next buscome here?

S: ... What can I do for you?