2013 Speech TEK - Alphanumeric Recognition Discussion

15
© 2002 2012 Versay Solutions, LLC. All rights reserved. Alphanumeric Speech Recognition SpeechTek August 19, 2013 Crispin Reedy

description

This morning's discussion on Alphanumeric Reco was great. Here are the slides for anyone who is interested. Thanks to all for sharing their experiences!

Transcript of 2013 Speech TEK - Alphanumeric Recognition Discussion

Page 1: 2013 Speech TEK - Alphanumeric Recognition Discussion

© 2002 – 2012 Versay Solutions, LLC. All rights reserved.

Alphanumeric Speech Recognition

SpeechTek

August 19, 2013

Crispin Reedy

Page 2: 2013 Speech TEK - Alphanumeric Recognition Discussion

“The fault, dear Brutus, is not in our stars, but in ourselves”

-- Julius Caesar, Act I, scene ii

2

The Problem With Alphanumerics

Page 3: 2013 Speech TEK - Alphanumeric Recognition Discussion

© 2002 – 2012 Versay Solutions, LLC. All rights reserved.

The Need

• Account Numbers

• Policy Numbers

• Spelling out names and addresses

• Special cases

– VIN, Canadian Postal Code

• And more…

3

Page 4: 2013 Speech TEK - Alphanumeric Recognition Discussion

© 2002 – 2012 Versay Solutions, LLC. All rights reserved.

Methods for Addressing

• Project Tactics

• Limit the grammar

– Constraint List

– N-Best + Back-End Data Validation

• Confirmation

• Prefiller

4

Page 5: 2013 Speech TEK - Alphanumeric Recognition Discussion

© 2002 – 2012 Versay Solutions, LLC. All rights reserved.

Project Tactics

• Can you avoid it?

– Phone number / SSN / Zip / DOB?

• Set expectations

– Not always easy!

• Describe the problem

• What tools do you have available?

– Constraints / patterns?

– Back-end data source available?

• Can you run a proof of concept / experiment?5

Page 6: 2013 Speech TEK - Alphanumeric Recognition Discussion

© 2002 – 2012 Versay Solutions, LLC. All rights reserved.

Constraints and Patterns

• Does the number have any known pattern that can be used to limit possible values (and thereby improve recognition)– For example:

• First character is always A

• First three characters are always numbers

• Last characters are always C, G or T.

• If the answer is “no,” consider doing your own analysis.– Even if you don’t think there is a pattern, there

may be one.6

Page 7: 2013 Speech TEK - Alphanumeric Recognition Discussion

© 2002 – 2012 Versay Solutions, LLC. All rights reserved.

Applying Constraints

• Writing grammar specifically for the pattern

– How complicated is it?

• Applying a constraint list.

– How big is it?

7

Page 8: 2013 Speech TEK - Alphanumeric Recognition Discussion

© 2002 – 2012 Versay Solutions, LLC. All rights reserved.

Using nBest + Back-End Data

• Collect using an unconstrained grammar

• Set your recognizer to return an nBest list.

• Use a webservice / back end data dip to determine which ones are “real.”

• Confirm the first “real” one on the list

– Throw out the ones that are not real.

• If no, confirm the second “real” one on the list.

– Potentially collect again after that.8

Page 9: 2013 Speech TEK - Alphanumeric Recognition Discussion

© 2002 – 2012 Versay Solutions, LLC. All rights reserved.

Confirmation Strategy

• PROTIP: Phonemes that are difficult for the recognizer to hear … are also difficult for humans to hear when they are spoken back.

• Confirm using letter names for easily confusable alphanumerics.

– “You said 8, 2, 7 G as in George, B as in Boy, 9. Is that right?”

9

Page 10: 2013 Speech TEK - Alphanumeric Recognition Discussion

© 2002 – 2012 Versay Solutions, LLC. All rights reserved.

What About Letter Names?

• Yes with caveats:– Do you have a special domain that would allow

you to teach the caller letter names?

– Letter names invented by the caller will be quite variable. • Some of the “oddballs” will never be recognized

– If letter names are used during confirmation, and the utterance is re-collected, the caller may tend to use those letter names during the second collection. • So add them.

10

Page 11: 2013 Speech TEK - Alphanumeric Recognition Discussion

© 2002 – 2012 Versay Solutions, LLC. All rights reserved.

What About Letter Names?

• Yes, because:

– Longer utterances “B as in Boy” are not likely to generate false acceptance between shorter utterances such as “G” “T” etc.

• Make them separate rules so they can be weighted

11

Page 12: 2013 Speech TEK - Alphanumeric Recognition Discussion

© 2002 – 2012 Versay Solutions, LLC. All rights reserved.

Using Prefiller

• “The account number is… B Z 3 9 0”

– Noticeable improvement in recognition of first letter

– Caller may spontaneously offer

– Consider teaching the caller to say the prefiller

• Especially if you have repeat callers

12

Page 13: 2013 Speech TEK - Alphanumeric Recognition Discussion

© 2002 – 2012 Versay Solutions, LLC. All rights reserved.

Other Suggestions

• Look at speech recognition parameters that are not directly related to alphanumeric

– Are callers calling from a very noisy environment?

• Adjust overall speech threshold

– Timing of utterance collection?

• Listen to recording of utterances to make sure everything is getting collected

13

Page 14: 2013 Speech TEK - Alphanumeric Recognition Discussion

© 2002 – 2012 Versay Solutions, LLC. All rights reserved.

Specific Cases

• VIN

– Has specific pattern, but different for each manufacturer

– 16 digits: nobody will want to re-enter if you get it wrong.

14

Page 15: 2013 Speech TEK - Alphanumeric Recognition Discussion

IT DEPENDS!

15

but which way is “the best?”