Micai 13 contextualized practical speech

Practical Speech Recognition for Contextualized Service Robots

Departamento de Ciencias de la ComputaciónInstituto de Investigaciones en Matemáticas Aplicadas y en Sistemas

Universidad Nacional Autónoma de México

http://golem.iimas.unam.mx/

Ivan Meza, Caleb Rascón and Luis Pineda

GrupoGolem

Service robots● Our future butlers ● They are task oriented

○ Clean up a room○ Play a game

● Interaction with spoken language ● They work in noisy environments● Microphone is not close to the speaker● Poor speech recognition

Proposal● Improve the system on four aspects

● Contextualized recogniser

● Prompting strategies

● Recovery strategies

● Audio calibration

I. Contextualized recognition

● Use specific language models for the given expectations

■ YES: yes, okay, all right■ NO: no, don’t, do not

■ NAVIGATE: go to the kitchen, go to the living room, go to the bedroom

ASR module

II. Prompting strategies

● Let know the user when to speak

■ Beep sound

● Speaker volume monitor

■ Could you speak louder or softer

III. Recovery strategy

● Let know the user when something went wrong

■ could you repeat? ■ i can’t hear you well, could you repeat■ sorry, i’m a little deaf

IV. Calibration of audio setting

● Hardware■ 1 directional microphone■ 1 USB interface with 4 channels■ 2 speakers

● Calibration of SNR in situ■ For background noise -58dB■ SNR set to 20 dB

Corpus evaluation

● Logs from the robot performing RoboCup tasks■ 2 years interactions in lab and competition■ 1,439 utterances■ 2,472 tokens■ 120 types■ 11 tasks■ 9 of 11 tasks are contextualized■ 14 language models

Contextualized recognitionWe measure WER (the lower the better)

● With a unique LM for all tasks: 53.84%

● With task-based LM: 28.28%

● With contextualized: 23.42%

17.2% relative error reduction

Beep sound

● 79 utterances were recorded without the beep sound

■ Without beeps 55.86%

■ With beeps 39.75%

■ With beeps full 53.72%

30%-4% Relative error reduction

Usage of SoundLoc System ● We measure usage

■ 174 times could have been triggered

■ 21 soft speech

■ 4 louder

14.36% of the times

Recovery strategy ● We measure usage

■ 504 times could have been triggered

■ 85 times activated

16.87% of the times

Conclusions

● These strategies help to improve in small amounts the performance

● Together they allow practical speech recognition on a service robot

Thank you

● ¿Questions?

Micai 13 contextualized practical speech

Technology

Transcript of Micai 13 contextualized practical speech

Contextualized Weak Supervision for Text Classification · 3 Document Contextualization We leverage contextualized representation tech-niques to create a contextualized corpus. The

Contextualized Customer Journeys

Human Parsing with Contextualized Convolutional …users.eecs.northwestern.edu/~xsh835/assets/iccv2015...Human Parsing with Contextualized Convolutional Neural Network Xiaodan Liang1;

Effects of contextualized and decontextualized practice ...

A Contextualized

CONTEXTUALIZED STRATEGIC INTERVENTION MATERIALS IN …

Contextualized Instruction/Contextualization: In its broadest definition, contextualized instruction refers to "the practice of actively relating instructional.

GloCal Service Learning Contextualized education critical ...

Shared Reading Intervention - Speech-Language Resources · 2016-07-29 · appropriate targets for language intervention. This chapter details shared reading contextualized language

Low-Resource Parsing with Crosslingual Contextualized ...

Topic Modeling with Contextualized Word Representation ...

contextualized Earth Science K-12 Curriculum

[RELO] The Contextualized English Camp

Optimizing Rehabilitation Outcomes Through Contextualized ...

Contextualized Jeopardy

The need for contextualized scientometric analysis

Contextualized online search and research skills

Pooled Contextualized Embeddings for Named Entity …alanakbik.github.io/papers/naacl2019_embeddings.pdfPooled Contextualized Embeddings for Named Entity Recognition Alan Akbik Zalando

Representing Contextualized Data using Semantic Web Toolsbigbear.kaist.ac.kr/~iko/publications/SituatedKnowledge... · 2004-02-03 · Representing Contextualized Data using Semantic

Dynamic Contextualized Word Embeddings