Uses for Automatic Speech Recognition with Diverse English Speakers
-
Upload
mckenzie-norton -
Category
Documents
-
view
17 -
download
0
description
Transcript of Uses for Automatic Speech Recognition with Diverse English Speakers
Uses for Automatic Speech Recognition with Diverse English Speakers
2002 American Speech-Language-Hearing Association Annual Convention
Atlanta, Georgia World Congress Center, Room: A314, Saturday, Nov 23 2002 4:30PM – 5:30PM
Presenters/Authors: Kathleen Eilers Crandall, Ph.D., Paula M. Brown, Ph.D., Donna E. Gustina, and Stephen S. Campbell
National Technical Institute for the DeafRochester Institute of Technology
Seminar – PresentersKathleen Eilers crandall,
Ph.D.Department of English, National Technical Institute for the Deaf, Rochester Institute of Technology
Paula M. Brown, Ph.D., CCC-SLP Department of Speech and Language, National Technical Institute for the Deaf, Rochester Institute of Technology
The Glossograph
• Fay wrote about an experimental mechanical device used to transcribe human speech, and said,
• “… it is not unreasonable to hope that some instrument will yet be contrived …“
Fay, E.A. (1883). The glossograph. American Annals of the Deaf, 28, 67-69.
Sci-Fi or Reality?
"The pen was an archaic instrument, seldom used even for signatures...Apart from very short notes, it was usual to dictate everything into the speak-write…” (Nineteen eighty-four. Orwell, 1949)
Two Projects
• Teacher use of ASR:– English Classroom/Lab Project
• Student use of ASR:– Speech Project
Funded by a grant from the Parsons Foundation of California
English Classroom/Lab Project
English Classroom/Lab Project
Purpose
Investigate direct use of ASR by classroom teacher to learn:
• Is acceptable recognition level attained?
• Under what conditions?– Style of speaking– Communication mode– Language complexity
Related Work
Use of ASR by an intermediary • Intermediary, a ‘captionist,’ re-speaks
professor’s words into a computer• Intermediary summarizes professor’s
words into a computer (‘interpreted speech’)
• Intermediary may use C-print (a shorthand typing system) in combination with ASR http://cprint.rit.edu/
Related Work
Use of ASR by the primary speaker
• iCommunicator™ http://www.myicommunicator.com/product_info.html
• Liberated Learning Environment http://www.liberatedlearning.com (St. Mary’s University, Halifax, Nova Scotia)
Speech Project
Speech Project Intent
• Can ASR become better than a naïve listener?
• Can ASR serve as an effective and motivating feedback system?
Speech Project How ASR Is Used Educationally
Visual displays provide feedback regarding speech production
• Natural way of learning
• Expect feedback to reflect accuracy– Assume if don’t get right picture, you were
wrong
English Classroom/Lab Project
English Classroom/Lab Project
Teacher -- Students• Teacher -- Speaker
– Native speaker of American English– User of ASL as a second language – Trained the ASR equipment
• Students -- Readers – Young adult college students who are deaf or hard-of-
hearing– Reading and writing skills at the lowest quartile of
entering students– Enrolled in basic level English language reading and
writing courses
English Classroom/Lab Project
Evaluation Procedures
• ASR Software: – Dragon Naturally Speaking– IBM ViaVoice– Microsoft Office
• Speaking styles: – Spontaneous conversation– Dictation-like speech
• Communication modes:– Speaking– Simultaneously speaking and signing
English Classroom/Lab
Teacher stationControl systemSmart Board & LCD Projector
Student Stations
English Classroom/Lab Project
Accuracy Needs
• Vary by population and message predictability– New vs. Known information– Fluent readers vs.
Language learners– Reading for pleasure vs. Reading to master new
information
• CLOZE research and prediction of missing information
English Classroom/Lab Project
Results: ASR Software
75%
80%
85%
90%
95%
100%
Dragon ViaVoice XP
Conversation
Dictation
English Classroom/Lab Project
Results: Communication Mode
80%
82%
84%
86%
88%
90%
92%
94%
96%
98%
Simultaneous Commmunication Speech Only
Conversation
Dictation
English Classroom/Lab Project
Results: Language Complexity
82%
84%
86%
88%
90%
92%
94%
96%
98%
< 7th Grade > 7th Grade
Conversation
Dictation
English Classroom/Lab Project
Correcting Text
• Error correction– What to correct – When to correct– How to correct
Multitasking Demands
• Normal tasks for speaker/teacher– Formulating ideas relevant to topic– Attending to learning needs of students – Meeting lipreading and sign language needs
• Added tasks for speaker/teacher – Speaking to produce readable ASR text– Monitoring text– Making corrections
Speech Project
Speech Project
Training Sequence
• Read a paragraph
• Correct and train recognition errors
• Reread paragraph
• Correct and train recognition errors
• Create transfer paragraph or spontaneous speech
• Correct and train recognition errors
Recognition Accuracy
0%
10%
20%30%
40%
50%
60%
70%
80%90%
100%
M Intel F semi-intel F quasi-intel
Improvement Across Sessions
0%
10%
20%
30%
40%
50%
60%
70%
80%
time 1 time 2 time 3 time 4 time 5
Improvement Within Session
65%
70%
75%
80%
85%
90%
95%
Reading 1 Reading 2 Reading 3 Spon Sp
Speech Project
Improvement Evaluated
• Improvement across sessions
• Improvement within a session– Improvement with speaker training– Improvement with ASR training
RecommendationsDiscussionQuestions
Grammatical Correctness
• Is ASR accuracy affected by the grammatical correctness of the user’s speech?
• Student written responses spoken as written: Accuracy – 93.8%
• Student written responses spoken after corrected: Accuracy - 94.3%
Style of Speaking
1. Style of speaking that more closely resembles dictation approaches a usable accuracy rate.
2. Lowering the complexity does not improve accuracy.
Conditions of Use
Direct use of ASR by a language teacher --Useful only under very controlled conditions.• Illustrating the generation of written
language • Demonstrating the use of notes and
outlines to produce written text• Translating selected sign language
utterances into English text during discussions
ASR: Classroom Use
Prepared Outline
Student’s Screen
Teacher’s Screen
Considerations• Training
– Critical to reach over 90% accuracy– Training with conversation
• Corrections– Familiarity with strategies – Dictate, Spell, Right click
• Equipment– Microphone headsets - design, comfort, and size– Demand on computer processor– Effect of optional settings
Language Processing
Teaching/Learning Issues:• Does ASR promote the learning of reading
and writing for Deaf and Hard-of-Hearing students?
• How do students process this information?• Do students attend to multiple inputs?• Can teachers attend to this many tasks
effectively?
More Questions
• Who is at fault?– Speaker or ASR receiver?
• Acceptability of input– Various voices– Nontypical speakers
• User friendliness– Want immediate use
PresentersKathleen Eilers Crandall, Ph.D.Department of English
National Technical Institute for the Deaf
Rochester Institute of Technology Lyndon Baines Johnson Building -
2264
Phone: (585) 475-5111
Fax: (585) 475-6500
Email: [email protected]
Web: http://www.rit.edu/~kecncp
Paula M. Brown, Ph. D., CCC-SLP
Department of Speech and Language
National Technical Institute for the Deaf
Rochester Institute of Technology Lyndon Baines Johnson Building -
3851
Phone: (585) 475-6593 V/TDD
Fax: (585) 475-6500
Email: [email protected]
Web: http://www.rit.edu/~462www/