Speech recognition in MUMIS Eric Sanders (KUN) March 2003.
-
Upload
yosef-dager -
Category
Documents
-
view
219 -
download
0
Transcript of Speech recognition in MUMIS Eric Sanders (KUN) March 2003.
![Page 1: Speech recognition in MUMIS Eric Sanders (KUN) March 2003.](https://reader035.fdocuments.in/reader035/viewer/2022062417/551b1d6c550346cf5a8b567a/html5/thumbnails/1.jpg)
Speech recognition in MUMIS
Eric Sanders (KUN)
March 2003
![Page 2: Speech recognition in MUMIS Eric Sanders (KUN) March 2003.](https://reader035.fdocuments.in/reader035/viewer/2022062417/551b1d6c550346cf5a8b567a/html5/thumbnails/2.jpg)
People involved at KUN
Helmer Strik
Judith Kessens
Mirjam Wester
Janienke Sturm
Eric Sanders
Febe de Wet
Paul Tielen
![Page 3: Speech recognition in MUMIS Eric Sanders (KUN) March 2003.](https://reader035.fdocuments.in/reader035/viewer/2022062417/551b1d6c550346cf5a8b567a/html5/thumbnails/3.jpg)
Overview
Speech data
Baseline recognition
Adding data
Noise robustness
Word types
Conclusions
![Page 4: Speech recognition in MUMIS Eric Sanders (KUN) March 2003.](https://reader035.fdocuments.in/reader035/viewer/2022062417/551b1d6c550346cf5a8b567a/html5/thumbnails/4.jpg)
Examples of Data
Dutch“op _t ogenblik wordt in dit stadion de opstelling voorgelezen”
English“and they wanna make the change before the corner”
German“und die beiden Tore die die Hollaender bekommen hat haben”
From Yugoslavia-The Netherlands
![Page 5: Speech recognition in MUMIS Eric Sanders (KUN) March 2003.](https://reader035.fdocuments.in/reader035/viewer/2022062417/551b1d6c550346cf5a8b567a/html5/thumbnails/5.jpg)
Speech Data
All data
Language Dutch English German
# matches 6 3 21
# words 40,296 34,684 127,265
![Page 6: Speech recognition in MUMIS Eric Sanders (KUN) March 2003.](https://reader035.fdocuments.in/reader035/viewer/2022062417/551b1d6c550346cf5a8b567a/html5/thumbnails/6.jpg)
Speech Data
Match Dutch English German
Yugoslavia – The Netherlands 5,922 10,188 3,998
England – Germany 5,798 13,488 7,280
Test data (#words)
![Page 7: Speech recognition in MUMIS Eric Sanders (KUN) March 2003.](https://reader035.fdocuments.in/reader035/viewer/2022062417/551b1d6c550346cf5a8b567a/html5/thumbnails/7.jpg)
Baseline recognition
PMs: - trained on the other test match
Lex: - based on the other test set- match specific words added
LM: - category LM - based on the other test match- match specific words added
![Page 8: Speech recognition in MUMIS Eric Sanders (KUN) March 2003.](https://reader035.fdocuments.in/reader035/viewer/2022062417/551b1d6c550346cf5a8b567a/html5/thumbnails/8.jpg)
Baseline recognition
83,28
84,9186,84
93,16
85,71 85,21
78
80
82
84
86
88
90
92
94
YugNL EngGer
WE
R (
%)
Dutch
German
English
![Page 9: Speech recognition in MUMIS Eric Sanders (KUN) March 2003.](https://reader035.fdocuments.in/reader035/viewer/2022062417/551b1d6c550346cf5a8b567a/html5/thumbnails/9.jpg)
Adding Data
Extra training data:Dutch = 4 matchesGerman = 19 matchesEnglish = 1 match
Adding training data to train the lexicon and the language models (phone models trained on 1 match)
![Page 10: Speech recognition in MUMIS Eric Sanders (KUN) March 2003.](https://reader035.fdocuments.in/reader035/viewer/2022062417/551b1d6c550346cf5a8b567a/html5/thumbnails/10.jpg)
Adding Data (German)
75
80
85
90
95
0 100.000 200.000 300.000
number of words to train the LM
WE
R (%
)
Yug-NL, lex:1match
Yug-NL, lex:7matches
Yug-NL, lex:19matches
Eng-Ger, lex:7matches
Eng-Ger, lex:19matches
![Page 11: Speech recognition in MUMIS Eric Sanders (KUN) March 2003.](https://reader035.fdocuments.in/reader035/viewer/2022062417/551b1d6c550346cf5a8b567a/html5/thumbnails/11.jpg)
Noise Robustness Dutch English German
![Page 12: Speech recognition in MUMIS Eric Sanders (KUN) March 2003.](https://reader035.fdocuments.in/reader035/viewer/2022062417/551b1d6c550346cf5a8b567a/html5/thumbnails/12.jpg)
Noise Robustness
0
10
20
30
40
50
60
70
80
90
100
0 5 10 15 20 25 30
SNR (dB)
WER
(%)
YugNL_NL
EngGer_NL
YugNL_ENG
EngGer_ENG
YugNL_GER (A)
YugNL_GER (B)
Eng-Ger_GER
![Page 13: Speech recognition in MUMIS Eric Sanders (KUN) March 2003.](https://reader035.fdocuments.in/reader035/viewer/2022062417/551b1d6c550346cf5a8b567a/html5/thumbnails/13.jpg)
Noise Robustness
Matching acoustic properties of train and test material
Training SNR dependent phone models
Applying noise robust feature extraction:Histogram Normalisation & FTNR
Possible solutions:
![Page 14: Speech recognition in MUMIS Eric Sanders (KUN) March 2003.](https://reader035.fdocuments.in/reader035/viewer/2022062417/551b1d6c550346cf5a8b567a/html5/thumbnails/14.jpg)
Noise RobustnessYUG-NL, very noisy
66
68
70
72
74
76
78
80
82
Semi-clean Noisy Very noisy
WE
R (
%)
Baseline
HN
HN + FTNR
![Page 15: Speech recognition in MUMIS Eric Sanders (KUN) March 2003.](https://reader035.fdocuments.in/reader035/viewer/2022062417/551b1d6c550346cf5a8b567a/html5/thumbnails/15.jpg)
Word Types
Not all words are equally important for an information retrieval task
Categories:- function words (prepositions, pronouns)- application specific words (player names)- other content words
WERs for different categories
![Page 16: Speech recognition in MUMIS Eric Sanders (KUN) March 2003.](https://reader035.fdocuments.in/reader035/viewer/2022062417/551b1d6c550346cf5a8b567a/html5/thumbnails/16.jpg)
0
20
40
60
80
100
NL Ger Eng NL Ger Eng
YugNL EngGer
WER
(%
) all
content w ords
function w ords
player names
Word Types
![Page 17: Speech recognition in MUMIS Eric Sanders (KUN) March 2003.](https://reader035.fdocuments.in/reader035/viewer/2022062417/551b1d6c550346cf5a8b567a/html5/thumbnails/17.jpg)
Conclusions
SNR values explain the WERs to a large extent
More data is not necessarily better
Applying noise robust features leads to best results
Overall WERs are very high, but application specific words are recognised relatively well
![Page 18: Speech recognition in MUMIS Eric Sanders (KUN) March 2003.](https://reader035.fdocuments.in/reader035/viewer/2022062417/551b1d6c550346cf5a8b567a/html5/thumbnails/18.jpg)
The end