INTRODUCTION Sibilant speech is aperiodic. the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the...
-
Upload
aubrey-bradley -
Category
Documents
-
view
216 -
download
0
Transcript of INTRODUCTION Sibilant speech is aperiodic. the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the...
![Page 1: INTRODUCTION Sibilant speech is aperiodic. the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ / we present a sibilant.](https://reader035.fdocuments.in/reader035/viewer/2022062717/56649e315503460f94b22891/html5/thumbnails/1.jpg)
![Page 2: INTRODUCTION Sibilant speech is aperiodic. the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ / we present a sibilant.](https://reader035.fdocuments.in/reader035/viewer/2022062717/56649e315503460f94b22891/html5/thumbnails/2.jpg)
SIBILANT SPEECH DETECTION IN NOISEBY: HOSEIN BITARAFSUPERVISOR: DR. NASERSHARIF
![Page 3: INTRODUCTION Sibilant speech is aperiodic. the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ / we present a sibilant.](https://reader035.fdocuments.in/reader035/viewer/2022062717/56649e315503460f94b22891/html5/thumbnails/3.jpg)
INTRODUCTION
Sibilant speech is aperiodic. the fricatives /s/, /ʃ/, /z/ and /Ʒ/ and the
affricatives /tʃ/ and /dƷ/ we present a sibilant detection algorithm
robust to high levels of noise
![Page 4: INTRODUCTION Sibilant speech is aperiodic. the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ / we present a sibilant.](https://reader035.fdocuments.in/reader035/viewer/2022062717/56649e315503460f94b22891/html5/thumbnails/4.jpg)
Gaussian for noisy speech signal
Xk,i = power K = frequency i = time-frame µk,i = mean power
![Page 5: INTRODUCTION Sibilant speech is aperiodic. the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ / we present a sibilant.](https://reader035.fdocuments.in/reader035/viewer/2022062717/56649e315503460f94b22891/html5/thumbnails/5.jpg)
PSD for /ʃ/
![Page 6: INTRODUCTION Sibilant speech is aperiodic. the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ / we present a sibilant.](https://reader035.fdocuments.in/reader035/viewer/2022062717/56649e315503460f94b22891/html5/thumbnails/6.jpg)
Log-likelihood
µk,N1 = µk,N2 = ak
µk,S = ak + bk
![Page 7: INTRODUCTION Sibilant speech is aperiodic. the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ / we present a sibilant.](https://reader035.fdocuments.in/reader035/viewer/2022062717/56649e315503460f94b22891/html5/thumbnails/7.jpg)
Maximizing the log-likelihood
74% of sibilant within 60 and 130 ms. |t| < 30 ms high probability sibilant |t| > 65 ms high probability outside the
sibilant. reduces contribution of the transition region 30 ms < |t| < 65 ms
![Page 8: INTRODUCTION Sibilant speech is aperiodic. the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ / we present a sibilant.](https://reader035.fdocuments.in/reader035/viewer/2022062717/56649e315503460f94b22891/html5/thumbnails/8.jpg)
Maximizing the log-likelihood
![Page 9: INTRODUCTION Sibilant speech is aperiodic. the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ / we present a sibilant.](https://reader035.fdocuments.in/reader035/viewer/2022062717/56649e315503460f94b22891/html5/thumbnails/9.jpg)
Maximizing the log-likelihood
![Page 10: INTRODUCTION Sibilant speech is aperiodic. the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ / we present a sibilant.](https://reader035.fdocuments.in/reader035/viewer/2022062717/56649e315503460f94b22891/html5/thumbnails/10.jpg)
Maximizing the log-likelihood
![Page 11: INTRODUCTION Sibilant speech is aperiodic. the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ / we present a sibilant.](https://reader035.fdocuments.in/reader035/viewer/2022062717/56649e315503460f94b22891/html5/thumbnails/11.jpg)
Estimate noise and siblant
![Page 12: INTRODUCTION Sibilant speech is aperiodic. the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ / we present a sibilant.](https://reader035.fdocuments.in/reader035/viewer/2022062717/56649e315503460f94b22891/html5/thumbnails/12.jpg)
Estimated sibilant mean power
![Page 13: INTRODUCTION Sibilant speech is aperiodic. the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ / we present a sibilant.](https://reader035.fdocuments.in/reader035/viewer/2022062717/56649e315503460f94b22891/html5/thumbnails/13.jpg)
Maximum filter
W = 30
![Page 14: INTRODUCTION Sibilant speech is aperiodic. the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ / we present a sibilant.](https://reader035.fdocuments.in/reader035/viewer/2022062717/56649e315503460f94b22891/html5/thumbnails/14.jpg)
Normalization
To make the estimate independent of the overall speech level
![Page 15: INTRODUCTION Sibilant speech is aperiodic. the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ / we present a sibilant.](https://reader035.fdocuments.in/reader035/viewer/2022062717/56649e315503460f94b22891/html5/thumbnails/15.jpg)
Gaussian Mixture Model
For each frame has two Gaussian mix-ture models (GMMs):
one trained on non-sibilant speech and the other on sibilant speech.
![Page 16: INTRODUCTION Sibilant speech is aperiodic. the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ / we present a sibilant.](https://reader035.fdocuments.in/reader035/viewer/2022062717/56649e315503460f94b22891/html5/thumbnails/16.jpg)
EXPERIMENTS
Filter for1.5 kHz to 8 kHz. The weighting function used for three
Hamming windows
![Page 17: INTRODUCTION Sibilant speech is aperiodic. the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ / we present a sibilant.](https://reader035.fdocuments.in/reader035/viewer/2022062717/56649e315503460f94b22891/html5/thumbnails/17.jpg)
GMMs
The input for the GMMs was a 14-component vector
containing the estimated sibilant power spectrum from
1.5 kHz to 8 kHz every 500 Hz
![Page 18: INTRODUCTION Sibilant speech is aperiodic. the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ / we present a sibilant.](https://reader035.fdocuments.in/reader035/viewer/2022062717/56649e315503460f94b22891/html5/thumbnails/18.jpg)
Result
White Gaussian noise was added to the speech files
it is more difficult to detect sibilants in white noise than in other typical stationary noise
![Page 19: INTRODUCTION Sibilant speech is aperiodic. the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ / we present a sibilant.](https://reader035.fdocuments.in/reader035/viewer/2022062717/56649e315503460f94b22891/html5/thumbnails/19.jpg)
Result
Pmiss = miss probability
Pfa = false alarm probability
![Page 20: INTRODUCTION Sibilant speech is aperiodic. the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ / we present a sibilant.](https://reader035.fdocuments.in/reader035/viewer/2022062717/56649e315503460f94b22891/html5/thumbnails/20.jpg)
Result
![Page 21: INTRODUCTION Sibilant speech is aperiodic. the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ / we present a sibilant.](https://reader035.fdocuments.in/reader035/viewer/2022062717/56649e315503460f94b22891/html5/thumbnails/21.jpg)
Result
![Page 22: INTRODUCTION Sibilant speech is aperiodic. the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ / we present a sibilant.](https://reader035.fdocuments.in/reader035/viewer/2022062717/56649e315503460f94b22891/html5/thumbnails/22.jpg)
CONCLUSIONS
we have presented a sibilant detection algorithm with noise
sibilant mean power estimation stage likelihood ratio of two GMMs, Test in TIMIT . 80% classification accuracy for positive
SNRs.
![Page 23: INTRODUCTION Sibilant speech is aperiodic. the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ / we present a sibilant.](https://reader035.fdocuments.in/reader035/viewer/2022062717/56649e315503460f94b22891/html5/thumbnails/23.jpg)
For Future
it is possible that its classification accuracy could be further improved by applying temporal constraints to the classification decisions.
![Page 24: INTRODUCTION Sibilant speech is aperiodic. the fricatives /s/, / ʃ /, /z/ and / Ʒ / and the affricatives /t ʃ / and /d Ʒ / we present a sibilant.](https://reader035.fdocuments.in/reader035/viewer/2022062717/56649e315503460f94b22891/html5/thumbnails/24.jpg)
Thank you