Post on 02-Jun-2018
8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise
1/30
ZCR Based Identification of Voiced
Unvoiced and Silent Parts of SpeechSignal in Presence of Background
Noise
Presented by
Sivaranjan Goswami, B. Tech. 4thYear
Department of Electronics and Communication Engineering
Don Bosco College of Engineering and Technology
Assam Don Bosco University
Guwahati, Assam (India)
Contact: sivgos@gmail.com
mailto:sivgos@gmail.commailto:sivgos@gmail.com8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise
2/30
Outline of Presentation
Introduction and a Brief Overview
Speech Signal
Experimental Details Proposed Algorithms
Experimental Results
Discussion and Bibliography
2
8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise
3/30
8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise
4/30
Introduction (1 of 2)
The identification of voiced, unvoiced and
silent parts of speech signal is an important
step of speech processing.
It can be easily achieved by estimating short-
time zero crossing rate and short-time average
magnitude if background is quiet.
However in the presence of background noise,
it is a challenging task.
4
8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise
5/30
Introduction (2 of 2)
A simple algorithm is designed based onshort-time zero-crossing-rate (ZCR) and short-
time average magnitude to identify the
voiced, unvoiced and silent frames of speechin quiet background.
The algorithm is then improved to serve the
same purpose in the presence of realbackground noise.
The second algorithm is found to reduce the
errors of the first algorithm by 60% (approx.).5
8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise
6/30
A Brief Overview
The first algorithm is totally based on short-timezero-crossing-rate (ZCR) and short-time average
magnitude to identify the voiced, unvoiced and silent
frames of speech.
The modified algorithm processes only background
noise for 1 second at the beginning and creates a
reference of the background noise.
The noise reference is used for separation of voicedor unvoiced samples from samples containing only
noise.
6
8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise
7/30
Speech Signal
7
8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise
8/30
Human Speech Production System
8
8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise
9/30
Types of Excitation
Voiced
Unvoiced
Mixed Plosive
Whisper
Silent
9
8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise
10/30
Types of Excitation
Voiced=High Amplitude Low Frequency (ZCR), quasi periodic pulses
Unvoiced= Random signal with low amplitude and high ZCR
Mixed Plosive
Whisper
Silent
Only Voiced and Unvoiced excitations are of our interest.
10
8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise
11/30
Experimental Details
11
8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise
12/30
Calculation of Zero-Crossing
Rate(ZCR)
The ZCR of a signal within a short time interval
t has been found using the equation:
Where N is the number of times the polarityof the signal is changed during t
)1....(....................2 t
NZCRaverage
12
8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise
13/30
Decision of Voiced and Unvoiced
SpeechFor every time-frame, the average ZCR, fis calculated and the
power, xcorresponding to the frequency fis calculated using
Fourier Transform. Then the result is subjected to the
threshold condition given in relations 2 and 3,
Unvoiced: fN aand |xN| b .(2)
Voiced: fN cand |xN| d ..(3)
where, the subscript N denotes normalized value and a, b, c, d
are user defined threshold values between 0 and 1.
13
8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise
14/30
Proposed Algorithms
14
8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise
15/30
Algorithm For Quiet BackgroundStart
Calculate ZCR ofa 20 ms frame
Calculate power
using Fourier
Transform
Store ZCR and
power in memory
Are all
framesconsidered
?
Normalize ZCR and
power of a frame
Apply equations 2
and 3 to decide
voiced /unvoiced
Mark the frame assilent if it is neither
voiced nor unvoiced
Are all
frames
considered?
NoNo
Yes
Yes
Display result
End
Process 15
8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise
16/30
Assumptions for Background Noise
The Algorithm-1 is modified for noisy
background under the following assumptions:
1. The first 1 second of the signal contains only
background noise.
2. The frequency of the noise source is different
from the vocal tract frequency or ZCR.
3. The human voice has dominating amplitude,since mouth is closer to the microphone than
the noise source.
16
8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise
17/30
ZCR of Voiced Speech is
Independent of Noise
As shown in the figure, theZCR of voiced speech is
independent of noise,
under assumption 3.
17
8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise
18/30
Distinguishing Noise and Unvoiced
Speech It is found that when Algorithm-1 is subjected to
speech with background noise, many of the silent
frames are also marked as unvoiced because of their
similar amplitude and ZCR. The modified algorithm resolves this problem under
assumption 1 and 2.
The first 1 second of the recording is pure
background noise. Hence, a noise reference can be
created using the ZCR information of the first 1
second of the recorded speech.
18
8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise
19/30
Algorithm for Creating the Noise ReferenceStart
Calculate ZCR of
a 20 ms frame
Store ZCR in Noise
Reference vector
Are all
frames
considered
?No
YesDelete redundant time-
frames with repeated
ZCRs to reduce the size of
Noise Reference vector
End
Process 19
8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise
20/30
Modified Algorithm for Noisy Background
Start
Calculate ZCR of
a 20 ms frame
Calculate power
using Fourier
Transform
Store ZCR and
power in memory
Are allframes
considered
?
Normalize ZCR and
power of a frame
No
Yes
Is the ZCR
is close to
any ZCR in
the noise
reference
?
Apply equation 3
to decide
unvoiced/silent
Mark it assilent
1
Are allframes
considered
?
Yes
Display result
End
Process
1
2
2
Yes
No
No
20
Is it
marked
voiced by
equation
2?
No
1
Yes
Update Noise
Reference
8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise
21/30
Experimental Results
21
8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise
22/30
Case 1: Quiet Background
For quiet background the 1st algorithm and the modified algorithm gives
similar result.
1stAlgorithm Modified Algorithm
22
8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise
23/30
Case 2: Additive White Gaussian
Noise (AWGN)
1stAlgorithm Modified Algorithm
In this case, the 1stalgorithm gives poor result, the second algorithm improves
the result, still, the accuracy is poor, since AWGN has uniform spectral power
density. 23
8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise
24/30
Case 3: Real Noise
1stAlgorithm Modified Algorithm
In this case, the 1stalgorithm gives poor result, the second algorithm improves
the result since most of the assumptions are satisfied.
24
8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise
25/30
Comparison of the Two Algorithms
Table: Percentage of Silent Frames marked Unvoiced
Background First Algorithm Modified
AlgorithmNo Noise 0% 0%
AWGN 80% 30%
Natural Noise 58% 23%
25
8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise
26/30
Discussion and Bibliography
26
8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise
27/30
Discussion
Advantages: Simple to Implement
Accuracy is high
The information is found to be useful in speech
enhancement.
Drawbacks:
The first 1 second must contain only background noise.
The algorithm involves two loops, hence it needs further
modification in order to be implemented in real time.
It may not give accurate result if the noise contains
human voice, because the noise will also contain voiced
and unvoiced parts in that case.
27
8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise
28/30
Bibliography (1 of 2)1. Bachu R.G., Kopparthi S., Adapa B., Barkana B.D Separation of Voiced and
Unvoiced using Zero crossing rate and Energy of the Speech Signa l, ElectricalEngineering Department; School of Engineering, University of Bridgeport;
available at http://audio-fingerprint.googlecode.com/svn-
history/r62/trunk/referencias/ASEE12008 0044 paper.pdf
2. Thierry Dutoit A (Short) Introduction to Speech Processing, ailable at
http://tcts.fpms.ac.be/cours/1005-07-08/speech/icme2002 intro.pdf
3. John R. Deller, Jr. John H. L. Hansen and John G. Proakis. Discrete-Time
Processing of Speech Signals, JOHN WILEY and SONS, INC; New York
4. Douglas, S.C.; Chapter 18, Introduction to Adaptive Filters of Digital Signal
Processing Handbook; Ed. Vijay K. Madisetti and Douglas B. Williams; Boca
Raton: CRC Press LLC, 1999 available at http://www.dsp-
book.narod.ru/DSPMW/18.PDF5. S. Ghaemmaghami, M. Deriche, and B. Boashash A new approach to pitch and
voicing detection through spectrum periodicity measurement; 1997 IEEE
TENCON - Speech and Image Technologies for Computing and
Telecommunications, pp: 743-746
28
8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise
29/30
8/10/2019 ZCR Based Identification of Voiced Unvoiced and Silent Parts of Speech Signal in Presence of Background Noise
30/30
30
To download full paper:
https://gauhati.academia.edu/SivaranjanGoswami
https://gauhati.academia.edu/SivaranjanGoswamihttps://gauhati.academia.edu/SivaranjanGoswami