IIT Bombay [email protected] ICSCI 2004, Hyderabad, India, 12-15 Feb’ 04 Introduction Analysis /...
-
Upload
audrey-townsend -
Category
Documents
-
view
214 -
download
0
Transcript of IIT Bombay [email protected] ICSCI 2004, Hyderabad, India, 12-15 Feb’ 04 Introduction Analysis /...
I
IT B
omba
IC
SCI 2
004,
Hyd
erab
ad, I
ndia
, 12-
15 F
eb’ 0
4
1• Introduction •Analysis / synthesis •Spec. Sub. •Methodology •Results • Conclusion and Future plan
ICSCI 2004, Hyderabad, India, 12-15 Feb’ 04
USE OF HARMONIC PLUS NOISE MODELFOR REDUCTION OF SELF LEAKAGE IN
ELECTROALARYNGEAL SPEECH
Parveen K. Lehana1, Prem C. Pandey2,Santosh S. Pratapwar2, Rockey Gupta1
1University of Jammu, India2IIT Bombay, India
I
IT B
omba
IC
SCI 2
004,
Hyd
erab
ad, I
ndia
, 12-
15 F
eb’ 0
4
2• Introduction •Analysis / synthesis •Spec. Sub. •Methodology •Results • Conclusion and Future plan
ABSTRACTArtificial larynx is an assistive device for providing excitation to vocal tract as a substitute to a dysfunctional or removed larynx. The speech generated by electrolarynx, an external vibrator held against the neck tissue, is not natural and most of the time is unintelligible because of the improper shape of the excitation pulses and presence of a background noise caused by sound leakage from the vibrator. The objective of this paper is to enhance the intelligibility of electrolaryngeal speech by reducing the background noise using harmonic plus noise model (HNM). The alaryngeal speech and the leakage signal are analyzed using HNM and average harmonic spectrum of the leakage noise is subtracted from the harmonic magnitude spectrum of the noisy speech in each frame. HNM synthesis is carried out retaining the original phase spectra. Investigations show that the output is more natural and intelligible as compared to input speech signal and the enhanced signal obtained from spectral subtraction without HNM analysis and synthesis.
I
IT B
omba
IC
SCI 2
004,
Hyd
erab
ad, I
ndia
, 12-
15 F
eb’ 0
4
3• Introduction •Analysis / synthesis •Spec. Sub. •Methodology •Results • Conclusion and Future plan
PRESENTATION OVERVIEW
Introduction HNM Analysis / synthesis Spectral subtraction with HNM Methodology Results Conclusion & future plan
I
IT B
omba
IC
SCI 2
004,
Hyd
erab
ad, I
ndia
, 12-
15 F
eb’ 0
4
4• Introduction •Analysis / synthesis •Spec. Sub. •Methodology •Results • Conclusion and Future plan
INTRODUCTION (1/5)
NATURAL SPEECH PRODUCTION Glottal excitation to vocal tract
I
IT B
omba
IC
SCI 2
004,
Hyd
erab
ad, I
ndia
, 12-
15 F
eb’ 0
4
5• Introduction •Analysis / synthesis •Spec. Sub. •Methodology •Results • Conclusion and Future plan
INTRODUCTION (2/5 )
))(exp()()(
1tjte
tK
kk
)];(exp[);();( tfjtfGtfH
)(
1
)));(()(());(()(tK
k
ttftjk kkettfGts
)()(
1)()( ti
tK
kk ketAts
)));(()()( ttftt kkk
If excitation and vocal tract transfer functions are
then output speech is
and can be simplified to
where
&
I
IT B
omba
IC
SCI 2
004,
Hyd
erab
ad, I
ndia
, 12-
15 F
eb’ 0
4
6• Introduction •Analysis / synthesis •Spec. Sub. •Methodology •Results • Conclusion and Future plan
INTRODUCTION (3/5 )
External electronic larynx (transcervical electrolarynx)
Excitation to vocal tract from external vibrator (creates background noise)
I
IT B
omba
IC
SCI 2
004,
Hyd
erab
ad, I
ndia
, 12-
15 F
eb’ 0
4
7• Introduction •Analysis / synthesis •Spec. Sub. •Methodology •Results • Conclusion and Future plan
INTRODUCTION (4/5 )
External electronic larynx (transcervical electrolarynx)
Leakage path:- back side of membrane/plate- improper tissue coupling
I
IT B
omba
IC
SCI 2
004,
Hyd
erab
ad, I
ndia
, 12-
15 F
eb’ 0
4
8• Introduction •Analysis / synthesis •Spec. Sub. •Methodology •Results • Conclusion and Future plan
INTRODUCTION (5/5 )
RESEARCH OBJECTIVEThe objective of this paper is to enhance the intelligibility of electrolaryngeal speech by reducing the background noise using harmonic plus noise model (HNM).
I
IT B
omba
IC
SCI 2
004,
Hyd
erab
ad, I
ndia
, 12-
15 F
eb’ 0
4
9• Introduction •Analysis / synthesis •Spec. Sub. •Methodology •Results • Conclusion and Future plan
HNM ANALYSIS / SYNTHESIS (1/3)
HARMONIC PLUS NOISE MODEL(Stylianou, 1995; 2001)
Speech signal divided into: • harmonic part • noise part
Harmonic part
Noise part
Parameters: • Max. voiced frequency• V/UV & pitch• Harm. ampl. & phases• Noise parameters
( )
0( ) Re ( )exp{ [ ( ) ]}L t t
l ll o
s t a t j l d
( ) ( )[ ( ; )* ( )]n t w t h t b t
I
IT B
omba
IC
SCI 2
004,
Hyd
erab
ad, I
ndia
, 12-
15 F
eb’ 0
4
10• Introduction •Analysis / synthesis •Spec. Sub. •Methodology •Results • Conclusion and Future plan
ANALYSIS / SYNTHESIS WITH HNM (2/3)
ANALYSIS
I
IT B
omba
IC
SCI 2
004,
Hyd
erab
ad, I
ndia
, 12-
15 F
eb’ 0
4
11• Introduction •Analysis / synthesis •Spec. Sub. •Methodology •Results • Conclusion and Future plan
ANALYSIS / SYNTHESIS WITH HNM (3/3)
SYNTHESIS
I
IT B
omba
IC
SCI 2
004,
Hyd
erab
ad, I
ndia
, 12-
15 F
eb’ 0
4
12• Introduction •Analysis / synthesis •Spec. Sub. •Methodology •Results • Conclusion and Future plan
SPECTRAL SUBTRACTION WITH HNM
x(n) = e(n)*hv(n) + e(n)*hl(n) Taking DFT: Xn(ej) = En(ej) [Hvn(ej) + Hln(ej) ]
Assumption:hv(n) & h(n) uncorrelated Xn(ej) 2 = En(ej) 2[Hvn(ej) 2 + Hln(ej) 2]
During non-speech segment: s(n) = 0Xn(ej) 2 = Ln(ej) 2 = En(ej) 2 Hln(ej) 2L(ej) 2 : averaged over many segments
Yn(k) = Xn(k) – L(k) Yn(k) = Yn(k) if Yn(k) L(k)
L(k) otherwise
(: subtraction, : spectral floor, : exp. factors)
Here n is frame index and k is harmonic index
I
IT B
omba
IC
SCI 2
004,
Hyd
erab
ad, I
ndia
, 12-
15 F
eb’ 0
4
13• Introduction •Analysis / synthesis •Spec. Sub. •Methodology •Results • Conclusion and Future plan
METHODOLOGY
STEPS FOR HNM BASED SPECTRAL SUBTRACTION
• Non speech segments analyzed
• Average harmonic spectrum obtained
• Noisy speech analyzed and average harmonic spectrum of noise
subtracted • Resynthesis with noisy speech phase
spectra
For comparison, spectral subtraction using DFT derived magnitude is also carried out.
I
IT B
omba
IC
SCI 2
004,
Hyd
erab
ad, I
ndia
, 12-
15 F
eb’ 0
4
14• Introduction •Analysis / synthesis •Spec. Sub. •Methodology •Results • Conclusion and Future plan
RESULTS (1/2)
Both DFT derived and HNM based harmonic
spectrum significantly reduce the background noiseBoth require empirical selection of the parameters DFT derived spectral subtraction more effective
during non-speechHNM based spectral subtraction more effective
during speech with less musical noise and enhanced
formant structureSaving in parameters and processing time in HNM
based spectral subtraction
I
IT B
omba
IC
SCI 2
004,
Hyd
erab
ad, I
ndia
, 12-
15 F
eb’ 0
4
15• Introduction •Analysis / synthesis •Spec. Sub. •Methodology •Results • Conclusion and Future plan
a) Recorded speech signal
b) Processed (DFT derived) ( = 2, = 0.001, and =1)
c) Processed (HNM derived) ( = 1, = 0.1, and = 1)
RESULTS (2/2)
I
IT B
omba
IC
SCI 2
004,
Hyd
erab
ad, I
ndia
, 12-
15 F
eb’ 0
4
16• Introduction •Analysis / synthesis •Spec. Sub. •Methodology •Results • Conclusion and Future plan
CONCLUSION HNM based method provides an effective subtraction of noise during the speech and hence can be used for improving intelligibility of electrolaryngeal speech.
FURTHER PLAN
QBNE combined with HNM based spectral subtraction Phase resynthesis from enhanced magnitude spectrumEffect of artificial jitter in pitch on speech quality
I
IT B
omba
IC
SCI 2
004,
Hyd
erab
ad, I
ndia
, 12-
15 F
eb’ 0
4
17• Introduction •Analysis / synthesis •Spec. Sub. •Methodology •Results • Conclusion and Future plan