Speech Coding Submitted To: Dr. Mohab Mangoud Submitted By: Nidal Ismail.
-
Upload
francis-simon -
Category
Documents
-
view
230 -
download
1
Transcript of Speech Coding Submitted To: Dr. Mohab Mangoud Submitted By: Nidal Ismail.
Outline1. Introduction
Overview of Speech Coding Properties of a Speech Coder Modeling the Speech Production System Linear Prediction
2. Different Coding Techniques Waveform Coders Parametric Coders Hybrid Coders Coding Standards
3. PCM & DPCM
4. Linear Predictive Coding
5. Conclusion
6. References
1. Introduction
Block Diagram of a speech coding system
Sampling Frequency = 8kHzNumber of Bits per sample = 8Bit Rate = 8 . 8kHz = 64 kbps
Overview of Speech Coding
Properties of a Speech Coder
Low Bit-Rate High Speech Quality Robustness Across Different Speakers /
Languages Robustness in the Presence of Channel
Errors Good Performance on Non speech
Signals Low Memory Size and Low Computational
Complexity Low Coding Delay
1. Introduction
1. Introduction
Modeling the Speech Production System
Autocorrelation values for the signal frames. Left: Unvoiced. Right: Voiced.
1. Introduction
Modeling the Speech Production System• Signal from a source is
filtered by a time-varying filter with resonant properties similar to that of the vocal tract.
• The gain controls Av and AN determine the intensity of voiced and unvoiced excitation.
• The frequency of higher formant are attenuated by -12 dB/octave (due to the nature of our speech organs).
1. Introduction
Linear Prediction
Linear prediction as system identification.
Linear prediction is a practical method of spectrumestimation, where the PSD can be captured using a few coefficients.
These coefficients or linear prediction coefficients can be used to construct the synthesis filter.
1. Introduction
Linear Prediction
Linear prediction as system identification.
Predicted Signal
Prediction error
Outline1. Introduction
Overview of Speech Coding Properties of a Speech Coder Modeling the Speech Production System Linear Prediction
2. Different Coding Techniques Waveform Coders Parametric Coders Hybrid Coders Coding Standards
3. PCM & DPCM
4. Linear Predictive Coding
5. Conclusion
6. References
2. Different Coding Techniques
Waveform Coders Original shape of the signal waveform is
preserved Coders can be applied to any signal source Coders are better suited for high bit-rate
coding, since performance drops sharply with decreasing bit-rate.
In practice, these coders work best at a bit-rate of 32 kbps and higher.
Some examples of this class include various kinds of pulse code modulation (PCM) and adaptive differential PCM (ADPCM)
Parametric Coders
The speech signal is generated from a model, which is controlled by some parameters.
Parameters are estimated from the input speech signal No attempt to preserve the original shape of the
waveform Accuracy and sophistication of the mode account for
the quality. The most successful model is based on linear
prediction. In this approach, the human speech production mechanism is summarized using a time-varying filter ( with the coefficients of the filter found using the linear prediction analysis procedure.)
This class of coders works well for low bit-rate. Bit-rate is in the range of 2 to 5 kbps. Example coders of this class include linear prediction
coding (LPC) and mixed excitation linear prediction (MELP).
2. Different Coding Techniques
Hybrid Coders
Combines the strength of a waveform coder with that of a parametric coder
As in waveform coders, an attempt is made to match the original signal with the decoded signal in the time domain
This class dominates the medium bit-rate coders, with the code-excited linear prediction (CELP) algorithm and its variants the most outstanding representatives
A hybrid coder tends to behave like a waveform coder for high bit-rate, and like a parametric coder at low bit-rate, with fair to good quality for medium bit-rate.
2. Different Coding Techniques
Outline1. Introduction
Overview of Speech Coding Properties of a Speech Coder Modeling the Speech Production System Linear Prediction
2. Different Coding Techniques Waveform Coders Parametric Coders Hybrid Coders Coding Standards
3. PCM & DPCM
4. Linear Predictive Coding
5. Conclusion
6. References
3. PCM & DPCM
Pulse Code Modulation
Invented 1926, deployed 1962.
Basic idea: assign smaller quantization stepsize for small-amplitude regions and larger quantization stepsize for large-amplitude regions (Non-uniform Quantization)
Two types of nonlinear compressing functions• Mu-law adopted by North American telecommunications
systems• A-law adopted by European telecommunications systems
Mu-law(A-law) compresses the signal to 8 bits/sample or 64Kbits/second (without compandor, we would need 12bits/sample)
-law
3. PCM & DPCM
where A is the peak-input magnitude and is a constant that controls the degree of compression.
Pulse Code Modulation
A-law
3. PCM & DPCM
Pulse Code Modulation
with Ao a constant that controls the degree of compression.
3. PCM & DPCM
Differential Pulse Code Modulation
Since speech signals are slowly varying, it is possible to eliminate the temporal redundancy by prediction
Quantizing the prediction-error Signal
i[n] are entered into the quantizer’s decoder to obtain the quantized prediction error, which is combined with the prediction xp[n] to form the quantized input.
DPCM encoder (top) and decoder (bottom)
3. PCM & DPCM
Differential Pulse Code Modulation
PCM quantized Signal (left) and Quantization error (right)
DPCM quantized Signal (left) and Quantization error (right)
Comparison between PCM and DPCM
Half the bit rate was used in DPCM and a higher SNR was achieved
Outline1. Introduction
Overview of Speech Coding Properties of a Speech Coder Modeling the Speech Production System Linear Prediction
2. Different Coding Techniques Waveform Coders Parametric Coders Hybrid Coders Coding Standards
3. PCM & DPCM
4. Linear Predictive Coding
5. Conclusion
6. References
4. Linear Predictive Coding
Linear prediction coding relies on a highly simplified model for speech production
The LPC model of speech production
Parameters of the model are estimated from the speech samples
4. Linear Predictive Coding
The LPC model of speech production
Parameters of the model are estimated from the speech samplesThese include:
Voicing: whether the frame is voiced or unvoiced.Gain: mainly related to the energy level of the frame.Filter coefficients: specify the response of the synthesis filter.Pitch period: in the case of voiced frames, time length between consecutive excitation impulses.
4. Linear Predictive Coding
By carefully allocating bits for each parameter so as to minimize distortion, an impressive compression ratio can be achieved.
For instance, the bit-rate of 2.4kbps for the FS1015 coder is 53.3 times lower than the corresponding bit-rate for 16-bit PCM
Estimating the parameters is the responsibility of the encoder.
The decoder takes the estimated parameters and uses the speech production model to synthesize speech
4. Linear Predictive Coding The Voicing Detector is a key element to successful coding. The purpose of the voicing detector is to classify a given frame as voiced or unvoiced.Measurements that a voicing detector relies on toaccomplish its task :
Energyor
Zero Crossing Rate
Prediction Gain
4. Linear Predictive Coding
Top left: A speech waveform. Top right: Magnitude sum function. Bottom left: Zero crossing rate. Bottom right: Prediction gain.
4. Linear Predictive Coding
Bandwidth: 2.4kbpsSamples/frame : 180 samplesFrame Size: 22.5ms = 44.44 frames/sec
4. Linear Predictive Coding
Speech Coder Standard
FS1015-LPC10Coefficient 10
FS1016-CELPCode Excitation
MELPMixed Excitation
IS-54 VCELPVector Sum Excited
IS-96 QCELPQualComm Code Excited
LD-CELP G.728Low-Delay Code-Excited
G.729 CS-ACELPConjugate-structure Algebraic-Code-Excited
5. Conclusion
An overview of speech coding was introduced with a brief explanation of the speech production model. Properties of different coding techniques were also co0mpared. For wire line transmission coding, PCM and DPCM were covered. Linear Prediction Coding which is a basic for modern wireless systems was also introduced.