Context and Motivation · • A uniform linear quantizer is called Pulse Code Modulation(PCM). •...

28
Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 2 Context and Motivation What : Find an efficient representation of speech so that it can be transmitted with a minimum bandwidth, depending on the desired quality. How : Exploit the redundancy of the speech waveform. Applications : Telephony, PBX Wireless/Cellular Telephony Internet Telephony Speech Storage (Automated call-centers) Text-to-speech (machine generated speech)

Transcript of Context and Motivation · • A uniform linear quantizer is called Pulse Code Modulation(PCM). •...

Page 1: Context and Motivation · • A uniform linear quantizer is called Pulse Code Modulation(PCM). • Pulse code modulation (PCM): Encoding the quantized signals into a digital word

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 1

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 2

Context and Motivation

• What : Find an efficient representation of speech so that it can be transmitted with a minimum bandwidth, depending on the desired quality.

• How : Exploit the redundancy of the speech waveform.

• Applications :

– Telephony, PBX

– Wireless/Cellular Telephony

– Internet Telephony

– Speech Storage (Automated call-centers)

– Text-to-speech (machine generated speech)

Page 2: Context and Motivation · • A uniform linear quantizer is called Pulse Code Modulation(PCM). • Pulse code modulation (PCM): Encoding the quantized signals into a digital word

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 3

Types of CodersSpeech Coders

Waveform Coders Vocoders

Time Domain : PCM. ADPCM

Frequency Domain : Sub-band coders,

Adaptive transform coder

Linear Predictive Coder Formant Coders

• Waveform based coders : Preserve the signal waveform, not the speech.– Pulse Coded Modulation (PCM)– Differential PCM (DPCM)– Adaptive DPCM (ADPCM)

• Model based coders: Preserve speech , not waveform.– LPC10(e) Federal Standard 101 – Mixed Excitation Linear Prediction (MELP)

• Hybrid coders– Coded Excitation Linear Prediction (CELP)– Vector Sum Excitation Linear Prediction (VSELP)

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 4

Types of CodersSpeech Coders

Waveform Coders Vocoders

Time Domain : PCM. ADPCM

Frequency Domain : Sub-band coders,

Adaptive transform coder

Linear Predictive Coder Formant Coders

Page 3: Context and Motivation · • A uniform linear quantizer is called Pulse Code Modulation(PCM). • Pulse code modulation (PCM): Encoding the quantized signals into a digital word

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 5

Quantization • Amplitude quantizing: Mapping samples of a continuous amplitude

waveform to a finite set of amplitudes.

Qua

ntiz

edva

lues

Continuous signal

Discrete signal

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 6

Uniform Quantizer

• A uniform linear quantizer is called Pulse Code Modulation (PCM).

• Pulse code modulation (PCM): Encoding the quantized signals into a digital word (PCM word or codeword).

– Each quantized sample is digitally encoded into an l bits codeword where Lin the number of quantization levels and

Page 4: Context and Motivation · • A uniform linear quantizer is called Pulse Code Modulation(PCM). • Pulse code modulation (PCM): Encoding the quantized signals into a digital word

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 7

Quantization example

tTs: sampling time

x(nTs): sampled valuesxq(nTs): quantized values

boundaries

Quant. levels

111 3.1867

110 2.2762

101 1.3657

100 0.4552

011 -0.4552

010 -1.3657

001 -2.2762

000 -3.1867

PCMcodeword 110 110 111 110 100 010 011 100 100 011 PCM sequence

amplitudex(t)

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 8

Quantization Error

• Quantizing error: The difference between the output and input of a quantizer

)()(ˆ)( txtxte −=

+

)(tx )(ˆ tx

)()(ˆ)(

txtxte−=

AGC

x

)(xqy =Qauntizer

Process of quantizing noise

Page 5: Context and Motivation · • A uniform linear quantizer is called Pulse Code Modulation(PCM). • Pulse code modulation (PCM): Encoding the quantized signals into a digital word

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 9

Quantization error …

• Quantizing error:– Granular or linear errors happen for inputs within the dynamic range of

quantizer– Saturation errors happen for inputs outside the dynamic range of quantizer

» Saturation errors are larger than linear errors» Saturation errors can be avoided by proper tuning of AGC

• Quantization noise variance: 2Sat

2Lin

22 }]{[ σσσ +=−= xxq)E

Value of Input Signal

Value of Output Signal

-1-2-3-4-5 1 2 3 4

1

2

3

4

-1

-2

-3

-4

5

Quantizing Error

(output-input)

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 10

Quantization error

Page 6: Context and Motivation · • A uniform linear quantizer is called Pulse Code Modulation(PCM). • Pulse code modulation (PCM): Encoding the quantized signals into a digital word

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 11

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

−3

−2

−1

0

1

2

3

Time (ms)

Am

plitu

de (

Qua

ntiz

atio

n Le

vels

)

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

−15

−10

−5

0

5

10

15

Time (ms)

Am

plitu

de (

Qua

ntiz

atio

n Le

vels

)

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

−250

−200

−150

−100

−50

0

50

100

150

200

250

Time (ms)

Am

plitu

de (

Qua

ntiz

atio

n Le

vels

)

Quantization error

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 12

Quantization error

• “mid-tread” vs. a “mid-riser” quantizer design is significant when large quantizing steps are used.

– Mid-tread has zero output unless analog input exceeds voltage step size, so background noise is suppressed, but produces worse quantizing error at low voice levels.

– Mid-riser produces worse idle channel noise by increasing the miniscule background room noise or circuit noise, but has less average quantizing noise at low signal levels.

• Quantizing error can be characterized as an equivalent additive quantizing “noise”

Quantizeroutputcode value

Analog voltage

mid-tread

mid-risercode value

Analog voltage

Page 7: Context and Motivation · • A uniform linear quantizer is called Pulse Code Modulation(PCM). • Pulse code modulation (PCM): Encoding the quantized signals into a digital word

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 13

– The quantization noise is characterize as a realization of a stationary random process q in which each of the random variables q(n) has uniform pdf.

» Where the step size of the quantizer is

2)(

≤≤Δ

− xq

Δ/1

dqqpdfnqnqq ⋅⋅== ∫∞

∞−)()(})]({[ 222 Eσ

Quantization error

B

A2max=Δ

B: Number of bits

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 14

– :maximum swing of signal.

– The mean square value of the quantization error is :

– For the case of , the mean square value of the quantization noise is in dB :

Quantization Error

B

A2max=ΔmaxA

[ ]

12212|)(

31

1)()(

2

2max

22/

2/3

2/

2/

22

×=

Δ=

Δ=

⋅Δ⋅=

ΔΔ−

Δ

Δ−∫

B

Anq

dqnqnqE

.dB 8.10612

2log1012

log102

10

2

10 −−==Δ −

BB

1max =A

Page 8: Context and Motivation · • A uniform linear quantizer is called Pulse Code Modulation(PCM). • Pulse code modulation (PCM): Encoding the quantized signals into a digital word

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 15

)23()(

)(ofpower averagethedenoteLet

231

;log

sampleper bitsofnumber theiswhere2

form,binary in expressedissamplequatizedWhen the

22max

2o

22max

2

2

B

Q

BQ

B

mPPSNR

tmP

m

LB

BL

==⇒

=

=

=

σ

σ

6dB per bit

Quantization SNR

2max

22max

10dB3 10log6B )23( log*10 )(

mP

mPSNR B +=⎥

⎤⎢⎣

⎡=

BB

mA2

22

maxmax ==Δ

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 16

How many bits?

• 16 bits resolution is more than is needed for telephone purposes.

– the voice waveform has already been band-limited to ~3.5kHz bandwidth

– Filter imperfections add about -30 dB noise – Carbon microphone is not high-fidelity– Extra bits cost more in hardware and precision of design and

manufacture, and in transmission cost.

• Empirical listener testing indicates about 12-13 bits of uniform resolution is adequate

– No perception of degradation in telephone voice quality

• Logarithmically compressed (“companded”) steps at low level permit equivalent quality with even less bits

Page 9: Context and Motivation · • A uniform linear quantizer is called Pulse Code Modulation(PCM). • Pulse code modulation (PCM): Encoding the quantized signals into a digital word

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 17

Types of Quantization

– Uniform (linear) quantizing:• No assumption about amplitude statistics and correlation properties

of the input.• Robust to small changes in input statistic by not finely tuned to a

specific set of input parameters• Simply implemented

– Non-uniform quantizing:• Using the input statistics to tune quantizer parameters• Larger SNR than uniform quantizing with same number of levels• Non-uniform intervals in the dynamic range with same quantization

noise variance• Commonly used for speech

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 18

Statistics of Speech Signals

• In speech, weak signals are more frequent than strong ones.

• Using equal step sizes (uniform quantizer) gives low for weak signals and high for strong signals.

– Thus, adjusting the step size of the quantizer by taking into account the speech statistics improves the SNR for the input range.

0.0

1.0

0.5

1.0 2.0 3.0Normalized magnitude of speech signalPr

obab

ility

den

sity

func

tion

qNS⎟⎠⎞

⎜⎝⎛

qNS⎟⎠⎞

⎜⎝⎛

Page 10: Context and Motivation · • A uniform linear quantizer is called Pulse Code Modulation(PCM). • Pulse code modulation (PCM): Encoding the quantized signals into a digital word

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 19

Non-Uniform Quantizer

Input SignalO

utpu

t Sig

nal

Input Signal

Out

put S

igna

l

Uniform Transfer

Characteristic

Non-Uniform Transfer

Characteristic

Input Signal

Uniform Error

Characteristic

Non-Uniform Error

Characteristic

Input Signal

2-44

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 20

Uniform vs Non-Linear Quantizing

Page 11: Context and Motivation · • A uniform linear quantizer is called Pulse Code Modulation(PCM). • Pulse code modulation (PCM): Encoding the quantized signals into a digital word

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 21

Non-uniform quantization

• It is done by uniformly quantizing the “compressed” signal. • At the receiver, an inverse compression characteristic, called “expansion”

is employed to avoid signal distortion.

compression+expansion companding

)(ty)(tx )(ˆ ty )(ˆ tx

x

)(xCy = x

yCompress Quantize

ChannelExpand

Transmitter Receiver

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 22

μ Law/A Law

• The μ-law algorithm (μ-law) is a companding algorithm, primarily used in the digital telecommunication systems of North America and Japan. Its purpose is to reduce the dynamic range of an audio signal. In the analog domain, this can increase the signal to noise ratio achieved during transmission, and in the digital domain, it can reduce the quantization error (hence increasing signal to quantization noise ratio).

• A-law algorithm used in the rest of world.

• A-law algorithm provides a slightly larger dynamic range than the mu-lawat the cost of worse proportional distortion for small signals. By convention, A-law is used for an international connection if at least one country uses it.

Page 12: Context and Motivation · • A uniform linear quantizer is called Pulse Code Modulation(PCM). • Pulse code modulation (PCM): Encoding the quantized signals into a digital word

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 23

μ Law

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 24

A Law

Page 13: Context and Motivation · • A uniform linear quantizer is called Pulse Code Modulation(PCM). • Pulse code modulation (PCM): Encoding the quantized signals into a digital word

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 25

μ Law/A Law

|x| |x|

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 26

A Companding Law (Europe - ITU)

163248648096

128

16 32 48 64 80 96 112

128

144

160

176

192

208

224

240

256

272

11 bitinput

8 bitoutput

Page 14: Context and Motivation · • A uniform linear quantizer is called Pulse Code Modulation(PCM). • Pulse code modulation (PCM): Encoding the quantized signals into a digital word

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 27

Compounding in music recording

• Recall the human ear’s “masking” phenomena – a noise signal is not perceived as objectionable unless it is sufficiently

large in relation to a desired sound present simultaneously – Small noises are objectionable in a quiet library– The same small noise is imperceptible at a rock concert!

• This principle is the basis of noise reduction systems like the Dolby™ system for sound recording

– The recording audio level is automatically increased for soft passages– The playback level is automatically reduced, to match, via an auxiliary

control signal, so desired signal has the original loudness. In Dolby system, this is typically a low frequency control signal.

– Therefore, noise added by the recording medium (e.g., magnetic tape “hiss”) is not noticeable during “soft” music intervals

– Dolby systems treat different audio frequency bands separately (high frequency is noisiest in magnetic tape), and use different types of auxiliary signals (Dolby B, C, etc.)

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 28

G.711

• The most commonplace codec– Used in circuit-switched telephone network– PCM, Pulse-Code Modulation

• If uniform quantization– 12 bits * 8 k/sec = 96 kbps

• Non-uniform quantization– 64 kbps DS0 rate– mu-law

» North America– A-law

» Other countries, a little friendlier to lower signal levels– An MOS of about 4.3

Page 15: Context and Motivation · • A uniform linear quantizer is called Pulse Code Modulation(PCM). • Pulse code modulation (PCM): Encoding the quantized signals into a digital word

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 29

Differential PCM

• Basic idea– Since speech signals are slowly varying, it is possible to eliminate the

temporal redundancy by prediction– For many natural signals, the difference between successive samples

quantizes better than samples themselves– Even better, predict the current sample from the past one(s) and transmit

the error of the prediction to the decoder on the other side.

• Linear prediction– Fixed: the same predictor is used again and again

– Adaptive: predictor is adjusted on-the-fly

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 30

First-order Prediction

- Encodinge1=x1

en=xn-xn-1 n = 2,…,N

x1 x2 … … xN

_

+D

xn

xnxn-1

en

xn-1

+xn-1

en xn

D

EncoderDecoder

DPCM Loop

- Decoding e1 e2 … … eN

x1=e1

xn=en+xn-1 n = 2,…,N

Page 16: Context and Motivation · • A uniform linear quantizer is called Pulse Code Modulation(PCM). • Pulse code modulation (PCM): Encoding the quantized signals into a digital word

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 31

Open-loop DPCM

_

+D

+

D

EncoderDecoder

Q

Note: • Prediction is based on the past unquantized sample

• But quantization is located outside the DPCM loop

nene) nx)

nx1−nx

nx

1−nx1−nx)

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 32

DPCM

Σ Quantizer

Σ

ΣCommunicationChannel

PredictorPredictor

)(nx )(ne )(ne )(~ ne

Coder Decoder

)1( −nx)

)1( −nx)

)1( −nx)

)(nx)

)(nx)

Bring the quantizer into the ‘prediction loop’

Page 17: Context and Motivation · • A uniform linear quantizer is called Pulse Code Modulation(PCM). • Pulse code modulation (PCM): Encoding the quantized signals into a digital word

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 33

Numerical Example

90 92 91 93 93 95 …

90 2 -2 3 0 2

90 93 90 93 93 96 …

90 3 -3 3 0 3

Q 33

)( ⋅⎥⎦⎤

⎢⎣⎡=

xxQ

a

b

a-b

a

b a+b

)(nx

)(ne

)(ne

)(nx)

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 34

DPCM

1−−= nnn xxe )

1−+= nnn xex )) nnnn eexx −=− )A:

B:

The distortion due to quantization of the prediction residue en is identical to the distortion introduced to the original sample xn

Σ Quantizer

ΣPredictor

)(nx )(ne )(ne

)1( −nx)

)1( −nx)

)(nx)

AB

Page 18: Context and Motivation · • A uniform linear quantizer is called Pulse Code Modulation(PCM). • Pulse code modulation (PCM): Encoding the quantized signals into a digital word

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 35

Higher Order Prediction

- Encoding

initialize

prediction Nknxaxek

iininn ,...,1

1

+=−= ∑=

Nknxaexk

iininn ,...,1

1

+=+= ∑=

knxe nn :1==

- Decodinginitialize knex nn :1==

prediction

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 36

DPCM

Σ Quantizer

Σ

ΣCommunicationChannel

PredictorPredictor

)(nx )(ne )(ne

Coder Decoder

∑=

−=k

iinin xax

1

~~

nx~)(~ nx

)(~ nx

nx~)(ne

Prediction of the current sample from past estimated ones

Est of current sample = predicted + error prediction

error

nnn xxe ~−=

Page 19: Context and Motivation · • A uniform linear quantizer is called Pulse Code Modulation(PCM). • Pulse code modulation (PCM): Encoding the quantized signals into a digital word

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 37

Linear predictor coefficients

⎥⎥⎥⎥

⎢⎢⎢⎢

=

⎥⎥⎥⎥

⎢⎢⎢⎢

⎥⎥⎥⎥

⎢⎢⎢⎢

)(

)2()1(

)0()1()1()1(

)0()1()1()1()0(

2

1

KR

RR

a

aa

RRKRR

RRKRRR

n

n

n

Knnn

n

nn

nnn

MM

L

OM

MO

L

∑ ∑∑= ==

−−==N

n

K

kk

N

nknxanxneMSE

1

2

11

2 ])()([)(minimize

Note that in fixed prediction, auto-correlation is calculatedover the whole segment of speech (NOT short-time features)

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 38

Adaptive DPCM

• Forward adaptation

– The prediction parameters are estimated from the current speech data

which is available only at the transmitter. The quantized prediction

coefficients are transmitted to the decoder as side information .

• Backward adaptation

– The parameters are estimated from past data, which is available at both

transmitter and receiver, thus there is no need for side information (no

overhead), but the operation is suseptible to transmission errors.

Page 20: Context and Motivation · • A uniform linear quantizer is called Pulse Code Modulation(PCM). • Pulse code modulation (PCM): Encoding the quantized signals into a digital word

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 39

More suitable for high-bit rate coding

More suitable for low-bit rate coding

sensitive to errorsrobust to errors

No overheadOverhead non-negligible

Symmetric complexity allocation (encoder=decoder)

Asymmetric complexity allocation (encoder>decoder)

Backward adaptive predictionForward adaptive prediction

Forward / Backward Adaptation

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 40

Adaptive DPCM with Forward Adaptation

+ AdaptiveSpeech Input

+Adaptive

Quantizer

Predictor

DecoderEncoder

-

+

order p

+

AdaptivePredictor

Speech Output

Q-1

PredictorAdaptation

PredictorAdaptation

Step sizeAdaptation

Step sizeAdaptation

side info

side info

Page 21: Context and Motivation · • A uniform linear quantizer is called Pulse Code Modulation(PCM). • Pulse code modulation (PCM): Encoding the quantized signals into a digital word

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 41

Adaptive DPCM with Backward Adaptation

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 42

Short and long-term ADPCM

( ) ∑=

−=P

k

kk zazP

1

If we wish to model the short and long term prediction nature of speech, we can use a predictor of very large order P

But modeling the pitch periodicity of speech as well as the short term redundancy would require a very large order, P = 50 to 100.

Instead, the predictor is split into 2 portions, one modelling the short term redundancy of speech, and one modeling the long term redundancy , due to pitch periodicity. The long term predictor can be a single coeficient filter of the form :

( ) ML zzA −⋅= β

Page 22: Context and Motivation · • A uniform linear quantizer is called Pulse Code Modulation(PCM). • Pulse code modulation (PCM): Encoding the quantized signals into a digital word

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 43

Short and long-term ADPCM

β is a scaling factor that relates to the degree of periodicity of the waveform and M is the estimated period (in samples). The predictor time response is a single impulse delayed by M samples. M is the estimated pitch period. βThe synthesis filter is of the form :

0.5

1

1.5

2

2.5

3

3.5

0 500 1000 1500 2000 2500 3000 3500 4000 Freq. (Hz)

|H(f)|

Frequency response of the synthesis filter:

( ) ( ) ML

L zzAzS −⋅−

=−

=β1

11

1

Peaks are spaced by 1/M

Width of the peaks is function of β , which can be estimated as

[ ][ ])(

)()(2 MnxE

MnxnxE−−

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 44

Short and long-term ADPCM

+

AL(z) +

A(z)

+ Qeq(n)

s(n)

-+ - + +

+Speech Output

+Q-1eq(n)

+

+

+

A(z)

AL(z)

+

Long-term prediction Encoder

Decoder

( ) ML zzA −⋅= β( ) ∑

=

−=P

k

kk zazA

1

Page 23: Context and Motivation · • A uniform linear quantizer is called Pulse Code Modulation(PCM). • Pulse code modulation (PCM): Encoding the quantized signals into a digital word

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 45

Higher-order LT predictor

The true pitch period is unlikely to be an exact multiple of 1/Fs. Thus, a predictor of multiple orders can be used to better synthesize the pitch periods .

( ) 132

11

−−−+− ⋅+⋅+⋅= MMML zzzzA βββ

Another way to deal with the varying degree of voicing across the spectrum (lower spectrum is more harmonically pronounced than the higher), separate bands can be considered separately. This allows the pitch predictor in different bands to have different b : high values lead to narrow bandwidth for the lower frequencies and lower values for the less periodic higher frequencies.

Hi band

Low band

LT prediction

LT prediction

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 46

ITU-T G.726 - Adaptive Differential Pulse Code Modulation (ADPCM)

Encoder

Decoder

Page 24: Context and Motivation · • A uniform linear quantizer is called Pulse Code Modulation(PCM). • Pulse code modulation (PCM): Encoding the quantized signals into a digital word

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 47

ITU-T G.722 7 kHz Audio Coding within 64 kbit/s

simultaneous speech- and data-transmission with data-rate BD=8 or16 kbit/s possible, B+BD= 64 kbit/s

overall signal delay 1.5ms

ADPCM (G.726 like) coding in both subbands with w = 4,5 or 6 (32,40,48 kbit/s) in the lower subband ans w = 2 (16 kbit/s) in the higher subband

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 48

ITU waveform coders

Page 25: Context and Motivation · • A uniform linear quantizer is called Pulse Code Modulation(PCM). • Pulse code modulation (PCM): Encoding the quantized signals into a digital word

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 49

ITU waveform coders

G.722(48 kbps)

G.726(32 kbps)

http://www-lns.tf.uni-kiel.de/demo/demo_speech.htm

G.711(64 kbps)

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 50

Delta Modulation : (DM)

• Predictor : one-step delay function

• Quantizer : 1-bit quantizer

[ ])()(~)1(~)()(

1 neQnenunune

bit−=−−=

)1(~)(~ −= nunu

Page 26: Context and Motivation · • A uniform linear quantizer is called Pulse Code Modulation(PCM). • Pulse code modulation (PCM): Encoding the quantized signals into a digital word

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 51

Delta Modulation : (DM)

• Primary Limitation of DM– Slope overload : large jump region

» Max. slope = (step size)X(sampling freq.)

– Granularity Noise : almost constant region

– Instability to channel noise

)(nu

)(~ nu

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 52

DM:

Unit Delay (Ts)

Unit Delay (Ts)

Integrator

)(nu )(ne )(~ ne

)(~ nu)1(~)(~ −= nunu

)(~ ne )(~ nu

)(~ nu

Coder

Decoder

1-bit quantizer

Page 27: Context and Motivation · • A uniform linear quantizer is called Pulse Code Modulation(PCM). • Pulse code modulation (PCM): Encoding the quantized signals into a digital word

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 53

DM

Step size effect : Step Size (i) slope overload

(sampling frequency ) (ii) granular Noise

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 54

DM – step size conditions

Afdt

tdxT 02)(max π=≥Δ

00 22 ff

TfA s

ππ⋅Δ

• The choice of step size is crucial to successful performance in DM. Since the output magnitude can change only by Δ each sample interval T, then Δ must be large enough to accommodate rapid changes.

Tnxnx

dttdx

T)1()(

max)(max−−

≈≥Δ

For Sinusoidal Signals ( )tfAtx 02cos)( π⋅=

Page 28: Context and Motivation · • A uniform linear quantizer is called Pulse Code Modulation(PCM). • Pulse code modulation (PCM): Encoding the quantized signals into a digital word

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 55

DM – step size example

– Q. Consider a Speech Signal with maximum frequency of 3.4KHz and maximum amplitude of 1volt. This speech signal is applied to a delta modulator whose bit rate is set at 60kbit/sec. What is an appropriate step size for the modulator ?

– Bandwidth of the signal = 3.4 KHz.– Maximum amplitude = 1 volt– Bit Rate = 60Kbits/sec– Sampling rate = 60K Samples/sec.– STEP SIZE = 0.356 Volts

sATf02π≥Δ

Digital Speech and Audio Processing E. Nemer UCI Spring 2008 - 56

Adaptive DM:

1+kX

1+kE1+ks

Adaptive Function

Unit DelaykX 1+Δ k

Storedk mink ,E, ΔΔ

Input signal is varying fast - Step Size is increased

Input signal is varying slow - Step Size is reduced

Variable Step Size