When to Code WHEN NOT TO CODE James D. (jj) Johnston Chief Scientist DTS, Inc.

When to CodeWHEN NOT TO CODEWhen to CodeWHEN NOT TO CODEJames D. (jj) Johnston

Chief Scientist

DTS, Inc

Information on this slide is the confidential property of DTS. Any unauthorized copying is strictly prohibited. Copyright 2009 DTS, Inc.

What is a coder, or a codecWhat is a coder, or a codec

• A coder or codec is a signal processing method that reduces bit rate.

• There are two (broadly speaking) kinds of codec:

–Source Coders

–Perceptual Coders


What is a Source CoderWhat is a Source Coder

• A source coder uses a signal model of the source to reduce the bit rate

– It does this by reducing redundancy, and/or by adding noise– A source coder’s success depends entirely on its ability to model

the source.• Not all sources meet the same model• Not all coders can handle disparate models• The more constrained the source, the more efficient the

codec– But you’d better not have the wrong source, then.


Some points about Source CodersSome points about Source Coders

• A source coder is highly dependent on its source model, adaptive or fixed

– The stronger the model, the more compression– The more general the model, the less the reduction in bit rate (in

general)

• When a source coder does not match the actual source– Compression fails

• Bit rate rises (or)• Quality falls

– When a source coder is extremely specific, it may fail completely on some source material

• Adaptive codecs (most are) generally adapt to the most energetic signal presented as input

– If noise is louder, it will code the noise better than the speech!


Low Rate Voice CodersLow Rate Voice Coders

• The way that a low rate voice coder reduces the bit rate is by having a very specific speech model.

– Such codecs fail miserably on music– Such codecs fail miserably when exposed to too

much background noise

• In short, if you use a very-low-rate voice codec, YOU MUST MAKE SURE THAT IT GETS CLEAN VOICE INPUT

– We all hear this on our cell phones on a daily basis


What about Special CodecsWhat about Special Codecs

• It is possible to make a codec work somewhat better on noise at a cost to speech performance

•Unfortunately, this cuts both ways.


What’s the point for Emergency Services?What’s the point for Emergency Services?

1. Don’t use an extremely low rate speech codec

2. Really! Don’t use a low-rate speech codec

3. See Rule #2!

• You have no control over background noise, and the background noise in an emergency situation is likely to be both non-speech-like as well as intense.


Digital Transmission in the Emergency ServicesDigital Transmission in the Emergency Services

• Analog transmission gets noisy when you get to the limits of communication.

– The noise gives you warning that you are losing contact.

• Digital transmission JUST GOES AWAY– What’s more, with time-domain redundancy, it will “stay

away” for a while

• If you use digital transmission, make sure that the user will be alerted to loss of communication.


What about perceptual codecs?What about perceptual codecs?

• A perceptual codec attempts to discover what the most audible signal elements are, and sends them

– It uses a human hearing model that assumes normal listening levels.

• Do I need to say any more?– Ok, It uses a human hearing model that

assumes normal listening levels.


Implications for Emergency ServicesImplications for Emergency Services

• Perceptual coders must make assumptions about the listening level

• Perceptual coders must also make assumptions about what the “message” is at low rates

• The combination of lots of noise, loud noise, and speech is not a good combination.


The Ear vs. NoiseThe Ear vs. Noise

• At high levels, the ear’s ability to separate frequencies in a signal is badly degraded

– This makes speech even harder to understand– Making it LOUDER may make the result worse

• A situation where the listener is in a high noise environment, the talker has lots of noise, and a low-rate codec is in use is very, very likely to create communications problems.

• BE CAREFUL!


Now what?Now what?

• HIGH rate speech codecs (say muLaw at 56 kb/s) have no model to speak of, and are more robust.

• Straight PCM is probably an excessive rate for mobile radio use

• The tradeoff between rate, delay, time redundancy, and “the cliff” must be carefully examined.


So now what?So now what?

GET THE CLEANEST SPEECH SIGNAL YOU CAN

A. Close talking mike

B. Good AGC

C. Heavily filtered bandwidth, 500-4000Hz or so

i. Yes, this eliminates fricatives, that’s what Alpha, Baker … is for.

D. Use as much noise cancelling as you can


ThenThen

• Carefully test any codecs for

–Tandeming properties in noise

–DRT/Articulation performance in noise

–Overload or “lockup” issues in noise


Finally:Finally:

BE CarefulOUT

THERE!

When to Code WHEN NOT TO CODE James D. (jj) Johnston Chief Scientist DTS, Inc.

Documents

Transcript of When to Code WHEN NOT TO CODE James D. (jj) Johnston Chief Scientist DTS, Inc.