Audio Coding Quantization and Coding Methods - TU Ilmenau

29
Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 1 Audio Coding Quantization and Coding Methods Prof. Dr.-Ing. Karlheinz Brandenburg [email protected]

Transcript of Audio Coding Quantization and Coding Methods - TU Ilmenau

Page 1: Audio Coding Quantization and Coding Methods - TU Ilmenau

Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 1

Audio Coding Quantization and Coding Methods

Prof. Dr.-Ing. Karlheinz

Brandenburg [email protected]

Page 2: Audio Coding Quantization and Coding Methods - TU Ilmenau

Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 2

Quantization and Coding Methods (1)

Page 3: Audio Coding Quantization and Coding Methods - TU Ilmenau

Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 3

Quantization and Coding Methods (2)•

Objective:–

„Good“

representation of spectral data

Compactness (low bit rate)•

Smallest possible perceptible distortion (high subjective quality)

Overview:–

Quantization

Noiseless Coding–

Joint Quantization/ Coding Techniques

Encoding Strategies

Page 4: Audio Coding Quantization and Coding Methods - TU Ilmenau

Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 4

Quantization (1)•

Basics:–

Data reduction by removing irrelevance

Explicit control of quantization distortion according to time/frequency-dependent masking threshold (perceptual coder)

High variability in local SNR •

(e.g. 0db ... >30db)–

Most popular case: Scalar quantization

Page 5: Audio Coding Quantization and Coding Methods - TU Ilmenau

Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 5

Quantization

(2)

x

f(x) = x

• Simpler:- Uniform quantization

• MPEG-1/2 Layers I and II, ATRAC, AC-3

- Average distortion independent on size of coefficients

more precise control of quantization noise:

Page 6: Audio Coding Quantization and Coding Methods - TU Ilmenau

Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 6

Quantization (3)•

More sophisticated:–

Non-uniform quantization•

MPEG-1/2 Layer III, MPEG-2/4 AAC:

More distortion for larger coefficients (subband signal x)

⎟⎟⎠

⎞⎜⎜⎝

Δ=

q

ii

xroundS75.0

Page 7: Audio Coding Quantization and Coding Methods - TU Ilmenau

Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 7

Quantization

(4)

x

f(x) = x1/0.75

Decoder

expanding

x

Δq

f(x) = x0.75

Encoder

compression

resulting

step

sizeafter

decoding

Non-uniform quantization:

Page 8: Audio Coding Quantization and Coding Methods - TU Ilmenau

Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 8

Quantization (5)•

Approach #1:–

Group of spectral coefficients (subband

signal

x) is normalized by means of a common multiplier („scalefactor“)

Side information: Scalefactor

and quantizer resolution (bits/sample)

„Block companding“

/ „Block floating point“–

MPEG-1/2 Layers I and II, ATRAC, AC-3

Page 9: Audio Coding Quantization and Coding Methods - TU Ilmenau

Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 9

Quantization (6)•

Approach #2:–

Group of spectral coefficients is scaled („scalefactor“) and subsequently quantized by a fixed quantizer

Side information: Scalefactor–

Used with entropy coding

Quant. Precision controlled by scalefactor–

MPEG-1/2 Layers III, MPEG-2/4 AAC

Page 10: Audio Coding Quantization and Coding Methods - TU Ilmenau

Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 10

Shaping of the noise according to psycho-acoustic model. Several subbands

in each scalefactor

band,

with same step size.

SignalMasking ThresholdEach step: scalefactor band

with scalefactor

Approximation of Masking Threshold by a step function → fewer bits for parameterization

fscalefactor

bandsubbands

scale-factor

Scalefactor

Bands and Masking

Threshold

Page 11: Audio Coding Quantization and Coding Methods - TU Ilmenau

Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 11

Effect

of non-uniform

Quantizationin each

scalefactor

band with

SignalMasking ThresholdEach step: scalefactor band

with scalefactor

Additional noise shaping with non-uniform quantizer (x0.75):

quantization noise follows signal x

Approximation of Masking Threshold by a step function → fewer bits for parameterization

fscalefactor

bandsubbands

scale-factor

Page 12: Audio Coding Quantization and Coding Methods - TU Ilmenau

Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 12

Different way to approximate the Masking Threshold, use polynomial approximation instead of step-function of scale factor bands.

Goal: Approximation of Masking Threshold as close as possible to reduce bit demand

Used in Ultra Low Delay Coder, predictive coder, no subbands. Is frequency response of linear filter.

SignalMasking ThresholdPolynomial Approximation (example: 12 coefficients)

Polynomial

Approximation of Masking

Threshold

Page 13: Audio Coding Quantization and Coding Methods - TU Ilmenau

Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 13

Noiseless Coding (1)•

Even more efficient:–

Higher gain by using entropy coding•

Huffman coding (MPEG-1/2 Layer III, MPEG-

2/4 AAC)•

Arithmetic coding (MPEG-4 version 2 BSAC)

ScalefactorBands

3CodebookNumbers8 0 6 2

Sections

0 1 2 3 4 5 6 7 8 9 10

choose

„best“

Huffman code

from

codebook

Page 14: Audio Coding Quantization and Coding Methods - TU Ilmenau

Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 14

Noiseless

Coding

(2)

00: highest

probability

reason

for

compression

gain

with

entropy

coding

shorter

codewords

for

the

more

likely

symbols

(e.g. 0)

shortest

codeword

for

0

Histogram

xq

p~e-|x|/c

typical probability distribution of audio samples: Laplace distribution

Page 15: Audio Coding Quantization and Coding Methods - TU Ilmenau

Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 15

Noiseless Coding (3)•

Example for state-of-the art coding kernel (MPEG-

2/4 AAC)–

Multi-dimensional (2 or 4-dim.) entropy coding -> exploiting joint statistics of vector comp.

Huffman coding, several codebooks–

Signal information part of codebook or separate escape mechanism for large quant. values

Choice of Huffman codebook for arbitrary groups of scalefactor

bands („sections“)

can obtain less than 1 bit per sample with 2 or 4 dimensional vectors

Page 16: Audio Coding Quantization and Coding Methods - TU Ilmenau

Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 16

MPEG Audio Layer-3: Huffman Code Tables(1)

Page 17: Audio Coding Quantization and Coding Methods - TU Ilmenau

Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 17

MPEG Audio Layer-3: Huffman Code Tables(2)

Page 18: Audio Coding Quantization and Coding Methods - TU Ilmenau

Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 18

MPEG Audio Layer-3: Huffman Code Tables(3)

Page 19: Audio Coding Quantization and Coding Methods - TU Ilmenau

Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 19

MPEG Audio Layer-3: Huffman Code Tables(4)

Page 20: Audio Coding Quantization and Coding Methods - TU Ilmenau

Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 20

Joint Quantization / Coding Techniques (1)•

Vector quantization–

Excellent coding efficiency at very low rates (<< 1 bit / sample), but

Perceptual control of distortion difficult→control only total distortion for “group”

Application range:–

Used for intermediate quality / very low bit rate coding•

e.g. MPEG-4 TwinVQ

Page 21: Audio Coding Quantization and Coding Methods - TU Ilmenau

Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 21

Joint Quantization / Coding Techniques (2)

quantization

step

size

resulting quantization errorfor 2 subbands

x1

x2

subband

value

2

subband

value

1

x1

also possible for higher dimensions→ even higher gain

denser packing of quantization„spheres“

reduced average quantization

errors

x2

Page 22: Audio Coding Quantization and Coding Methods - TU Ilmenau

Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 22

Joint Quantization / Coding Techniques (3)

WeightedVQ

WeightedVQ

WeightedVQ

WeightedVQ

Input Vector

Sub-Vectors(Interleaved)

Interleaving

Perceptual Weights

VectorQuantization

Index Values

Page 23: Audio Coding Quantization and Coding Methods - TU Ilmenau

Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 23

Encoding Strategies (1)•

Today‘s coding schemes provide large degree of flexibility–

Quantization noise profile over frequency

Trade-off audio bandwidth vs. overall distortion–

Bit rate & coding mode

Usage of optional tools (Temporal Noise Shaping, prediction)

Encoding strategy „intelligent“

part of encoding; determines quality

Arena for specific know-how („secrets of audio coding“)

Page 24: Audio Coding Quantization and Coding Methods - TU Ilmenau

Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 24

Encoding Strategies (2)•

Signal-to-Noise Ratio (SNR)–

Quadratic distortion metric

Does not allow prediction about quality !

Noise-to-mask Ratio (NMR)–

Ratio of distortion with respect to masking threshold

Determines audibility of distortions

Signal-to-Mask Ration (SMR)–

Relation between signal and masking threshold

Gives indication of bit demandCoder Bands

[dB]

Signalenergy

Maskingthreshold

Distortion(Noise)

SNR

SMR

NMR

Page 25: Audio Coding Quantization and Coding Methods - TU Ilmenau

Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 25

Constant Quality Coding•

Goal:–

Coding with a pre-selected constant quality

Procedure:–

Determine time & frequency dependent masking threshold

Adjust quantizer step size to meet target distortion

Constant quality -> Variable bit rate

Applications:–

Storage media / transmission channels supporting variable rate e.g.•

Storage on digital media; music on internet

Page 26: Audio Coding Quantization and Coding Methods - TU Ilmenau

Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 26

Constant Rate Coding (1)•

Paradigm 1 „Bit Allocation“–

Used typically in coding schemes with block companding

Direct translation between bit rate and local SNR available

Set initial SNRfor each band

Calc. bits/sample

#used_bits <<#available_bits ?

AdjustSNR

No

Yes

Perform Quantization

Page 27: Audio Coding Quantization and Coding Methods - TU Ilmenau

Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 27

Constant Rate Coding (2)•

Paradigm 2 „Noise Allocation“–

Used in codecs

with entropy encoding

Set equal

step

sizefor

each

band initially

Rate Loop:Change global step

size

until#used_bits==available_bits

Adjust

step

size

fordeficient

bands

Coding

resultsatisfactory

?

Better

resultpossible

?

Yes

No

No

Yes

Quantization

info

/ step

sizes

#available_bits

Page 28: Audio Coding Quantization and Coding Methods - TU Ilmenau

Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 28

Constrained Variable Bit Rate Coding (Bit Reservoir)•

Constant quality vs. Constant bit rate–

Constant quality -> variable bit rate (VBR)

Constant bit rate -> variable quality–

? How to achieve both ?

Compromise:–

Bit reservoir = constrained VBR Coding

Limit accumulated deviation of bit consumption from average („bit reservoir size“)

Additional delay–

Used in MPEG-1/2 Layer III, MPEG-2/4 AAC

-300-200-100

0100200

300400500600700

100 150 200 250 300 350 400 450 500 550 600frame number

Page 29: Audio Coding Quantization and Coding Methods - TU Ilmenau

Prof. Dr.-Ing. K. Brandenburg, [email protected] Prof. Dr.-Ing. G. Schuller, [email protected] Page 29

Quantization / Coding: Codec OverviewFi lt e rb a n kch a nn els

Q u a nt iz a t io n Co di n g

MP EG- 1 /2 La y e r I / I I 32 u n i f o r m b lo ckco m pa n d ing

MP EG- 1 /2 Lay er I I I 576 / 1 92 n o n -u n if o rm H u f f m anco d in g

MP EG- 2 AA C 1024 / 128 n o n -u n if o rm H u f f m anco d in g

MP EG- 4 Tw inV Q 1024 / 128(960/1 20)

VQ VQ

A C-3 256 / 1 28 u n i f o r m b lo ckco m pa n d ing

AT RA C 96 .. 5 12 u n i f o r m b lo ckco m pa n d ing