Post on 02-Jun-2018
8/10/2019 Introduction to coding of audivisual contents
1/28
CAV: Fall 2014 1. Introduction
ETSETB TelecomBCN Universitat Politcnica de Catalunya UPC-BarcelonaTech 1
Coding of Audiovisual Contents
1. Introduction
Veu Processing GroupImage and Video Processing Group
Unit structure
Motivating multimedia compression
The ITU601 standard
ow oes compress on wor
Redundancy and irrelevance
Lossless and lossy compression Compression efficiency
Quality measures
2
Coding systems
Importance of standards
1. IntroductionCoding of Audiovisual Contents
8/10/2019 Introduction to coding of audivisual contents
2/28
CAV: Fall 2014 1. Introduction
ETSETB TelecomBCN Universitat Politcnica de Catalunya UPC-BarcelonaTech 2
Motivating multimedia compression (I)
Uncompressed multimedia (graphics, audio and video) data requires considerable stora e ca acit and transmission bandwidth .
Why do we need to compress multimedia content?
Despite rapid progress in mass storage density, processor speeds , and digital communication system performance, demand for data storage capacity and data transmission bandwidth continues to outstrip the capabilities of available technologies.
The continuous growth of data intensive multimedia based web applications have not only sustained the need for more efficient ways to encode signals and images but have made compression of such signals central to storage and communication technology .
3 1. IntroductionCoding of Audiovisual Contents
Motivating multimedia compression (II)
More than 500 thousand million digital photos are taken each year .
3 days of content are uploaded
Some data (2013)
every minute to YouTube. 600 years of Youtube content are
shared through Facebook links every
day. Every second , on average, 58
photos are uploaded to Instagram, 15 photos are uploaded to Flickr, 26 photos/videos are shared on Twitter and more than 2000 photos are uploaded to Facebook (though the most part of them are not public).
4 1. IntroductionCoding of Audiovisual Contents
http://instagram.com/p/W2FCksR9-e/#
What a difference 8 years makes. Saint Peters square in 2005 vs 2013 (#NBCPope)
8/10/2019 Introduction to coding of audivisual contents
3/28
CAV: Fall 2014 1. Introduction
ETSETB TelecomBCN Universitat Politcnica de Catalunya UPC-BarcelonaTech 3
An example: ITU601
In 1982, the CCIR defined a standard for encoding interlaced analogue video signals in digital form mainly for studio applications.
The current name of the standard is ITUR BT.601 .
e co or space s B R: It enables compatibility between color and black&white . It allows a reduction of the chroma resolution thanks to the lower
sensitivity of the HVS to the color information. The RB signals are selected to create the chroma signals (CBCR)
since G presents the largest correlation with Y.
0.299 0.587 0.1140.500 0.419 0.0810.169 0.331 0.500
R
B
Y R
C G
BC
E E
E E
E E
5 1. IntroductionCoding of Audiovisual Contents
Digital video: Sampling (I)
The previous signals EY ECB ECR present a bandwidth of
approximately 6 MHz: Antialiasing filters have a bandpass up to 5.75 MHz (ITU601). The sampling frequency is 13.5 MHz (ITU601) which is a
multiple of the line frequency in the most common standards: 625 (PAL) or 525 (NTSC) lines.
Both standards have 720 samples by active line.
Video signal sampling : An orthogonal sampling pattern is used. Chroma signals can be sampled at lower resolution than the
luminance signal.
6 1. IntroductionCoding of Audiovisual Contents
8/10/2019 Introduction to coding of audivisual contents
4/28
CAV: Fall 2014 1. Introduction
ETSETB TelecomBCN Universitat Politcnica de Catalunya UPC-BarcelonaTech 4
Digital video: Sampling (II)
The most common sampling formats are (Chroma sampling changes):
4:4:4 Format 4:2:2 Format 4:2:0 Format
4:4:4 Format: Also used for the RGB format.
4:2:2 Format: Useful for most studio applications.
4:2:0 Format: Not the scope of the ITU601 standard.
LuminanceChroma
7 1. IntroductionCoding of Audiovisual Contents
Digital video: Quantization (I) The EY ECB ECR components
do not have the same dynamics : Signal EY [0, 1] Volts Signal ECB [0.5, 0.5] Volts Signal ECR [0.5, 0.5] Volts.
Every signal is quantized using 8 (10) bits ; that is 256 (1024) levels, so that: Values 0 and 255 are preserved for synchronization data
A protection interval (to allow for overshoot and undershoot) is left in the upper and lower levels of the dynamic range.
219 16Y Y E
224 128 160 128 R R C R Y C E E E 224 128 126 128
B B C B Y C E E E
The corresponding level number is the nearest integer value and the digital
equivalents are termed Y , C B and C R .
8 1. IntroductionCoding of Audiovisual Contents
8/10/2019 Introduction to coding of audivisual contents
5/28
8/10/2019 Introduction to coding of audivisual contents
6/28
CAV: Fall 2014 1. Introduction
ETSETB TelecomBCN Universitat Politcnica de Catalunya UPC-BarcelonaTech 6
Bit rates (II)
The most
common
version
is
4:2:2
10
bits,
leading
to
270
Mbits/s :
Standardized as SMPTE 259M
Lower bit rates can be achieved removing all irrelevant information (without blanking intervals: Buffered video ):
eo s samp e a , z Stored in a memory Only actives lines are read at minimum speed.
For instance, at 4:2:2 format (625 lines, 50 Hz), there are 720 samples per line, 576 active lines and 25 frames per second. As every sample is quantized with 8 bits, the final bit rate is:
=
(720 x 480 x 30 x 10 x (1 + + ) = 207.36 Mbit/s)
Although the bit rate is reduced, figures are still very high: Only for studio applications . Broadcasting requires compression systems .
11 1. IntroductionCoding of Audiovisual Contents
Unit structure
Motivating multimedia compression
The ITU601 standard
ow oes compress on wor
Redundancy and irrelevance
Lossless and lossy compression Compression efficiency
Quality measures
12
Coding systems
Importance of standards
1. IntroductionCoding of Audiovisual Contents
8/10/2019 Introduction to coding of audivisual contents
7/28
CAV: Fall 2014 1. Introduction
ETSETB TelecomBCN Universitat Politcnica de Catalunya UPC-BarcelonaTech 7
How does compression work?
Two fundamental components of information compression are
redundancy and irrelevance reduction.
Redundancy reduction aims at removing duplication from the signal source (audio / image/ video/ text). In general, three types of redundancy can be identified:
Temporal Redundancy or correlation between adjacent samples in audio or adjacent frames in a sequence of images (in video applications).
. Spectral Redundancy or correlation between different spectral
bands or color components.
13 1. IntroductionCoding of Audiovisual Contents
Redundancy / Relevancy
Two fundamental components of information compression are
redundancy and irrelevance reduction.
Irrelevance reduction omits parts of the signal that will not be noticed by the signal receiver:
The human
auditory
system
(HAS)
or
The human visual system (HVS).
Current coding systems exploit both types of characteristics
redundancy and irrelevance reduction, introducing losses in
the decoded signal.
14 1. IntroductionCoding of Audiovisual Contents
8/10/2019 Introduction to coding of audivisual contents
8/28
CAV: Fall 2014 1. Introduction
ETSETB TelecomBCN Universitat Politcnica de Catalunya UPC-BarcelonaTech 8
Temporal Redundancy: Video
In video signals, there is usually a high temporal correlation (similarity) between frames that are captured at around the same time, especially if the temporal sampling rate (the frame rate) is high.
Table tennis sequenceframe #40
Table tennis sequenceframe #41
Table tennis sequenceframe #42
15 1. IntroductionCoding of Audiovisual Contents
Temporal Redundancy: Speech
In speech signals, there is usually a high temporal correlation (similarity) between consecutive (or close) samples that can be appreciated in the signal itself, its (estimated) autocorrelation or its (estimated) spectral density.
Autocorrelation of
Speech signal: Voiced and Unvoiced parts
the voiced signal
Table tennis sequenceframe #42
16 1. IntroductionCoding of Audiovisual Contents
Spectral density of the voiced signal
8/10/2019 Introduction to coding of audivisual contents
9/28
CAV: Fall 2014 1. Introduction
ETSETB TelecomBCN Universitat Politcnica de Catalunya UPC-BarcelonaTech 9
Temporal Redundancy: Audio
This temporal redundancy can be clearly observed as well when comparing consecutive samples in an audio signal.
Strong correlation found in voiced sounds of speech and many sounds from musical instruments
17 1. IntroductionCoding of Audiovisual Contents
Spatial Redundancy: Image
In still images, there is usually a high spatial correlation between pixels (samples) that are close to each other, i.e. the values of neighboring samples are often very similar.
Amplitude of adjacent pixel
Gray level image, 8bbpJoint histogram of two horizontal adjacent pixels
Amplitude of current pixel
18 1. IntroductionCoding of Audiovisual Contents
8/10/2019 Introduction to coding of audivisual contents
10/28
CAV: Fall 2014 1. Introduction
ETSETB TelecomBCN Universitat Politcnica de Catalunya UPC-BarcelonaTech 10
Inter channel redundancy: Color Image
In images, there is usually a high correlation in the spectral domain between color image components
Statistical dependence between R,G,B is stronger
19 1. IntroductionCoding of Audiovisual Contents
Inter channel redundancy: Stereo audio signals
For some stereo audio signals , there is a strong correlation between channels
Example of recorded sounds coming from the front (as in live concerts)
20 1. IntroductionCoding of Audiovisual Contents
8/10/2019 Introduction to coding of audivisual contents
11/28
CAV: Fall 2014 1. Introduction
ETSETB TelecomBCN Universitat Politcnica de Catalunya UPC-BarcelonaTech 11
Frequency redundancy: Image
In images, there is usually a high correlation between the various frequency bands of the signal
First level Second level Third level
21 1. IntroductionCoding of Audiovisual Contents
Irrelevance
For audio signals as well as image and video data to be consumed by humans, there is no need to represent more than the information that can be perceived in
, , , loudness, brightness,
Required resolution might depend on audio / image content (masking )
Irrelevancy can be application dependent : In ima e onl a s ecific area mi ht be relevant for some task e. . in
medical imaging.
22 1. IntroductionCoding of Audiovisual Contents
8/10/2019 Introduction to coding of audivisual contents
12/28
CAV: Fall 2014 1. Introduction
ETSETB TelecomBCN Universitat Politcnica de Catalunya UPC-BarcelonaTech 12
Exploiting limitations of HAS (I)
Human ear can only hear sound louder than a frequency dependant threshold (20Hz20kHz)
Masking : some signal components are irrelevant to human perception Depends on spectral characteristics, temporal characteristics and intensity
It can occur simultaneously with the masking signal (frequency masking)
Loud sounds mask soft ones, louder sound happens at the same time
23 1. IntroductionCoding of Audiovisual Contents
Exploiting limitations of HAS (II) Human ear can only hear sound louder than a frequency dependant threshold
(20Hz20kHz) Masking : some signal components are irrelevant to human perception
Depends on intensity and spectral as well as temporal characteristics
It can occur before or after the masking signal (temporal masking) The auditory system needs an integration time higher for lower sounds
Signals which are not audible do not need to be transmitted
Perceptual lossless : assign number of bits so that the quantization noise is not audible
24 1. IntroductionCoding of Audiovisual Contents
8/10/2019 Introduction to coding of audivisual contents
13/28
CAV: Fall 2014 1. Introduction
ETSETB TelecomBCN Universitat Politcnica de Catalunya UPC-BarcelonaTech 13
Exploiting limitations of HVS
Human visual system has much lower acuity for color hue and saturationthan for brightness
Use color transform to facilitate exploiting that property
Cb and Cr often sub sampled 2:1 relative to Y
0.299 0.587 0.114
0.169 0.331 0.5 .
0.5 0.419 0.813
Y R
Cb G
Cr B
x x
x x
x x
RGBcomponents
Luminance component
Chrominance components {
Y
Cb
Cr
25 1. IntroductionCoding of Audiovisual Contents
Unit structure
Motivating multimedia compression
The ITU601 standard
ow oes compress on wor
Redundancy and irrelevance
Lossless and lossy compression Compression efficiency
Quality measures
26
Coding systems
Importance of standards
1. IntroductionCoding of Audiovisual Contents
8/10/2019 Introduction to coding of audivisual contents
14/28
CAV: Fall 2014 1. Introduction
ETSETB TelecomBCN Universitat Politcnica de Catalunya UPC-BarcelonaTech 14
Lossless and lossy compression
Lossless compression
The reconstructed data at the output of the decoder is a perfect copy of the original data.
n y s a s ca re un ancy s exp o e . Lossless compression of audio, image, video gives only a moderate amount of
compression. Useful for file compression (rar, zip, arj), medical or satellite images
Lossy compression
The decompressed data is not identical to the source data. Both statistical and visual/psychoacoustic redundancy are exploited. Much higher compression ratios can be achieved at the expense of a loss of
visual/audio quality.
27 1. IntroductionCoding of Audiovisual Contents
Performance
How to evaluate the performance of a compression algorithm?
Performance can be based on several factors: Algorithm complexity Memory requirements Run time Robustness Compression efficiency Quality measures
Objective measures Subjective measures
28 1. IntroductionCoding of Audiovisual Contents
8/10/2019 Introduction to coding of audivisual contents
15/28
CAV: Fall 2014 1. Introduction
ETSETB TelecomBCN Universitat Politcnica de Catalunya UPC-BarcelonaTech 15
Efficiency (Measures of compression)
If b and b are the number of binary digits (or information carrying units) used in two representations of the same information,
Compression ratio : '
bC
b Example : grayscale image of size 256x256 at 1 byte per pixel. If the compressed image
occupies 16384 bytes, the compression ratio is 4
Compressed bit rate : number of binary digits (bits) used to represent each sample for ima es, in b or number of bits used er second for audio/video, in b/s
256 256 8 655364
16384 8 16384
x x
x
C
Example : for the previous image, the compressed bit rate is 2 bpp (bits per pixel)
16384 8 1310722
256 256 65536
x
x
29 1. IntroductionCoding of Audiovisual Contents
Typical compressed bit rates (I)
Audio (MP3) 32 Kb/s MW (AM) quality 96 Kb/s FM quality 128160 Kb/s Standard Bitrate quality 192 Kb/s DAB (Digital Audio Broadcasting) quality 224320 Kb/s Near CD quality
Other audio formats 800 b/s Minimum necessary for recognizable speech 8 Kb/s Telephone quality 0.5:1 Mb/s Lossless audio 1.411 Mb/s PCM sound format of Compact Disc Digital Audio
30 1. IntroductionCoding of Audiovisual Contents
8/10/2019 Introduction to coding of audivisual contents
16/28
CAV: Fall 2014 1. Introduction
ETSETB TelecomBCN Universitat Politcnica de Catalunya UPC-BarcelonaTech 16
Typical compressed bit rates (II)
Color Image Lossless: 3 bpp Lossy quality: Usable: 0.25 bpp, Moderate: 0.5 bpp, High: 1 bpp
Video (MPEG2) 16 Kb/s Videophone quality (minimum necessary for a
consumer acceptable "talking head" picture) 128 : 384 Kb/s Business oriented videoconferencing system quality 1.25 Mb/s VCD quality
5 Mb/s DVD quality 15 Mb/s HDTV quality 36 Mb/s HD DVD quality 54 Mb/s Bluray Disc quality
31 1. IntroductionCoding of Audiovisual Contents
Quality measures: Speech (I)
Speech coders are evaluated in terms of bit rate, complexity, delay, robustness and output quality:
Objective measures exist for computing bit rate, complexity and delay performance.
For quality or robustness only subjective measures can be applied.
Mean Opinion
Score
(MOS):
The MOS is generated by averaging the results of a set of standard, subjective tests where a number of listeners rate the heard audio quality of test sentences read aloud by both male and female speakers over the communications medium bein tested.
A listener is required to give each sentence a rating. The whole evaluation process is very expensive
32 1. IntroductionCoding of Audiovisual Contents
8/10/2019 Introduction to coding of audivisual contents
17/28
CAV: Fall 2014 1. Introduction
ETSETB TelecomBCN Universitat Politcnica de Catalunya UPC-BarcelonaTech 17
Quality measures: Speech (II)
Mean Opinion Score (MOS): The rating scheme is as follows:
Mean Opinion Score (MOS)MOS Quality Impairment
5 Excellent Imperceptible
4 Good Perceptible but not
annoying
3 Fair Slightly annoying2 Poor Annoying
33 1. IntroductionCoding of Audiovisual Contents
1 Ba Very annoying
A common value for usual standards is around 4.0 See http://en.wikipedia.org/wiki/Mean_opinion_score for typical values
Quality measures: Audio (I)
Audio coders are evaluated in terms of bit rate, complexity, delay, robustness and output quality:
Objective measures exist for computing bit rate, complexity and delay performance.
For quality or robustness only subjective measures can be applied.
In current
perceptual
audio
coding,
uncompressed
waveforms
can
be very different form the originals, but still they can sound quite similar.
Objective measures Classical objective measures, such as Signal to Noise Ratio (SNR) or
Total Harmonic Distortion (THD) are inadequate. Segmental SNR may be used for some classes of coders.
34 1. IntroductionCoding of Audiovisual Contents
8/10/2019 Introduction to coding of audivisual contents
18/28
CAV: Fall 2014 1. Introduction
ETSETB TelecomBCN Universitat Politcnica de Catalunya UPC-BarcelonaTech 18
Quality measures: Audio (II)
Subjective measures Subjective listening tests are the most reliable tool for assessing
quality. . . . . ,
ITUR BS.1116, ITUR BS1387, )
Listening tests are influenced by Playback level and background noise Loudspeakers and headphones distortions Listening room acoustics
Listeners expertise
35 1. IntroductionCoding of Audiovisual Contents
Quality measures: Image and Video (I)
Objective measures
Mean Square Error (MSE)1 2
0
1 N
i i
i
MSE X X N
Mean Absolute Difference (MAD)
Max Error
1
0
1 N
i i
i
MAD X X N
max i ii MaxError X X
Peak Signalto Noise Ratio (PSNR)
PSNR is useful to compare images with different dynamic ranges :
M = Maximum peak to peak value of the representation
2
1010log ; M
PSNR MSE
36 1. IntroductionCoding of Audiovisual Contents
8/10/2019 Introduction to coding of audivisual contents
19/28
CAV: Fall 2014 1. Introduction
ETSETB TelecomBCN Universitat Politcnica de Catalunya UPC-BarcelonaTech 19
Quality measures: Image and Video (II)
Subjective measures
Group of observers ITUR Standard BT 500 Time consuming, expensive, off line
Model the fundamental properties of Human Visual System Several attempts, no universal solution yet Lower levels of processing (eye imaging system, retina sampling,
spatial masking, etc.) can be modeled, but higher level in the brain
Commercial systems, like PQA500 (Tektronix)
37 1. IntroductionCoding of Audiovisual Contents
Quality measures: Image and Video (III)
Example of subjective quality measure : Absolute rating scale of the Television Allocation Study Organization
R. Gonzlez and R. Woods, Digital Image Processing , Prentice-Hall, 2008.
38 1. IntroductionCoding of Audiovisual Contents
8/10/2019 Introduction to coding of audivisual contents
20/28
CAV: Fall 2014 1. Introduction
ETSETB TelecomBCN Universitat Politcnica de Catalunya UPC-BarcelonaTech 20
Quality measures: Subjective vs. Objective (I)
Original Compressed and reconstructed,
very good quality
Compressed and reconstructed,
noticeable degradation
Missing visual information and with
artifacts
MSE= 5.17 MSE = 15.67 MSE = 14.17
39 1. IntroductionCoding of Audiovisual Contents
Quality measures: Subjective vs. Objective (II)
Best SSIM Researchers are
continuously looking for
objective measures that
ReferenceImage
Hypersphere
behavior of the HVS.
For instance, the Structural
Similarity Index (SSIM).
Z. Wang and A.C. Bovik, Mean Square Error: Love it or Leave it? IEEE Signal Processing Magazine , pp. 98 - 117, January 2009.
Worst SSIM
40 1. IntroductionCoding of Audiovisual Contents
8/10/2019 Introduction to coding of audivisual contents
21/28
CAV: Fall 2014 1. Introduction
ETSETB TelecomBCN Universitat Politcnica de Catalunya UPC-BarcelonaTech 21
Unit structure
Motivating multimedia compression
The ITU601 standard
ow oes compress on wor
Redundancy and irrelevance
Lossless and lossy compression
Compression efficiency
Quality measures
41
Coding systems
Importance of standards
1. IntroductionCoding of Audiovisual Contents
A typical coding system (I)
Source Encoder
Channel Encoder
Encoder
x
Channel Decoder
Source Decoder
Channel
Decoder
x
Two different pairs of blocks: source encoder/decoder and channel encoder/decoder.
The source encoder converts the source output to a binary sequence and the channel encoder processes the binary sequence for transmission over the channel. The channel decoder recreates the incoming binary sequence (hopefully reliable) and the source decoder recreates the source input.
42 1. IntroductionCoding of Audiovisual Contents
8/10/2019 Introduction to coding of audivisual contents
22/28
CAV: Fall 2014 1. Introduction
ETSETB TelecomBCN Universitat Politcnica de Catalunya UPC-BarcelonaTech 22
A typical coding system (II)
Source Encoder
Channel Encoder
Encoder
x
The main objective of the source encoder is to reduce the amount of data
Channel Decoder
Source Decoder
Channel
Decoder
x
(average number of bits) necessary to represent the information in the signal. This data compression is mainly carried out by exploiting the redundancy and irrelevancy in the signal information.
The main objective of the channel encoder is to add redundancy to the output of the source encoder to enhance the reliability on the transmission.
43 1. IntroductionCoding of Audiovisual Contents
Typical source codec
Source Encoder
PerceptualAnalysis
Transform orPrediction Quantization
Channel
x EntropyEncoder
ChannelEncoder
QuantizationTable
CodingTable
Transform orPrediction Model
Inverse Quantization
InverseTransform or
Prediction x
Source Decoder
EntropyDecoder
ChannelDecoder
44 1. IntroductionCoding of Audiovisual Contents
8/10/2019 Introduction to coding of audivisual contents
23/28
CAV: Fall 2014 1. Introduction
ETSETB TelecomBCN Universitat Politcnica de Catalunya UPC-BarcelonaTech 23
Source encoder (I)
Perceptual Analysis Transform or Prediction Quantization
Source Encoder
EntropyEncoder
Quant. Table Coding TableSelection T/P
PerceptualAnalysis
This block applies the knowledge about human perception on the signal and systems to be used.
Non perceptual parts of the signal should not be considered
Depending on the signal to be coded, better models can be used: In the audio case, the impact of the perceptual analysis in the final coding system is
paramount
In some cases, the knowledge about the human perception cannot be fully exploited in the coding process.
Typically, the result of the perceptual analysis affects the kind of signal model as well as the quantization of the parameters of such a model.
45 1. IntroductionCoding of Audiovisual Contents
Source encoder (II)
Transform or Prediction Transform or Prediction Quantization
Source Encoder
EntropyEncoder
Quant. Table Coding TableSelection T/P
PerceptualAnalysis
correlated set of samples describing the signal; modifies the signal representation into a format designed to reduce its redundancy.
Data in the new format has to present a correct structure in order to be useful for coding
purposes
The operation is generally reversible , that is the transform step does not introduce losses in the description.
represent the signal.
Coding algorithms can be grouped into two main classes: techniques that modified the original samples relying on a prediction model (e.g. Block Matching) and techniques that represent the signal in a different domain and code the transformed coefficients (e.g. DCT)
46 1. IntroductionCoding of Audiovisual Contents
8/10/2019 Introduction to coding of audivisual contents
24/28
CAV: Fall 2014 1. Introduction
ETSETB TelecomBCN Universitat Politcnica de Catalunya UPC-BarcelonaTech 24
Source encoder (III)
Quantization Transform or Prediction Quantization
Source Encoder
EntropyEncoder
Quant. Table Coding TableSelection T/P
PerceptualAnalysis
This block reduces the accuracy of the transform or prediction output in accordance with a pre established fidelity criterion.
The goal is to keep irrelevant information out of the compressed representation
The quantization represents a range of values of a given transformed sample by a single value in the range. The set of output values are the quantization indices. This step is responsible of the losses in the system and, therefore, determines the achieved com ression here we are assumin scalar uantization rou in a set of
transformed samples into a vector and performing vector quantization is possible as well).
This operation is irreversible , so it must be omitted when (lossless compression) error free compression is desired.
47 1. IntroductionCoding of Audiovisual Contents
Source encoder (IV)
Entropy encoder Transform or Prediction Quantization
Source Encoder
EntropyEncoder
Quant. Table Coding TableSelection T/P
PerceptualAnalysis
This block produces the final bit stream .
The entropy encoder generates a fixed or variable length code to represent the quantization output and maps the output in accordance with the code.
In many cases a variable length code is used: the shortest code words are assigned to the most frequently occurring quantization output values, thus minimizing coding redundancy.
This operation is reversible
48 1. IntroductionCoding of Audiovisual Contents
8/10/2019 Introduction to coding of audivisual contents
25/28
CAV: Fall 2014 1. Introduction
ETSETB TelecomBCN Universitat Politcnica de Catalunya UPC-BarcelonaTech 25
Complete system assessment
Rate/Distortion Analysis (I)
Transform or Prediction Quantization
Source Encoder
EntropyEncoder
Quant. Table Coding TableSelection T/P
PerceptualAnalysis
The way to analysis the performance of a coding system is by evaluating the quality of the decoded signal when varying the encoding rate .
In the case of audio signals, since good models for the HAS exist, the concept of quality refers to perceived audio quality .
D
, the concept of quality commonly refers toa measure of distortion in the decodedsignal (for instance, MSE or PSNR) R
49 1. IntroductionCoding of Audiovisual Contents
Complete system assessment
Rate/Distortion Analysis (II)
Transform or Prediction Quantization
Source Encoder
EntropyEncoder
Quant. Table Coding TableSelection T/P
PerceptualAnalysis
A Rate/Distortion analysis requires the implementation of the complete encoding system , which sometimes is not available
The various parts of the system are commonly assessed using other (local) measures:
Transform / Prediction : Decrease in correlation of the output samples. Quantization : Decrease in entropy of the output symbols
Those are suboptimal measures and usually the global analysis is necessary since the different blocks can be very related:
For instance, entropy coders which are very adapted to a given source
50 1. IntroductionCoding of Audiovisual Contents
8/10/2019 Introduction to coding of audivisual contents
26/28
CAV: Fall 2014 1. Introduction
ETSETB TelecomBCN Universitat Politcnica de Catalunya UPC-BarcelonaTech 26
Source decoder
Source Decoder
The decoder contains three blocks that erform, in reverse order, the inverse operations of the encoder: entropy decoder , inverse quantization and inverse transform block.
InverseQuantization
InverseTransform or
EntropyDecoder
Prediction
Source Decoder
51 1. IntroductionCoding of Audiovisual Contents
Channel encoder and decoder The channel encoder and decoder play an important role when the
channel is noisy or prone to error.
They are designed to reduce the impact of channel noise by inserting a controlled form of redundancy into the source encoded data.
As the output of the source encoder contains little redundancy, it would be highly sensitive to transmission noise without the addition of this controlled redundancy.
Source Channel
Encoder
x
ChannelDecoder
SourceDecoder
Channel
Decoder
x
52 1. IntroductionCoding of Audiovisual Contents
8/10/2019 Introduction to coding of audivisual contents
27/28
CAV: Fall 2014 1. Introduction
ETSETB TelecomBCN Universitat Politcnica de Catalunya UPC-BarcelonaTech 27
Normalization and Audiovisual Coding Standards
What do standards standardize? To correctly decode the bitstream produced by the coding system, the encoder
has to share some information with the decoder: How the different codewords have been stored in the bitstream ( syntax definition )
and Which actions are associated to their different decoded values (decoder definition ).
What is left for commercial competition? Manufacturers still have a margin for innovation and competition
Codec performance: parameters, coding tables, actual implementation, Added functionalities: interface, adaptation, complexity, quality, price,
Im ortance of Normalization Normalization allows interconnection of equipment from different
manufacturers Standards should arrive early in the market, to avoid de facto standards
53 1. IntroductionCoding of Audiovisual Contents
Standards ISO/MPEG
ISO/IEC MPEG has published audio visual standards in the last 10 years to foster the development of the digital environment.
MPEG1 1992, ISO/IEC 11172Coding of video & associated audio for Digital Storage Media (DSM) up to 1,5 Mbit/s in
MPEG2 1994, ISO/IEC 13818 1996 Emmy for technical excellence Generic video and audio coding for TV Broadcasting of Standard Definition signals (SDTV) & High Definition (HDTV)
MPEG4 1998, ISO/IEC 14496 2003 JVT (ISO/ITU-T) AVC Coding of Audio Visual Objects (AVO) oriented to manipulation (e.g. use of acompositor in the decoder)
MPEG7 2001, ISO/IEC 15938Interface for the description of multimedia content, in order to facilitate search, retrieval and access of audio visual contents
MPEG21 2003, ISO/IEC 21000Structure for the development of the multimedia, management of adaptation of content to different media and devices (repurposing) & digital rights management for content creation (applications: collections, multimedia libraries...)
54 1. IntroductionCoding of Audiovisual Contents
8/10/2019 Introduction to coding of audivisual contents
28/28
CAV: Fall 2014 1. Introduction
Course contents
Transform orPrediction Quantization
x
Source Encoder
EntropyEncoder
ChannelEncoder
PerceptualAnalysis
Inverse Quantization
InverseTransform or
Prediction
Channel
x EntropyDecoderChannelDecoder
QuantizationTable
CodingTable
Transform orPrediction Model
Source Decoder
Unit 2 Entropy Coding CommunicationCourses
Source Decoder
Unit 3 Speech Coding, Unit 4 Audio Coding, Unit 5 Image Coding, Unit 6 Video Coding
55 1. IntroductionCoding of Audiovisual Contents