Post on 19-Jul-2020
1
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
104_Compression © elsaddik
Multimedia Communications
Multimedia Technologies & Applications
Prof. Dr. Abdulmotaleb El SaddikMultimedia Communications Research Laboratory
School of Information Technology and EngineeringUniversity of Ottawa
Ottawa, Ontario, Canada
elsaddik @ site.uottawa.ca
abed @ mcrlab.uottawa.ca
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
204_Compression © elsaddik
Content
1. Motivation
2. Requirements - General
3. Fundamentals - Categories
4. Source Coding
5. Entropy Coding
6. Hybrid Coding: Basic Encoding Steps
7. JPEG
8. H.261 and related ITU Standards
9. MPEG-1
10. MPEG-2
11. MPEG-4
12. Wavelets
13. Fractal Image Compression
14. Basic Audio and Speech Coding Schemes
15. Conclusion
2
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
4504_Compression © elsaddik
Video Compression
In video streams, there are 2 types of redundancy that can be explored:ØSpatial redundancyØTemporal redundancy
Recall that spatial redundancy is what JPEG and other still image algorithms use.ØThere are two groups of video compression
products: vBased purely on spatial redundancyvBased on both spatial and temporal
redundancy
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
4604_Compression © elsaddik
Spatial-Redundancy-Only Video Compression
ØCalled motion JPEGØCompress each frame individually, without
reference to any other frames in the sequencevà thus does not consider inter-frame
redundanciesØaudio is not supported in an integrated fashionØMotion JPEG Hardware (Chips, boards) for near
real-time compression/ decompression available, but storage and retrieval from a hard disc still takes a second or more.vHigh quality video requires fast SCSI discs or
cashing of short video sequences in large memory buffers.
3
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
4704_Compression © elsaddik
JPEG for full-motion video
ØAdvantages:vLoss of frames does not affect other framesvLess encoding complexity and delayvEasier editing
ØDisadvantages:vnetwork-based JPEG applications unlikely,
since it is bandwidth-intensive• Typical rate for studio quality TV: 10 ~ 20
Mbps
Basically, lower compression rates is needed
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
4804_Compression © elsaddik
Spatial and temporal redundancy video compression – MPEG
We have seen with JPEG how spatial redundancy can be explored. MPEG utilises, as well as spatial redundancy, the fact that frames in a sequence are similar to each other. This is what is known as temporal redundancy.
A few definitions are required here:ØMacroblocksvThis is a 16x16 pixel block, composed of
4 times 8x8 luminance blocks and 2 colour difference blocks
ØMotion VectorsvIndicates the spatial translation of a
macroblock between two frames.
4
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
4904_Compression © elsaddik
Macroblocks
Y CB CR
0 1
2 3
4 5
YrcYbc
bgrY
b
r
−=−=
⋅+⋅+⋅= 0721,07154,02125,0
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
5004_Compression © elsaddik
Macroblocks
5
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
5104_Compression © elsaddik
Motion Vectors: Basis imagew
ww
.site
.uot
taw
a.ca
/~el
sadd
ikw
ww
.el-s
addi
k.co
m
5204_Compression © elsaddik
Motion Vectors: 2nd Image with motion
6
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
5304_Compression © elsaddik
Motion Vectors: Difference without motion compensationw
ww
.site
.uot
taw
a.ca
/~el
sadd
ikw
ww
.el-s
addi
k.co
m
5404_Compression © elsaddik
Motion Vectors: Difference with motion compensation
7
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
5504_Compression © elsaddik
Motion estimation for different frames
I P
B
Available from earlier frame (I)
Available from later frame (P)
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
5604_Compression © elsaddik
MPEG
ØMotion Picture Expert Group (MPEG)vISO/IEC working group(s)vISO/IEC JTC1/SC29/WG11vISO IS 11172 since 3/93
Øcoding of combined:vvideo and audio information
ØStarting point: MPEG-1vAudio/video at about 1.5 Mbit/svBased on experiences with JPEG and H.261
ØFollow-up standardsvMPEG-2vMPEG-4vMPEG-7vMPEG-21
8
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
5704_Compression © elsaddik
MPEG
ØMPEG vallows coding comparison across multiple
frames and therefore can yield compression ratios of 50:1 to 200:1vMPEG chips
• provide VHS quality at 1.2 -1.5 Mbps and 200:1
• can also give 50:1 and broadcast video quality at 6 Mbps
Øalgorithm asymmetrical: vmore complex to compress than decompress
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
5804_Compression © elsaddik
MPEG - Video: Processing Step
4 types of frames:ØI-frames (intra-coded frames):vReal-time decoding demands and sometimes
in encoding toovCompression of I frames the lowest in MPEGvI-frames are points for random access in
MPEG streamsvcoding and decoding like JPEGvStructured in 8x8 blocks, within macroblocks
of 16x16, that are DCT coded, quantized and entropy coded
9
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
5904_Compression © elsaddik
MPEG - Video: Processing Step
ØP-frames (predictive coded frames):
vRequire about 1/3 of data of I-framesvReference to previous I- or P-framesvMotion vector calculated
• MPEG does not define how to determine the motion vector
• difference of similar macroblocks is DCT codedvDC and AC coefficients are runlength coded
ØB-frames (bi-directional predictive coded frames):
vReference to previous and subsequent (I or P) framesvOne or two motion vectors are encodedvInterpolation between matching macroblocks allowed
(both directions)ØD-frames (DC-coded frames):vOnly DC-coefficients are DCT codedvFor fast forward and rewind
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
6004_Compression © elsaddik
MPEG Video-frame sequence
I B B P B B P B B I
•I frame: Intra frame •P frame: Predicted frame•B frame: Bidirectionallyinterpolated frame
1 2 3 4 5 6 7 8 9 10
MPEG coded sequence will be transmitted in different order:
I P B B P B B I B B1 4 2 3 7 5 6 10 8 9
Sequence• Defined by application
10
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
6104_Compression © elsaddik
MPEG in a Nutshell
ØI-Frames are self contained but less compressed than P and B Frames. ØB-Frames are the most compressed frames.
Typical sequences of frames are:ØI BBB P BBB I…ØI BB P BB P BB I…ØI BB P BB P BB P BB I...
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
6204_Compression © elsaddik
MPEG Video-Coding Procedure
Colourspace converter
FDCT QuantizationEntropyencoder
I frame
(RGB->YUV)
Video in
Compressed data
Colourspace converter
FDCT
Entropyencoder
+
-
+
Referenceframe
Errorterms
Motionestimator
P / B frame
(RGB->YUV)
Video in
Compressed data
11
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
6304_Compression © elsaddik
MPEG Encoder: One possible implementation
Framerecorder DCT Quantize
Variable-lengthcoder
Transmitbuffer
Predictionencoder
De-quantize
InverseDCT
Motionpredictor
Referenceframe
Ratecontroller
IN OUT
Scalefactor
Bufferfullness
Prediction
Motion vectors
DC
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
6404_Compression © elsaddik
MPEG- Audio Coding
ØSampling compatible to encoding of CD-DA and DAT:vSampling rates:
• 32 kHz, 44,1 kHz, 48 kHzvSampling precision:
• 16 bit/sampleØAudio channels:vMono (single, 1 channel)vStereo (2 channels)
• dual channel mode (independent, e.g., bilingual)
• optional: joint stereo (exploits redundancy and irrelevancy)
12
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
6504_Compression © elsaddik
MPEG Audio
ØApplication Example: DAB Digital Audio BroadcastingØuses MPEG layer 2 (compression also known as
“MUSICAM” =v(Masking pattern adapted Universal Subband
Integrated Coding And Multiplexing)Ødelays, for VLSI implementation:vmax. 30 ms encodingvmax. 10 ms decoding
ØSW codec delays vary for different layers, implementations, computers (rule-of-thumb may be 50/100/150 ms for layer 1/2/3, which makes MP3 rather inappropriate for real-time conversation)
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
6604_Compression © elsaddik
MPEG-Audio Coding
ØFFT applied to audio and spectrum is split into 32 non-interleaved sub-bandsvfor each sub-band, amplitude of audio signal
is calculatedvalso, noise level is determined simultaneously
with FFT, using a “psychoacoustic” model• Rough quantization at low noise level and
fine one at high-level
Sub-bandcoding Quantization Entropy
coding
Psychoacousticalmodel
32
control
Uncompressedaudio
Compressedaudio
13
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
6704_Compression © elsaddik
MPEG- Audio Coding
ØDefines 3 layers of quality, with different complexity of encoder/ decoder
v"higher layer" means "more complex" & "can handle lower layers"
ØData ratesv14 fixed data rates per layer, between 32 kbps-448 kbps
• In steps of 16 kbit/s
vLayer 1: max. 448 Kbit/s(ca. 1:4 compression, e.g. used as PASC in DCC)
vLayer 2: max. 384 Kbit/s(ca. 1:6-8, common, e.g. as MUSICAM in DAB)
vLayer 3: max. 320 Kbit/s(ca. 1:10-12, the famous MP3)
vHigher data rates are allowed for the modes:• “stereo”
• “joint stereo”• “dual channel”
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
6804_Compression © elsaddik
MPEG- Audio and Video Data Streams
Audio Data Stream LayersØ1. FramesØ2. Audio access unitsØ3. Slots ( 4 bytes in Layer 1 (low compexity), 1
byte in Layer 2 &3)
Video Data Stream LayersØ1. Video sequence layerØ2. Group of pictures layerØ3. Single picture layerØ4. Slice LayerØ5. Macroblock layerØ6. Block layer
PB
BI
14
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
6904_Compression © elsaddik
MPEG Layersw
ww
.site
.uot
taw
a.ca
/~el
sadd
ikw
ww
.el-s
addi
k.co
m
7004_Compression © elsaddik
MPEG Layers
ØEach picture is divided to m horizontal slicesØEach slices contains n macroblocksØEach macroblock contains of 16x16 pixels with
the total of 256 pixelsØEach block composed of 8x8 pixels which is 64
total pixels
PicturePicture
SliceSlice
MacroBlockMacroBlock
BlockBlock
15
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
7104_Compression © elsaddik
MPEG - Fellow upØMPEG-2:vHigher data rates for high-quality audio/videovMultiple layers and profilesvStudio quality TV and CD quality audio channels. 4 to 6 Mbps
typically.ØMPEG-3vInitially HDTVvMPEG-2 scaled up to subsume MPEG-3
ØMPEG-4:vInitially, lower data rates for e.g. mobile communicationvthen: focus coding & additional functionalities based on
image contentsvVideo conferencing at very low bit rates: 4.8 to 64 Kbps, with
10fps.ØMPEG-7 (EC = "experimental core" status):vContent descriptionvBasis for search and retrievalvSee section on databases
ØMPEG-21 (upcoming):vFramework for multimedia business, delivery... what’s
missing?vmaybe eCommerce focus --> e.g., security, watermarking?
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
7204_Compression © elsaddik
MPEG 2
ØFrom MPEG-1 to MPEG-2vImprovement in quality
• from VCR to TV to HDTVØNo CD-ROM based constraintsvhigher data rates
• MPEG-1: about 1.5 Mbit/s• MPEG-2: 2-100 Mbit/s
ØProminent role for digital TV in DVB (digital video broadcasting)vcommercial MPEG-2 realizations available
16
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
7304_Compression © elsaddik
MPEG 2
Øan international standard (1994)ØCBR (constant bit rate) and VBR video (Variable
bit rate) ØPicture quality higher than that of current NTSC,
PAL and SECAM broadcast systems ØCompression to bit rates in the range of:
v60 Mbps for HDTVv15 Mbps for NTSC, PAL and SECAMv4-15 Mbps for TV signals conforming
to CCIR 601ØMPEG-2 consists of five profiles: (Simple (does
not support B frames), Main, Next, .. ) each having four levels :
vHigh level Type 1: 1152 lpf, 1920 ppl, 60 fps -> 60 Mbps
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
7404_Compression © elsaddik
MPEG-2 Video Profiles and Levels
SimpleProfile
MainProfile
SNR ScalableProfile
Spatial Sca-lable Profile
HighProfile
High Level1920 pixels/line1152 lines
High-1440 Level1440 pixels/line1152 lines
Main Level720 pixels/line576 lines
Low Level352 pixels/line288 lines
LAYERSandPROFILES
No B-frames B-frames B-frames B-frames B-frames
Not Scalable Not Scalable SNR Scalable SNR Scalableor Spatial Sca-lable
SNR Scalableor Spatial Sca-lable
80 Mbps
80 Mbps60 Mbps 60 Mbps
100 Mbps
15 Mbps 15 Mbps 15 Mbps 20 Mbps
4 Mbps 4 Mbps
Signal to Noise (SNR) scaling : noise introduced byquantization errors and block structures
17
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
7504_Compression © elsaddik
MPEG 2 Audio
(two modest) extension to MPEG-1 audio: 1. "low sample rate extension" LSE: v 1/2 of all MPEG-1 rates: 16, 22.05, 24kHzv quantization down to 8 bits/sample
2. "multichannel extension": more channels, i.e. up to v 5 full bandwidth channels (surround system)
• left and right front• center (in front)• left and right back
v "multilingual extension": 7 more, i.e. up to 12 channels (multiple languages, commentary)
Ø Backward compatibility with MPEG-1 audiov Only three MPEG-2 audio codecs will not provide
backward compatibility ( in the range of 256- 448 kbps)
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
7604_Compression © elsaddik
MPEG-2 System DefinitionØStepsvaudio and video combined to “Packetized
Elementary Stream”vPES combined to “Program Stream” or “Transport
StreamӯProgram StreamvError-free environmentvPackets of variable lengthvOne single stream with one timing reference
ØTransport StreamvDesigned for “noisy” (lossy) media channelsvMultiplex of various programs with one or more
time basesvPackets of 188 bytes
ØConversion between Program and Transport Streams possible
18
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
7704_Compression © elsaddik
MPEG 2 Elementary Streams
Audio source
Video source
Audio encoder
Video encoder
Systemclock
MPEG2 Systemmultiplexerand encoder
MPEG2stream
Audio PacketizedUnit
MPEG2 encoded Audio
MPEG2 encoded Video
Video PacketizedUnit
Time Sync. Information
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
7804_Compression © elsaddik
MPEG 2 Streams
ISO 11172 Stream
PackHeader
PackHeader
SystemHeader ……..
Pack 1 Pack 2
VideoPacket
VideoPacket
VideoPacket
VideoPacket
VideoPacket
AudioPacket
188 bytes
19
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
7904_Compression © elsaddik
MPEG - Fellow upØMPEG-2:vHigher data rates for high-quality audio/videovMultiple layers and profilesvStudio quality TV and CD quality audio channels. 4 to 6 Mbps
typically.ØMPEG-3vInitially HDTVvMPEG-2 scaled up to subsume MPEG-3
ØMPEG-4:vInitially, lower data rates for e.g. mobile communicationvthen: focus coding & additional functionalities based on
image contentsvVideo conferencing at very low bit rates: 4.8 to 64 Kbps, with
10fps.ØMPEG-7 (EC = "experimental core" status):vContent descriptionvBasis for search and retrievalvSee section on databases
ØMPEG-21 (upcoming):vFramework for multimedia business, delivery... what’s
missing?vmaybe eCommerce focus --> e.g., security, watermarking?
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
8004_Compression © elsaddik
MPEG 4
ØMPEG-4 (ISO 14496) originally:vTargeted at systems with very scarce
resourcesvTo support applications like
• Mobile communication• Videophone and E-mail
vMax. data rates and dimensions (roughly):• VLBV “Very Low Bit-rate Video”
• Between 4800 and 64000 bits/s• 176 columns x 144 lines x 10 frames/s
• Largely covered by H.263 (QCIF)Øtherefore re-orientation:vGoal to provide enhanced functionalityvto allow for analysis and manipulation of
image contents
20
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
8104_Compression © elsaddik
MPEG 4
MPEG-4: Schedule for StandardizationØ1993: Work startedØ1997: Committee DraftØ1998: Final Committee DraftØ1998: Draft International StandardØ1999-2000: International Standard
ØAgainvStarted from original goal of providing an audio-visual
coding standard for very-low-bit-rate channels (e.g., for mobile applications)vEvolved into a complex tool kit vMPEG-4 innovates the MPEG-2 information production
and consumption paradigm by the way audio and video info is represented
vDeals with audio and video no longer as packaged “bitstreams”, produced by encoding, but as “audio-visual objects” (AVOs)
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
8204_Compression © elsaddik
MPEG 4 - Technical information
ØObjects are organized in a hierarchal fashion.ØEach object has its own
description element.
vAllows handling of the object
ØOne or more primitive media objects can be combined.ØTechniques from the
Virtual Reality model language.
Voice
Background
Image
Talkingperson
Compound mediaobject
Primitive media objects
21
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
8304_Compression © elsaddik
Video objects
ØDivide video components
vPerson and backgroundØCamera position information
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
8404_Compression © elsaddik
MPEG 4 -- Media streams
ØOne or more media streamsØDescriptors for the objects and the stream
Mediastream Decompression
Scenedescription
Composition
22
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
8504_Compression © elsaddik
Scene description
ØGrouping of the objectsvDirected acyclic graph
ØPositioning the objectsvSpecial attributes
Scene
Room
… … … …
Person
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
8604_Compression © elsaddik
New or Improved’ MPEG4 Functionalities
ØContent-Based ScalabilityØContent-Based Manipulation and Bitstream
EditingØContent-Based Multimedia Data Access ToolsØHybrid Natural and Synthetic Data CodingØCoding of Multiple Concurrent Data StreamsØImproved Coding EfficiencyØRobustness in Error-Prone EnvironmentsØImproved Temporal Random Access
23
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
8704_Compression © elsaddik
Content-Based Scalability
ØMPEG4 provides the ability to achieve scalability with a fine granularity in content, spatial resolution, temporal resolution, quality and complexity.ØContent-scalability may imply the existence of a
prioritization of the objects in the scene. The combination of more than one scalability case may yield interesting scene representations, where the more relevant objects are represented with higher spatial-temporal resolution. ØExample uses: vuser selection of decoded quality of individual
objects in the scene; vdatabase browsing at different scales,
resolutions, and qualities.
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
8804_Compression © elsaddik
Content-Based Manipulation and Bitstream Editing
ØMPEG4 provides a syntax and coding schemes to support content-based manipulation and bitstream editing without the need for transcoding.ØThis means the user should be able to access
one specific object in the scene/bitstream and perhaps change some of its characteristics.ØExample uses: vhome movie production and editing;
interactive home shopping; vinsertion of sign language interpreter or
subtitles.
24
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
8904_Compression © elsaddik
Content-Based Multimedia Data Access Tools
ØMPEG4 shall provide efficient data access and organisation based on the audio-visual contentvAccess tools may be
• indexing, hyperlinking, querying,browsing, uploading, downloading, and deleting.
ØExample uses: vcontent-based retrieval of information from
on-line libraries and travel information databases
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
9004_Compression © elsaddik
Hybrid Natural and Synthetic Data Coding
ØMPEG4 supports efficient methods for combining synthetic scenes with natural scenes (e.g. text and graphics overlays), the ability to code and manipulate natural and synthetic audio and video data and decoder-controllable methods of mixing synthetic data with ordinary video and audio, allowing for interactivity. Øharmonious integration of natural and synthetic audio-
visual objects. Ø first step towards the integration of all types of audio-
visual information.ØExample uses:
vvirtual reality applications; vanimations and synthetic audio (e.g. MIDI) can be mixed
with ordinary audio and video in a game; vgraphics can be rendered from different viewpoints.
25
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
9104_Compression © elsaddik
Coding of Multiple Concurrent Data Streams
Øability to efficiently code multiple views/soundtracks of a scene as well as sufficient synchronisation between the resulting elementary streams. ØFor stereoscopic and multiview video applications, MPEG4
shall include the ability to exploit redundancy in multiple views of the same scene, also permitting solutions that allow compatibility with normal (mono) video. This functionality should provide efficient representations of 3D natural objects provided a sufficient number of views is available. Again, this may require a complex analysis process. It is expected that this functionality could substantially benefit applications such as virtual reality where almost only synthetic objects are used till now.ØExample uses:
vmultimedia entertainment, e.g. virtual reality games, 3D movies; vtraining and flight simulations; vmultimedia presentations and education.
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
9204_Compression © elsaddik
Improved Coding Efficiency
Øthe growth of mobile networks provides a strong need for improved coding efficiency, ØMPEG4 is required to provide subjectively better
audio-visual quality compared to existing or other emerging standards (such as H.263), at comparable bit-rates. ØThe results of the MPEG4 video subjective tests,
held in November 1995, showed however that, in terms of coding efficiency, the available coding standards still perform very well in comparison with most of the other coding techniques proposedØExample uses: vefficient transmission of audio-visual data on
low-bandwidth channels; vefficient storage of audio-visual data on
limited capacity media, such as chip cards.
26
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
9304_Compression © elsaddik
Robustness in Error-Prone Environments
Øuniversal accessibility implies access to applications over a variety of wireless and wired networks and storage media ØMPEG4 shall provide an error robustness
capability. Particularly, for low bit-rate applications under severe error conditions.ØThe idea is not to substitute the error control
techniques implemented by the network but provide resilience against the residual errors, e.g. through selective forward error correction, error containment or error concealment.ØExample uses: vtransmitting from a database over a wireless
network;vcommunicating with a mobile terminal; vgathering audio-visual data from a remote
location
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
9404_Compression © elsaddik
Improved Temporal Random Access
ØMPEG4 shall provide efficient methods to randomly access, within a limited time and with fine resolution, parts from an audio-visual sequence. This includes ‘conventional’ random access at very low bit rates.
ØExample uses vaudio-visual data can be randomly accessed
from a remote terminal over limited capacity media; va ‘fast forward’ can be performed on a single
audio-visual object in the sequence.
27
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
9504_Compression © elsaddik
MPEG7
ØIncreasing availability of Multimedia contentØIncreasing creation of Multimedia contentØIncreasing use of Multimedia content by
machinesØThe need for searching, categorizing, describing,
managing and filtering
à Great need for Standard Description
ØMPEG-7 proposing such a standardØMPEG-7 does not deal with implementationv(Great for Master Thesis)
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
9604_Compression © elsaddik
MPEG 21
ØMPEG-21 Multimedia FrameworkvThe vision for MPEG-21 is:
to define a multimedia framework to enable transparent and augmented use of multimedia
resources across a wide range of networks and devices used by different communities
28
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
9704_Compression © elsaddik
MPEG 21
Seven Architectural ‘Elements’ in the Multimedia Framework:
1. Digital Item Declaration2. Digital Items Representation3. Digital Item Identification and Description 4. Content Management and Usage 5. Intellectual Property Management and
Protection 6. Terminals and Networks 7. Event Reporting
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
9804_Compression © elsaddik
MPEG 21
98
Identification and
Description
Content Management and Usage
Terminals & Networks
IPMP
Content Represent-
ation
Digital Item Declaration
Event Reporting
Event Reporting Metrics & InterfacesEvent Reporting Metrics & InterfacesUser A User BTransaction/Use/Relationship
ßContentàßAuthorization/Value Exchangeà
Example: item, resource
Example: Unique IdentifierExample: Natural & Synthetic
Example: Encryption, Authentication Watermarking
Example: resource Mgmt. (QoS)
Example: Storage MgmtPersonalization
Event reporting, by creating metrics and interfaces,
further describes specific interactions
29
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
9904_Compression © elsaddik
MPEG relations
MPEG2MPEG1 MPEG2 MPEG4MPEG7
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
10004_Compression © elsaddik
Standards for Narrow-Band Videoconferencing
ØH.320:
vStandard for videoconferencing over ISDN linesØH.324: vStandard for videoconferencing over POTS (Plain Old
Telephone Service)ØH.32x’s umbrella specification structure:
G.723 H.263H.245 H.223V.34
H.324
G.722 H.261H.242 H.221
H.320
30
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
10104_Compression © elsaddik
H.261 and related ITU Standards
ØVideo codec for audiovisual services at p x 64kbit/s
v("p-times-sixtyfour", where p means "multiples-of"):vITU- CCITT standard from 1990
• ITU = International Telecommunication Union• CCIT = Consultative Committee for International
Telegraph and Telephone vFor ISDNvWith p=1,..., 30
ØTechnical issues:
vReal-time encoding/decodingvMax. signal delay of 150msvConstant data ratevImplementation in hardware (main goal) and software
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
10204_Compression © elsaddik
H.261 – Resolution FormatØUnlike JPEG, H.261 defines a very precise image formatvImage components:
• Luminance signal (Y)• Two color difference signals (Cb,Cr)
vSubsampling according to CCIR 601 (4:1:1)• ITU-R 601: (formerly CCIR) designates a "raw" digital
video format with 704 x 480 pixels • CCIR = International Radio Consultative Committee
Two resolution formats are specified:ØOptionalvCommon Intermediate format (CIF) resolution
• Y: 352 x 288 pixel• At 29.97 frames/s app. 36.46 Mbps (uncompressed) i.e. ~
570 * 64kbpsØMandatoryvQuarter Common Intermediate Format (QCIF) resolution (has
half of CIF resolution)• Y: 176 x 144 pixel• At 29.97 frames/s app. 9.115 Mbps (uncompressed)
Ø all H.261 implementations must be able to encode and decode QCIF ; CIF is optional
31
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
10304_Compression © elsaddik
H.261 ( p x 64) Video Compression
ØDCT-based compression algorithm, like JPEG, with
vdifferential PCM (DPCM) with motion estimation for interframe coding and vvariable word-length entropy coding (such as Huffman)
Øvery high-compression ratios for full-color, real-time motion video transmissionØcombines intraframe and interframe codingØoptimized for applications such as vvideo-conferencing, which are not motion-intensive
Ø limited motion search and estimation strategiesØcompression ratios from 100:1 to 2,000:1Øcovers the entire ISDN channel capacity (p x 64 kbps,
p=1,2,...,30)vfor p=1 or 2: videophone, desk -top video-conferencing
applicationsvfor p=6 or higher, more complex pictures are
transmitted. Good for group video-conferencing
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
10404_Compression © elsaddik
H.261Ø Intraframe coding takes no advantage of redundancy between
frames.vIntraframe coding: yields "reference frame" f0veach 8x8 block is transformed by DCTvDCT with same quantization factor for all AC valuesvthis factor may be adjusted by loopback filtervintraframes rare (bandwidth!, main application videophone)
Ø Interframe coding (corresponds to P frames of MPEG) à Motion estimationvinterframes: f1,f2,f3,... relative to f0 (differential encoding)vSearch of similar macroblock (16x16) in previous imagevPosition of this macroblock defines motion vectorvSearch range is up to the implementation:
• max. ± 15 pixel• but: motion vector may also always be 0 ("bad" software
encoder) • e.g. H.261 also allows simple implementation, considering
only the differences between macroblocks located in the same position, thus a zero motion vector
32
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
10504_Compression © elsaddik
Main Differences between H.261 and H.263
ØExtension to H.261Ømax. bitrate: H.263 approx. 2.5 x H.261; lowest bitrates
suitable f. modem
Main Differences between H.261 and H.263ØBase Level Differences (always ON)vNo filter for HF noise in feedback loopvMotion vectors produced with 1/2-pixel resolutionvPicture format for sub-QCIF (128x96)vHuffman tables designed specifically for low bit rate.
vJPEG is the still picture modeØOptional Level Differences (Negotiated)vUnlimited search space for motion vector à fast encoder can do bettervSyntax-based Arithmetic codingvAdvanced prediction modevPB-frames (2 combined pictures: 1 B- & 1 P-Frame)
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
10604_Compression © elsaddik
Main Differences between H.261 and H.263
ØN.B. H.261 is fully contained within H.263
H.261
H.263
33
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
10704_Compression © elsaddik
Source Image Formats
optionalnot defined1408 x 115216QCIF16 Times Quarter
Common Intermediate Format
optionalnot defined704 x 5764QCIF4 Times Quarter
Common Intermediate Format
optionaloptional352 x 144CIFCommon Intermediate
Format
requiredrequired176 x 144QCIFQuarter Common
Intermediate Format
requiredoptional128 x 96SQCIFSub Quarter Common Intermediate Format
H263Encoder/Decoder
H261Encoder/Decoder
PixelsFormat
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
10804_Compression © elsaddik
Conclusion
JPEGØVery general format with good compression ratioØSW and HW for baseline mode available
H.261/ H.263ØEstablished standard by telecom worldØPreferable hardware realization
MPEG-1, MPEG-2, MPEG-4, MPEG-7ØMPEG-2 with data rates between 2 and 100 MbpsØMPEG-4, MPEG-7: objects coding, content descr.
Proprietary Systems: Quicktime, DVI, CD-I,...ØProduct that use of other standardsØMigration to use the standards
34
ww
w.s
ite.u
otta
wa.
ca/~
elsa
ddik
ww
w.e
l-sad
dik.
com
10904_Compression © elsaddik
Encoding Rates of Various Standards
JPEG (for video) 10-20 Mbps 7-27 timesMPEG-1 1.2-2.0 Mbps 100 timesH.261 64kbps-2Mbps 24 timesDVI 1.2-1.5 Mbps 160 timesCD-I 1.2-1.5 Mbps 100 timesMPEG-2 4-60 Mbps 30-100 timesCCIR 723 32-45 Mbps 3-5 timesCCIR 601/D-1 140-270 Mbps ReferencePictureTel SG3 0.1-1.5 Mbps 100 timesSoftware compression (small window) ~2 Mbps 6 times
Standard Data Rate Compression
NB. For JPEG , it was assumed 640 x 480 x 24-bit colour, 15 fps