Overview: image and video coding...

61
Bernd Girod: EE398B Image Communication II Video Coding Standards no. 1 Overview: Video Coding Standards Video coding standards: the applications and the common structure Relevant standards organizations ITU-T Rec. H.261 ITU-T Rec. H.263 ISO/IEC MPEG-1 ISO/IEC MPEG-2 ISO/IEC MPEG-4 Recent progress: H.264/JVT

Transcript of Overview: image and video coding...

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 1

Overview: Video Coding Standards

Video coding standards: the applications and the common structureRelevant standards organizationsITU-T Rec. H.261 ITU-T Rec. H.263ISO/IEC MPEG-1 ISO/IEC MPEG-2ISO/IEC MPEG-4Recent progress: H.264/JVT

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 2

Major Applications of Video Compression

H.263, MPEG-420 . . . 100 kbpsVideo over 3G wireless

H.261, H.26320 . . . 320 kbpsVideoconferencing, videotelephony

Proprietary, similar to H.263, MPEG-4, or H.26L/JVT

20 . . . 200 kbpsInternet video streaming

MPEG-26 . . . 8 MbpsDVD video

MPEG-22 . . . 6 Mbps(10…20 Mbps for HD)

Digital television broadcasting

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 3

Motion-compensated Hybrid CodingH.261, MPEG-1, MPEG-2, H.263, MPEG-4, H.26L/JVT

EntropyCoding

Deq./Inv. Transform

Motion-Compensated

Predictor

ControlData

Quant.Transf. coeffs

MotionData

0

Intra/Inter

CoderControl

Decoder

MotionEstimator

Transform/Quantizer-

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 4

Video Standards: Hierarchical Syntax I

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 5

Video Standards: Hierarchical Syntax II

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 6

International Telecommunication Union (ITU)

Formed in 1934 by merger of the International Telegraph Convention of 1865 and the International Radiotelegraph Convention of 1906Several “committees,” among them

CCITT (International Telephone and Telegraph Consultative Committee) 1956-1992CCIR (International Radio Consultative Committee) 1927-1992

Reform in 1992CCITT -> ITU-TCCIR -> ITU-R

Any recommendation must be agreed upon unanimously by all its member states

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 7

ITU organization with subgroups relevant for video

ITU

ITU-R ITU-T ITU-D

WP1 – Modems and interfaceV.34, V.25ter

WP2 –SystemsH.320 – ISDNH.323 – LANH.324 – PSTNT.120 - DATA

WP3 – CodingG.7xx – AudioH.261 – VideoH.263 - Video

SG1 SG2 Study Group 16 - Multimedia…

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 8

IEC and ISO

IEC – International Electrotechnical Commission founded in 1906 to establish international standards for all electrical technologiesprivate, non-profit company under Swiss law

ISO – International Organization for StandardizationEstablished in 1947 “to facilitate the international coordination and unification of industrial standards”Private, non-profit company under Swiss lawAgency of the United Nations

Joint ISO/IEC Technical Committee 1 (JTC 1)Jointly addresses all computer-related activitiesAbout 30% of total ISO and IEC standards

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 9

ISO/IEC organization with subgroups relevant for video

IEC ISO

JTC1

SC29

AGM RA WG1 WG12 WG11

AG WG

SG SG SG

JBIG

JPEG

MHEG-5

MHEG-6

Requirements

Systems

Description

Video

Audio

SNHC

Tests

Implementation Studies

Liaisons

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 10

Requirements for a successful video coding standard

Interoperability: should assure that encoders and decoders from different manufacturers work together seamlessly.Innovation: should perform significantly better than previous standard.Competition: should be flexible enough to allow competition between manufacturers based on technical merit. Only standardize bit-stream syntax and reference decoder.Independence from transmission and storage media: should be flexible enough to be used for a range of applications.Forward compatibility: should decode bit-streams from prior standardBackward compatibility: prior generation decoders should be able to partially decode new bit-streams

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 11

Standard Development Process

REC. H.261

subject 1985 1986 1987 1988 1989 1990

n x 384

(n= 1-5)

p x 64

(p= 1-30)

m x 64

(m= 1,2)

competition convergence verification optimization

convergence verification optimization

competition

RM 1 2 3 4

RM 6 7 8

RM 5

Overview of H.261 Standardization Process

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 12

ITU-T Rec. H.261

International standard for ISDN picture phones and for video conferencing systems (1990)Image format: CIF (352 x 288 Y samples) or QCIF (176 * 144 Y samples), frame rate 7.5 ... 30 fpsBit-rate: multiple of 64 kbps (= ISDN-channel), typically 128 kbps including audio.Picture quality: for 128 kbps acceptable with limited motion in the sceneStand-alone videoconferencing system or desk-top videoconferencing system, integrated with PC

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 13

Image formats

ITU-R 601

sub QCIF

QCIF

CIF352 x 288

176 x 144

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 14

H.261 macroblocks

Macroblock (MB) of 16x16 pixelsSampling format: 4:2:0An MB consists of 4 luminance and 2 chrominance blocks

16x16 luminancesamples

0 1

2 3

4 5

8x8 Cb-samples

8x8 Cr-samples

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 15

H.261 motion compensated prediction

Integer-pel accuracyOne displacement vector per macroblockMaximum displacement vector range +/-16 horizontally and verticallyAdaptive loop filter, separable filters in 1-D horizontal and verticalimpulse response: [¼, ½, ¼]Differential encoding of motion vectors

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 16

H.261 residual coding

8x8 DCTQuantization

A uniform quantizer (∆=8) for intra-mode DC coefficientsA uniform threshold quantizer (∆=2,4,…,62) for AC coefficients in intra-mode and all coefficients in inter-mode

Zigzag scanRun-level coding for entropy coding

(zero-run, value) symbolszero-run: the number of coefficients quantized to zero since the lastnonzero coefficientvalue: the amplitude of the current nonzero coefficient

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 17

H.261 Macroblock Types (VLC Table)

0000 01XXXXInter+MC+FIL

01XXXInter+MC+FIL

001XInter+MC+FIL

0000 0000 01XXXXInter+MC

0000 0001XXXInter+MC

0000 0000 1XInter+MC

0000 1XXXInter

1XXInter

0000 001XXIntra

0001XIntraVLCTCOEFFCBPMVDMQUANTPrediction

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 18

ITU-T Rec. H.263

International standard for picture phones over analog subscriber lines (1995)Image format usually CIF, QCIF or Sub-QCIF, frame rate usually below 10 fpsBit-rate: arbitrary, typically 20 kbps for PSTNPicture quality: with new options as good as H.261 (at half rate)Software-only PC video phone or TV set-top boxWidely used as compression enginefor Internet video streamingH.263 is also the compression coreof the MPEG-4 standard

Example: 8x8 ViaTV phone VC 105

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 19

H.261 vs. H.263

Improved motion compensationH.261 (1990): integer-pel accuracy, loop filter, 1 motion vector per MBH.263 (1995): half-pel accuracy, no loop filter, 1 motion vector per MB

Improved 3-D VLC for DCT coefficients, (last, run, level)Reduced overhead Support more picture formatsOptional features defined in annexes

Unrestricted motion vectors (Annex D)Syntax-based arithmetic coding (Annex E)Advanced prediction mode (Annex F)

• Overlapped block motion compensation (OBMC),• Switch between 1 or 4 motion vectors per MB

PB pictures (Annex G)More optional features in H.263++. (H.263 as of 2001)

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 20

Performance of H.263 and H.261PS

NR

[dB

]

1) 2) 3)

4)5)

32 64 128

24

26

28

30

32

34

rate [kbps]

1) H.2632) H.263 w/o options3) H.2614) H.263 w/o options, integer-pel ME5) H.261 w/o loop filter

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 21

Performance of H.263 SAC mode

2627282930313233343536

32 64 1280rate [kbps]

1)2)

1) SAC-mode2) w/o options

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 22

H.263: overlapped block motion compensation (OBMC)

MACROBLOCK

Currentluminanceblock (8x8)

remote luminance block

remote luminance block

remote luminance block

remote luminance block

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 23

H.263: OBMC weights

5 5 5 5 5 5 5 54 5 5 5 5 5 5 4

5 5 6 6 6 6 5 55 5 6 6 6 6 5 55 5 6 6 6 6 5 55 5 6 6 6 6 5 55 5 5 5 5 5 5 54 5 5 5 5 5 5 4

2 2 1 1 1 1 2 22 1 1 1 1 1 1 2

2 2 1 1 1 1 2 22 2 1 1 1 1 2 22 2 1 1 1 1 2 22 2 1 1 1 1 2 22 2 1 1 1 1 2 22 1 1 1 1 1 1 2

1 1 2 2 2 2 1 12 2 2 2 2 2 2 2

1 1 1 1 1 1 1 11 1 1 1 1 1 1 11 1 1 1 1 1 1 11 1 1 1 1 1 1 11 1 2 2 2 2 1 12 2 2 2 2 2 2 2

for MV of current luminance block

for remote MV of top/bottomluminance block

for remote MV of left/right luminance block

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 24

Performance of H.263 AP mode

02627282930313233343536

32 64 128rate [kbps]

1) 2)

1) AP-mode2) w/o options

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 25

H.263: PB pictures (Annex G)

forward prediction

P B P

bidirectionalprediction prediction

bidirectional

PB picture

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 26

Performance of H.263 PB-mode

2627282930313233343536

32 64 1280rate [kbps]

PSN

R [d

B]

1) 3a) 2)

3b)

1) w/o options (6.25 fps)2) w/o options (12.5 fps)3) PB-mode (12.5 fps) a) P-frames b) B-frames

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 27

Visual Communication Systems: H.320/H.323/H.324

H.245H.225.0G.7xxH.261H.323Non-Qos LAN

H.242H.221G.7xxH.261/3H.322QoS LAN

H.245H.222.0/1G.7xx/MPEGH.261/2H.310

Q.2931H.221G.7xxH.261H.321B-ISDN /ATM

H.242H.221G.7xxH.261H.320N-ISDN

H.245H.223G.723.1H.261/3H.324PSTN

ControlMuxAudioVideoSystemNetwork

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 28

H.324 Multimedia Terminals

PSTN network

Video I/O equipment

Audio I/O equipment

User data applications

System control

Video codec H.261/H.263

Audio codec G.723

Data protocols V.14, LAPM etc

Control protocol SRP/LAPM

H.245 procedures

Receive path delay Mux/

demuxH.223

Modem V.34/V.8

Modem control v.25ter

Scope of H.324

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 29

Overview: Video Coding Standards

Video coding standards: the applications and the common structureRelevant standards organizationsITU-T Rec. H.261 ITU-T Rec. H.263ISO/IEC MPEG-1 ISO/IEC MPEG-2ISO/IEC MPEG-4Recent progress: H.26L/JVT

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 30

ISO/IEC MPEG

MPEG-1 Standard (1991) (ISO/IEC 11172)Target bit-rate about 1.5 Mbit/s Typical image format CIF, no interlaceFrame rate 24 ... 30 fpsMain application: video storage for multimedia (e.g., on CD-ROM)

MPEG-2 Standard (1994) (ISO/IEC 13818)Extension for interlace, optimized for TV resolution (NTSC: 704 x 480 Pixel)Image quality similar to NTSC, PAL, SECAM at 4 - 8 Mbit/sHDTV at 20 Mbit/s

MPEG-4 Standard (1999) (ISO/IEC 14496)Object based codingWide-range of applications, with choices of interactivity, scalability, error resilience, etc.

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 31

MPEG-1/2: GOP Structure

"Group of Pictures" = “GOP“, GOP structure is very flexible

I-Picture P-Picture P-Picture

B-Pictures

time

1 3 4 2 6 7 8 5

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 32

MPEG-1 Encoder

Pre-processing DCT Weighting Quantization VLC Video

multiplex BufferPicture reordering

Inverse quantization

Inverse weighting

Inverse DCT

Picture store 1

Picture store 2

Motion compensation+

-

+

1/2

zero

Motion vectors, macroblock info, start codesVideo in

Bitstream

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 33

MPEG-1: coding of I-pictures

I-pictures: intraframe coded8x8 DCTArbitrary weighting matrix for coefficientsDifferential coding of DC-coefficientsUniform quantizationZig-zag-scan, run-level-codingEntropy codingUnfortunately, not quite JPEG

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 34

MPEG-1: coding of P-pictures

Motion-compensated prediction from an encoded I-picture or P-picture (DPCM)Half-pel accuracy of motion compensation, bilinear interpolationOne displacement vector per macroblockDifferential coding of displacement vectorsCoding of prediction error with 8x8-DCT, uniform threshold quantization, zig-zag-scan as in I-pictures

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 35

MPEG-1: coding of B-pictures

Motion-compensated prediction from two consecutive P- or I-pictures

either• only forward prediction (1 vector/macroblock)

or• only backward prediction (1 vector/macroblock)

or• Average of forward and backward prediction = interpolation (2

vectors/macroblock)

Half-pel accuracy of motion compensation, bilinear interpolationCoding of prediction error with 8x8-DCT, uniform quantization, zig-zag-scan as in I-pictures

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 36

MPEG-2 vs. MPEG-1

Efficiently compress interlaced digital video at broadcast quality

Field/frame picturesChroma samplingNew prediction modesField/frame DCTAdditional scan patterns for DCT coefficientsMotion compensation with blocks of size 16x8 pels

Improved coding efficiency by different quantization, VLC tablesVarious scalability modes

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 37

Coding of Interlaced Video (1)

Frame and field picture structures

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 38

Coding of Interlaced Video (2)

Field prediction for field pictures

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 39

Coding of Interlaced Video (3)

Field prediction for frame pictures

16 16

8

8

16

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 40

Coding of Interlaced Video (4)

Dual prime for P pictures

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 41

Coding of Interlaced Video (5)

Field/frame DCTAlternate Scan

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 42

MPEG-4

Support highly interactive multimedia applications as well as traditional applicationsAdvanced functionalities: interactivity, scalability, error resilience…Coding of natural and synthetic audio and video, as well as graphicsEnable the multiplexing of audiovisual objects and composition in a scene

Video on LANs, Internet videoWireless videoVideo databaseInteractive home shoppingVideo e-mail, home moviesVirtual reality games, flight simulation, multi-viewpoint training

‘TV/film’ AV-data

‘Computer’ Interactivity

‘Telecom’ Wireless

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 43

MPEG-4: Scene with audiovisual objects

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 44

MPEG-4: Video coding

Basic video codingDefinition of Video Object (VO), Vide Object Layer (VOL), Video Object Plane (VOP) Improved coding efficiency vs. MPEG-1/2

• Based on H.263 baseline• Global motion compensation• Sprite• Quarter pixel motion compensation

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 45

MPEG-4: Video coding

Object-based video codingBinary shape coding Greyscale shape codingPadding for block-based DCT of textureShape-adaptive DCT

DWT for still texture codingMesh animation, face and body animation

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 46

Shape Adaptive DCT

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 47

Video compression progress

• Intraframe coding: only spatial correlation exploitedDCT [Ahmed, Natarajan, Rao 1974], JPEG [1992]

• Conditional replenishment, DPCM, scalar quantizationH.120 [1984]

• Frame difference codingH.120 Version 2 [1988]

• Motion compensation: integer-pel accurate displacementsH.261 [1991]

• Half-pel accurate motion compensationMPEG-1 [1993], MPEG-2/H.262 [1994]

• Variable block-size motion compensationH.263 [1996], MPEG-4 [1999]

Complexityincreases

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 48

Video compression progress

Variableblock size

motioncompensation(H.263 1996)

Bit Rate [kbps]

Integer-pelmotion

compensation(H.261 1991)

Half-pelmotion

compensation(MPEG-1 1993)

Framedifference

coding(H.120 1988)

600

Foreman10 Hz, QCIF

100 frames encoded

0 100 200 300 400 500

ConditionalReplenishment

(H.120)

IntraframeDCT coding

(JPEG)

~40 %

~20 %

~30 %

40

38

36

PSNR[dB]

34

32

30

28

26

24

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 49

Video compression progress

0 50 100 150 200 250 300 35028

30

32

34

36

38

40

42

Bit Rate [kbps]

~70 %~60 %

PSNR[dB]

Mother & Daughter10 Hz, QCIF

100 frames encoded

Variableblock size

motioncompensation(H.263 1996)

ConditionalReplenishment

(H.120)

IntraframeDCT coding

(JPEG)

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 50

Video compression progress

Mobile &Calendar10 Hz, QCIF

100 frames encoded

14000 200 400 600 800 1000 120022 Bit Rate [kbps]

~ 40 %~ 35 %

Variableblock size

motioncompensation(H.263 1996)

Integer-pelmotion

compensation(H.261 1991)

IntraframeDCT coding

(JPEG)

40

38

36

PSNR[dB]

34

32

30

28

26

24

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 51

H.26L/JVT: Motion Compensation AccuracyH.26L/JVT: Motion Compensation Accuracy

EntropyCoding

Deq./Inv. Transform

Motion-Compensated

Predictor

ControlData

Quant.Transf. coeffs

MotionData

0

Intra/Inter

CoderControl

Decoder

MotionEstimator

Transform/Quantizer-

1/4 (QCIF) or 1/8 (CIF) pel

0

0 1 2 3

4 5 6 7

Mode 10 1

2 3

Mode 4

Mode 5

0 1

0 12 34 56 7

Mode 2

Mode 6

1

0

0 1 2 34 5 6 78 9 10 11

12 13 14 15

Mode 3

Mode 7

[courtesy T. Wiegand]

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 52

H.26L/JVT: Multiple Reference FramesH.26L/JVT: Multiple Reference Frames

EntropyCoding

MotionData

Deq./Inv. Transform

Motion-Compensated

Predictor

ControlData

Quant.Transf. coeffs

0

Intra/Inter

Decoder

MotionEstimator

-

Multiple Reference Frames for Motion Compensation

CoderControl

Transform/Quantizer

[courtesy T. Wiegand]

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 53

H.26L/JVT: Residual CodingH.26L/JVT: Residual Coding

Transform 4x4 Integer transform approximating a DCTExpanded to 8x8 for chroma by 2x2 DC transform

Intra Coding Structure Directional spatial prediction for intra modeExpanded to 16x16 for luma intra by 4x4 DC transform

QuantizationTwo inverse scan patternsLogarithmic step size controlSmaller step size for chroma (per H.263 Annex T)

Deblocking Filter (in loop)

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 54

H.26L/JVT: Entropy Coding (1)H.26L/JVT: Entropy Coding (1)

Universal Variable Length Code (UVLC)

Deq./Inv. Transform

Motion-Compensated

Predictor

ControlData

Quant.Transf. coeffs

MotionData

0

Intra/Inter

CoderControl

Decoder

MotionEstimator

Transform/Quantizer-

EntropyCoding

[courtesy T. Wiegand]

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 55

H.26L/JVT: Entropy Coding (2)H.26L/JVT: Entropy Coding (2)

Context-based adaptive binary arithmetic codes (CABAC)

Context modeling Binarization Probability

estimationCoding engine

update probability estimation

Adaptive binary arithmetic coder

Chooses a model conditioned on

past observations

Maps non-binary symbols to a

binary sequence

Uses the provided model for the actual encodingand updates the model

[courtesy T. Wiegand]

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 56

Comparison of H.26L to MPEGComparison of H.26L to MPEG--44

MPEG-4: Advanced Simple Profile (ASP)Motion Compensation: 1/4 pelGlobal Motion Compensation

H.26L:Motion Compensation: 1/4 pel (QCIF), 1/8 pel (CIF)Using CABAC entropy coding5 reference frames in 7 of 8 cases (News: 17 / 25)

BothSequence structure IBBPBBP...QPB=QPP+2 (step size: +25%)Search range: 32x32 around 16x16 predictorWell-known D+λR optimization techniques

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 57

RD Curves: RD Curves: FForemanoreman (QCIF, 10Hz)(QCIF, 10Hz)

2627282930313233343536373839

Ave

rage

PSN

R(Y

) [dB

]

MPEG-4

H.26L

>30%

0 16 32 48 64 80 96 112 128

Bit-rate [kbit/s][source: ITU-T VCEG]

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 58

RD Curves: RD Curves: FFlowergardenlowergarden (CIF, 30Hz)(CIF, 30Hz)

2223242526272829303132333435363738

0 256 512 768 1024 1280 1536 1792 2048 2304

Ave

rage

PSN

R(Y

) [dB

]

MPEG-4

H.26L

>30%

Bit-rate [kbit/s][source: ITU-T VCEG]

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 59

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 60

Bernd Girod: EE398B Image Communication II Video Coding Standards no. 61