MPEG Video (Part 2)

29
CS 294-9 :: Fall 2003 MPEG Video (Part 2) Ketan Mayer-Patel

description

MPEG Video (Part 2). Ketan Mayer-Patel. Last Time. Overall MPEG bitstream organization. I-Frames Examples of many encoding techniques: Subsampling (chrominance planes) Transform Coding (DCT, zig-zag) Run-length Encoding (AC coeffs) Predictive Encoding (DC coeffs) - PowerPoint PPT Presentation

Transcript of MPEG Video (Part 2)

Page 1: MPEG Video (Part 2)

CS 294-9 :: Fall 2003

MPEG Video (Part 2)

Ketan Mayer-Patel

Page 2: MPEG Video (Part 2)

CS 294-9 :: Fall 2003

Last Time• Overall MPEG bitstream organization.• I-Frames• Examples of many encoding techniques:

– Subsampling (chrominance planes)– Transform Coding (DCT, zig-zag)– Run-length Encoding (AC coeffs)– Predictive Encoding (DC coeffs)– Entropy Encoding (Huffman encoding)– Quantization (All coefficients)

Page 3: MPEG Video (Part 2)

CS 294-9 :: Fall 2003

This Time• P and B frames

– Motion compensation.• Search techniques• The problem with error measurements

– Skipped macroblocks• Quantization control

– Variable bitrate vs. Constant bitrate• DCT Artifacts

– Spider noise– Blockiness

Page 4: MPEG Video (Part 2)

CS 294-9 :: Fall 2003

P-Frames• Two types of macroblocks in P-Frames:

– I-Macroblocks.• Just like macroblocks in a I-Frame• DC term is differentially encoded from DC

predictor– DC predictor is simply last coded DC term.– Predictor reset at slice boundaries.– Encoded as DC size followed by that many bits.

• AC terms– RLE’d as (run,value) pairs. Huffman encoded.

– P-Macroblocks

Page 5: MPEG Video (Part 2)

CS 294-9 :: Fall 2003

P-Macroblocks

Macroblock Address Increment (variable)Macroblock Type (1-6 bits)

Q Scale (5 bits)

Luminance Blocks U Block V Block

Motion Vector (variable)Block Pattern (3- 9 bits)

Macroblock Type determines if Q Scale, Motion Vector, or Block Pattern exist.One or all of the blocks may be absent in a P-Macroblock.

Page 6: MPEG Video (Part 2)

CS 294-9 :: Fall 2003

Address Increment• Each macroblock has an address.

– MB_WIDTH = width of luminance / 16– MB_ROW = row # of upper left pixel / 16– MB_COL = col. # of upper left pixel / 16– MB_ADDR = MB_ROW * MB_WIDTH + MB_COL

• Decoder maintains PREV_MBADDR.– Set to -1 at beginning of picture.– Set to (SLICE_ROW*MB_WIDTH-1) at slice header.

• MB address increment added to PREV_MBADDR provides current macroblock address. – PREV_MBADDR set to current macroblock address.

Page 7: MPEG Video (Part 2)

CS 294-9 :: Fall 2003

Address Increment Coding• Address increment coded using Huffman

code.– 33 codes for values (1-33).

• 1 is smallest (1-bit)• 33 is largest (11-bits)

– 1 code for ESCAPE• ESCAPE means add 33 to address increment code that

follows.• ESCAPS can be chained allowing any positive value to

be encoded as an address increment.

• This occurs for I-Frames as well.

Page 8: MPEG Video (Part 2)

CS 294-9 :: Fall 2003

MB Type• Huffman coded.

– 7 possible codes (1 - 6 bits)• Determine the following:

– Intra or non-intra.– Q scale specified or not.– Motion vector exists or not.– Block pattern exists or not.

• Not all combinations are possible.• Not all possible combinations are feasible.

Page 9: MPEG Video (Part 2)

CS 294-9 :: Fall 2003

Quantization Scale• 5 bits.• Zero is illegal.• Encoded as 1-31 which results in q-scale

values of (2-62). – Odd values impossible to encode.

• Decoder maintains current q-scale.– If not specified, current q-scale used.– If specified, current q-scale replaced.

Page 10: MPEG Video (Part 2)

CS 294-9 :: Fall 2003

Motion Vector• Two components:

– Horizontal and vertical offsets.– Offset is from upper left pixel of macroblock.– Positive values indicate right and down.– Negative values indicate left and up.– Offsets are specified in half pixels.

• Motion vector is used to define a predictive base for the current macroblock from the reference picture.

Page 11: MPEG Video (Part 2)

CS 294-9 :: Fall 2003

Motion Vector Illustrated

P-FramePreviously Decoded I- or P- Frame

Prediction base does not have to be macroblock aligned.If predictive base is half-pixel aligned, bilinear interpolation is used.Whatever luminance pixels are picked out, corresponding

chrominance pixels used to form chrominance prediciton.

Page 12: MPEG Video (Part 2)

CS 294-9 :: Fall 2003

Motion Vector Encoding• If no motion vector is present, then motion

vector is understood to be (0,0).• Horiz. component followed by vertical.• Decoder maintains motion vector predictor.

– Set to 0,0 at beginning of picture or slice or whenever an I-macroblock is encountered.

– Difference between predictor and value is Huffman encoded.

• Actually a bit more complicated than this.

Page 13: MPEG Video (Part 2)

CS 294-9 :: Fall 2003

Predictive Base• P-Macroblocks always specify a predictive

base:– Either motion vector picks out an area, or– No motion vector implicitly implies 0,0 (i.e.,

predictive base is same macroblock in reference frame.)

Page 14: MPEG Video (Part 2)

CS 294-9 :: Fall 2003

Block Pattern• The goal of motion compensation is to find

predictive base that matches most closely with macroblock.– If match is really good, then no appreciable

difference will need to be encoded at all.• Block pattern indicates which blocks have

enough error to warrant coding.• Absence of block pattern indicates no

blocks needed coding.

Page 15: MPEG Video (Part 2)

CS 294-9 :: Fall 2003

Block• Difference between pixels in prediction and

macroblock is encoded as block:– 9-bit input values– Still produces 12-bit coefficients– Sometimes called error blocks.

Page 16: MPEG Video (Part 2)

CS 294-9 :: Fall 2003

Error Block Encoding• Different quantization matrix is used.

– Default is “16” in all coefficient positions.– Error blocks have lots of high frequency info.– No good perceptual correlation between

frequencies of error coding and artifacts.• DC no longer specially treated.

– No differential encoding from predictor.• All terms are zig-zag RLE’d and then

(run,value) pairs Huffman encoded.

Page 17: MPEG Video (Part 2)

CS 294-9 :: Fall 2003

P-Frame Review• Macroblocks are either I-macroblocks or

P-macroblocks.• I-macroblocks just like macroblocks in

I-frame.• P-macroblocks define predictive base and

encode the difference.

Page 18: MPEG Video (Part 2)

CS 294-9 :: Fall 2003

Skipped Macroblocks• If P-macroblock has (0,0) motion vector and

no appreciable difference to encode, then can be skipped altogether.

• Skipped macroblock detected when address increment for next coded macroblock is detected.

• First block and last block of slice must not be skipped.

• Last slice must include lower right macroblock.

Page 19: MPEG Video (Part 2)

CS 294-9 :: Fall 2003

Decoder State Updates• DC predictors are reset whenever a

P-macroblock or skipped macroblock is encountered.

• Motion vector predictors reset whenever I-macroblock is encountered.

Page 20: MPEG Video (Part 2)

CS 294-9 :: Fall 2003

B-Frames• B-frames have 4 macroblock types:

– I-macroblocks– P-macroblocks

• Predictive base specified from previous reference frame.

– B-macroblocks• Predictve base specified from subsequent reference

frame.– Bi-macroblocks.

• Predictive base specified from both reference frames.

Page 21: MPEG Video (Part 2)

CS 294-9 :: Fall 2003

Skipped Macroblocks• Handled slightly differently than P-frames.• Skipped macroblock implies:

– Same macroblock type as last encoded macroblock (i.e., P-, B-, or Bi-).

– Motion vectors same a previous encoded macroblock.

• Compare to (0,0) assumption in P-frame.• Also means that predictors not reset.

– Can’t skip macroblock following an I-macroblock.• Other state changes as per P-frames.

Page 22: MPEG Video (Part 2)

CS 294-9 :: Fall 2003

Motion Compensation• Provides most of MPEG’s compression.• Relies on temporal coherence.• Finding a good motion vector essentially a

search problem.• Evaluating “goodness” of a motion vector

can be a bit tricky.• MC is what makes MPEG asymmetric.

– Harder to encode than to decode.

Page 23: MPEG Video (Part 2)

CS 294-9 :: Fall 2003

Exhaustive Search• The most obvious and easiest solution.• Encoding time related to size of search

window.• Although time consuming, also

embarrassingly parallel.

Page 24: MPEG Video (Part 2)

CS 294-9 :: Fall 2003

Logarithmic Search• Evaluate the search window with an even

sampling of motion vectors.• Take best and reevaluate in region of the

motion vector with denser sampling.

Page 25: MPEG Video (Part 2)

CS 294-9 :: Fall 2003

Predictive Search• Motion vectors differentially encoded for a

reason.– Tend to be correlated from one macroblock to the

next.• Use previous macroblocks motion vector as

centering point for search.• Or, use motion vector from same block in

previous frame as center of search.• My research is looking at using depth and

other spatial info to guide encoding.

Page 26: MPEG Video (Part 2)

CS 294-9 :: Fall 2003

Error Measurements• Regardless of search algorithm, need to

determine which motion vector is best.• Simple measures:

– Mean Squared Error– Mean Absolute Error– Minimum Difference Variance

• Fundamental problem is no good correlation between any simple metric and perceptual quality.

Page 27: MPEG Video (Part 2)

CS 294-9 :: Fall 2003

VBR vs. CBR• Two ways to handle bitrate:

– Variable Bit Rate (VBR)• Allows compressed bitrate to vary

– Constant Bit Rate (CBR)• Bitrate constant over some averaging window.

• MPEG buffer model.– Optional (don’t have to use it).– Provides in the sequence header parameters to a

buffer model that can describe bitrate behavior.

Page 28: MPEG Video (Part 2)

CS 294-9 :: Fall 2003

VBR Q-scale adjustments• In general, VBR used to maintain quality.• Q scale is adjusted to provide maximum

compression given quality limit.• Need some metric for quality.

– Same issue for judging perceptual quality crop up here.

• Common solution: q scale statically set for I-, P-, and B-frames.– A variation on this is differentiating among

macroblock types.

Page 29: MPEG Video (Part 2)

CS 294-9 :: Fall 2003

CBR Q-scale adjustments• To achieve CBR, q-scale used to control bitrate.

– Higher q-scale provides better compression at the expense of quality.

– Lower q-scale provides better quality at the expense of compression.

• Algorithms for controlling how q-scale is adjusted can get pretty complicated.

• Common solution is to have target I, P, and B frame sizes and then adjust q-scale as macroblocks are encoded to hit the target.