MPEG-1 Video (Part 1)

26
CS 294-9 :: Fall 2003 MPEG-1 Video (Part 1) Ketan Mayer-Patel

description

MPEG-1 Video (Part 1). Ketan Mayer-Patel. Encoding Techniques. Subsampling Transform Coding Run-length Encoding Predictive Encoding Entropy Encoding Quantization. Bitstream Organization. Picture Data. Seq. End Code. Seq. Header Width Height Frame Rate Buffer Control. - PowerPoint PPT Presentation

Transcript of MPEG-1 Video (Part 1)

Page 1: MPEG-1 Video (Part 1)

CS 294-9 :: Fall 2003

MPEG-1 Video (Part 1)

Ketan Mayer-Patel

Page 2: MPEG-1 Video (Part 1)

CS 294-9 :: Fall 2003

Encoding Techniques• Subsampling

• Transform Coding

• Run-length Encoding

• Predictive Encoding

• Entropy Encoding

• Quantization

Page 3: MPEG-1 Video (Part 1)

CS 294-9 :: Fall 2003

Bitstream Organization

Seq. HeaderWidthHeightFrame RateBuffer Control

GOP HeaderTime Code

Picture HeaderTemporal RefPicture TypeMotion Vector Parameters

Picture Data Seq. End Code

All headers begin with 23 zeroes followed by 9 bits that indicate header type. Encoding process will never produce 23 zeroes.

Page 4: MPEG-1 Video (Part 1)

CS 294-9 :: Fall 2003

Frame Types• 3 Frame Types: I, P, B

I : All information for frame present.P: Predictively encoded from previous I or P.

B: Predictively encoded from previous I or P and next I or P.

I P IP P PB B B B B B B B B B

Page 5: MPEG-1 Video (Part 1)

CS 294-9 :: Fall 2003

Frame Order• Predictive relationships create an obvious

problem: B-frames depend on the future.

• Obvious solution: send the frames out of order.

I P IP P PB B B B B B B B B B1 4 167 10 132 3 5 6 8 9 11 12 14 151 2 145 8 113 4 6 7 9 10 12 13 15 16

Page 6: MPEG-1 Video (Part 1)

CS 294-9 :: Fall 2003

Source Input• Before we describe how I-frames are

encoded, we should describe our input.

• 3 planes of Y, U, V– 8 bits per pixel.– Y range [0,255].– U and V range [-128,127]

• Planes are all of the same size.

• Pixels colocated between frames.

Page 7: MPEG-1 Video (Part 1)

CS 294-9 :: Fall 2003

Chrominance Subsampling• First step: downsize chrominance.

• 4:2:0 (with chrominance samples centered)

• Requires bilinear interpolation.

• U and V biased by 128 to put in range [0,255]

• Compression Ratio: 2:1

• Wow, doing well already.

Page 8: MPEG-1 Video (Part 1)

CS 294-9 :: Fall 2003

Subsampling In General• Severe loss of data.

• Exploits imperceptibility of data loss.– In this case: human not as sensitive to color.

• What if we were using images as input to feature extractor?– Depending on what was being extracted,

subsampling might not be such a good idea.

• Compression gain is directly related to subsampling factors.

Page 9: MPEG-1 Video (Part 1)

CS 294-9 :: Fall 2003

Macroblocks• Y is cut into 8x8 tiled pixel regions.

• U and V cut into 8x8 tiled pixel regions.

• Macroblock defined as 4 Y tiles that form a 16x16 pixel region and associated U and V tiles.

• Macroblocks organized in row order fashion from top to bottom.

• Compression gain: none.

Page 10: MPEG-1 Video (Part 1)

CS 294-9 :: Fall 2003

Discrete Cosine Transform• Each tile (aka block) in a macroblock is

transformed with a 2D DCT.

• DCT is an orthonormal, separable, frequency basis much like a Fourier transform.

• 1-D case: 8 pixel values are transformed into 8 DCT coefficients.

• 2-D case: apply 1-D transform to all of the rows and then apply 1-D transform to all of the columns.

Page 11: MPEG-1 Video (Part 1)

CS 294-9 :: Fall 2003

DCT Basis Functions

)0(s

)1(s

)2(s

)3(s

)4(s

)5(s

)6(s

)7(s

∑=

+7

0

)0*16

12cos()(

22

1

x

xxs π

∑=

+7

0

)*16

12cos()(

x

xxs π

∑=

+7

0

)2*16

12cos()(

x

xxs π

∑=

+7

0

)3*16

12cos()(

x

xxs π

∑=

+7

0

)4*16

12cos()(

x

xxs π

∑=

+7

0

)5*16

12cos()(

x

xxs π

∑=

+7

0

)6*16

12cos()(

x

xxs π

∑=

+7

0

)7*16

12cos()(

x

xxs π

)0(S

)1(S

)2(S

)3(S

)4(S

)5(S

)6(S

)7(S

DC

AC

Page 12: MPEG-1 Video (Part 1)

CS 294-9 :: Fall 2003

DCT Properties• 8-bit pixel values produces 12-bit signed

coefficient values.

• Fast algorithms exist for computation.– 13 multiplies and 29 additions– Fixed point integer math.

• Good perceptual properties.– Losing higher freq. results in a bit of blurring.– Ringing fairly minimal.

Page 13: MPEG-1 Video (Part 1)

CS 294-9 :: Fall 2003

Transform Coding Properties• No loss of data

– Except for numerical errors

• No compression either.

• Used to rearrange the data into a form to make another coding technique more effective.

Page 14: MPEG-1 Video (Part 1)

CS 294-9 :: Fall 2003

Coefficient Quantization• Each block is now 64 coefficients instead of 64

pixel values.• Each coefficient quantized independently.

– Allows larger quantization factors to be used with higher frequency coefficients.

• Quantization is controlled by two parameters:– Quantization table.

• Set in picture header or system header.

• Two different tables, one each for intra and non-intra blocks.

– Quantization factor. • Can be set on a per macroblock basis. Used to scale the table.

• Can take value from (2-62)

Page 15: MPEG-1 Video (Part 1)

CS 294-9 :: Fall 2003

Quantization Properties• Data loss relative to quantization step.

• Compression in two ways:– Smaller range to represent.

• In our case 12 bit signed values turn into 9-bit signed values.

– Creates runs of the same number.– In our case: runs of zeroes.

Page 16: MPEG-1 Video (Part 1)

CS 294-9 :: Fall 2003

Run Length Encoding• High quantization step size for higher frequency

components results in lots of zero coefficients.

• Run Length Encoding provides better representation.– Convert 2D matrix into 1D ordering of coefficients.– Reorganize as (run, value) pairs.– Run specifies number of zeroes to insert in the

ordering before value appears.– Special marker that indicates nothing left but zeroes.

Page 17: MPEG-1 Video (Part 1)

CS 294-9 :: Fall 2003

Zig-Zag ordering• In order to group as many of the zeroes

together, zig-zag ordering used.

Page 18: MPEG-1 Video (Part 1)

CS 294-9 :: Fall 2003

RLE Properties• Compression related to avg. size of run.

• No data loss.

Page 19: MPEG-1 Video (Part 1)

CS 294-9 :: Fall 2003

DC Term Encoding• At this stage, each block in our macroblock

is represented as a set of RLE’d DCT coefficients.

• DC term is always coded even if it is zero.– Coded as difference between last DC term and

current DC term. – Blocks are ordered within a macroblock.

• Why code the difference?– Avg. pixel value of one block is likely to be

correlated to nearby block.

Page 20: MPEG-1 Video (Part 1)

CS 294-9 :: Fall 2003

DC Term Encoding Cont’d• Now DC term is expressed as difference

from previous DC term (DC_DIFF)

• Encoded as two parts:– Size of difference (i.e., log(DC_DIFF))– Size number of bits that provides the value.

• Size is encoded as a Huffman code.

Page 21: MPEG-1 Video (Part 1)

CS 294-9 :: Fall 2003

Differential Encoding• Useful when values being encoded are well

correlated.

• Distribution of differences is expected to not be uniform.

• No compression per se, but increases the efficiency of entropy encoding techniques (i.e., Huffman coding)

Page 22: MPEG-1 Video (Part 1)

CS 294-9 :: Fall 2003

AC Term Encoding• AC terms are given as (run,value) pairs.

• Encoded in one of two ways:– Huffman code for (run, abs(value)) followed by single bit

for sign of value.

– Special Huffman code indicating ESCAPE, followed by 6 bits for run and either 8 or 16 bits for value.

• 6 bits for run simply encode 0 through 63

• First 8 bits of value put value at –128 to 127.

• If first 8 bits is -128, next 8 bits provide codes for –128 through –255

• If first 8 bits is 0, next 8 bits provide codes for 128 through 255.

Page 23: MPEG-1 Video (Part 1)

CS 294-9 :: Fall 2003

Entropy Coding• Huffman codes are a form of entropy encoding.

• Relies on uneven distribution of values to be encoded.

• Length of code associated with values inversely related to weight in distribution.– The more likely the value is to occur, the small the

code length relative to all the other codes.

• No data loss.

• Compression depends on distribution.

Page 24: MPEG-1 Video (Part 1)

CS 294-9 :: Fall 2003

Stepping Back A BitPicture Header

Picture Data Row Major Scan of Encoded Macroblocks

Macroblock Address Increment (1-bit)

Macroblock Type (1 or 2 bits)Q Scale (5 bits)

Luminance Blocks U Block V Block

DC Size (2-7 bits)

DC Bits (0-8 bits)

First Non-zero AC Coeff. (variable bit length)

Last Non-zero AC Coeff. (variable bit length)

EOB (2 bits)

Page 25: MPEG-1 Video (Part 1)

CS 294-9 :: Fall 2003

Slices• One last level of organization.

• Macroblocks grouped into slices.– Typically, one row of macroblocks in one slice.– Other groupings also possible.

• Slice starts with a slice header.– Contains qscale. and indicates row in which slice starts.

• Decoder state is reset.– DC predictors for Y, U, and V set to 1024.

– Prev. macroblock address set to address of first macroblock in slice row (may not be first macroblock in slice).

Page 26: MPEG-1 Video (Part 1)

CS 294-9 :: Fall 2003

I-Frame Review• All macroblocks are intra-coded.• Blocks DCT’d and quantized to produce

coefficients.• DC terms encoded differentially.• AC terms encoded with entropy codes

associated with (run,value) pairs. – Escape code with fix length encoding for

seldom used possibilities.

• In general, compression ratio is 10:1 to 20:1