CSc 461/561 Multimedia Systems Part B: 2. Lossy Compression

CSc 461/561

CSc 461/561Multimedia Systems Part B: 2. Lossy Compression

CSc 461/561

Summary

(1) Why is lossy compression possible? (2) Distortion measure (3) Quantization (4) Transformation (5) Introduction to JPEG- Part I (6) Introduction to MPEG-Part I

CSc 461/561

1. Why is lossy compression possible?

– some information is more important than others for human

– keep the important one

Compression Ratio: 12.3Compression Ratio: 7.7 Compression Ratio: 33.9

Original

CSc 461/561

2. Distortion measure

• Rate– # of bits per source symbol

• Distortion– one measure: mean square error (MSE)– x: original value; y: reconstructed value– MSE = [(x1-y1)2+(x2-y2)2+…+(xN-yN)2]/N

• Rate vs distortion– lower rate, higher distortion

Rate

Distortion

A

B

CSc 461/561

3. Quantization (1)

• Quantization (recall audio A/D)– use a discrete value to represent a value range– information loss!

• The smaller range, the less distortion– granular distortion

• Quantization steps– uniform: all ranges have the same size– non-uniform: otherwise

CSc 461/561

3. Uniform quantization (2)• Quantization step: uniform• Two constructions: midrise, midtread

∆ 2∆ 3∆ Input

-3∆ -2∆ -∆

Reconstruction3.5∆2.5∆1.5∆

0.5 ∆

-0.5∆-1.5∆-2.5∆-3.5∆

Uniform Midrise Quantizer

-2.5∆ -1.5∆ -0.5∆

Reconstruction3∆2∆∆

-∆-2∆-3∆

Uniform Midtread Quantizer

0.5∆ 1.5∆ 2.5∆ Input

CSc 461/561

3. Signal-to-quantization-noise ratio (3)

• Quantization– n bits; 2n steps for [-Xmax,Xmax]

– step size: delta = 2Xmax / 2n

– granular distortion: • SQNR in dB

– 10 log10 signal_energy / noise_energy

=10 log10 [(2Xmax)2/12]/[delta2/12]=20n log102

• One more bit adds 6 dB to SQNR

σ2q= ∫

−Δ/2

Δ/2

x−0 2 1Δdx= 1

12Δ2

CSc 461/561

3. Non-uniform quantization (4)• Recall u-law or A-law voice compander• How to choose quantization steps?

– Int f(x) dx = 1/2n

x

f(x)

0

x

f(x)

0

Uniform Non-uniform

xi xi+1 xi xi+1

xi

xi+1

CSc 461/561xi

3. Non-uniform quantization: more (5)

• How to represent a range?– Int f(x) dx = 1/2n+1

– when uniform: yi=(xi+xi+1)/2

x

f(x)

0

x

f(x)

0

Uniform Non-uniform

xi xi+1 xi+1yiyi

xi

yi

CSc 461/561

4. Transformation (1)• Transformation

– represent information in anther space• identify and remove (hard-to-remove) correlation,

i.e., redundancy, in the original space• information loss!

– e.g., time/space => frequency (FFT)• Inverse transformation

– represent the info back in the original space

CSc 461/561

4. Discrete Cosine Transform (2)• Recall: a wave is of many waves • “Any signal can be expressed as a sum of multiple

signals that are sine or cosine waveforms at various amplitudes and frequencies.”

• Cosine transform: using cosine waveforms• DCT: integer indexes

– widely used in image compression (e.g., JPEG)

CSc 461/561

4. DCT: more (3)

• 2-D DCT (8x8); C(x)=1/sqrt(2) when x=0

• Inverse 2-D DCT (IDCT); C(x)=1 otherwise

CSc 461/561

4. DCT: examples (4)

DC Component

Original values of an 8x8 block (in spatial domain)

Corresponding DCT coefficientscoefficients (in frequency domain)(in frequency domain)

CSc 461/561

5. Introduction to JPEG-Part I (1)

• Joint Photographic Experts Group (JPEG)– ISO standard (1992)– widely used (.jpeg, .jpe, .jpg; C/R: 10~20)

• The family of JPEGs– lossless JPEG: prediction-based compression– lossy JPEG: DCT-based compression– M-JPEG: motion JPEG– JPEG2000: discrete wavelet transform; new!

CSc 461/561


JPEG compression guidelines – Brightness vs color sensitivity

• RGB => YUV/YIQ• chroma subsampling (4:2:0)

– Spatial correlation among nearby pixels• slice an image into 8x8 blocks (bad for text)

– Remove redundancy in frequency domain• discrete cosine transform (DCT)• coarse quantization for high freq coefficients

CSc 461/561


• Sequential mode• Progressive mode

– low quality first, then differential data added• DC first, then AC; or MSB first, then LSB

• Hierarchical mode– lowest resolution first and then higher resolutions

• Lossless mode– prediction and entropy encoding

CSc 461/561


• We will revisit the topic later.

CSc 461/561

6. Introduction to MPEG-Part I (1)

• MPEG-1 (1991): VCD (VCR+CD quality)– 352x240, 1.2Mbps video CBR, 256Kbps audio– progressive scan only (1x CD-ROM)

• MPEG-1 video compression– similar to H.261, with a few differences

• more formats, flexible slices, quantization table– I-frame: JPEG-like compression– P-frame: prediction-based; B-frame

CSc 461/561

6. Introduction to MPEG-Part I (2) MPEG-1: more

• Bi-directional search– search both previous and

next frames for similarmacro-blocks

• MPEG-1 GOP– I-frame, P-frame, B-frame

• display order: IBBPBBPBBPBBPBBI (M=3, N=15)• coding order: IPBBPBBPBBPBBIBB; timestamps

– D-frame: for search through the video, DC only

1 2 3 4 5 6 7 8 9I B B P B B P B B

CSc 461/561

6. Introduction to MPEG-Part I (3) MPEG-2• MPEG-2 (1994): DVD, HDTV, etc

– also adopted as ITU-T H.262– many video formats and data rates; better audio

• profiles: simple (4:2:0, I/P), main (+B), SNR (+variable quality), spatial (+variable resolution), high (+4:2:2)

• levels: low (352x288), main (720x576), high 1440 (1440x1152), high (1920x1152)

– support interlaced video (broadcasting!)

CSc 461/561

6. Introduction to MPEG-Part I (4) MPEG-2 scalability• Layered encoding

– base layer: independent for basic quality– enhancement layer: dependent on the base layer

• E.g., SNR scalability– base: low SQNR (coarse quantization)– enhance: high SQNR (fine Q on actual-base)

• E.g., spatial scalability– base: low resolution; enhance: high resolution

CSc 461/561

6. Introduction to MPEG-Part I (5) MPEG-4

• MPEG-4 (1999): content-based, object-oriented– based on H.263, initially for low bit-rate apps– video sequence: a collection of media objects

• objects: still image, moving object, audio, etc• how to decompose is NOT specified (encoder)

– VOP: video object plane• GOV: I-VOP, P-VOP, B-VOP• VOP is divided into many macro-blocks

– motion estimation: bounding box; padding

CSc 461/561

6. Introduction to MPEG-Part I (5):

MPEG-4: object oriented

CSc 461/561

6. Introduction to MPEG-Part I (6) MPEG-4: more• Fine gain scalability

– spatial scalability– temporal scalability– quality scalability

• MPEG-4 audio– general audio (2~64Kbps)– speech (2~4Kbps: HVXC; 4~24Kbps: CELP)– synthesized (e.g., MIDI, TTS)

CSc 461/561

6. Introduction to MPEG-Part I (7)

• We will revisit the topic later.

CSc 461/561 Multimedia Systems Part B: 2. Lossy Compression

Documents

Transcript of CSc 461/561 Multimedia Systems Part B: 2. Lossy Compression