EE465: Introduction to Digital Image Processing Binary Image Compression 4 The art of modeling image...

EE465: Introduction to Digital Image Processing

Binary Image Compression

The art of modeling image source– Image pixels are NOT independent events

Run length coding of binary and graphic images– Applications: BMP, TIF/TIFF

Lempel-Ziv coding*– How does the idea of building a dictionary can

achieve data compression?


From Theory to Practice

So far, we have discussed the problem of data compression under the assumption that the source distribution is known (i.e., the probabilities of all possible events are given).

In practice, the source distribution (probability models) is unknown and can only be approximately obtained by relative frequency counting method due to the inherent dependency among source data.


An Image Example

Binary image sized 100100 (approximately 1000 dark pixels)


A Bad Model

Each pixel in the image is observed as an independent event

All pixels can be characterized by a single discrete random variable – binary Bernoulli source

We can estimate the probabilities by relative frequency counting


Binary Bernoulli distribution (P(X=black)=0.1)

Synthesized Image by the Bad Model


Why does It Fail?

Roughly speaking– Pixels in an image are not independent events (source is

not memoryless but spatially correlated)– Pixels in an image do not observe the same probability

model (source is not stationary but spatially varying) Fundamentally speaking

– Pixels are the projection of meaningful objects in the real world (e.g., characters, lady, flowers, cameraman, etc.)

– Our sensory processing system has learned to understand meaningful patterns in sensory input through evolution


Similar Examples in 1D

Scenario I: think of a paragraph of English texts. Each alphabet is NOT independent due to semantic structures. For example, the probability of seeing a “u” is typically small; however, if a “q” appears, the probability of the next alphabet being “u” is large.

Scenario II: think of a piece of human speech. It consists of silent and voiced segments. The silent segment can be modeled by white Gaussian noise; while the voiced segment can not (e.g., pitches)


Data Compression in Practice

sourcemodeling

entropycoding

discretesource X

binarybit stream

probabilityestimation

P(Y)

Y

The art of data compression is the art of source modeling

Probabilities can be estimated by counting relative frequencies either online or offline


Source Modeling Techniques

Transformation– transform the source into an equivalent yet

more convenient representation

Prediction– Predict the future based on the causal past

Pattern matching– Identify and represent repeated patterns


Non-image Examples

Transformation in audio coding (MP3)– Audio samples are transformed into frequency

domain Prediction in speech coding (CELP)

– Human vocal tract can be approximated by an auto-regressive model

Pattern matching in text compression– Search repeated patterns in texts


How to Build a Good Model

Study the source characteristics– The origin of data: computer-generated,

recorded, scanned …– It is about linguistics, physics, material science

and so on … Choose or invent the appropriate tool

(modeling technique)Q: why pattern matching is suitable for texts, but

not for speech or audio?


Image Modeling

Binary images– Scanned documents (e.g., FAX) or computer-

generated (e.g., line drawing in WORD) Graphic images

– Windows icons, web banners, cartoons Photographic images

– Acquired by digital cameras Others: fingerprints, CT images, astronomical

images …


Lossless Image Compression

No information loss – i.e., the decoded image is mathematically identical to the original image– For some sensitive data such as document or

medical images, information loss is simply unbearable

– For others such as photographic images, we only care about the subjective quality of decoded images (not the fidelity to the original)


Run Length Coding (RLC)

What is run length?

Run length is defined as the length of consecutivelyidentical symbols

HHHHH T HHHHHHH5 71

Examples

SSSS EEEE NNNN WWWW4444

Coin-flip

random walk


Run Length Coding (Con’t)

Transformationby run-length

counting

Entropycoding

discretesource X

binarybit stream


P(Y)

Y

Y is the sequence of run-lengths from which Xcan be recovered losslessly


Properties

-“0” run-length (red) and “1” run-length (green) alternates

- run-lengths are positive integers

RLC of 1D Binary Source

0000 111 000000 11111 00

4 63

X

Y 5 2

(need extra 1 bit to denote what the starting symbol is)

Huffman coding

compressed data


0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 …5 7 4 0 8 run-length

When P(x=0) is close to 1, we can record run-length of dominant symbol (“0”) only

Example

Properties

- all coded symbols are “0” run-lengths

- run-length is a nonnegative integer

Variation of 1D Binary RLC


Modeling Run-length

geometric source: P(X=k)=(1/2)k, k=1,2,…

Run-length k Probability

12345…

1/21/41/81/161/32…


Golomb Codes

k12345678…

codeword010110111011110111110111111011111110… …

Optimal VLC for geometric source: P(X=k)=(1/2)k, k=1,2,…

01

1 0

1 0

1 0

…


From 1D to 2D

white run-length

black run-length

Question: what distinguishes 2D from 1D coding?

Answer: inter-line dependency of run-lengths


Relative Address Coding (RAC)*

0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 1 1 0 0

previousline

currentline

7

d1=1 d2=-2 NS,run=2NS – New Start

Its variation was adopted by CCITT for Fax transmission


Image Example

CCITT test image No. 1Size: 17282376

513216 bytes

Raw data (1bps)

filesize of ccitt1.pbm:

filesize of ccitt1.tif: 37588 bytes

Compression Ratio=13.65


Graphic Images (Cartoons)

Observations: -dominant background color (e.g., white)-objects only contain a few other colors

Total 12 different colors (black,white,red,green,blue,yellow …)


Palette-based Image Representation

index color

0

12

34

5

…

white

blackred

green

blue

yellow

…

Any (R,G,B) 24-bit color can be repre-sented by its index in thepalette.


0 0 0 0 0 3 3 3 0 0 0 0 0 0 0 2 2 2 2 1 0 0 0 0 0 0 0 0 …5 3 7 4 8 run-length

Example

0 3 0 2 0 color-index

RLC of 1D M-ary Source

Basic idea

Record not only run-lengths but also color indexes

11

(color, run) representation: (0,5) (3,3) (0,7) (2,4) (1,1) (0,8) …

WWWWGGGWWWWWWWRRRR… color sequence

Note: run-length is a positive integer


Variation of 1D M-ary RLC

When P(x=0) is close to 1, we can record run-length of dominant symbol (“0”) only

0 0 0 0 0 1 0 0 0 0 0 0 0 4 0 0 0 0 3 2 0 0 0 0 0 0 0 0 1 …5 7 4 0 8 run

Example

Properties - “0” run-lengths only- run-length is a nonnegative integer

level1 4 3 2 1(run, level) representation: (5,1) (7,4) (4,3) (0,2) (8,1) …


Image Example

Raw data: party8.ppm, 526286, 451308 bytesCompressed: party8.bmp, 17094 bytes

Compression ratio=26.4


History of Lempel-Ziv Coding

Invented by Lempel-Ziv in 1977 Numerous variations and improvements sinc

e then Widely used in different applications

– Unix system: compress command– Winzip software (LZW algorithm)– TIF/TIFF image format– Dial-up modem (to speed up the transmission)


Dictionary-based Coding

Use a dictionary– Think about the evolution of an English

dictionary• It is structured - if any random combination of alphabets

formed a word, the dictionary would not exist

• It is dynamic - more and more words are put into the dictionary as time moves on

– Data compression is similar in the sense that redundancy reveals as patterns, just like English words in a dictionary


Toy Example

I took a walk in town one dayAnd met a cat along the way.What do you think that cat did say?Meow, Meow, Meow

I took a walk in town one dayAnd met a pig along the way.What do you think that pig did say?Oink, Oink, Oink

I took a walk in town one dayAnd met a cow along the way.What do you think that cow did say?Moo, Moo, Moo

- from “Wee Sing for Baby”

I took a walk in town one day1entry pattern

2345

And met a

What do you think that

did say?

along the way

67

cat

meow

… … …


Basic Ideas

Build up the dictionary on-the-fly (based on causal past such that decoder can duplicate the process)

Achieve the goal of compression by replacing a repeated string by a reference to an earlier occurrence

Unlike VLC (fixed-to-variable), LZ parsing goes the other way (variable-to-fixed)


Lempel-Ziv Parsing

Initialization: – D={all single-length symbols}– L is the smallest integer such that all codewords whose lengt

h is smaller than L have appeared in D (L=1 at the beginning) Iterations: wnext parsed block of L symbols

– Rule A: If wD, then represent w by its entry in D and update D by adding a new entry with w concatenated with its next input symbol

– Rule B: If wD, then represent the first L-1 symbols in w by its entry in D and update D by adding a new entry with w


Example of Parsing a Binary Stream

0 1 1 0 0 1 1 1 0 1 1 …Dictionaryentry pattern

1

2

3

4

5

6

0

1

01

11

10

00

7 011

1, 2, 2, 1, 3, 4, 7, …

8 110

(entries in D)

fixed-length

variable-length

output:

input:

L=1

L=2

L=3

w: 0, 11, 10, 00, 01, 110, 011, …A A ABBBBrule:

Illustration: Step 1: w=0, Rule A, output 1, add 01 to D, L=2Step 2: w=11, Rule B, output 2, add 11 to DStep 3: w=10, Rule B, output 2, add 10 to DStep 4: w=00, Rule B, output 1, add 00 to DStep 5: w=01, Rule A, output 3, add 011 to D, L=3Step 6: w=110, Rule B, output 4, add 110 to D


Binary Image Compression Summary

Theoretical aspect– Shannon’s source entropy formula tells us the lower bound for

coding a memoryless discrete source– To achieve the source entropy, we need to use variable length

codes (i.e., long codeword assigned to small-probability event and vice versa)

– Huffman’s algorithm (generating the optimal prefix codes) offers a systematic solution

Practical aspect– Data in the real world contains memory (dependency, correlation

…) – We don’t know the probability distribution function of the source– Tricky point: the goal of image compression (processing) is not to

make your algorithm work for one image, but for a class of images


The Art of Source Modeling

How do I know a model matches or not?

Hypothesis Model: Bernoulli distribution

(P(X=black)=0.1)

Observation Data


A Matched Example

Synthesis

Model


Transformationby run-length

counting

Huffmancoding

discretesource X

binarybit stream


P(Y)

Y

Y is the sequence of run-lengths from which Xcan be recovered losslessly

Good Models for Binary Images

Transformationby Lempel-Ziv

parsing

Y is the sequence of dictionary entries from which Xcan be recovered losslessly


Remaining Questions

How to choose different compression techniques, say RLC vs. LZC?– Look at the source statistics– For instance, compare binary image (good for RLC) vs. English

texts (good for LZC) What is the best binary image compression technique?

– It is called JBIG2 – an enhanced version of JBIG What is the entropy of binary image source?

– We don’t know and the question itself is questionable– A more relevant question would be: what is the entropy for a

probabilistic model which we believe generates the source?

EE465: Introduction to Digital Image Processing Binary Image Compression 4 The art of modeling image...

Documents

Transcript of EE465: Introduction to Digital Image Processing Binary Image Compression 4 The art of modeling image...