Image Processing: Image...

33
Image Processing: Image Compression Introduction Compression & Decompression Steps Information theory: Entropy Information Coding Binary Image Compression Lossless Gray Level Image Compression Lossy Compression of Gray Level Images

Transcript of Image Processing: Image...

Image Processing:Image Compression

IntroductionCompression & Decompression StepsInformation theory: EntropyInformation CodingBinary Image CompressionLossless Gray Level Image CompressionLossy Compression of Gray Level Images

© 2004 Rolf Ingold, University of Fribourg

Definition of image compression

Image compression is data compression on digital imagesIts objective is to reduce the quantity of information (number of bytes) used to represent an image

needs less storageallows faster transmission

The principle consists of reducing redundanciesdue to data correlation (data redundancy)due to the use of non optimal codes (coding redundancy)

Compression : the original image is transformed and encoded into a compressed fileDecompression : the compressed file is decoded and the original image is reconstructed

© 2004 Rolf Ingold, University of Fribourg

Two groups of compression methods

Lossless compression methods : all information is preservedthe transformation is reversiblesuitable for binary and indexed color images

Lossy compression methods : some information is lostthe transformation is not reversiblecompression artifacts are introducedsuitable for natural gray-level and color images such as photos best image quality at a given compression rate is the main goal

© 2004 Rolf Ingold, University of Fribourg

Compression steps

Transformation perfoms data decorrelation to reduce data redundancyQuantization performs approximation by restricting the set of possible valuesEncoding assigns optimal codes to eliminate coding redundancy

original image

transfor-mation

quantization

encoding

information preserved

compressedimage

loss of information

© 2004 Rolf Ingold, University of Fribourg

Decompression steps

Decoding restores the values representing the data in the tranfomatedspaceReconstruction recomputes the data of the original image space

For some applications, special requirements may be requestedprogressive display to preview a low quality image whiledownloading (stepwise) a better quality version

original image

recons-tructiondecoding

compressedimage

© 2004 Rolf Ingold, University of Fribourg

Encoding

Goal : encode a sequence of symbols (or values), by minimizing thelength of the data

assign a binary code to each symbolwhich is reversible (can be decoded univoquely)

Types of encoding schemes usedentropy encoding

Huffman codingrange codingarithmetic coding (encumbered by patents !)

dictionary based encoding

© 2004 Rolf Ingold, University of Fribourg

Information entropy

Entropy is a basic concept of information theoryFormal definition (by Shanon) : the entropy of a discrete random eventx, with possible states 1..n , is defined as

Entropy is a lower bound of the average amount of bits used to represent one event

entropy is maximal and equal to log2 nif all states have the same probabilityentropy decreases with the differencesof the states' probalilitiesentropy is minimal and equal to 0 if only one state can occurexample: Bernoulli trial with two state

∑=

−=n

i

ipipxH1

2 )(log)()(

© 2004 Rolf Ingold, University of Fribourg

Entropy example

We consider an image with 8 grey levels Z in the range 0..7 with knowndistributions PZ ; the table below computes its entropy

2.652Entropy :

0.1135.6440.027

0.1525.0590.036

0.2444.0590.065

0.2923.6440.084

0.4232.6440.163

0.4732.2520.212

0.5002.0000.251

0.4552.3960.190

-PZ log2 (PZ)-log2(PZ)PZZ

© 2004 Rolf Ingold, University of Fribourg

Infomation coding principles

An encoder translates a sequence of symbols (describing the states ofan event sequence) into a string of bitsThe decoder interpretes this string of bits to reconstruct the sequenceof symbolsCompression is achieved by using codes having variable lengths

short codes are used for frequent symbolslong codes are used for rare symbols

Two principles should be observeddecoding must be univoque : this is the case for prefix-free codes(no code is a prefix of another one)each bit string must be usable : this is achieved if each bit string either has a code as prefix, or it is the prefix of a code

Horn adresses of binary trees meet both requirements

© 2004 Rolf Ingold, University of Fribourg

Information coding example

Evaluation of a variable length code used to represent the gray levelsof the previous example of a

281size : 300size :

1260000006311127

1860000019311036

3050000118310165

434000124310084

483001483011163

42210633010212

50201753001251

38211573000190

FreqZ. l2l2code 2FreqZ

. l1l1code 1FreqZZ

© 2004 Rolf Ingold, University of Fribourg

Huffman codes

Huffman codes are optimal prefix-free codes associated to symbolsHuffman codes can be built by the following algorithm using a forest ofbinary trees and a heap

heap = new Heap();for each Symbol s do

heap.add(new Node(s, freq(s), null, null);while (heap.size > 1) {

node1 = heap.extraxt(); node2 = heap.extraxt();f1 = node1.getFreq(); f2 = node2.getFreq()heap.add(new Node(null, f1+f2, n1, n2);

}result = heap.extract();

the code of each symbol corresponds to its Horn address of theresult tree

© 2004 Rolf Ingold, University of Fribourg

Huffman codes example

The Huffman codes associated with the previous example is built withthe binary tree below

z0 → 11 z1 → 01 z2 → 10 z3 → 001z4 → 0001 z5 → 00001 z6 → 000001 z7 → 000000

p7=0.02

p6=0.03

p5=0.06

p4=0.08

p3=0.16

p2=0.21

p1=0.25

p0=0.19

0.050.11

0.19

0.40

0.35

1.00

00

00

0

1

1

1

10

11

00.60

1

© 2004 Rolf Ingold, University of Fribourg

Arithmetic coding

Arithmetic coding is an optimal entropy encoding technique in whichthe entire message is encoded as a single number between 0 and 1 (with very precision)Example:

consider 4 symbols with the following probabilitiesA(50%), B(20%), C(20%), D(10%)each symbol correespond to an intevall:A(0.0-0.5), B(0.5-0.7), C(0.7-0.9), D(0.9-1)the message ACBADA is encoded as0.0+0.5*(0.7+0.2*(0.5+0.2*(0.0+0.5*(0.9+0.1*(0.0+0.5*(0)))))) = 0.409

Many companies own patents in the US and other countries on algorithms used for implementing an arithmetic encoder and decoder !Range coding is an equivalent technique believed to not be covered by any company's patents !

© 2004 Rolf Ingold, University of Fribourg

Dictionary based coders

Dictionary based coders have initialy been developed to compress textPerformance is similar to entropy based codersPrinciples

sequences of symbols are encoded according to entries in a dictionarydictionaries may be static or dynamicto avoid transmitting the dictionary, it is possible to automaticallybuild a dictionary of previously seen strings

The following decoders belong to this familyLZW : used by the Unix compress utility and GIF files, but protected by several patentsLZ77, LZ78 : ancestors of LZW, free of patentsDEFLATE : combining LZ77 and Huffman codes, used for ZIP andgzip file formats

© 2004 Rolf Ingold, University of Fribourg

Limitation of entropy encoding

Entropy encoding is suitable for removing coding redundancyImages generally don't have much coding redundancy

entropy encoding alone does not allow to compress image data significantly !

The role of preliminary transformations is to convert data redundancyinto coding redundancy, which can later be eliminated by entropycoding

© 2004 Rolf Ingold, University of Fribourg

Binary Image Compression

Types of transformations used for binary imagesrun length encodingscan-line differencesquadtrees

© 2004 Rolf Ingold, University of Fribourg

Run-Length Encoding (RLE)

RLE is a very simple form of lossless data compressionLengths of horizontal runs (sequences of same pixel values) are storedas a single numberExample :

Original image using 120 bits

RLE representation: 5, 3, 16, 4, 4, 9, 3, 4, 4, 4, 9, 3, 4, 5, 4, 8, 3, 4, 6, 14, 4 using 21x5=105 bits (assuming five bits are used for each number)

© 2004 Rolf Ingold, University of Fribourg

2D-Run-Length Encoding

Principle : apply RLE on scanline differencesExample :

size of the compressed encoding : 13 + 18 + 11 + 13 + 17 = 72 bits

code table

ABS 0REL 1FL 00000 0-1 10+1 110 00011 0012 0103 0114 1005 10106 10116+... 11...

ABS, 5, 3, ABS, FL, 0 1010 011 0 0000REL, -1, 0, ABS, 9, 3, ABS, FL, 1 10 0 0 11011 011 0 0000REL, 0, 0, REL, 0, 0, ABS, FL, 1 0 0 1 0 0 0 0000REL, +1, +1, REL, 0, 0, ABS, FL, 1 10 10 1 0 0 0 0000ABS, 6, 14, ABS, FL 0 1011 1111010 0 0000

© 2004 Rolf Ingold, University of Fribourg

Quadtree Data Structure

A quadtree is a data structure used to partition recursively an image by subdividing it into four quadrants

the process is stoped if a node correspond to a homogeneoussquare

L

A B

G

H I

J K

M N

S

C DE F

O PQ R

L

1 2

3

A B

G

L

H I

J K

M N

S

4

5

A B

C D E F

G H I J K

L

M N

O P Q R

S

1 2 3

4 5

0

0

© 2004 Rolf Ingold, University of Fribourg

Quadtree compression

Quatrees can be used for lossless data compressionthe quadtree is represented by the list of nodes in pre-order, whichis significant to rebuild the tree univoquely

Example: Original image using 64 bits

quadtree representation: XXWWXWWWBBXWBBWBXWWXBBWWBusing 25x2=50 bits (assuming two bits are used for each symbol)

© 2004 Rolf Ingold, University of Fribourg

Lossless Gray Level Image Compression

Performing lossless gray-level image compression can be doneby using binary compression on bit- plane decompositionby decorrelating adjacent pixels and using entropy coding

Performance is limited !

© 2004 Rolf Ingold, University of Fribourg

Bit-plane Decomposition

A binary image with 256 gray levels can be represented by 8 bit-planesa0, a1, ... a7.

the higher planes (a7, ... ) are more compressible than the lower

A better compression is achieved with the planes of the Gray code defined as 1+⊕= iii aag

© 2004 Rolf Ingold, University of Fribourg

Entropy Encoding for Gray-Level images

Usually, gray-level images don't have much coding redundancy

Example: the set of 256 gray levels of the 65'536 pixels of the Lena picture have an entropyof 7.394 (correponding to 60'572 bytes)

using Huffman codes, the picturecan be compressed into 60'825 bytes (plus 481 bytes for the codebook)

50 100 150 200

100

200

300

400

500

600

700

© 2004 Rolf Ingold, University of Fribourg

Gray-Level Image Transfomations

Transformations of gray-level images are usingpixel differences rather than absolute values

based on scan linesbased on quadtrees

global transformations Karhunen-Loève or Hotelling transform (also PCA)Fourier transform, Walsh-Hadamard transform, discretecosinus transform (DCT), etc.Haar and other wavelets

© 2004 Rolf Ingold, University of Fribourg

Data Correlation for Adjacent Pixels

In gray level images, values of adjacent pixels are highly correlatedExample (from the Lena picture)

first plot : distribution of (z1, z2) with z1 = f(2x, y), z2=f(2x+1, y)first plot : distribution of (z1, z2-z1)

the distribution of z2-z1 is much narrower than the distribution of z2 !50 100 150 200

50

100

150

200

50 100 150 200 250

-200

-100

100

200

© 2004 Rolf Ingold, University of Fribourg

Decorrelation by Adjacent Pixel Differences

Replacing pixel values by pixel differencesnarrows the distributionreduces the entropy

Example: the pixel differences of the Lena picturehave an entropy of 5.277 (needing globaly43'321 bytes)

-100 -50 0 50 100

1000

2000

3000

4000

5000

6000

© 2004 Rolf Ingold, University of Fribourg

Decorrelation by Quadtrees

Pixel differences can be generalized to 2 dimensions by consideringquadtrees

each node is characterized by the mean value of all its leaves(representing pixels)and stores the gray-level difference to its father

Example : for the Lena picture, the quatree of differences has an entropy of4.710

© 2004 Rolf Ingold, University of Fribourg

Lossy Compression of Gray-Level Images

Lossy compression consists oftransforming spatial data in order to reduce data correlationquantization of the values in transformed spaceencode the data by eliminating code redundancy

Type of transfomations usedKarhunen-Loève Transfom (KLT)Fourier Transform, Walsh-Hadamard Transform and DiscreteCosinus Transform (DCT)Wavelets, such as Haar Transform...

© 2004 Rolf Ingold, University of Fribourg

Karhunen-Loève Transform

The Karhunen-Loève transform (KLT) is the linear transform thatoptimaly decorrelates the dataThe KLT is calculated as follows

estimate the covariance matrix

find the eigenvectors (i.e an orthonormal matrix) A such as

the transformation y=At(x-m) is the Karhunen-Loève transformThe reverse transform is obtained by x=Ay+m

tn

k

tkk

t xxn

E mmmxmxC −=−−= ∑=1

1}))({( ∑=

==n

kkn

E1

1}{ xxm

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

λ

λλ

=

d

t

0

0

2

1

ACA

© 2004 Rolf Ingold, University of Fribourg

Karhunen-Loève Transform (cont.)

The Karhunen-Loève transform can be applied on blocks of adjacent pixelsThe Karhunen-Loève transform can be understood as a lineartransform of a set of data dependent basis function

these functions are ordered according to decreasing eigenvaluesExample :

the 64 8x8 basis functions of the Lena Pictures are listed belowwith their index

1 2 3 4 5 ... 15 16 17 ... 62 63 64

© 2004 Rolf Ingold, University of Fribourg

Quantization of the KLT data

The KLT is a reversible transform: the original image can be rebuiltfrom the KLT coefficientsFor lossy compression, KLT coefficient are quantized, using twocomplementary principles

the set of possible values is restricted to multiples of a quantization interval

only a subset of KLT coefficients are used (the others are put to 0)Entropy coding used for

KLT basis encodingKLT coefficients

)/round(ˆ qyqy =

© 2004 Rolf Ingold, University of Fribourg

Example of KLT compression

Original image compared to1/4 of KLT coefficients quantized with multiples of 16 (size = 3'834 bytes, compression rate = ~1:17)1/8 of KLT coefficients quantized with multiples of 16 (size = 2'152 bytes, compression rate = ~1:30)

© 2004 Rolf Ingold, University of Fribourg

Distortion

Two measures are used to characterize distortionmean square error (MSE) is defined as

peak signal-to-noise ratio (PSNR) is defined as

where m represents the peak value (255 for 8-bit gray level image)

( ) ]ˆ[ 2xxEMSE −=

( ) ]ˆ[log10 2

2

10 xxEmPSNR−

=