Lossy Image Compression Using Wavelets

8/2/2019 Lossy Image Compression Using Wavelets

1/21

Journal of Intelligent and Robotic Systems 28: 3959, 2000.

2000 Kluwer Academic Publishers. Printed in the Netherlands.39

Lossy Image Compression Using Wavelets

NICK D. PANAGIOTACOPULOS, KEN FRIESEN and SUKIT LERTSUNTIVITDepartment of Electrical Engineering, College of Engineering, California State University,

Long Beach, CA 90840, U.S.A.

(Received: 15 June 1998; accepted: 19 July 1999)

Abstract. In this paper, we report the results of the application of transform coding image data

compression techniques using Daubechies and Coifman wavelets. More specifically, D2, D4, D8,

D16, and C6, C12 wavelets were used. The results from these wavelets were compared with those of

discrete cosine transform. They clearly demonstrated the superiority of the wavelet-based techniques

both in compression ratios and image quality, as well as in computational speed. Two quantiza-

tion methods were used: non-uniform scalar quantization and pseudo-quantization. Both producedsatisfactory results (8688% compression ratio, and acceptable image quality).

Key words: wavelet, wavelet function, multiresolution analysis (MRA), filter banks, lossy compres-

sion, scaling functions, dilation equation, pseudo-quantization.

1. Introduction

The aim of image compression is to reduce the size of an image with or without

loss of information (lossy or lossless compression). One of the most commonly

used lossy compression methods is that of transform coding using one of the many

image transforms available. In a previous paper [4], we examined the performanceof the three most common transform methods: the Hadamard, the Walsh, and

the discrete cosine transform [9]. The idea here is to transform the image into

a new domain, where the image is now represented (approximated) by a set of

decorrelated coefficients. Clearly, the larger the number of coefficients, the more

accurate the approximation is. However, in order to achieve (lossy) compression,

the coefficients containing less important information are removed while those with

the most important information are kept. At this point, we can inspect qualita-

tively (visually) and quantitatively (signal to noise ratio) the compressed image

by reconstructing it via the inverse transformation. It was observed that by using

the appropriate transform method (choose the best basis) only, we can achieve a

satisfactory compression ratio but not good enough for most practical applications.

In order to achieve higher compression ratios, The transformed image was subject

to a scalar, non-uniform quantization (which removes unimportant coefficients)

followed by the application of a combined variable length and run-length coding

process. In summary, the compression process consists of three steps: transfor-

mation, quantization, and encoding, while decompression (reconstruction) is the


2/21

40 N. D. PANAGIOTACOPULOS ET AL.

inverse process. In [4], we demonstrated that from all of these transform methods,

the discrete cosine transform method produced the best compression ratio, the best

image quality (higher signal to noise ratio), and the smallest mean square error. The

Walsh and Hadamard transforms were next in performance. In the same study [4],

we examined the performance of two discrete Wavelet transform methods, namely:

the classical Haar wavelet and the Daubechies (D4) wavelet. The quantization and

encoding used for both methods were the same as the ones used for the other three

methods. We found that the D4 wavelet outperformed the discrete cosine transform

method by obtaining a better compression ratio, a higher signal to noise ratio and

higher computational speed, while the Haar resulted in a very high compression

ratio but a low signal to noise ratio. Similar observations were made when higher

order Daubechies and Coifman wavelets were used.

2. Theoretical Basis

Image compression consists of two processes: the compression process (encoding),and the decompression (decoding) process. The compression process requires three

steps: transformation, quantization, and encoding, while the decompression is the

inverse process and requires: decoding, dequantization, and inverse transformation.

Next, we discuss the theoretical foundation upon which this research is based.

2.1. THE TRANSFORMATION

Let us consider an (N N ) imagef (x,y) in D = {a x b, c y d}

and a double sequence of square integral orthonomal functions:

{m,n(x,y)} in D m, n N,where N = {1, 2, . . .} is the set of natural numbers. Then we can approximatef (x,y) by

f (x,y) =N

m=1

Nn=1

am,nm,n(x,y),

where

am,n=

b

ad

c

f(x,y)m,n(x,y) dx dy.

The above expression for f (x,y) is known as generalized Fourier series in 2D and

am,n are the generalized Fourier coefficients.

The functions m,n(x,y) are known as basis functions in 2D and can be properly

chosen (i.e., cosine, or other orthonomal functions). Clearly, the image f (x,y) is


3/21

LOSSY IMAGE COMPRESSION USING WAVELETS 41

now represented by the coefficients am,n m, n N = {1, 2, . . . , N }. If N is alarge integer number, then the number of coefficients am,n, needed to represent the

image, is (impractically) large but the representation of f (x,y) is very accurate.

However, in lossy image compression, we want to approximate f (x,y) using fewer

coefficients as long as the quality of the compressed image is acceptable. By an

acceptable image we mean an image which is visually undistinguishable from the

original image. That is, the acceptability is a function of a user-specified threshold

error e (if e 0 the compression is lossless). Therefore, we want to find anapproximation such that the error e does not exceed the allowable error level for

a good image quality. More precisely, the problem is: given the basis functions

m,n(x,y) and a desired error e, find a cruder approximation fe(x,y) for f (x,y)

using a smaller number of coefficients am,n m, n N = {1, 2, . . . , M }, whereM < N. That is, we are looking for

fe(x,y) =

M

m=1

M

n=1

am,nm,n(x,y)

such that the (L2) normf (x,y) fe(x,y)2 efor M < N. From the expressions for f (x,y),f e(x,y), and we obtainf (x,y) fe(x,y)22 =

Nu=m+1

Nl=n+1

a2k,l .

This indicates that the square of the overall L2 error is just the sum of squares of

all the coefficients we choose to leave out. We assume that the coefficients am,nm, n N = {1, 2, . . . , M } are in descending order of their magnitude accordingto the information content.

Furthermore, in order to make the computer implementation easier, let us as-sume that the functions m,n(x,y) are separable am,n m, n N. That is,

m,n(x,y) = m(x)n(y) in D m, n N.Then we can rewrite the expression for f (x,y) and am,n as follows:

f (x,y) =N

m=1

N

n=1am,nn(y)

m(x),

am,n =b

a

dc

f(x,y)n(y) dy

m(x) dx.

In summary [3], the compression is achieved by computing the coefficients am,n

of the image, sorting them in their descending order of decreasing magnitude suchthat

Nu=m+1

Nl=n+1

(am,n)2 e2 for M < N, and for a given error e.


4/21


3. Filter Banks [2, 7, 8]

In the transformation section, we presented the general theoretical basis of the

classical compression methods. Next, we discuss in detail an efficient version of

this technique known as a multiresolution decomposition/reconstruction approach.The Haar (D2), the Daubechies (D4, D6, D8, D10, D12, and D16) and the Coif-

man (C6 and C12) wavelets act as multiple branch filter banks upon the original

image transforming it into a sequence of independent base images preserving the

decorrelation property of the DCT. We can calculate the number of branches (nb)

of the filter back by the formula:

nb = log2 N log2(order of filter) 1.For example, for a 256 256 image the D2 wavelet (WT) is an 8 branch filter bankand the 2D D4 WT is a 7 branch filter bank. (The 2D D4 WT has 4 tap weights,

while the 2D D2 WT has 2 tap weights.) Each branch of the filter bank down

samples the input image by a factor of 2 which results in the output subimagehaving its dimensions reduced by a factor of 2. The filter bank stops when the

dimensions of the output subimage equal half the tap weights of the filter.

The original image contains superimposed signals in all frequency bands. The

first branch of the filter bank separates the original image into high-frequency

and low-frequency subimages which are orthogonal, hence independent. The low-

frequency subimage still contains superimposed signals in multi-frequency bands.

The second branch of the filter bank operation separates the low-frequency compo-

nents and high-frequency components of the correlated subimage output from the

first branch into orthogonal low-frequency and high-frequency subimages. This

successive decomposition of the correlated low-frequency output subimages from

each branch of the filter bank continues through the last branch of the filter bank.

The result is a sequence of independent subimages each containing 2D signals in a

distinct frequency range.

3.1. TWO -DIMENSIONAL MULTIRESOLUTION ANALYSIS (MR A)

DECOMPOSITION

Both Daubechies and Coifman WT generate an MRA decomposition of the Hilbert

space L2(R2), the space of finite energy 2D signals. A signal f (x,y) is contained

in L2(R2) if the following condition holds:

f (x,y)

2 dx dy < .

Any finite image is contained in L2(R2) since the above integral always exists for

functions defined in a 2D finite interval {[a, b][c, d]}. Any 2D wavelet constitutesan MRA decomposition of L2(R2) if it consists of a pair of basis functions for


5/21


L2(R2): ((x,y),(x,y)) that have special algebraic properties and as a result

impose a special algebraic structure upon the space L2(R2). The basis function

(x,y) is called scaling function and the basis function (x,y) is called wavelet

function. Before presenting a detailed description of the 2D MRA generated by

Daubechies and Coifman WT, we note that both (x,y) and (x,y) are separable

functions in the variables x, y. That is, (x,y) = 1(x)2(y), where 1(x) and2(y) are 1D scaling functions that generate 1D MRA decompositions of the space

of 1D finite energy signals: L2(R); and (x,y) = 1(x)2(y), where 1(x) and2(y) are 1D wavelet functions corresponding to the scaling functions 1(x) and

2(y), respectively. The properties of the 2D MRA decomposition for the space

L2(R2) will be defined as the Cartesian product of 2 independent 1D MRA de-

compositions one acting on the x-axis, the other on the y-axis. The key function

generating an MRA decomposition is the scaling function 1(x). A scaling function

is the solution to the following dilation equation:

1(x) = k

ak1(2x k),

where the coefficients ak have the property ak , ak21 = 2(k 1), where () isthe Kronecker delta, which makes the set of integer translates of 1(x): 1(x k)orthogonal. This property is expressed as follows

1(x k),1(x 1) = (k 1).

As such the integer translates of the scaling function form a basis for a closed

subspace V0. If 1(x) is scaled by a factor of 2, the integer translates of 1(2x),

1(2x k), also form a basis for another subspaces ofL2(R) V1 which contains V0:

V0 V1.If 1(x) is scaled by the factor 2

2, then the integer translations of the function

1(22x), 1(2

2x k), form a basis for another closed subspace of L2(R) V2 thatproperly contains both the subspaces V0 and V1:

V0 V1 V2.The set of integer translates of1(x) scaled by a positive power of 2 (2

j ) forms the

basis for a closed subspace of L2(R) Vj which properly contain all the subspaces

V of a lower order:

V0

V1

Vj

.

If we scale continuously the function 1(x) by increasing powers of 2, a ladder

of subspaces is generated whose infinite union comprises the space L2(R):

L2(R) = closjZ

Vj .


6/21


Hence, any finite energy 1D signal f(x) can be expressed as a series expansion of

the basis functions of these component spaces:

f(x)

= j k cj kj k(x),where the coefficient cj k is given by:

cj k =

f(x)j k(x) dx; j,k (x) = 1

2j x k

.

Since V0 is a closed subspace in a Hilbert space which is also contained in V1, an-

other closed subspace in a Hilbert space, there must exist another closed subspace

denoted by W0 which is the orthogonal complement to subspace V0 such that the

following relationship between the subspaces V0, W0, and V1 hold:

V1 = V0 W0.

That is, the subspace V1 has, as a basis, the union of bases for the 2 subspaces onthe right-hand side and the bases for the respective subspaces are orthogonal. This

relationship is known as the decomposition of subspace V1. The subspace W0 is

generated by integer of the wavelet function 1(x),1(x k), corresponding tothe scaling function 1(x). Since W0 is orthogonal to V0 for all j, k Z,

1(x j ) , 1(x k) = 0.

Since the integer 1(2x j ) are also a basis for the closed subspace V1 that is con-tained in a closed subspace there must exist a closet subspace that is the orthogonal

complement of V1, W1, and the closed subspace V2 is a direct sum of the closed

subspace V1 and its orthogonal complement subspace W1:

V2 = V1 W1.If one substitutes the decomposition of subspace V1 in the above expression one

obtains:

V2 = V0 W0 W1.Since the closed subspace V2 is contained in the closed subspace V3, there exists

an orthogonal complement subspace to V2, W2, such that V3 = V2 W2. Then,V3 = V0 W0 W1 W2.

The decomposition relation

Vj=

Vj

1

Wj

1

can be extended to all orders of V subspaces. If one substitutes the direct sum

decomposition for all Vj in

L2(R) = closjZ

Vj ,


7/21


then L2(R) has an alternate subspace composition:

L2(R) = clos

jZ

Wj

.

This is known as the direct sum composition for space L2(R). This expression

is more refined than the composition of V spaces since each of the generating

subspaces is orthogonal. This is the special algebraic structure for the Hilbert

space L2(R) created by an MRA basis: it produces 2 sets of generating subspaces

as defined in the two previous relations, the latter being composed of mutually

independent (orthogonal) generating subspaces. The WT in the space L2(R) is the

transformation obtained from the two previous generating expressions; that is, the

WT of any 1D finite energy signal f(x) is the conversion of its series expansion

relative to the subspaces

{jZ} Vj :

f(x) = j

k

cj kj k(x)

to a series expansion relative to the subspaces Wj , j Z:

f(x) =

j

k

dj kj k(x).

We now extrapolate the definition of a 1D MRA decomposition to the 2D Hilbert

space L2(R2) generated by separable wavelet functions. Since the scaling function

(x,y) is defined by 1(x)2(y), where 1(x) and 2(y) are 1D scaling functions

in L2(R), the 2D wavelet function (x,y) is defined as 1(x)2(y), where 1(x)

and 2(y) are the 1D wavelet functions corresponding to the scaling functions

1(x) and 2(y), respectively. The subspace V(2)

0 is defined as the Cartesian prod-

uct space V(x)0 V(y)0 of the two 1D subspaces which, for the sake of notationalconvenience, will be referred to as V

(x)0 V

(y)

0 . The subspace V(x)

0 is generated by the

basis 1(x k), and V(y)0 is generated by the basis 2(y k). The space W(2)0 is theCartesian product W

(x)0 W

(y)

0 where W(x)

0 is generated by the basis 1(x k), andW

(y)

0 is generated by the basis 2(y k). The decomposition of the space V(x)1 V(y)1is given by:

V(x)

1 V(y)

1 =

V(x)

0 W(x)0

V(y)

0 W(y)0

=

V(x)

0 V(y)

0 V(x)0 W(y)0 W(x)0 V(y)0 W(x)0 W(y)0

.

Each of these 2D subspaces is orthogonal. The subspace V(x)

0 V(y)

0 contains only

low-frequency components in both variables x and y, the subspaces V(x)0 W(y)0 con-

tain low-frequency components in the variable x and high-frequency components

in the variable y, the subspace W(x)0 V

(y)

0 contains high-frequency components in

the variable x and low-frequency components in the variable y, and the subspace

W(x)0 W

(y)

0 contains high-frequency components in both variables x and y.


8/21


3.1.1. Decomposition Procedure

In our case, we are wavelet transforming a 256 256 image. By its dimensionthe original image is contained in the L2(R2) subspace V

(x)8 V

(y)

8 . The WT of this

image is the transformation of subspace V(x)8 V(y)8 onto the most refined directsum subspace possible for the WT used. In the case of the 2D Haar WT this

transformation is:

V(x)

8 V(y)

8

V(x)

0 V(y)

0 V(x)0 W(y)0 W(x)0 V(y)0

0j,k8

W(x)j W

(y)

k

since 0 is the stopping order for the 2D Haar WT. For the 2D D4 WT, the transfor-

mation is:

V(x)

8 V(y)

8

V(x)

1 V(y)

1 V(x)1 W(y)1 W(x)1 V(y)1

1j,k8

W(x)j W

(y)

k

since 1 is the stopping order for the 2D D4 WT. As described above, the targetsubspace for either transformation is composed of mutually orthogonal subspaces.

Each of these respective decomposition procedures takes place in a series of levels;

8 levels for the 2D Haar WT and 7 levels for the 2D D4 WT. Since both wavelet

operators are separable, each level in turn is composed of 2 distinct steps:

(1) decomposition of the subspace V(x)

k onto the orthogonal subspace V(x)

k1W(x)k1,and

(2) decomposition of the subspace V(y)

k onto the orthogonal subspace V(y)

k1W(y)k1for 1 k 8 (2D Haar WT) and 0 k 8 (2D D4 WT).

The first step in each transformation level is accomplished by applying the

1D wavelet operator to each row of a subimage representing the subspace V(x)

k .

The second step in each transformation level is accomplished by applying the 1D

wavelet operator to each column of the subimage that is the output of the first step

in the level decomposition. This second step decomposes the subspace V(y)

k . We

illustrate the first two levels of the decomposition:

Level 1. The first step of the first level operates on the subspace V(x)

8 in the

product subspace V(x)

8 V(y)

8 and creates an intermediary subspace:

V(x)

8 V(y)

8

V(x)

7 W(x)7

V(y)

8 = V(x)7 V(y)8 W(x)7 V(y)8 .This decomposition produces two orthogonal subspaces labelled L: V

(x)7 V

(y)

8 and

H: W(x)7 V

(y)

8 . Labels L and H represent low- and high-frequencies, respectively

(Figure 1).

The second step of level 1 decomposition operates on the subspace V(y)

8con-

tained in the product subspace (V(x)

7 W(x)7 )V(y)8 creating the orthogonal subspace:

V(x)

7 W(x)7

V(y)

8

V(x)

7 W(x)7

V(y)

7 W(y)7

= V(x)7 V(y)7 W(x)7 V(y)7 V(x)7 W(y)7 W(x)7 W(y)7 .


9/21


Figure 1. 2D MRA decomposition (level 1, x-decomposition).

Figure 2. 2D MRA decomposition (level 1, y-decomposition).

This resultant subspace is composed of mutually orthogonal subspaces containing

either low-frequency components in x and y, high-frequency components in x

and y, or mixtures of low-frequency components in x and high-frequency com-

ponents in y or high-frequency components in x and low-frequency components

in y. This decomposition produces four subspaces with the following labelling

(Figure 2):

LL: V(x)

7 V(y)

7 , HH: W(x)7 W

(y)7 , HL: W

(x)7 V

(y)7 , LH: V

(x)7 W

(y)7 .

Level 2. In this level, we deal with the LL subspace (V(x)

7 V(y)

7 ) which cor-

responds to 128 128 image in the upper left-hand corner of the 256 256decomposed image. In the case ofV

(x)7 , this decomposition is

V(x)

7 V(y)

7

V(x)

6 W(x)6

V(y)

7 = V(x)6 V(y)7 W(x)6 V(y)7 .

In the V(y)

7 case, we have

V(y)

7 = V(x)

6 W(x)6 V(y)

6 W(y)6 = V(x)6 V(y)6 W(x)6 V(y)6 W(7)6 V(x)6 + W(x)6 W(y)6 .This decomposition produces four orthogonal subspaces labelled (Figure 3):

LLLL: V(x)

6 V(y)

6 , LLHL: V(x)

6 W(y)

6 , LLHL: W(x)6 V

(y)

6 , LLHH: W(x)6 W

(y)

6 .


10/21


Figure 3. 2D MRA decomposition (level 2).

3.1.2. Two-dimensional MRA Reconstruction

Image reconstruction is the inverse of image decomposition. The reconstruction

process is the inverse WT of the subspace

V(x) V(y) V(x) W(y) W(y) {j, k7} W(x)j W

(y)

k onto the subspace V

(x)8 V

(y)

8 ( = 0 for the 2D Haar WT, and = 1 for the 2DDaubechies and Coifman WT), where all the component 2D signals are super-

imposed over all frequency ranges. Since image reconstruction is the inverse of

image decomposition, all the steps of the decomposition process are inverted. The

step involving x-decomposition: V(x)

k (V(x)k1 W(x)k1) becomesV

(x)

k1 W(x)k1 V(x)k , and the step involving y-decomposition .

V(y)

k (V(y)k1 W(y)k1) becomes (V(y)k1 W(y)k1) V(y)k . The (x,y)-directionI of the two-step procedure in each level of decomposition is inverted: (1) y-

reconstruction and then (2) x-reconstruction.

4. Generation of Daubechies and Coifman Wavelet Coefficients

4.1. DAUBECHIES WAVELETS [1, 2, 8]

The coefficients for the Daubechies filter banks of order N must satisfy the Nth

order dilation equation for the scaling function which is the basis for subspace V0:

(x) =N1k=0

ck (2x k)

and also simultaneously satisfy the orthogonality requirements for the scaling func-

tion:

N1k=0

ckck2m = (m).


11/21


These two requirements can be simultaneously satisfied if, by solving the matrix

equation for an ((N2)(N2)) matrix operating upon the (N2) dimensionalvector

= M,where = [(1),(2) , . . . , ( N 1)]T, the vector comprised of Daubechies scal-ing functions evaluated at all the nonzero integer values in the interval [0, 1, 2, . . . ,N 1] which is the support interval for the Daubechies scaling function of or-der N. M is an orthogonal matrix whose eigenvector is with an eigenvalue of 1.

Therefore, the expression of each element of the vector as a linear sum of the

matrix product produces the dilation equation for (x) and the orthogonality ofM

will satisfy

N1

k=0ckck2m = (m).

The order of the coefficients within the matrix M is [c2jk], 1 j, k N 1,where j is the row index and k is the column index. In the case ofN = 4, the (22)matrix M is comprised of the sequence of 4th order scaling function coefficients

arranged in the row column sequence 2j k, 1 j, k 3j . Their arrangementin the matrix M, given by the index 2j k, is the row indicator in the range of1 j 2, and k is the column indicator of 1 k 2:

M =

c1 c0c3 c2

and is [(1),(2)]T. The solution of this matrix equation produces the sequence:

c0 =1 + 3

2, c1 =

1 + 332

,

c2 =1

3

2, c3 =

1

3

2.

These elements of matrix M are the coefficients for the D4 scaling function which

satisfies both requirements. Since the previous condition of (x) makes them or-

thogonal, the coefficients for the Daubechies wavelet function can be generated by

applying the conversion formula, valid for all orthogonal wavelets, that converts

the scaling function sequence {ck} into the coefficient sequence {dk} for the dilationequation for the wavelet function (x):

(x) =N1k=0

dk(2x k).

The conversion formula is: dk = (1)kc[(1 k) mod N]. For the case N = 4, thewavelet coefficient sequence is: d0 = c3, d1 = c2, d2 = c1, and d3 = c0.


12/21


4.2. COIFMAN WAVELETS [2]

The necessary and sufficient condition for a function to be MRA orthogonal wave-

lets is that the Fourier transform of the scaling function

(x) =N1k=0

ck (2x k)

satisfy the following equation in the frequency domain:

(2 ) = m0()(),where

m0( ) =1

2

k

ckei k,

and

{ck

}is the coefficient sequence for the dilation equation. The function m0( )

must also satisfy the added requirement for the orthogonality of the scaling func-tion (x) = kZ ck (2x k) which is met if it also satisfies the condition:m0( )2 + |m0( + )2 = 1 for all R.This property for the low-pass filter m0( ) is not restricted only to Coifman wave-

lets, but must apply to the low-pass filter component in the Fourier transform of

the dilation equation for the scaling function in all MRA orthogonal wavelets. The

Coifman wavelet differs from other MRA wavelets in that the low-pass filter m0( )

not only meets these requirements but also has the added property:

(k)

{2

} =

dk

dk m0()() =

0 for k

=0, 1, . . . , L .

This produces a zero moment of order L for the scaling function as well as for the

wavelet function, i.e.,

xk (x) dx = 0 for 0 k L.

The zero moment of order L for the Fourier transform of the scaling function ()

ensures that the functional approximations of both the Coifman wavelet and Coif-

man scaling function would be highly accurate, since the coefficients for the scaling

function series expansion for f(x) and the coefficients for the wavelet expansion

off(x) are given by:

f, =

f (x)(x) dx

and

f, =

f (x)(x) dx.


13/21


Iff(x) in both integrals is replaced by its Taylor series expansion, then one obtains

the expression for the coefficients of the scaling function series expansion:

f, =

Mk=0

f(k)(0)xk (x) dx

and for the coefficients of the wavelet series expansion:

f, =

Mk=0

f(k)(0)xk (x) dx.

If (x) and (x) both have zero moments of order L, then the coefficients of

the series expansions for both the wavelet and the scaling function will only have

nonzero components of terms in the Taylor series for f(x) that are of a higher order

than L.

This means that the wavelet coefficients in a finite series approximation of f(x)

will be centered only around higher moments in the Taylor series for f(x) and

thus have a rapid convergence. All orthogonal wavelets with a continuous deriv-

ative through order L have L-order zero moments; however, only the Coifman

wavelet has this property for the scaling function as well. It means that the series

approximations for the low frequency components of a signal will have as rapid

a convergence as those for the higher frequency components which have wavelet

function expansions.

EXAMPLE. Figure 4 shows the original image and its corresponding histogram.

Figure 5 shows the transform image for all levels, achieved by the application

of the D16 transform on the original image. Here we can observe the leptosis of

this histogram in comparison with the histogram of the original image.

Figure 4. Original 512 512 image of Aliki and Casey and its histogram.


14/21


Figure 5. D16 transformed image and its histogram.

5. Image Quantization

Image quantization is the conversion of elements of the transformed image into adata representation that requires less data storage than the original image. Quanti-

zation is achieved by the following two step process: (1) normalization and (2) inte-

ger conversion. Since all transform methods concentrate most of the image energy

in the low-frequency regions (Figure 3), most of the data in the high-frequency re-

gions have values that are negligible with respect to the values of the low-frequency

regions. These negligible values represent details of the original image. They are

of such fine resolution that the human visual system cannot distinguish (the differ-

ence) when one compares the perfectly reconstructed image and the reconstructed

image in which only transformed values of significant magnitude were retained.

Normalization converts all negligible values to 0 and simultaneously reduces

the magnitude of the next level of lower high-frequencies while preserving the

magnitude of the lowest frequency components (see LLLL in Figure 3). Thus,

normalization is achieved by dividing each element in the transformed image by a

value known as a quantizer. It is important to select a quantizer that will reduce

Figure 6. Quantized image and its histogram.


15/21


all negligible values to 0. The second step in quantization is the conversion of

these normalized values to integer values. This process is achieved by the following

operation

fq (x,y) = int

f (x,y)

q(x,y)

,

where f (x,y) is the transformed coefficient, q(x,y) is the quantizer, and fq (x,y)

is the normalized value converted to an integer. Integer conversion changes the

normalized data to a set of integer values that can be encoded in such a way as to

minimize their storage requirements. The quantizer must be adjusted to different

frequency regions of the transformed image. In the low-frequency region, the quan-

tizer must have relatively low value and, in the high-frequency region, the quantizer

must have a relatively high value. This quantizer adaptation is accomplished by

the use of non-uniform quantization. Non-uniform quantization utilizes two pa-

rameters: (1) the shifting size, and (2) the step size. The first parameter is usedto adjust the value of the quantizer for a given step size to different regions of

the transformed image. The second parameter is used to determine the range of

quantization. The minimum shifting size of 1 corresponding to the step size of 1

allows the quantizer to be adjusted for each element in the transformed image. In

this research, the effect of the shifting and the step size changes on the storage

requirements was examined. A highly localized quantization (small shifting size)

mainly concentrates the nonzero elements in a low-frequency region (upper left-

hand corner), but also increases the detail loss of quantization. A less localized

quantization (larger shifting size) concentrates the nonzero elements less, but it

reduces its detail loss of quantization. In this study, the step size is fixed at 1, while

shifting sizes as large as 128 and as small as 4 are used.

Continuation from the previous example: Figure 6 shows the effect of the quan-

tization on the transformed image. We can observe here the shifting of the his-

togram towards left.

6. Partial Decomposition and Pseudo-quantization

If we use any order of an MRA wavelet and decompose an image only through 2

levels, we use only the coefficients of the low-pass filter to reconstruct an image

that is extremely close to the original, very low MSE, and very high SNR. Thus,

by saving only a quarter of the coefficients in a 2-level decomposition, we can

reproduce the original image with a very small error. This, however, will give us

only a compression ratio of = 4 : 1 which is far too low to be of any practicaluse. If, on the other hand, we complete all the levels of decomposition and retain

only the low-pass coefficients in our image reconstruction, we will have achieved a

phenomenal compression ratio, but too much detail will have been lost to produce

an acceptable reconstruction.


16/21


The high MSE and low SNR, resulting from the reconstruction of the image

from only a few low-pass coefficients, show that as one proceeds into all levels of

decomposition, the high-pass components become increasingly important. A ques-

tion arises: can one perform a partial decomposition, saving only a relatively small

number of low-pass coefficients, discard all of the high-pass coefficients at the ini-

tial levels of decomposition, and save only a relatively few high-pass coefficients at

the lower levels of the decomposition that contain significant information, perform

a coding scheme (see Section 7) to reduce storage of the reduced level coefficients;

thus achieving excellent compression ratios and high quality image reconstruction?

We experimented with this procedure using all the wavelets under consider-

ation. We decomposed the image into 3 levels. For the 512 512 image, thismeant our subimage containing the low-pass coefficients was the 64 64 imagein the upper left-hand corner. For the 384 384 image, the subimage of low-passcoefficients was the 4848 image in the upper left-hand corner. We then computedthe mean and the standard deviation for the remaining portion of the decomposed

image. Then we thresholded the high-pass region using a 1-SD threshold; anycoefficient in this region whose value exceeded 1 SD was retained, while all the

others were zeroed out. Next we applied the same coding procedure to encode

the low-pass coefficients and the nonzero thresholded high-pass coefficients. The

reconstruction consisted of decoding the nonzero coefficients and the application

of 3-level reconstruction.

7. Image Encoding [5, 6]

Image encoding is the final step in the image compression process where integer

values of the quantized image are converted to binary symbols that require, on theaverage, less storage than the integer representation. Typically, zero and positive

integers less than 255 require only 1 byte of storage, while all other integer values

require at least 2 bytes of storage. Binary symbols are composed of sequences of

0 and 1 each requiring only a single bit of storage. Thus, the bit requirements for

the storage of a binary symbol equal the number of 0s and 1s that compose it.

The object of encoding is to assign short binary symbols to integer values with a

relatively high probability (frequency of occurrence) and longer binary symbols to

integer values with a relatively low probability.

There are a number of different encoding schemes available. This study deals

with the variable length coding (VLC) and the run length coding (RLC) methods.

The VLC method assigns binary symbols to the integer values the length of which

is a function of its probability value. The RLC methods condenses streams of 0

values to special symbols called tokens whose binary representation requires less

average storage than the assignment of a single bit to each of the consecutive 0s.

The VLC method is applied to the nonzero values, while the RLC method is applied

to the 0 values. In this study, a combination of these two methods was used; taking


17/21


Figure 7. Reconstructed image and its histogram.

advantage of the quantization effect of concentrating nonzero integer values in a

small region in the upper left-hand corner of the image.

In the case of the VLC method, a histogram analysis was first performed on

the quantized image in order to determine the probability of all integer values. For

the elements with the highest probability, a smaller binary symbol was used. The

elements with the third, fourth, fifth, and sixth highest probabilities were assigned

the binary symbols: 00, 01, 10, and 11, etc. For the values of the lowest probabil-

ities, longer binary symbols were assigned. However, the average number of bits

per binary symbol was always equal to the total entropy of the encoded image.

Thus, the long binary symbols assigned to the integers with a low probability did

not significantly increase the average storage requirements. Therefore, the RLC

method reduces the number of 0s. If, for example, the following pattern of integers

occurs: 1324000000000345, then the token (0 : 9) is assigned to 9 consecutive 0s.The resulting representation is 1324(0 : 9)345. The token (0 : 9) will be convertedto a binary symbol using the VLC method. When quantized integers and tokens for

the sequences of 0s have been encoded, the data can be stored in a binary stream in

which each binary symbol is distinguished from its adjacent symbols by a special

prefix code. The quantized image can be recovered without loss by reversing this

coding operation.

Continuation from the previous example: Figure 7 shows the decompressed

(reconstructed) image using the entire sequence of decoding, dequantization, and

inverse transformation. We observe here that the histogram of the reconstructed

image closely resembles the histogram of the original image.

8. ImageComparison Methods [7 9]

The fidelity of an image after reconstruction is an important aspect of image com-

pression. Compression methods are lossless product images that are an exact replica

of the original image. Lossy compression methods that remove visual redundan-

cies produce decompressed images that have lost visual information during the


18/21


compression process. One measurement of the reconstructed image fidelity is that

of the mean square error, MSE, between the reconstructed image fr (x,y) and the

original image f (x,y). The smaller the MSE, the closer the compressed image is

to the original image. The signal to noise ratio (SNR) of the reconstructed image is

also an important index and it is computed using

SNR =M1

y=0N1

x=0 (f(x,y))2M1

y=0N1

x=0 (f(x,y) fr (x,y))2.

The closer the reconstructed image is to the original image, the higher the signal

to the noise ratio will be. However, the main difficulty with using the MSE as a

measure of the image quality is that in many instances these values do not match the

quality that is perceived by the human visual system. In addition to MSE and SNR,

the compression ratio (CR) is computed. CR is the measure of the compressed

image size versus the original image size and its % is

Compression ratio (%)

= size of original imagel (byte) size of compressed image (byte)size of original image (byte)

100.

There are several statistical measures that give quantitative information about an

image. These statistical measures include the mean value, the maximum value, the

minimum value, the entropy, and the standard deviation.

9. Results

Tables IIII show, in a summarized fashion, the results obtained from all the image

compression measures used in this study. The entropy of the 512 512 inputimage was 7.316, while the entropy of the 384 384 image was 7.564 and theshifting/step ratio used in the non-uniform scalar quantization was 4/1.

Table I. Comparison of the compressed result of 512 512 image

WT Entropy of the Compressed CR CR Entropy of MSE SNR

transformed size (byte) % the inverse

image image

DCT 0.24 34345 7.63

:1 86.90 7.48 61.21 220.85

D2 0.17 34093 7.69 : 1 87.00 7.65 95.61 141.38D4 0.32 31818 8.24 : 1 87.86 7.66 65.80 205.43D8 0.35 31321 8.37 : 1 88.05 7.67 57.30 235.91D16 0.80 31816 8.24 : 1 87.87 7.68 55.23 244.77


19/21


Table II. Comparison of the compressed result of 512 512 image with pseudo-uantization



image image

D2-SQ 2.34 37041 7.07 : 1 85.87 7.57 178.55 75.71D4-SQ 2.10 36515 7.18 : 1 86.07 7.63 125.91 107.36D8-SQ 1.57 36444 7.19 : 1 86.10 7.63 107.62 125.60D16-SQ 1.76 36652 7.15 : 1 86.02 7.64 103.27 130.89

Table III. Comparison of the compressed result of 384 384 image



image image

D6 0.5416 20912 7.05 : 1 85.82 7.67 57.59 232.05D12 0.5750 20910 7.05 : 1 85.82 7.68 52.59 254.11C6 0.4153 21035 7.01 : 1 85.74 7.67 59.39 225.01C12 0.5174 20637 7.15 : 1 86.01 7.67 58.02 230.34

10. Discussion and Conclusion

10.1. EVALUATION OF THE COMPLETE DECOMPOSITION, QUANTIZATION,

AND ENCODING

The evaluation of DCT and wavelet transforms was based on: (1) Compressionratio, (2) Quality of the reconstructed image (MSE and SNR), and (3), to a lesser

extent, computational speed. As to the compression ratio, D2, D4, D8, and D16

had a better compression ratio than that of DCT. Therefore, with respect to CR, the

Daubechies wavelets were superior performers than DCT. More specifically, D8

and D16 had the best compression ratios (88 + % of the original image).Regarding the quality of the reconstructed image, D2 and D4 had higher MSE

values than DCT, D2: 95.61 and D4 65.80, whereas the MSE for the DCT was

61.21. The difference between the DCT and D4 is small +4.59, whereas the dif-ference between the DCT and D2 is considerable: +34.40, in fact, the MSE for D2indicates a low quality reconstruction. For D8 the MSE was better: 3.91 and D16it was

5.98. All these MSE values are in the range of good quality reconstruction.

Concerning the transforms which were applied to the 384 384 image, thecomparative results in the compression ratio, low MSE, and high SNR between

D6, D12, C6, and C12 wavelets are identical to that of the transforms applied to the

512512 image. Again, D12 had the best performance in both the lowest MSE andthe highest SNR. C12 had the best performance with respect to CR. However, in all


20/21


the three categories the results were so similar that none of the transforms could be

categorically listed as superior. Because of the added L-order zero moment to the

scaling function for the Coifman wavelets, it was expected that they would have a

decided advantage to D6 and D12, but the performance of both D6 and D12 was

as good as that of the Coifman wavelets.

10.2. EVALUATION OF THE PARTIAL DECOMPOSITION AND

PSEUDO-QUANTIZATION

The compression ratio of the four wavelets applied in this study was comparable to

the compression ratios obtained when the complete decomposition, quantization,

and encoding were applied. Limiting the decomposition to 3 levels, thresholding

at 1 SD of the high-pass portions of the 3-level transformed image, together with

encoding produced compression ratios that were 85+

%. However, by other more

important criteria, low MSE and high SNR, this method produced decidedly poorer

results. The MSE ranged from 103.27, D16, 178.55, D2; and the SNR ranged from

75.71, D2, to 130.89, D16. The values of these parameters are not in the range of

acceptable quality. The decision to zero out all high-pass coefficients that had a

magnitude of less than 1 SD probably caused us to loose significant detail. If the

quantization had been applied to the coefficients in the high-pass regions rather

than simple thresholding, the MSE and SNR values conceivably could be much

better. An interesting outcome of this study occurred as a result of tabulating the

image entropy in each step of the compression and reconstruction process. Tables

IIII show the entropy levels at all the levels of the compression and reconstruction

process, the Daubechies wavelets had higher entropy levels than did the DCT.

Lower entropy constraints that nevertheless yield high MSE and SNR values arean important asset to a transform in data compression applications computational

time: the code for all Daubechies wavelets executed considerably faster than the

code for the DCT, at least by several orders of magnitude.

References

1. Chui, C. K.: An Introduction to Wavelets, Academic Press, New York, 1992.

2. Daubechies, I.: Ten Lectures on Wavelets, SIAM, Philadelphia, PA, 1992.

3. Devore, R., Jeweth, B., and Lucier, B.: Image compression through wavelet transform coding,

IEEE Trans. Inform. Theory (1992), 719746.

4. Friesen, K., Panagiotacopulos, N., Lertsuntivit, S., and Lee, J.: Wavelet image compression:

A comparative study, in: Advanced and Intelligent Systems: Concept, Tools, and Applications,

Kluwer Academic Publishers, Dordrecht, 1998, pp. 265276.

5. Jain, A. K.: Fundamentals of Digital Image Processing, Prentice-Hall, Englewood Cliffs, NJ,

1989.

6. Nelson, M. and Gailly, J. L.: The Data Compression Book, M&T Books, New York, 1996.


21/21


7. Vetterli, M. and Kovacevic, J.: Wavelet and Subband Coding, Prentice-Hall, Englewood Cliffs,

NJ, 1995.

8. Walter, G. G.: Wavelets and Other Orthogonal System with Applications, CRC Press, Boca

Raton, FL, 1994.

9. Weeks, A. R.: Fundamentals of Electronic Image Processing, SPIE Press, Bellingham, 1996.

Lossy Image Compression Using Wavelets

Documents

Transcript of Lossy Image Compression Using Wavelets