Image Compression With Haar Discrete Wavelet Transformcourses.washington.edu › mengr535 › Sample...

Image Compression With Haar Discrete Wavelet Transform

Cory Cox

ME 535: Computational Techniques in Mech. Eng.

Figure 1 : An example of the 2D discrete wavelet transform that is used in JPEG2000.

Source: http://en.wikipedia.org/wiki/File:Jpeg2000_2-level_wavelet_transform-lichtenstein.png

Intro

Importance Of Image Compression

In 2010 Google started incorporating web page loading times into their search engine

optimization algorithms. Their reasoning behind the change was: “Faster sites create happy users

[...] Like us, our users place a lot of value in speed – that’s why we’ve decided to take site speed

into account in our search rankings.”As the internet becomes a more feature rich and graphic

intensive platform, image compression becomes more and more important, especially since

mobile phones and other devices are joining the fray along with more traditional and powerful

laptops and desktop computers. To provide a satisfactory internet experience it’s necessary to

compress images in order to optimize page loading times. Since search engine optimization is

one of the most widely studied marketing strategies it naturally follows that image compression

has become a vital component of web site design, especially since Google’s SEO incorporates

loading times.

To further impress the importance of image compression we can look at some data from a study

performed by the mobile web developer Trilibis.i By reviewing 155 major websites and

obtaining the image weight relative to the total weight of the site the following bar graph was

generated which shows just how much images contribute to the loading time for web pages.

Figure 2 : Plot of weight of web pages versus the loading time

Obviously image weight is a significant contributor to page weight, making up more than half of

the total weight in most cases. Now to illustrate how significant image compression is in

reducing the weight of a web page let’s look at Trilibis’ plot of weight savings after image

compression across different devices.

Figure 3: Page weight savings from image compression

Obviously image compression is a valuable tool for improving web page load times. It’s also

useful in many other applications such as storing image files on memory cards or hard drives.

Now let’s look at one method for image compression, the haar discrete wavelet transform

approach.

Haar Discrete Wavelet Transform Method

To begin, let’s assume that we’re working with a grayscale image. This means that each pixel is

represented with an integer value between zero (black) and 255 (white). Our goal is to save only

the most relevant pixel information with fewer values (smaller file size) while allowing the entire

image to be accurately reconstructed using only those relevant pixels. For a simple example let’s

first look at a row vector only then later we’ll move on to a matrix.

Say we have the following row of pixel information that we want to save.

We can represent these numbers in a variety of ways. As an example of what not to do, we could

save the vector as a series of paired averages, taking the first two pixel values and averaging

them, then the next two pixel values and so on. We’d get something like this:

That would give a relatively good representation of the original data, but it would be impossible

to accurately reconstruct the original data from the 4 so-called “approximation coefficients”

above. To improve this method we could save some more values that give us some idea of how

we could use the approximation coefficients to reconstruct the original data. Let’s save 4 more

values, bringing our total number of saved values to 8. This is the same number as we began

with! Why not just save those original 8 values then? We’ll get to that in a second. Let’s say that

we save the following 8 values:

The first 4 values are the values we originally picked to save; the average of neighboring pairs.

Various sources call those values the “sums” or the “approximation coefficients”. The last 4

values give us the distance from each average to the surrounding points. Some sources denote

that values as the “detail coefficients” or the “differences”. For example, the approximation

coefficient 5 is the average of the first two original points, and the distance to the surrounding

points is given by the detail coefficient 1, so we know the original values were 4 and 6.ii

The reason that we choose to save these 8 values is that they can accurately reconstruct the

original data plus they give us an idea of the rate of change of the data in an area. When the

distance to a surrounding point is small we know that the data at that location is all relatively

similar and not much changes as the location changes. This can be visualized as an area of an

image where the colors are relatively similar. When the distance to a surrounding point is large

we know that the data at that location is very different from the data surrounding it. This

corresponds to an area of an image where colors are changing at some kind of edge.

The fact remains that we started with 8 values and we are claiming that we can accomplish

image compression by saving 8 values, which is not intuitive. The reason that this method is

effective is that the differences, if small, can be approximated as zero and then discarded. By

iterating this process on matrices the Haar discrete wavelet transform focuses the energy of the

matrix in the upper left hand corner, leaving mostly zero values or near zero values elsewhere.

Let’s look at the procedure for Haar wavelet transforms (HWT) for matrices more in depth.

Say we have an image matrix, A, which stores grayscale pixel data for an image using integer

values between 0 and 255 :

If we look at the first row of matrix A and follow the procedure we outlined above then we’ll

start by splitting the row up into pairs:

Now we’ll find the approximate coefficients, or the sums, which are the averages of each of the

pairs.

We also need to find the detail coefficients, or the differences, which are the distances from each

average to the corresponding points on either side of it. We can calculate these more explicitly as

half of the difference of each pair.

If we combine the approximate coefficients and the detail coefficients into one row then we’ve

found the first iteration of the haar discrete wavelet transform of the first row of the matrix.

If we repeat that process on the approximate coefficients from the above row (just the first 4

values) then we’ll start with the following pairs:

We need to find the approximate and detail coefficients the way we did before:

And we’ll combine those into a row vector with the 1st iteration detail coefficients.

We repeat this once more for the first row and then perform the same operation for all of the rest

of the rows, as well as all of the columns. The resulting matrix is shown below:

All of the energy has been concentrated into the upper left hand entry and the rest of the entries

are either zero or relatively close to zero. This is a result of neighboring pixels in images

generally being relatively similar to each other in terms of color or grayscale intensity, with the

exception of pixels that define edges and outlines of shapes. If we count the number of entries

with a value of zero we’ll notice that there are 16 zeros in the above matrix.

Now let’s compress the image represented by the matrix above by picking a cutoff value such

that any pixel data with an absolute value less than that cutoff value is set to zero. Let’s pick a

cutoff value of 0.25 and set all entries in the above matrix that have an absolute value of less than

0.25 equal to zero. This results in the matrix pictured below:

If we count the number of entries with a value of zero we find that there are now 37 such entries

as opposed to 16 zero entries previously.

To calculate the compression ratio we take the number of non-zero entries in the original matrix

and divide by the number of non-zero entries in the compressed matrix.

Pixel Coding and Usage of Masks

To further optimize the compression of an image we can use different numbers of bits to store

the pixel information from each section of the matrix. As an example we’ll again reference the

compressed matrix, A, that we worked with in the above section. Most of the energy of the

matrix is contained in the upper left hand corner so we should use more bits to store that

information and we can use fewer bits for the sections of the matrix where the entries are mostly

zeros or close to zero.

Compression Analysis

We’ll be looking at a few different criteria for assessing the overall success image compression.

1. Compression Ratio

2. Mean Square Error

3. Peak Signal to Noise Ratio

The compression ratio is the most obvious quantitative measurement of the success of image

compression. As described above, it is a way to compare the amount of significant information

contained in the original image matrix to the amount of significant information contained in the

compressed image matrix. This can be found simply by comparing the file size of the original

image to the file size of the compressed image. An image that originally has a file size of 5 MB

that is compressed to have a file size of 1 MB would have a compression ratio of 5:1, for

example.

The mean square error is less of a compression evaluation than it is a quality evaluation. It is a

way to directly compare the accuracy of a compressed and reconstructed image to the original

image in terms of individual pixel values.

In the above formula the dimensions of the image are denoted by m and n and I is the intensity of

the individual grayscale pixel values. I(x, y) are the pixel values for the original image and

I’(x, y) are the pixel values for the compressed and reconstructed image.

Whereas the mean square error is indicative of a version of the cumulative error, the peak signal

to noise ratio describes a sort of maximum error. In terms of image compression the signal is the

original image and the noise is the error that occurs as a result of the compression and

reconstruction. The peak signal to noise ratio equation is given below, in terms of the mean

square error:

So generally a better image compression will result in lower mean square error and a higher peak

signal to noise ratio.iii

Procedure

The procedure that we’ll follow for compressing the image under study using a Haar discrete

wavelet transform is as follows:

Compressing the Image

1. Start with grayscale image of size 256 x 256

To begin with we’ll pick a picture in full color and crop it to the appropriate size. The starting

image is shown below in Figure 4.

Figure 4

We’ll convert it to a grayscale picture using Matlab. Unfortunately I don’t have access to the

Image Processing Toolbox so there will be a fair amount of coding in the name of finding a

workaround for some of the functions that come standard in the Image Processing Toolbox. The

reason we convert to a grayscale picture is to obtain a simple intensity map which is much easier

to work with. The grayscale image is seen below in Figure 5.

Figure 5 : The grayscale conversion

2. Scan a row of the image at a time, finding the sums/differences between neighboring

entries in the image matrix

This is easily accomplished using a “for” loop in Matlab.

3. Split the image matrix into a left side and a right side, storing the sums or approximate

coefficients in one half and the differences or detail coefficients in the other half.

Figure 6: After one iteration of row sums

In Figure 6 above we’ve split the original grayscale image seen in Figure 5 into a left half and

right half. For each entry in the original grayscale image matrix we calculated the sum and

difference between neighboring entries, which we use to generate Figure 6, which contains the

sums of the consecutive entries on one half and the differences between consecutive entries on

the other half. Now we’ll do the same for the columns of our original grayscale image.

4. Scan the image matrix by columns, finding the sums/differences between neighboring

entries

Once again, this is very easily done in Matlab with a simple “for” loop.

5. Split the matrix into a top half and bottom half, storing the sums in one half and the

differences in the other half.

Figure 7 : After one iteration of row sums and column sums

Here we’ve taken the sums and differences and stored the sums in the upper half of the new

image matrix and the differences in the lower half of the new image matrix.

6. Repeat steps 2-7 for the smaller matrix where the sums of the column scan and the row

scan overlap. In our case we’ll repeat 4 times to obtain an image matrix where all of the row

and column sums are concentrated in the upper left hand corner in a 16 x 16 sub-matrix.

In Figures 8 and 9 below we’ve shown the resulting sums after the second iteration and after the

fourth iteration, at which point the row and column sums are concentrated into a 16 x 16 area in

the upper left hand corner.

Figure 8 : Second Iteration Figure 9: After 4 iterations

Decompressing the Image

To decompress the image and see what kind of errors are present after the compression and

restoration we follow a similar process but in reverse. The steps are so similar to the compression

process that we won’t go over each step in detail.

1. Reverse the steps back to the original size

2. Reverse the sums/differences for each column of the matrix

3. Reverse the sums/differences for each row of the matrix

4. Repeat steps 2 and 3 for successively bigger matrices until we’re back at the original 256

x 256 image

The results of the decompression are shown below in Figure 10. Note that it is slightly more

pixelated than the original grayscale image in Figure 5.

Figure 10: Decompressed and reconstructed image showing signs of pixelation

Analysis

The point of image compression is obviously to reduce the file size of an image by eliminating

redundant pixels and areas. However it is also important that the image can be decompressed and

reconstructed successfully while minimizing the errors in the image. There are a few ways to

analyze the compressive capabilities and the quality of the compression, as previously discussed

in the method description. Examining the compression ratio is the most obvious way to assess

the compressive qualities and calculating the mean square error between the pixels of the

compressed image and the original image is a good way to analyze the quality of the

compression. In addition to those two analyses we will also look at the peak signal to noise ratio

which is a way to relate the power of the maximum signal in the image to the power of the noise

that corrupts the image’s fidelity. In the instance of image compression the signal is the original

image and the noise is the error that compression causes.

Let’s look first at the compression ratio, using a mask that allocated 8 bits to the highest energy

16 x 16 matrix in the upper left hand corner of the compressed image, 6 bits to the 32 x 32 matrix

that surrounded the upper left hand corner, 4 bits to the 64 x 64 matrix that makes up the next

level, 2 bits to the 128 x 128 matrix and 0 bits to the 256 x 256 matrix. This mask seems to give

a good mix of compression and quality. For this mask we see the following compression ratio:

Original Image Size Compressed File Size Compression Ratio

48,469 bytes 4,023 bytes 12:1

Now let’s look at the mean square error for a bunch of different types of masks:

And finally we’ll examine the corresponding peak signal to noise ratio (since it’s a function of

mean square error):

Obviously these plots show that as you use more bits to code each pixel you reduce the

cumulative error and decrease the amount of noise relative to the peak signal, which is indicative

of higher quality compression. We can get an idea of how our compression ratio of 12:1

compares to other compression ratios by doing a quick online search. It looks like a lot of online

image optimizers give the user the option to compress images at a ratio of anywhere from 1:1 to

99:1. So our compression trends towards the higher quality, less compression end of the

spectrum, as opposed to higher compression rates which sacrifice the quality of the reconstructed

image.

We can compare the mean square error results we got to some other studies that have been done

on image compression to see how our results look comparatively. Let’s look at a plot of mean

square error for several different video compression techniques.

The video compression software uses 5 different coding techniques on 59 different frames from a

video clip of football footage and finds the MSE for each frame. The MSE for the frames ranges

from around 150 to about 450. This makes sense compared to our results, especially if we were

using fewer bits per pixel (maybe 2 or 3) to compress each image. In general the MSE is on the

same order of magnitude.

Next we’ll look at a study where the researcher was varying the bits per pixel being used for

several different images.

Once again we see that the order of

magnitude of the mean square error

is on par with our image

compression results. They get

slightly better results using 1 bit per

pixel then we did, but other than

that the results seem quite similar.

References i Gesenhue, Amy. "Study: Load Times For 69% Of Responsive Design Mobile Sites Deemed "Unacceptable""

Marketing Land. MarketingLand, 22 Apr. 2014. Web. 19 May 2014.

ii Khoury, Joseph. "Application to Image Compression." Application to Image Compression. University of Ottawa,

n.d. Web. 3 June 2014.

iii Kumar, Satish. "An Introduction to Image Compression." An Introduction to Image Compression. DebugMode, 22

Oct. 2001. Web. 1 June 2014.

Bibliography

"Discrete Wavelet Transform." Wikipedia. Wikimedia Foundation, 06 June 2014. Web. 08 June 2014.

Emery, Ashley. "Wavelets." ME 535 Course Website. University of Washington, n.d. Web. 4 May 2014.

Gesenhue, Amy. "Study: Load Times For 69% Of Responsive Design Mobile Sites Deemed "Unacceptable""

Marketing Land. MarketingLand, 22 Apr. 2014. Web. 19 May 2014.

Husen. "Haar Wavelet Image Compression." Ohio State Mathematics. Ohio State University, Winter 2010. Web. 27

May 2014.

"Image Compression." Wikipedia. Wikimedia Foundation, 06 June 2014. Web. 3 June 2014.

Khoury, Joseph. "Application to Image Compression." Application to Image Compression. University of Ottawa,

n.d. Web. 3 June 2014.

Kumar, Satish. "An Introduction to Image Compression." An Introduction to Image Compression. DebugMode, 22

Oct. 2001. Web. 1 June 2014.

Image Compression With Haar Discrete Wavelet Transformcourses.washington.edu › mengr535 › Sample...

Documents

Transcript of Image Compression With Haar Discrete Wavelet Transformcourses.washington.edu › mengr535 › Sample...