[IEEE Electronics, Robotics and Automotive Mechanics Conference (CERMA'06) - Cuernavaca, Morelos...

6
Feature Preserving Image Compression: A Survey Osslan Osiris Vergara Villegas, Raúl Pinto Elías and Vianey Guadalupe Cruz Sánchez Centro Nacional de Investigación y Desarrollo Tecnológico (cenidet) Interior Internado Palmira S/N, Col. Palmira, Cuernavaca Morelos México {osslan, rpinto, vianey}@cenidet.edu.mx Abstract With the increase of the use of Internet and wireless mobile devices, the digital information needs to be send and received efficiently in low bit rates in order to exploit the bandwidth. At low bit rates it is almost impossible to generate errors or artifacts in images. In order to solve that problem, researchers are trying to design and build lossy image coders which can preserve important features of images. With this approaches the features of an image that are very important to perception and recognition are preserved even at low bit rates. In this paper a revision of some works that propose feature preserving image compression (FPIC) algorithms are presented. Finally a new methodology to obtain FPIC is presented. 1. Introduction The great growth in the use of Internet and mobile communication devices has generated a revolution in the way of human beings communicate and made information exchange. The necessity of efficient digital information delivery (e. g. images) in those devices is imperative, and different methods to do that have been proposed. A digital image uses a big storage space and big bandwidth for transmission, in mobile devices this is a problem because the space and bandwidth can be spent or saturated rapidly. A possible solution to solve this problem is to find a representation that use less information to represent digital images, by this necessity image compression emerges in the field of video and digital images. Image compression addresses the problem of reducing the data amount required to represent a digital image and is made by a removal process of the image redundant information [1]. An ideal scheme is to made lossy image compression in order to save a lot of storage space but sacrificing the quality of an image. A typical image compression system consists of three stages [2]: a) reversible discrete transform, b) quantization and c) entropy coding. In order to decode the image an inverse process is made. Figure 1 shows the general framework for lossy image compression. Fig. 1. Lossy image compression general scheme. In digital images there are interesting parts represented by shapes, contours, edges, etc., that give important information about the image [3]. This is the main reason for the creation of image compression frameworks in which even at low bit rates the important information are still preserved in order to made successful object recognition. The design of a lossy feature preserving image compression (FPIC) becomes imperative in different areas such as medical and textile industries in which there are laws and restrictions about the use of the original images [2]. In this paper we present a resume of the most important works that address the feature preserving image compression process, then, we propose a methodology to design a feature preserving coder finally a brief comment about the future and perspectives of this important research field is shown. 2. Image Compression An image can be analyzed as a composition of three main parts: edges, textures and details associated to edges. In order to design a lossy compression coder with feature preserving it is important first, identify the image features to preserve and second, sacrifice fidelity or quality in other image regions [3]. Proceedings of the Electronics, Robotics and Automotive Mechanics Conference (CERMA'06) 0-7695-2569-5/06 $20.00 © 2006

Transcript of [IEEE Electronics, Robotics and Automotive Mechanics Conference (CERMA'06) - Cuernavaca, Morelos...

Feature Preserving Image Compression: A Survey

Osslan Osiris Vergara Villegas, Raúl Pinto Elías and Vianey Guadalupe Cruz Sánchez Centro Nacional de Investigación y Desarrollo Tecnológico (cenidet)

Interior Internado Palmira S/N, Col. Palmira, Cuernavaca Morelos México {osslan, rpinto, vianey}@cenidet.edu.mx

Abstract

With the increase of the use of Internet and wireless mobile devices, the digital information needs to be send and received efficiently in low bit rates in order to exploit the bandwidth. At low bit rates it is almost impossible to generate errors or artifacts in images. In order to solve that problem, researchers are trying to design and build lossy image coders which can preserve important features of images. With this approaches the features of an image that are very important to perception and recognition are preserved even at low bit rates. In this paper a revision of some works that propose feature preserving image compression (FPIC) algorithms are presented. Finally a new methodology to obtain FPIC is presented.

1. Introduction

The great growth in the use of Internet and mobile communication devices has generated a revolution in the way of human beings communicate and made information exchange. The necessity of efficient digital information delivery (e. g. images) in those devices is imperative, and different methods to do that have been proposed.

A digital image uses a big storage space and big bandwidth for transmission, in mobile devices this is a problem because the space and bandwidth can be spent or saturated rapidly. A possible solution to solve this problem is to find a representation that use less information to represent digital images, by this necessity image compression emerges in the field of video and digital images.

Image compression addresses the problem of reducing the data amount required to represent a digital image and is made by a removal process of the image redundant information [1]. An ideal scheme is to made lossy image compression in order to save a lot of storage space but sacrificing the quality of an image.

A typical image compression system consists of three stages [2]: a) reversible discrete transform, b) quantization and c) entropy coding. In order to decode the image an inverse process is made. Figure 1 shows the general framework for lossy image compression.

Fig. 1. Lossy image compression general scheme.

In digital images there are interesting parts represented by shapes, contours, edges, etc., that give important information about the image [3]. This is the main reason for the creation of image compression frameworks in which even at low bit rates the important information are still preserved in order to made successful object recognition. The design of a lossy feature preserving image compression (FPIC) becomes imperative in different areas such as medical and textile industries in which there are laws and restrictions about the use of the original images [2].

In this paper we present a resume of the most important works that address the feature preserving image compression process, then, we propose a methodology to design a feature preserving coder finally a brief comment about the future and perspectives of this important research field is shown.

2. Image Compression

An image can be analyzed as a composition of three main parts: edges, textures and details associated to edges. In order to design a lossy compression coder with feature preserving it is important first, identify the image features to preserve and second, sacrifice fidelity or quality in other image regions [3].

Proceedings of the Electronics, Robotics and Automotive Mechanics Conference (CERMA'06)0-7695-2569-5/06 $20.00 © 2006

In image compression frameworks the quality of the reconstructed images obtained is an important issue to measure, by both objective and subjective means. Image quality has two implications: fidelity and intelligibility. The former describes how the reconstructed image differs from the original one and the latter shows the ability through which the image can offer information to people [4].

Image compression coders have been evaluated on the basis of minimizing an objective distortion measure at a given level of compression. However, a lower objective measure does not always mean better quality in the compressed image. The problem is that the measure fails into measure the quality of important features needed for the good recognition and perception of the reconstructed images.

The main errors or artifacts produced by compression schemes at low bit rates are: blurring and ringing. Blurring, results from neglecting most of the high frequency details of the image and cannot be avoided at very low bit rates. However, one can try to allocate bits in such a way that the produced blurring is not very annoying to human observers. Ringing, is related to the Gibbs phenomenon and occurs in the vicinity of sharp edges. The amount of ringing depends on the transform in use and also on the bit allocation. However, at extremely low bit rates, ringing usually cannot be avoided [5].

A goal followed by image compression coders is to reduce the artifacts in order to preserve some image features and obtain good quality reconstructed images at low bit rates.

3. The First Steps of Feature Preserving Image Compression

In the literature there are a few works showing image compression with feature preservation, specially preserving edges. Since 80´s the first steps present compression schemes using which is called Region of Interest (ROI) coders. In this coder types the bit budget is more used in ROI´s and is less used in other image parts. This implies hybrid coders in which different schemes are used for each image region [6].

The work proposed by [7] is an Edge Preserving Image Compression (EPIC) algorithm that combines techniques of edge lossy compression and Dynamic Associative Neural Networks (DANN). In order to obtain high compression the distortion rates can be specified by the user inside the adaptive compression system which is adequate to parallel implementation. DANN training base is improved using a variance classifier in order to control the convergence speed of a neural network bank and thus obtain high compression

rates to simple patterns. The system allows image progressive transmission by varying the number of quantification levels used to represent the compressed patterns. The images used are magnetic resonances looking to preserve edges in tumors and lesions. Figure 2 shows the EPIC scheme.

Fig. 2. Edge Preserving Neural Network based Image compressor proposed by [7].

With the pass of the years a progressive transmission coders schemes emerges in order to cover the necessity to prioritize the coefficients (give importance) to be coded and produce a progressive bit stream. This coders types allows obtaining different images at different qualities and are the base for must of the actual image coders. The most important progressive coders are: Embedded Zerotree Wavelet (EZW) [8] and Set Partitioning In Hierarchical Trees (SPIHT) [9].

The Discrete Wavelet Transform (DWT) is the transform more used as a base of feature preserving image compression. With DWT an image is decorrelated and the energy is concentrated in a small set of coefficients [6]. A problem with DWT is that cannot take image singularities as edges or contours, but this can be solved by looking the edge information at wavelet domain in the quantization coder stage.

A Region Based Discrete Wavelet Transform (RBDWT) coder is presented in [10]; this work is cited a lot of times in literature of feature preserving compression. The main idea proposed is to make a segmentation process in order to divide the image into regions described by their contours and textures which are coded separately. The main difference between DWT and RBDWT is that in the second the subbands are dividing into regions. The transmission of the image information takes place in two stages: first the contours, also called the segmentation information, and then the texture, which is equal to the segment contents. This is a natural approach, because contours and textures coincide with psychological concepts of vision. The information obtained from segmentation can be coded by chain codes and the texture information by the coefficients of a 2-D polynomial

Proceedings of the Electronics, Robotics and Automotive Mechanics Conference (CERMA'06)0-7695-2569-5/06 $20.00 © 2006

that is fitted to the segmented data. Figure 3 shows an example of RBDWT on subbands of the camera man image.

Fig. 3. RBDWT subbands being a combination of the DWT and a region-based coding [10].

The work of [11] propose an static image compression algorithm for very low bit rate applications that represents an image in terms of its binary edge map, mean information and the intensity information on both sides of the edges. The algorithm works well for images with limited amount of texture such as facial images. These techniques fail to provide acceptable quality at very low bit rates because do not take into account the properties of human visual perception; figure 4 shows the scheme of this encoder.

Fig. 4. Block diagram of the encoder proposed by [11].

After the application of different tests, researchers discover that in order to build efficient ROI coders the first and most important thing needed, is a robust digital image processing module in order to describe and image in terms of features such as edges and textures. In ROI methods an image is segmented to obtain a good representation of the image contours, then the points are removed from the regions and are coded separately, after contour coding, texture coding is made [12].

A different approach used, is to enhance in someway the features of an image [13], in this type of

coders the perceptual quality of an image is the main goal of the compression scheme, then the coefficients obtained from the transform needs to be multiplied by the weights obtained from the frequency sensitive function of the human visual system before encoding. The coder proposed is an extension of the Perceptual Subband Image Coder (PIC) by allowing local adaptation of the quantizer step size according to the amount of masking available on each transform coefficient. The coder is efficient in discriminate between image components which are and are not detected by the human receiver.

In the work of [14] consider high lossy compression over ATR (Automatic Target Recognition) and ATM (Airborne Thematic Mapper) noisy images. The goal is that after compression process the features (borders) of the resulting images can be preserved in order to obtain well recognition of the form and location of objects. When a wavelet lossy image compression is used on very noisy images a lost of some image features is produced because the edges represent high frequency and can be removed by the nature of the process. To solve this, a feature preserving noise elimination method called Total Variation (TV) is proposed followed by a wavelet compression with selected thresholds in order to obtain high bit rate but preserving edges.

Finally, in [5] an edge based wavelet transform for image compression is proposed. First, the dominant edge of an image is detected and coded as side information. The idea behind the coding technique presented in this work is to overcome the ringing problem by separating images at positions where sharp edges are located. Then, the Discrete Wavelet Transform (DWT) is carried out in such a way that no filtering over previously detected edges is performed, then the image can be decoded, the major problem is that the coder needs side information.

4. Recent Research Works in Feature Preserving Image Compression

With the arrival of the year 2000 compression technologies becomes more important due to the great use of wireless Internet and mobile devices in which the early perception and recognition of an image using a few bits is imperative and new methods for FPIC emerges.

The work presented in [15] is an edge preserving compression scheme based on a wavelet transform and iterative constrained least square regularization. Reconstructed image is treated as a process of image restoration. The edge information is used as apriori knowledge for the subsequent reconstruction, the

Proceedings of the Electronics, Robotics and Automotive Mechanics Conference (CERMA'06)0-7695-2569-5/06 $20.00 © 2006

spatial characteristics of a wavelet coded images are used to enhance the restoration performance. A vector quantization based edge bit-planes coding scheme is proposed to compromise the amount of information carried by the image data and the corresponding edges. Figure 5 shown the scheme proposed by this coder.

Fig. 5. The schematic diagram of the coder proposed by [15].

The work of [3] shows a preserving system in which the features of the images (edges, textures and edge associated details) are extracted, separated and coded with different methods. Figure 6 shows the block diagram of this model. The important edges are coded using line graphic techniques, textures using a wavelet based zerotree approach; edge details associated are bitplane coded.

Fig. 6. Block diagram of the feature preserving image coder proposed by [3].

The appearance of JPEG 2000 [16] represents the boom of image compression standards because the coding system is optimized not only for efficiency, but also for scalability and interoperability in network and mobile environments. JPEG 2000 addresses areas in where current standards fail to produce the best quality or performance and provides capabilities to markets that currently do not use compression. One of the most important feature provided by JPEG 2000 is the ROI coding which allows to users define image important parts which are coded and transmitted in a better quality and less distortion than the rest of image, this

added to the high low bit rate performance. Figure 7 shows the diagram for JPEG 2000 coder.

Fig. 7. Block Diagraman of JPEG 2000. a) encoder, b) decoder.

The DWT have the problem that adapts well to point singularities but not to two dimensional singularities found in images such as lines or curves [3]. A solution to this problem is to use multidirectional decompositions in order to detect image singularities in a best way. The Discrete Contourlet Transform (DCT) allows a multiscale and directional decomposition of an image using a combination of a modified laplacian pyramid and a Directional Filter Bank (DFB) [17]. The Discrete Contourlet Transform is also called as Pyramidal Directional Filter Bank (PDFB). The PDFB allows for different number of directions at each scale/resolution to nearly achieve critical sampling. The DFB is designed to capture high frequency components (representing directionality), the laplacian pyramid allows subband decomposition to avoid leaking of low frequencies into several directional subbands, and thus directional information can be captured efficiently.

In [18] presents a framework based in neural networks (NN) to find a wavelet optimal core that can be used in specific tasks of image processing. Use a lineal convolution NN in order to obtain a wavelet that minimize errors and maximize compression efficiency for an image or defined image pattern such as micro calcification in mammograms and bone in computed tomography (CT) head images. Analyze the wavelet filter spectrum in order to make feature preserving compression and as a wavelet function. As a resume the compression scheme is composed by: 1) development of a unified method to facilitate the multichanel decomposition in wavelet filtering, 2) designing a cost error function consisting in MSE and imposed entropy reduction function to seek an optimal wavelet kernel in the convolution neural network and 3) converting a neural network suggested kernel into a filter constrained by the wavelet requirements.

In [2] authors propose a methodology to preserve edges using a vector with features of interest and use SPIHT for the quantization stage. When an image is

Proceedings of the Electronics, Robotics and Automotive Mechanics Conference (CERMA'06)0-7695-2569-5/06 $20.00 © 2006

analyzed in order to find it details it appears in the wavelet domain as a coefficients with big magnitude and SPITH or EZW can be used to select the coefficients by significance in order to code and generate embedded bit stream. The scheme combines the representation of the wavelet coefficients based in trees and the implicit data transmission over the image features that need to be accentuated or preserved. The main advantage of this method is that exploits zerotree data structure to eliminate the necessity of sending additional information to the decoder respecting about the features selected. The proposed scheme is shown in figure 8.

Fig. 8. Compression framework proposed by [3].

The work presented in [19] shows two progressive image coders designed to improving the visual clarity of image edges in progressive code stream. The algorithms capture the locations of edges with an edge detection step then encode the edges and transmit them to the decoder as a part of the image header. The first algorithm is an edge enhancing image coder (EEIC)the decoder use the edge information to enhance the edges in the blurred low bit rate image decoded from the SPIHT stream as it is shown in figure 9.

Fig. 9. Block diagram of edge enhancing image coder proposed by [19].

The second coder improves over the first by use the information contained at the edge packet when performing a wavelet transform. The edges are removed from the images in forward transform and inserted by the decoder at the inverse transform the scheme is similar to that showed in figure 2.

Finally in [20] presents a set of filter design axioms in the spatial domain which ensures that certain

image features are preserved from scale to scale. For authors, feature preservation means that the location, strength and shape of features are unchanged after the application of a general filter, natural differences occur due change in resolution. Define a spatial domain framework for the design and analysis of wavelet transform, includes a set of axioms that can be used to design multiscale filter banks that preserve certain characteristics of data. The framework is used for the design of Optimal Total Variation Diminishing (OTVD) wavelets targeted to computational fluid dynamics data sets.

5. Conclusions and Further Works

In this paper we show a panorama about the works in feature preserving image compression. The design of these compression coders is very important in several industries and several devices. It is clear that objective measures are not capable to measure visual distinctness from digital imagery.

With all the research made we propose a methodology to made a robust feature preserving image compression (FPIC), and is shown in figure 10.

Fig. 10. The proposed model for FPIC.

The first stage, selects an image to compress and determines the information to extract and to be preserved in the posterior coding (texture, edges, shapes, corners, etc). In the second stage an image processing module for feature extraction is implemented, the result is a data structure with the information obtained. In third stage a mapping of the data structure to the transformed domain is made, the transformation is made with a DWT or a DCT.

Image compression is made in the fourth stage. Here we can use EZW coder or SPIHT, spending more bits for important information (determined from data structure) and few bits for other image parts, the information obtained is coded using arithmetic coding. Image decompression is made in the fifth stage. Finally, and stage of pattern recognition is

Proceedings of the Electronics, Robotics and Automotive Mechanics Conference (CERMA'06)0-7695-2569-5/06 $20.00 © 2006

implemented in order to demonstrate that the reconstructed images are better in the important parts than the decompressed images without feature preserving.

There are a lot of works to do in this research field; one of the most important is to use another image features to preserve, another work is to create an error measure that can inform about the image features preserved this accompanied and supported with a computer vision or pattern recognition strategy. The suggested way to a possible solution for high-compression image coding is to represent an image in terms of textured regions surrounded by contours in such a way that the regions correspond, as faithfully as possible, to the objects of the scene. With this approaches the users benefit from being able to understand the images early even at very low bit rates.

5. References

[1] C. Gonzalez R. and E. Woods R., Digital Image Processing, Addison Wesley / Díaz de Santos, U.S.A., 2000.

[2] Namuduri K. R. and Ramaswamy V. N., “Feature Preserving Image Compression”, Pattern Recognition Letters, vol. 24, no. 15, pp. 2767 – 2776, November 2003.

[3] Schilling D. and Cosman P., “Feature-Preserving Image Coding for Very Low Bit Rates”, Proceedings of the IEEE Data Compression Conference (DCC), Snowbird, Utah, U.S.A., vol. 1, pp. 103 – 112, March 2001.

[4] Saffor A., Bin Ramli R. A., Hoong Ng K. and Dowsett D., “Objective and Subjective Evaluation of Compressed Computed Tomography (CT) Images”, The Internet Journal of Radiology, vol. 2, no. 2, June 2002.

[5] Mertins A., “Image compression via edge-based wavelet transform”, Optical Engineering, vol. 38, no. 6, pp. 991 – 1000, June 1999.

[6] Kunt M., Ikonomopoulos A. and Kocher M., “Second-Generation Image Coding Techniques,” Proceedings of the IEEE, vol. 73, no. 4, pp. 549 – 574, April 1985.

[7] Chee T. W., “Edge Preserving Image Compression for Magnetic Resonance Images Using DANN - Based Neural Networks”, M. S. dissertation, University of Miami, August 1993.

[8] Shapiro J. M., “Embedded Image Coding Using Zerotrees of Wavelet Coefficients”, IEEE Transactions on Signal Processing, vol. 41, no. 12, pp. 3445 – 3463, December 1993.

[9] Said A. and Pearlman W., “A New Fast and Efficient Image Codec Based on Set Partitioning in Hierarchical Trees”, IEEE Transactions on Circuits and Systems for Video Technology, vol. 6, no. 3, pp. 243 - 250, June 1996.

[10] Barnard H. J., “Image and Video Coding Using a Wavelet Decomposition”, Ph. D. dissertation, Delft University of Technology, Department of Electrical Engineering, Information Theory Group, The Netherlands, May 1994.

[11] Ujjaval Y. D., M. Mizuki M., Masaki I. and K. P. Horn B., “Edge and Mean Based Image Compression”,

Technical Report 1584, Massachusetts Institute of Technology Artificial Intelligence Laboratory, U.S.A., November 1996.

[12] Yu T., Lin N., Liu J.C. and Chan A., “A Region-of-Interest Based Transmission Protocol for Wavelet-Compressed Medical Images,” SPIE Wavelet Applications Conference, vol. 3078, pp. 56 – 64, Orlando, Florida, U.S.A., April 1997.

[13] Hontsch I. S. and Karam L. J., ”APIC: Adaptive Perceptual Image Coding Based on Subband Decomposition with Locally Adaptive Perceptual Weighting,'' Proceedings of the IEEE International Conference on Image Processing (ICIP), vol. 1, pp. 37 – 40, Washington, DC, U.S.A., October 1997.

[14] Chan F. T. and Zhou H. M., “Feature Preserving Lossy Compression Using Nonlinear PDE´s”, Proceedings of the IEEE Data Compression Conference (DCC), pp. 529, Snowbird Utah, U.S.A., March 1998.

[15] Hong S. - W. and Bao P., “An Edge-Preserving Subband Coding Model Based on Non-Adaptive and Adaptive Regularization”, Image and Vision Computing, vol. 18, no. 8, pp. 573 – 582, May 2000.

[16] Skodras A., Christopoulos C. and Ebrahimi T., “The JPEG 2000 Still Image Compression Standard,” IEEE Signal Processing Magazine, vol. 18, no. 5, pp. 36 – 58, September 2001.

[17] Do M. N. and Vetterli M., “Contourlets: A Directional Multiresolution Image Representation”, Proceedings of the IEEE International Conference on Image Processing (ICIP), vol. 1, Rochester, New York, U.S.A., pp. 357 – 360, September 2002.

[18] B. Lo S., H. Li and T. Freedman M, “Optimization of Wavelet Decomposition for Image Compression and Feature Preservation”, IEEE Transactions on Medical Imaging, vol. 22, no. 9, pp. 1141 – 1151, September 2003.

[19] Schilling D. and Cosman P., “Preserving Step Edges in Low Bit Rate Progressive Image Compression”, IEEE Transactions on Image Processing, vol. 12, no. 12, pp. 1473 - 1484, December 2003.

[20] Craciun G., Jiang M., Thompson D. and Machiraju R., “Spatial Domain Wavelet Design for Feature Preservation in Computational Data Sets”, IEEE Transactions on Visualization and Computer Graphics, vol. 11, no. 2, pp. 149 – 159, April 2005.

Proceedings of the Electronics, Robotics and Automotive Mechanics Conference (CERMA'06)0-7695-2569-5/06 $20.00 © 2006