A novel binary-image compression scheme

17
A Fast Lossless A Fast Lossless Compression Scheme for Compression Scheme for Digital Map Images Digital Map Images Using Color Separation Using Color Separation by Saif Zahir and Arber Borici Department of Computer Science, University of Northern British Columbia ICASSP 2010

description

This is an ICASSP 2010 presentation of a novel binary and discrete-color image compression scheme.

Transcript of A novel binary-image compression scheme

Page 1: A novel binary-image compression scheme

A Fast Lossless A Fast Lossless Compression Scheme Compression Scheme

for Digital Map Images for Digital Map Images Using Color SeparationUsing Color Separation

by Saif Zahir and Arber Borici

Department of Computer Science,University of Northern British Columbia

ICASSP 2010

Page 2: A novel binary-image compression scheme

2

Huffman and Arithmetic Huffman and Arithmetic CodingCoding The Huffman method assigns shorter code

to symbols with higher probabilities Optimal encoder Low efficiency with skewed probabilities Higher efficiency on groups of symbols

The Arithmetic method performs better by encoding a sequence of symbols Higher complexity Less efficient with larger alphabets Affected by inaccurate probabilities more often

than Huffman coding

Page 3: A novel binary-image compression scheme

3

Motivation and ObjectivesMotivation and Objectives Proposing a new lossless compression

method for discrete-color images Constructing a universal Huffman-based

codebook by studying the entropy of a system of randomly chosen binary images

Introducing and additional helper module, the RCRC algorithm

Low-complexity and high-efficiency method Design a coding method notwithstanding the

nature of a binary image

Page 4: A novel binary-image compression scheme

4

The Proposed MethodThe Proposed Method Three components:

1. Preprocessing Perform color separation into binary layers Make layer dimensions divisible by 8

2. A universal Huffman codebook Designed by studying a relatively large sample of

randomly chosen binary images Huffman coding applied on 8x8 blocks of a binary image

3. The Row-Column Reduction Coding Attempts to compress 8x8 blocks which are not found in

the codebook Checks for identity between rows and columns

Page 5: A novel binary-image compression scheme

5

General Diagram of the Method

Page 6: A novel binary-image compression scheme

6

The Codebook The Codebook ConstructionConstruction Build a system of binary images

Images contain no noise The system is unbiased, i.e. images are randomly selected

Perform a frequency analysis on 8x8 blocks of the system images

Identify blocks that occur more than once Determine the system entropy Build correct Huffman codes for the most frequent

blocks The resulting codebook is a fixed-to-variable

dictionary containing 6952 entries

Page 7: A novel binary-image compression scheme

7

How is the codebook employed? Search the codebook for each 8x8 block of a

given source image The X-by-Y image is partitioned into XY/64 8x8

blocks

If the block exists in the codebook, compress using the corresponding Huffman code

An example of the codebook structure:

Page 8: A novel binary-image compression scheme

8

The first three codebook entries:

Page 9: A novel binary-image compression scheme

9

The Codebook EntropyThe Codebook Entropy

The entropy of our codebook is 4.08 bits per 8x8 block Thus, the compression limit is (64 – 4.08)/64 =

93.63% The average Huffman code length is 4.094

Page 10: A novel binary-image compression scheme

10

Row-Column Reduction Coding Row-Column Reduction Coding (RCRC)(RCRC) Operates on 8x8 blocks Uses two reference vectors

The Row Reference Vector (RRV) is a column vector The Column Reference Vector (CRV) is a row vector

Checks whether two consecutive row vectors are identical If rows are identical, one is eliminated and the block is

reduced by one row If not, the next two consecutive row vectors are compared The row reduction operation continues until the end of the

block is reached

Page 11: A novel binary-image compression scheme

11

RCRC (cont.) The column reduction operation is similar and

elimination operations are stored in the CRV

The output of RCRC is a bit stream containing the RRV, CRV, and the reduced block Concatenated as S = RRV+CRV+RB Minimum length of S = 17 bits

Page 12: A novel binary-image compression scheme

12

RCRC Example:

Row 1 eliminates row 2

Row 3 eliminates all other rows

Column elimination is performed on the row-reduced block:

The compressed bit stream for this example is: 10100000100000101011

Page 13: A novel binary-image compression scheme

13

The Coding ProcessThe Coding Process Summarized in the following table:

Block encoding bits Description

Case 1a ‘11’For the block with the shortest Huffman code in the codebook

Case 1b‘00’ + 5 bits + Huffman Code

For other blocks found in the codebook

Case 2 ‘01’ + RRV + CRV + RBFor blocks compressed by RCRC

Case 3 ‘10’ + 64 bits For uncompressed blocks

Page 14: A novel binary-image compression scheme

14

Time ComplexityTime Complexity Analytical Time Complexity

The codebook contains fixed entries RCRC is executed on fixed, 8x8 blocks The variable input is the source image size Time complexity is O(XY), where X and Y are the image

dimensions

Empirical Metric Average run time: less than 1s for various binary images Worst case: Block B not in the dictionary; not compressed

by RCRC. However, such blocks are rare (about 5% of, on average)

Page 15: A novel binary-image compression scheme

15

Preliminary ResultsPreliminary Results Tested on several topographic maps Average compression of 0.036 bpp Very large image dimensions

ImageOriginal Size (KB)

Compressed Size (KB)

Compression Ratio (bpp)

1 10,960 210.77 0.019

2 220,979 7,489.27 0.034

3 173,769 6,626. 27 0.038

4 2,562 206.69 0.081

Total 409,270 14,533.12 0.036

Results reported in the literature for a similar class of images vary from 0.22 to 0.18 bpp.

(Franti et al., 2002)

Page 16: A novel binary-image compression scheme

16

Selected Set of Test Images:

Source: UNBC GIS Lab: Maps of British Columbia

Page 17: A novel binary-image compression scheme

17

4 different colors

Layer 1: Contour Lines Layer 2: Lakes

Layer 3: Rivers Layer 4: Roads

The Four Layers of Map 1