Compression project presentation

28
{ (LZ78Based) LZW Data Compression Algorithm ZEESHAN SAJID 14222 MOHSIN ALI 11949 MUHAMMAD FAIZAN 12510

description

 

Transcript of Compression project presentation

Page 1: Compression project presentation

(LZ78Based) LZW Data Compression

Algorithm

ZEESHAN SAJID 14222MOHSIN ALI 11949MUHAMMAD FAIZAN 12510

Page 2: Compression project presentation

Introduction LZ78 Basic Algorithm LZW Compression LZW Decompression Application and Implementation Conclusion

Table of contents

Page 3: Compression project presentation

In computer science and information theory, data compression, source coding, or bit-rate reduction involves encoding information using fewer bits than the original representation. Compression can be either lossy or lossless.

Lossless Compression Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in lossless compression. Lossy CompressionLossy compression reduces bits by identifying marginally important information and removing it. The process of reducing the size of a data file is popularly referred to as data compression, although its formal name is source coding (coding done at the source of the data, before it is stored or transmitted).

Data compression

Page 4: Compression project presentation

Pos 1 2 3 4 5 6 7 8 9

Char A B B C B C A B A

LZ78 AlgorithmTable 1: The encoding process

Step Pos Dictionary Output

1. 1 A (0,A)

2. 2 B (0,B)

3. 3 B C (2,C)

4. 5 B C A (3,A)

5. 8 B A (2,A)

Page 5: Compression project presentation

Static coding schemes require some knowledge about the data before encoding takes place.

Universal coding schemes, like LZW, do not require advance knowledge and can build such knowledge on-the-fly.

LZW is the foremost technique for general purpose data compression due to its simplicity and versatility.

It is the basis of many PC utilities that claim to “double the capacity of your hard drive”

LZW compression uses a code table, with 4096 as a common choice for the number of table entries.

Introduction to LZW

Page 6: Compression project presentation

LZW is a "dictionary"-based compression algorithm. This means that instead of tabulating character counts and building trees (as for Huffman encoding), LZW encodes data by referencing a dictionary. Thus, to encode a substring, only a single code number, corresponding to that substring's index in the dictionary, needs to be written to the output file.

LZW Algorithm

Lempel & Ziv is the foremost technique for general purpose data compression due to its simplicity and versatility. Typically, you can expect LZW to compress text, executable code, and similar data files to about one-half their original size. LZW also performs well when presented with extremely redundant data files, such as tabulated numbers, computer source code, and acquired signals. Compression ratios of 5:1 are common for these cases. LZW is the basis of several personal computer utilities that claim to"double the capacity of your hard drive."

Page 7: Compression project presentation

Introduction to LZW (cont'd)

Codes 0-255 in the code table are always assigned to represent single bytes from the input file.

When encoding begins the code table contains only the first 256 entries, with the remainder of the table being blanks.

Compression is achieved by using codes 256 through 4095 to represent sequences of bytes.

As the encoding continues, LZW identifies repeated sequences in the data, and adds them to the code table.

Decoding is achieved by taking each code from the compressed file, and translating it through the code table to find what character or characters it represents.

Page 8: Compression project presentation

LZW Encoding Algorithm

1 Initialize table with single character strings 2 P = first input character 3 WHILE not end of input stream 4 C = next input character 5 IF P + C is in the string table 6 P = P + C 7 ELSE 8   output the code for P 9 add P + C to the string table 10 P = C 11 END WHILE

12 output code for P

Page 9: Compression project presentation

Example 1: Compression using LZW

Example 1: Use the LZW algorithm to compress the string

BABAABAAA

Page 10: Compression project presentation

BABAABAAA P=AC=empty

Example 1: LZW Compression Step 1

STRING TABLE ENCODER OUTPUT

string codeword representing output code

BA 256 B 66

Page 11: Compression project presentation

BABAABAAA P=BC=empty

Example 1: LZW Compression Step 2

STRING TABLE ENCODER OUTPUT

string codeword representing output code

BA 256 B 66

AB 257 A 65

Page 12: Compression project presentation

BABAABAAA P=AC=empty

Example 1: LZW Compression Step 3

STRING TABLE ENCODER OUTPUT

string codeword representing output code

BA 256 B 66

AB 257 A 65

BAA 258 BA 256

Page 13: Compression project presentation

BABAABAAA P=AC=empty

Example 1: LZW Compression Step 4

STRING TABLE ENCODER OUTPUT

string codeword representing output code

BA 256 B 66

AB 257 A 65

BAA 258 BA 256

ABA 259 AB 257

Page 14: Compression project presentation

BABAABAAA P=AC=A

Example 1: LZW Compression Step 5

STRING TABLE ENCODER OUTPUT

string codeword representing output code

BA 256 B 66

AB 257 A 65

BAA 258 BA 256

ABA 259 AB 257

AA 260 A 65

Page 15: Compression project presentation

BABAABAAA P=AAC=empty

Example 1: LZW Compression Step 6

STRING TABLE ENCODER OUTPUT

string codeword representing output code

BA 256 B 66

AB 257 A 65

BAA 258 BA 256

ABA 259 AB 257

AA 260 A 65

AA 260

Page 16: Compression project presentation

LZW Decompression

The LZW decompressor creates the same string table during decompression.

It starts with the first 256 table entries initialized to single characters.

The string table is updated for each character in the input stream, except the first one.

Decoding achieved by reading codes and translating them through the code table being built.

Page 17: Compression project presentation

LZW Decompression Algorithm

1 Initialize table with single character strings2 OLD = first input code3 output translation of OLD4 WHILE not end of input stream5 NEW = next input code6  IF NEW is not in the string table7 S = translation of OLD8   S = S + C9 ELSE10  S = translation of NEW11 output S12   C = first character of S13   OLD + C to the string table14 OLD = NEW15 END WHILE

Page 18: Compression project presentation

Example 2: LZW Decompression 1

Example 2: Use LZW to decompress the output sequence of

Example 1:

<66><65><256><257><65><260>.

Page 19: Compression project presentation

Example 2: LZW Decompression Step 1

<66><65><256><257><65><260> Old = 65 S = A

New = 66 C = A

STRING TABLE ENCODER OUTPUT

string codeword string

B

BA 256 A

Page 20: Compression project presentation

Example 2: LZW Decompression Step 2

<66><65><256><257><65><260> Old = 256 S = BANew = 256 C =

B

STRING TABLE ENCODER OUTPUT

string codeword string

B

BA 256 A

AB 257 BA

Page 21: Compression project presentation

Example 2: LZW Decompression Step 3

<66><65><256><257><65><260> Old = 257 S = ABNew = 257 C =

ASTRING TABLE ENCODER OUTPUT

string codeword string

B

BA 256 A

AB 257 BA

BAA 258 AB

Page 22: Compression project presentation

Example 2: LZW Decompression Step 4

<66><65><256><257><65><260> Old = 65 S = ANew = 65 C = A

STRING TABLE ENCODER OUTPUT

string codeword string

B

BA 256 A

AB 257 BA

BAA 258 AB

ABA 259 A

Page 23: Compression project presentation

Example 2: LZW Decompression Step 5

<66><65><256><257><65><260> Old = 260 S = AANew = 260 C =

ASTRING TABLE ENCODER OUTPUT

string codeword string

B

BA 256 A

AB 257 BA

BAA 258 AB

ABA 259 A

AA 260 AA

Page 24: Compression project presentation

Our Application and Implementation

Page 25: Compression project presentation
Page 26: Compression project presentation
Page 27: Compression project presentation

GUI of EXE File

Page 28: Compression project presentation