LZRW3 Data Compression Core
-
Upload
oscar-salinas -
Category
Documents
-
view
22 -
download
0
description
Transcript of LZRW3 Data Compression Core
![Page 1: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/1.jpg)
mid presentation Part A Project
Netanel Yamin & by: Shahar Zuta
Moshe porian Advisor:
Dual semester project November 2012
![Page 2: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/2.jpg)
Contents Project Overview Project goals Requirements Architecture Micro architecture Problems & solutions Conclusions Testability Methodology Schedule
![Page 3: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/3.jpg)
algorithm overview
INPUT FILE
-------------------------------------------------------
Literal items ONLY
A copy item consists of two bytes that represent from 3 to 18 bytes. literal item consist of one byte which represents himself
LZRW3 COMPRESSO
R
OUTPUT FILE
]----[-]-----[]-------[]-----------[]----[
GROUPS OF ITEMS(literal/Copy)
![Page 4: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/4.jpg)
mechanism
HASH FUNCTIO
N
INDEX409
5
0
INPUT FILE:
Offset
Expression_c om
press _ion
E x p
Offset value=
0
XXX
ZZZ
YYY
UUU
demonstration
UUU
r e s
3
XXX
Output
Exp
res
L.I
L.I
NOTE: The next 3 byte should be
“x p r” , then “ p r e “ and only then “r e s”, we did’nt demonstrate all the actions
for simplicity.
“L.I“ stands for
“Literal Item“
![Page 5: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/5.jpg)
mechanism
HASH FUNCTIO
N
INDEX409
5
0
INPUT FILE:
Expres sion_c om
press _ion
Offset value=
XXX
ZZZ
YYY
UUU
demonstration
ZZZ
03
6
s i
9
_ o
YYYExp
res
Output
L.I
L.I
sio L.I
n_c L.I
Offset
cn
![Page 6: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/6.jpg)
mechanism
HASH FUNCTIO
N
INDEX409
5
0
INPUT FILE:
Expression_c om
press _ion
Offset value=
XXX
ZZZ
YYY
UUU
demonstration
o m p
03
12
69
Exp
res
Output
L.I
L.I
sio L.I
n_c L.I
omp L.I
Offset
![Page 7: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/7.jpg)
mechanism
HASH FUNCTIO
N
INDEX409
5
0
INPUT FILE:
Express _comp ress _io
Offset value=
XXX
ZZZ
YYY
UUU
r e s
XXX
03
15
12
96
demonstration
Exp
res
Output
L.I
L.I
sio L.I
n_c L.I
omp L.I
123
C.IXXX
io nn
3+
012345
Offset
“C.I“ stands for
“Copy Item “
![Page 8: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/8.jpg)
Hash 3 bytes
Hash table [index
]
Enter offset
O.F-. Literal
item
Get offset
O.F.- Copy item
Length++
more
same byte
s
FWD 1 byte
FWD 3 +Length
bytes
START
index
empty filed
Same 3
bytes
no
yes
yes
![Page 9: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/9.jpg)
Project Goals
Implementation of LZRW3 data compression
algorithm
Implementing strong debugging capabilities
via GUI
![Page 10: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/10.jpg)
RequirementsVHDL implementationDE2 development board that features an
Altera Cyclone II FPGAFPGA – Host communication via UART
protocolUse internal memory on FPGA, no interface
to external memoryAdapted to data templates of 2Kbyte to
32KbyteHigh performance- data transfer of 1Gbps
![Page 11: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/11.jpg)
RequirementsVHDL implementationXUPV5 development board that features an
Xilinx Virtex-5 FPGAFPGA – Host communication via UART
protocolUse internal memory on FPGA, no interface
to external memoryAdapted to data templates of 2Kbyte to
32KbyteHigh performance- data transfer of 1Gbps
![Page 12: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/12.jpg)
Architecture
Rx PATH
Tx PATH
INPUT BLOCK memory LZRW3
COMPRESSOR
CORE
COMPRESSED FILE memory
GUI
XILINX VIRTEX 5 ON XUVP505 BOARD
UART
UART
![Page 13: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/13.jpg)
Architecture
Rx PATH
Tx PATH
INPUT BLOCKmemory LZRW3
COMPRESSOR
CORE
COMPRESSED FILE memory
GUI
XILINX VIRTEX 5 ON XUVP505 BOARD
UART
UART
![Page 14: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/14.jpg)
LZRW3 COMPRESSOR
CORE
Lzrw3_go
Lzrw3_mode
data_input_byte (7..0)
data_input_valid
data_input_taken
clk
Lzrw3_busy
Lzrw3_done
Lzrw3_output_group_size (4..0)
data_output_valid
data_output_taken
data_output_last
reset
data_output_bytes(13..0)
End_of_file
![Page 15: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/15.jpg)
![Page 16: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/16.jpg)
![Page 17: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/17.jpg)
STAGE 1 – three bytes buffer
3 BYTESBUFFER
enable
reset
New_byte(7..0)
clk
Newer_byte(7..0)
Mid_byte(7..0)
Older_byte(7..0)
![Page 18: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/18.jpg)
![Page 19: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/19.jpg)
STAGE 2- hash function
enable
HASH FUNCTION
middle_byte(7..0)
clk
Table_index(11..0)
older_byte(7..0)
Newer_byte(7..0)
reset
![Page 20: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/20.jpg)
TABLE INDEX = (((40543*(((*(PTR))<<8)^((*((PTR)+1))<<4)^(*((PTR)+2))))>>4) & 0xFFF) PTR pointes to the first byte . TABLE INDEX range: 0 to 4095.
7 6 5 4 3 2 1 0
7 6 5 4 3 2 1 0
7 6 5 4 3 2 1 0
7 6 5 4 3 7 2 6 1 5 0 4 3 7 2 6 1 5 0 4 3 2 1 0
, ,0000,0000
0000, , ,0000
0000,0000, ,
, , , , , , , , , , , , , , ,
a a a a a a a a
b b b b b b bb
c c c c c c c c
a a a a a b a b a b a b b c b c b c b c c c c c
![Page 21: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/21.jpg)
STAGE 2- RTL view
![Page 22: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/22.jpg)
STAGE 3 – hash tableenable
HASH TABLE
Data_out_valid
Table_index(0..11)
clk
Offset(19..0)
Current_offset(19..0)Offset
counter
reset
clear
![Page 23: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/23.jpg)
Current_offset
0
0
0
0
1
1
0
1
0
1
1
0
Valid bits
21 bits
40
96
ro
ws
Offsetcounter
DATA_ IN
INDEX
ADDRESS
Offset
Data_out_valid
1
Offsetcounter
![Page 24: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/24.jpg)
STAGE 4 – input file memory
![Page 25: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/25.jpg)
Stage 4 implementationInput file memory should supply three byte at
the same time.
![Page 26: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/26.jpg)
How to choose bank when byte arrives?
# _ %3Bank current offset
__ _
3
current offsetAddress in bank
![Page 27: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/27.jpg)
SOLUTIONInstead of counting in stage 3 and divide in
stage 4, we incerment by one only after three clock cycles.
In this configuration we expand the offset by 2 bits (tagging) to select the the data need to write into.
Hash table size now is 4096 x (19+2) .
1001010101001110011 10
19 bits 2 bits
![Page 28: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/28.jpg)
Solution costs (mem units) Memory usage At stage 3 from synplify_pro:
same as before.
LUT usage:
20 4096 81920 80 3 _ 108bit Kbit RAM block Kbit
36Kbit
![Page 29: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/29.jpg)
Back to stage 4
![Page 30: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/30.jpg)
Input file memorybanks
comparator
Continue
1
0
clk
clkTentative
Next address
clk
counter
offset
TAG
Com
pris
on_v
alid
Compare_success
clk
Offset_tag
Tentative_tag
clk
clk
Tentative_taken
Compare_success_P
Item_length_p
Offs
et_v
alid
Bank 0,1,2addresses
0
1
Addresses
alignment
Older_byte_P
Offset_valid
CBA
3401
Y Z
TENT
00
A
0
0
XB CD
CD
B
B
11
1
0
INDEX
TAG indicate the banks bytes order
![Page 31: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/31.jpg)
Input file memorybanks
comparator
Continue
1
0
clk
clkTentative
Next address
clk
counter
offset
TAG
Com
pris
on_v
alid
Compare_success
clk
Offset_tag
Tentative_tag
clk
clk
Tentative_taken
Compare_success_P
Item_length_p
Offs
et_v
alid
Bank 0,1,2addresses
0
1
Addresses
alignment
Older_byte_P
Offset_valid
D C
00
1
T
DE
CINDE
X
C
![Page 32: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/32.jpg)
![Page 33: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/33.jpg)
Problem(1)in stage 4, at first we implemented the counter that counts the number of successful comparisons in the comparator which is made of an asynchronous process. It passed simulations but was not synthesizable.
![Page 34: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/34.jpg)
Solution(1)we’ve changed the architecture of the units so the counter is implemented in a synchronous unit, it receives a signal from the asynchronous comparator if the comparison was successful and responds accordingly.
![Page 35: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/35.jpg)
Problem(2)in stage 4, in order to perform the comparison of the current 3 bytes in the pipe and three bytes from the RAM memory we need to extract three following bytes from different addresses at one clock period.
![Page 36: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/36.jpg)
Solution(2)we distributed the one memory we had into 3 RAM memory banks which contains following addresses so in case we want to extract 3 following bytes from the memory we’ll extract one byte from each bank.
![Page 37: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/37.jpg)
Problem(3)in stage 4, the current pipe bytes that arrive the comparator are arranged in their arrival order but the three bytes withdrawn from the banks aren’t necessarily arranged in the right order.
![Page 38: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/38.jpg)
Reading configurations
1. SAME ADDRESES
![Page 39: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/39.jpg)
2. DIFFERENT ADDRESS
Reading configurations
![Page 40: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/40.jpg)
3. DIFFERENT ADDRESS # 2
Reading configurations
![Page 41: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/41.jpg)
(�ׂ3)SolutionWe used the TAG that represented the extracted bytes addresses to determine which extracted byte will be compared with which current piped byte.
![Page 42: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/42.jpg)
Problem(4)In stage 4, the RAM memory banks need to have the next address to extract on the next
clock before the end of the current clock .
![Page 43: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/43.jpg)
(4)SolutionWe created two units that will contain the next two possible addresses (tentative
address unit or address align unit).
![Page 44: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/44.jpg)
ConclusionsWriting code for synthesis is different from
writing code for simulation.In asynchronous implementation all the
signals need to be in the sensitivity list.Reset should not pass through any logic.Think hardware when writing VHDL code for
synthesis.Keep on simplicity to achieve more flexibility.
![Page 45: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/45.jpg)
2048
2048Testability
Synthesisable
Hash Function
Block
UnsynthesisableSimulation Function
Random input
generator
A B C
A B C
Assert the comparison and report to console
Input file
![Page 46: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/46.jpg)
MethodologyStage data flow review.Writing VHDL code.Writing VHDL testbench.Code review and debugging.Synthesis check- synplify.
Check RTL view.Check CLK constraints.
Commit SVN folders and update data flow if needed.
Next stage data flow review.
Simulation & debugging
![Page 47: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/47.jpg)
Schedule 1/2DateGoals
24/4/2012 – 1/5/2012
Project Characterization& Algorithm interpreting
2/5/2012Characterization Presentation
2/5/2012 – 16/5/2012
Full Characterization of all blocks
17/5/2012 – 1/7/2012
•System blocks VHDL •Design
1/7/2012 – 27/7/2012
Work on project paused for exams
29/7/2012– 11/11/2012
•System blocks VHDL •Design (Cont.)•Writing every unit a simulating testbench
![Page 48: LZRW3 Data Compression Core](https://reader035.fdocuments.in/reader035/viewer/2022062314/5681362f550346895d9daac6/html5/thumbnails/48.jpg)
Schedule 2/2DateGoals
12/11/2012Mid presentation
13/11/2012– 19/12/2012
•System blocks VHDL •Design (Cont.)•Writing every unit a simulating testbench
20/1/2012Part A final- Core Simulation Vs. Golden model
21/1/2012 – 15/2/2012
Assemble all units and FPGA synthesis
16/2/2012 – 28/2/2012
GUI implementation
1/3/2012 – 10/3/2012
Final overall Tests & debug
11/3/2012 – 31/3/2012
Editing and finishing project portfolio
1/4/2012Final presentation