B Eng Final Year Project Presentation
-
Upload
jesujoseph -
Category
Documents
-
view
1.835 -
download
0
description
Transcript of B Eng Final Year Project Presentation
Parallel architecture for image compressionParallel architecture for image compressionIntroduction | Algorithm | Architecture | Results | Conclusion | Q&A
A parallel architecture for image compressionA parallel architecture for image compression
(2004)(2004)
An FYP presentation by:An FYP presentation by:
Jesu Joseph, Shibu MenonJesu Joseph, Shibu Menon
Parallel architecture for image compressionParallel architecture for image compressionIntroduction | Algorithm | Architecture | Results | Conclusion | Q&A
IntroductionIntroduction (Jesu Joseph)
AlgorithmAlgorithm (Shibu Menon)
ArchitectureArchitecture (Shibu Menon)
ResultsResults (Jesu Joseph)
ConclusionConclusion (Jesu Joseph)
Parallel architecture for image compressionParallel architecture for image compressionIntroduction | Algorithm | Architecture | Results | Conclusion | Q&A
IntroductionIntroduction
Parallel architecture for image compressionParallel architecture for image compressionIntroduction | Algorithm | Architecture | Results | Conclusion | Q&A
Why image compression?Why image compression?
• Low color displaysLow color displays
• Real time video compression and streamingReal time video compression and streaming
• Save storageSave storage
• Applications include: Applications include:
Digital camera displaysDigital camera displays
Device to device video streamingDevice to device video streaming
Devices with limited storageDevices with limited storage
Parallel architecture for image compressionParallel architecture for image compressionIntroduction | Algorithm | Architecture | Results | Conclusion | Q&A
Artificial neural networksArtificial neural networks
• Inspired by the human brainInspired by the human brain
• Human learning:Human learning:
1.1. Time dependentTime dependent
2.2. Quality of brain dependentQuality of brain dependent
3.3. Complexity of input dependentComplexity of input dependent
• PC processing: PC processing:
INPUT >> SOFTWARE >> HARDWARE >> OUTPUTINPUT >> SOFTWARE >> HARDWARE >> OUTPUT
Parallel architecture for image compressionParallel architecture for image compressionIntroduction | Algorithm | Architecture | Results | Conclusion | Q&A
Artificial neural networksArtificial neural networks
Self learningSelf learning
Self improvementSelf improvement
Self correctionSelf correction
Ease of upgradeEase of upgrade
Parallel architecture for image compressionParallel architecture for image compressionIntroduction | Algorithm | Architecture | Results | Conclusion | Q&A
Our ProjectOur Project
• Use neural network techniques to design and Use neural network techniques to design and implement a stand-alone chip for image compressionimplement a stand-alone chip for image compression
• Self learningSelf learning
• Self improvementSelf improvement
• Ease of upgradeEase of upgrade
Parallel architecture for image compressionParallel architecture for image compressionIntroduction | Algorithm | Architecture | Results | Conclusion | Q&A
Our ProjectOur Project
Stage 1 - Algorithm:Stage 1 - Algorithm:
• Studied the theoretical algorithmStudied the theoretical algorithm
• Optimize the algorithm for real time performanceOptimize the algorithm for real time performance
• Optimize the algorithm for ease of implementationOptimize the algorithm for ease of implementation
Stage 2 - Architecture:Stage 2 - Architecture:
• Block level architecture designBlock level architecture design
• Component level hardware designComponent level hardware design
• Hardware coding (Verilog)Hardware coding (Verilog)
• Design generationDesign generation
Parallel architecture for image compressionParallel architecture for image compressionIntroduction | Algorithm | Architecture | Results | Conclusion | Q&A
Our ProjectOur Project
Stage 3 - Testing:Stage 3 - Testing:
• Simulation of individual modules with test-benchesSimulation of individual modules with test-benches
• Data verification of individual modulesData verification of individual modules
• Result verificationResult verification
Stage 4 – Implementation:Stage 4 – Implementation:
• SynthesisSynthesis
• FPGA testingFPGA testing
Parallel architecture for image compressionParallel architecture for image compressionIntroduction | Algorithm | Architecture | Results | Conclusion | Q&A
Basics of image compressionBasics of image compression
(R,G,B) = (20,48,206)10
= (14,30,CE)16
= (00001110, 00110000, 11001110)2
Normal format
• 24 bits per pixel
• 16,777,216 (16 million) possible colors
• 512x512 image size= 512 x 512 x 3≈ 800kB
16 Color format
• Bits per pixel in 16 color format = 4 bits
•Size of the image in 16 color format= 512 x 512 x .5≈ 130kB
• About 6:1 image size compression and bandwidth improvement
Parallel architecture for image compressionParallel architecture for image compressionIntroduction | Algorithm | Architecture | Results | Conclusion | Q&A
Our designOur design
00000000 (00,00,00)(00,00,00)
00010001 (10,10,A3)(10,10,A3)
00100010 (39,0A, 9D)(39,0A, 9D)
00110011 (40, 68, 90)(40, 68, 90)
…… ……
11111111 (FF, FF, FF)(FF, FF, FF)
Color chooser
Encoder
About 170 different colors
16 colors, each represented by a 4-bit number
Compressed image
Parallel architecture for image compressionParallel architecture for image compressionIntroduction | Algorithm | Architecture | Results | Conclusion | Q&A
Our designOur design
Input pixels
Learn the image
Create the codebook
Improve the code book
Encode the image
00000000 (00,00,00)(00,00,00)
00010001 (10,10,A3)(10,10,A3)
00100010 (39,0A, 9D)(39,0A, 9D)
00110011 (40, 68, 90)(40, 68, 90)
…… ……
11111111 (FF, FF, FF)(FF, FF, FF)
Code book
Compressed image
Decode the image
Parallel architecture for image compressionParallel architecture for image compressionIntroduction | Algorithm | Architecture | Results | Conclusion | Q&A
AlgorithmAlgorithm
ALGORITHMALGORITHM NORMAL ALGORITHMNORMAL ALGORITHM
STEPSSTEPS MODIFICATIONS MADEMODIFICATIONS MADE
WHY THESE WHY THESE MODIFICATIONSMODIFICATIONS
MAIN FUNCTIONSMAIN FUNCTIONS WHAT THE ALGORITHM WHAT THE ALGORITHM
DOESDOES ALGORITHM STEPSALGORITHM STEPS
8-BIT8-BIT 7-BIT7-BIT
ADVANTAGE OF 7-BIT ADVANTAGE OF 7-BIT ALGORITHMALGORITHM
ALGORITHMALGORITHM
KOHONEN ALGORITHMKOHONEN ALGORITHM
MODIFICATIONSMODIFICATIONS
7-BIT ALGORITHM7-BIT ALGORITHM
8-BIT ALGORITHM8-BIT ALGORITHM
FUNCTIONSFUNCTIONS
Parallel architecture for image compressionParallel architecture for image compressionIntroduction | Algorithm | Architecture | Results | Conclusion | Q&A
OVERALL ALGORITHMOVERALL ALGORITHM
Neuron 1Neuron 1 Weight 1Weight 1
Neuron 2Neuron 2 Weight2Weight2
…….... ……....
Neuron 16Neuron 16 Weight 16Weight 16
Kohonen Algorithm
Address 1Address 1 Data 1Data 1
Address 2Address 2 Data 2Data 2
…….... ……....
Address 16Address 16 Data 16Data 16
IMAGE
Learning
Encoding
EncoderCompressed
Image
Pixel by pixel encoding
LUT
+LUTDECOMPRESSION
MSB PLANE
Parallel architecture for image compressionParallel architecture for image compressionIntroduction | Algorithm | Architecture | Results | Conclusion | Q&A
KOHONEN ALGORITHMKOHONEN ALGORITHM
AssumedAssumed Neuron Weights (w)Neuron Weights (w)
Denotes position of Denotes position of neuron in 3D space.neuron in 3D space.
Serial presentation of Serial presentation of training vectors (x)training vectors (x)
Time dependent learning Time dependent learning rate [rate [αα(t)](t)]
Learning Count - tLearning Count - t 3-D space3-D space
Represents R,G and BRepresents R,G and B
RED
GREEN
BLUE
w1
w2
Neuron1
Neuron 2
x
Input Training Vector
Parallel architecture for image compressionParallel architecture for image compressionIntroduction | Algorithm | Architecture | Results | Conclusion | Q&A
STEPSSTEPS
STEP 1: Find closest Neuron (neuron c)
||X(t) – Wc(t)|| = min{||X(t)-Wi(t)||}
STEP 2: Update Weight of the winning Neuron and the Neurons in the topological neighborhood.
Wi(t+1) = Wi(t) + αα(t).{X(t) – W(t)}(t).{X(t) – W(t)}
For i For i ЄЄ N Ncc(t) (t) Neighborhood Neighborhood
Iterate STEP1 and STEP2
Parallel architecture for image compressionParallel architecture for image compressionIntroduction | Algorithm | Architecture | Results | Conclusion | Q&A
ALGORITHM MODIFICATIONALGORITHM MODIFICATION
Why change ? Computational Expense Hardware Complexity Efficiency – 7-bit implementation
Avoid Multiplication and recursive logic blocks
Tradeoff Time EfficiencyComplexity
Modifications discussed where relevant
Parallel architecture for image compressionParallel architecture for image compressionIntroduction | Algorithm | Architecture | Results | Conclusion | Q&A
STEP 2:
Each Neuron calculates its weight difference from the input vector.
Manhattan Distance Σ |Wij – X|;
i1-N(for ith neuron), j r, g or b
STEP 3:
Winning Neuron with minimum Manhattan Distance chosen.
This neuron denoted Winner.
STEP 4:
Neurons in the topological neighborhood chosen
This neuron denoted Neighbor.
MODIFICATIONS
Initialization based on Gray scale initialization (r = g = b)Manhattan Distance used instead of Euclidean Distance. It denotes the absolute distance of the neuron from input vector.Minimum Distance neuron can be chosen using binary/recursive searching
Usual Kohonen Algorithm : Function f(t) => e.g. d = d0 ( 1- t/T )
Modified : Expanding Sphere.
MODIFIED ALGORITHMMODIFIED ALGORITHM
STEP 1
Training Vector X (Xr, Xg, Xb) input to all neurons (N).
Neurons initialized with weight vectors w (gray scale initialization).
Neighbor
STEP5 :
Update Neuron weights
Wi(t+1) = Wi(t) + αα(t).{X(t) – W(t)}(t).{X(t) – W(t)}
Learning rate α Є {1/2, 1/4, 1/8, 1/16…}
Repeat for the next input vector
Stop after fixed number of iterations
Parallel architecture for image compressionParallel architecture for image compressionIntroduction | Algorithm | Architecture | Results | Conclusion | Q&A
7-Bit Algorithm7-Bit Algorithm
Parallel architecture for image compressionParallel architecture for image compressionIntroduction | Algorithm | Architecture | Results | Conclusion | Q&A
(0000_0001)
(1111_1111)
(000_0001)
7-BIT vs. 8-BIT ALGORITHM7-BIT vs. 8-BIT ALGORITHM
8-Bit Algorithm8-Bit Algorithm 7-Bit Algorithm7-Bit Algorithm
• Neuron Weight components Neuron Weight components are 8-bits each (i.e. R= 8, G= 8 are 8-bits each (i.e. R= 8, G= 8 and B =8)and B =8)•Input Vector components are Input Vector components are 7-bits each.7-bits each.•Image Reconstruction a simple Image Reconstruction a simple matter of looking up the pixel matter of looking up the pixel values from the lookup tablevalues from the lookup table•Requires storage of only the Requires storage of only the LUT.LUT.
Neuron Weight Components Neuron Weight Components are 7-bits eachare 7-bits eachInput Vector components are Input Vector components are 7-bits each (need for 7-bits each (need for conversion).conversion).Image reconstruction complex Image reconstruction complex and involves looking up the and involves looking up the MSB planeMSB planeRequires storage of the MSB Requires storage of the MSB plane and LUTplane and LUT
Parallel architecture for image compressionParallel architecture for image compressionIntroduction | Algorithm | Architecture | Results | Conclusion | Q&A
ARCHITECTUREARCHITECTURE
SYSTEM ARCHITECTURESYSTEM ARCHITECTURE
ARHITECTUREARHITECTURE
NETWORK STRUCTURENETWORK STRUCTURE
NEURON STRUCTURENEURON STRUCTURE GLOBAL CONTROLLERGLOBAL CONTROLLER
ARCHITECTURAL NOVELTYARCHITECTURAL NOVELTY
ALGORITHM – ARCHITECTURE TRANSLATIONALGORITHM – ARCHITECTURE TRANSLATIONNeuron 15
Neuron 13
Neuron 11
Neuron 9
Neuron 7
Neuron 5
Neuron 3
Neuron 1
Neuron 14
Neuron 12
Neuron 10
Neuron 8
Neuron 6
Neuron 4
Neuron 2
Neuron 16
GLOBAL CONTROLLERGLOBAL CONTROLLER
NEURON 1 NEURON 2
NEURON 3
NEURON 5
NEURON 7
NEURON 11
NEURON 13
NEURON 9
NEURON 15 NEURON 16
NEURON 14
NEURON 12
NEURON 10
NEURON 8
NEURON 6
NEURON 4
BROADCAST ARCHITECTURE
Central Controller broadcasts control signals to the neuron.
Neurons have the ability to take control of the global bus.
Arbitration eliminates contention for central bus.
Hardware efficient since rich interconnection would mean infeasible number of i/o pins.
Expansion of neural network simplified.
Only one feedback signal from the network to the controller.
NEURON STRUCTURE
All Arithmetic computations use the same block “ARITHMETIC_UNIT”
Variable Learning rate implemented by using Shift register.
Parallel architecture for image compressionParallel architecture for image compressionIntroduction | Algorithm | Architecture | Results | Conclusion | Q&A
Step 6: Winner Updating• Learning Rate for winner based on value in the frequency count register.
• Red value updated based on:
WR = WR + (T2RC * T2R α(f)) α(f) is the learning rate as a function of frequency.
•This step repeated for Green and Blue Values.
•Only neuron with Wnr flag set updates these values.
•Learning rate implemented by using a shift register
•Eg. α = 0.875 1/2+ 1/4+ 1/8
Step 2: Weights Initialization
• 2 cycles per neuron : address broadcast and then the data.
GC:
Cycle 1: Address in the DATA [8:0] with add_cyc. The particular neuron is selected.
Cycle 2: The WR1(=Wg1 = Wg2 ) value on DATA [8:0] together with MEM_ADD [2:0]. The data is read by the RAM.
Step 1: FC initialization
• Global Controller fills in “FC” for all neurons.
• Frequency counter – starts from threshold value and is decremented for the winning neuron. Neuron disabled when FC=0.
GC: ini_freq asserted with the data in DATA[8:0]
Initialization:
Add WR WG WB
0 00 00 001 08 08 082 10 10 10… … … …15 78 78 78
Step 3: Manhattan Distance calculation
• Arithmetic Unit of each neuron calculates |WR-XRi| + | WG-XGi| + |WB-XB|i| and stores it in register T2
GC:
Step 1: WR address on RAM_ADD [2:0] with mem_rd asserted.
Step 2: First input’s red value on DATA [8:0] and assert st_cal.
Step 3: When first posedge(S0) is detected, read T2R into the memory and T2RC into the flag register. Repeat Step1&2 with the green value.
Step 4: When the second posedge(S0) is detected, read T2G into the memory and T2GC into the flag register. Repeat Step 1&2 with the blue value.
Step 5: When the third posedge(S0) is detected, read T2B into the memory and T2BC into the flag register.
Step 6: When the next negedge(S0) is detected, read T2 into the memory using data_en.
Step 4: Find the minimum Manhattan distance
• Any number less than 512 can be guessed in 10 steps, by using binary searching. The “<“ or “>” relation between the guessed number and the original number is used.
• Neuron with least Distance has Wnr flag set.
GC:
Step 1: T2 value is written to MEM_OUT[8:0]
Step 2: The number 512 is broadcasted to the negative terminal of the differentiator.
Step 3: If high S1 is detected, the least Manhattan distance lies below 512.
Step 4: Steps 1 to 3 are repeated with various values (256, 128…) depending on S1 after each stage.
Step 5: At the 10th cycle, min(T2) is on DATA[8:0]. Wnr Flag is loaded. The winning neuron(s) will have Wnr flag set to 1.
Step 5: FC update and winner weight broadcast
• Frequency counter is decremented and the winning neuron takes control of the bus.
• T2 is calculated in each neuron.
GC:
Step 1: dec_freq is asserted and the winning neuron decrements its frequency counter.
Step 2: nrn_ctrl is asserted with the WR MEM_ADD[2:0]. The winning neuron broadcasts its WR value on DATA[8:0]. All neurons, including the winning neuron, put their WR value on MEM_OUT [8:0].
Step 3: GC asserts st_cal. Step 2 is repeated with WG and WB.
Step 4: The accumulated value is written to T2 using data_en, MEM_ADD[2:0] and mem_en.
Step 7: Neighbor Determination and Updating
• All neurons have T1 value stored.
• GC broadcasts Neighbor size values and neurons with T1<neighbor size fall in the neighborhood.
•These neurons have their Nbr flag set.
•Neighbor neurons are updated using
Wr = Wr + (T2RC)(T2R/16)
The Neighbor sise increases progressively
ALGORITHM TO ARCHITECTUREALGORITHM TO ARCHITECTURE
Neuronaddress
wR(7) T1(9)wG(7) wB (7) T2(9) T2R(7) T2G(7) T2B(7) T2RC(1) T2GC(1) T2BC(1) FC (18)
Distance from the winning neuron
Distance from the input pixel
Weight vectors Frequency counter
Registers used in the architecture
Step 8: Iterate for next input pixel
• Stop after fixe number of iterations.
Parallel architecture for image compressionParallel architecture for image compressionIntroduction | Algorithm | Architecture | Results | Conclusion | Q&A
ARCHITECTURAL NOVELTIESARCHITECTURAL NOVELTIES
Implementation of 7-bit learning:7-bit learning, mapping of pixels to an octal space and encoding the MSB plane with the image,
are new theoretical ideas that we implemented on hardware.This mode should theoretically create images with a better quality than those encoded using 8-bit
mode. This is because the neurons are more closely packed in a smaller space, hence creating a better response on the structure from each pixel.
Implementation of the 8-bit and the 7-bit Learning Algorithm:The same hardware can process an image in both 7-bit and 8-bit modes. A single push-button
switch on the FPGA board sets the mode for a cycle. This is useful because certain images give a better output on 7-bit mode than on 8-bit or vice versa and they can be compared for later studies.
This is done, keeping in mind the need for future upgrading of the functionalities. A module can be added to the design that calculates mean-square error for both 7-bit and 8-bit images and the better one can be selected.
Parallel architecture for image compressionParallel architecture for image compressionIntroduction | Algorithm | Architecture | Results | Conclusion | Q&A
ARCHITECTURAL NOVELTIESARCHITECTURAL NOVELTIESIntegration of encoding hardware to the learning hardware:The integration of encoding and learning hardware ensures faster compression and reduces the hardware
overhead.This is done, keeping in mind the future practical application of the hardware for real-time video
compression, rather than just for stand-alone images.
Implementation of the variable learning rate:Variable learning rate (using learning rates of 1/2,1/4,…etc.) is a novel feature of this arhitecture. This
ensures better updating of neighbors based on its distance from the winning neuron, rather than a fixed updating. The neighbors are updated based on 5 ranges of distances from the winner and the updating distances calculated through theoretical calculations.
Parallel architecture for image compressionParallel architecture for image compressionIntroduction | Algorithm | Architecture | Results | Conclusion | Q&A
ARCHITECTURAL NOVELTIESARCHITECTURAL NOVELTIES
Implementation of the learning rate depending on the frequency count:The frequency count value is calculated so that all neurons get equal chance of being the winner. At
the same time, the algorithm ensures that a neuron that has been a winner the most number of the time doesn’t get updated as much as the ones that are not as lucky. This is not seen in any other similar algorithms.
Implementation of the Neighbor updating:Neighbor-updating together with the winner-updating is another novel feature of our algorithm.
This makes the design complicated, but the output quality is considerably improved compared to other architecture.
Parallel architecture for image compressionParallel architecture for image compressionIntroduction | Algorithm | Architecture | Results | Conclusion | Q&A
ARCHITECTURAL NOVELTIESARCHITECTURAL NOVELTIES
Hardware Features :• Absolutely no redundant hardware.• All computations done using the same hardware blocks. (Shift Register and Arithmetic
Unit).• All control features divested in the Global Controller.
Parallel architecture for image compressionParallel architecture for image compressionIntroduction | Algorithm | Architecture | Results | Conclusion | Q&A
Parallel architecture for image compressionParallel architecture for image compressionIntroduction | Algorithm | Architecture | Results | Conclusion | Q&A
ResultsResults
Parallel architecture for image compressionParallel architecture for image compression
Testing strategyTesting strategy
Data verificationData verification Result verificationResult verification
Introduction | Algorithm | Architecture | Results | Conclusion | Q&A
Parallel architecture for image compressionParallel architecture for image compression
Testing strategy – data verificationTesting strategy – data verification
Data verificationData verification
• Modelsim SE 5.5a Modelsim SE 5.5a
• 4 random input pixels, 1 loop, 4 random input pixels, 1 loop, 16 neurons, 7/8 bits16 neurons, 7/8 bits
• Signals for each module Signals for each module viewed & verifiedviewed & verified
• Major verifications:Major verifications:• Correct winner selectionCorrect winner selection
• GC state transitionGC state transition
• Correct neighbor/winner updateCorrect neighbor/winner update
• Correct number of encoded/decoded Correct number of encoded/decoded pixelspixels
ModelsimModelsim
TEST.VTEST.V
TOP.V TOP.V (Top-level synthesizable module)(Top-level synthesizable module)
NEURON_ARRAYNEURON_ARRAY
GLOBAL_CONTROLLERGLOBAL_CONTROLLER
ENCODERENCODER
TEST_INPUTTEST_INPUT
Introduction | Algorithm | Architecture | Results | Conclusion | Q&A
Parallel architecture for image compressionParallel architecture for image compression
Testing strategyTesting strategy
Data verificationData verification Result verificationResult verification
Introduction | Algorithm | Architecture | Results | Conclusion | Q&A
Parallel architecture for image compressionParallel architecture for image compression
Testing strategy – Result verificationTesting strategy – Result verification
TOP.V TOP.V (Top-level synthesizable module)(Top-level synthesizable module)
NEURON_ARRAYNEURON_ARRAY
GLOBAL_CONTROLLERGLOBAL_CONTROLLER
ENCODERENCODER
C ProgramC Program
Configuration fileConfiguration file
Decode.vDecode.v
Encoded_image.datEncoded_image.dat
Decoded_image.datDecoded_image.dat
Output.tiffOutput.tiff
input.tiffinput.tiff
Config.datConfig.dat
Result verificationResult verification
Introduction | Algorithm | Architecture | Results | Conclusion | Q&A
Parallel architecture for image compressionParallel architecture for image compression
Testing strategy – Result verificationTesting strategy – Result verification
Result verificationResult verification
Introduction | Algorithm | Architecture | Results | Conclusion | Q&A
1.1. 16 neurons16 neurons 1 loop1 loop 7 bit7 bit
2.2. 16 neurons16 neurons 1 loop1 loop 8 bit8 bit
3.3. 16 neurons16 neurons 5 loop5 loop 7 bit7 bit
4.4. 16 neurons16 neurons 5 loop5 loop 8 bit8 bit
5.5. 16 neurons16 neurons 10 loop10 loop 7 bit7 bit
6.6. 16 neurons16 neurons 10 loop10 loop 8 bit8 bit
7.7. 32 neurons32 neurons 1 loop1 loop 7 bit7 bit
8.8. 32 neurons32 neurons 1 loop1 loop 8 bit8 bit
9.9. 32 neurons32 neurons 5 loop5 loop 7 bit7 bit
10.10. 32 neurons32 neurons 5 loop5 loop 8 bit8 bit
11.11. 32 neurons32 neurons 10 loop10 loop 7 bit7 bit
12.12. 32 neurons32 neurons 10 loop10 loop 8 bit8 bit
Parallel architecture for image compressionParallel architecture for image compression
Testing strategy – Result verificationTesting strategy – Result verification
Mean square error
= Σ [(Ro-Rc)2+ (Go-Gc)2+(Bo-Bc)2]
2963.2441292963.244129 2780.9329032780.932903
Introduction | Algorithm | Architecture | Results | Conclusion | Q&A
Parallel architecture for image compressionParallel architecture for image compression
Screen capturesScreen captures
Introduction | Algorithm | Architecture | Results | Conclusion | Q&A
Parallel architecture for image compressionParallel architecture for image compression
SynthesisSynthesis
Introduction | Algorithm | Architecture | Results | Conclusion | Q&A
Technology LibrariesTechnology Libraries
Verilog CodeVerilog Code ConstraintsConstraintsSynthesis tool
Prototype Model
Schematic optimized net-listSchematic optimized net-list
In-signal file Out-signal file
Xilinx ISE Series 4.1iXilinx ISE Series 4.1i
Parallel architecture for image compressionParallel architecture for image compression
SynthesisSynthesis
Introduction | Algorithm | Architecture | Results | Conclusion | Q&A
==================Chip top-optimized==================
Summary Information:--------------------Type: Optimized implementationSource: top, up to dateStatus: 0 errors, 0 warnings, 0 messagesExport: exported after last optimizationChip create time: 0.000000sChip optimize time: 598.734000sFSM synthesis: ONEHOT
Target Information:-------------------Vendor: XilinxFamily: VIRTEXDevice: V800HQ240Speed: -4
Chip Parameters:----------------Optimize for: SpeedOptimization effort: LowFrequency: 50 MHzIs module: NoKeep io pads: NoNumber of flip-flops: 3129Number of latches: 0
Parallel architecture for image compressionParallel architecture for image compression
FPGA ImplementationFPGA Implementation
Introduction | Algorithm | Architecture | Results | Conclusion | Q&A
• Used the following:
Board – XESS Corporation XSVFPGA – XILINX VIRTEX XCV V800HQ240CPLD - XILINX XC95108 CPLDMEMORY - Winbond AS7C4096 – 2 512K x 16 bit banks
1 on-board Dip switch2 push buttons9 bar graph LEDs
Parallel architecture for image compressionParallel architecture for image compression
FPGA ImplementationFPGA Implementation
Introduction | Algorithm | Architecture | Results | Conclusion | Q&A
Upload configuration files and image to the on-board memory
Upload the FPGA bit file to the CPLD
BAR LED 1 glows - FPGA is configured
Press Push Button 1 (START) to start the learning process
BAR LED 2 glows – 2 loops completed
BAR LED 3 glows – 4 loops completed
BAR LED 4 glows – 6 loops completed
BAR LED 5 glows – 10 loops completed
BAR LED 6 glows – Encoding completed
Download the image and convert it to tiff format
Parallel architecture for image compressionParallel architecture for image compression
FPGA ImplementationFPGA Implementation
Introduction | Algorithm | Architecture | Results | Conclusion | Q&A
Parallel architecture for image compressionParallel architecture for image compressionIntroduction | Algorithm | Architecture | Results | Conclusion | Q&A
ConclusionConclusion
Parallel architecture for image compressionParallel architecture for image compression
ConclusionConclusion
1. 7-bit process better than 8-bit process1. 7-bit process better than 8-bit process
Introduction | Algorithm | Architecture | Results | Conclusion | Q&A
2. Suitable for real-time encoding and streaming of video images2. Suitable for real-time encoding and streaming of video images
(About 12 seconds at 5MHz)(About 12 seconds at 5MHz)
3. Use of frequency count register gives better images3. Use of frequency count register gives better images
4. More the loops, better the image (8-bit, beyond 5 loops). Similar 4. More the loops, better the image (8-bit, beyond 5 loops). Similar to human learningto human learning
Parallel architecture for image compressionParallel architecture for image compression
RecommendationRecommendation
1. Algorithm can be modified to improve learning time1. Algorithm can be modified to improve learning time
Introduction | Algorithm | Architecture | Results | Conclusion | Q&A
2. Real time video compression with 2 parallel learning chips2. Real time video compression with 2 parallel learning chips
3. Both 7-bit and 8-bit in the same hardware3. Both 7-bit and 8-bit in the same hardware
4. MSB plane compression4. MSB plane compression
Parallel architecture for image compressionParallel architecture for image compressionIntroduction | Algorithm | Architecture | Results | Conclusion | Q&A
Q & AQ & A