Presented by: Sergey Volkovich Vladimir Dibnis Spring 2011 Supervisor: Mony Orbach.
Student : Andrey Kuyel Supervised by Mony Orbach Spring 2011 Final Presentation
description
Transcript of Student : Andrey Kuyel Supervised by Mony Orbach Spring 2011 Final Presentation
Student : Andrey KuyelSupervised by Mony Orbach
Spring 2011Final Presentation
High speed digital systems laboratory
High-Throughput FFT
Technion - Israel institute of technologydepartment of Electrical Engineering
Presentation overview
•Project motivation and goals•Theory studding•FFT 16/32 core definitions•Encountered problems•Selecting optimal algorithm•FFT core design and development •Validation and verification•Xilinx development boar Demo
Project goals
•The project goals is to design and implement on FPGA device FFT that capable to deal with high rate data processing (rates up to 10MSamp/sec*).
•The design will be written on VHDL and tested on Xilinx development board.
•The project has aspects of: signal processing and logic design and high rate data processing.
*- 5Ms/sec for each of I and Q components .
FFT - Theoretic overviewThe DFT (N- length vector) definition is:
The time-complexity of the DFT is:
The FFT algorithm (developed at first by J.W. Cooley and John Tukey at 1965) comes to reduce the time-complexity of DFT into
This algorithm called: "The Cooley–Tukey radix-2 FFT algorithm".It is one of the most common FFT algorithms.
Radix 4 algorithm
The FFT (N=16) radix 2 data flowThe FFT (N=8) radix 2 data flow
Studding and Examining different FFT parallel algorithms
Sixteen-point radix-4 decimation-in-time algorithm Length-16, Decimation-in-Frequency, In-order input, Radix-4 FFT
FFT core will have the following features:•Real and imaginary Inputs: 8 bits width each.•Real and imaginary outputs: 20bits width each, where 12 MSB bits for integer part and 8 LSB bits for fractional part.•Drop-in module for Virtex-6 (xc6vlx240T)•Forward complex FFT•Transform sizes N = 16/32•Arithmetic type: Fixed-point•Truncation after the butterfly •natural Input/output order•Input data at frequency 10 Ms/sec (total rate of real and image part of data )
FFT core features
FFT core general schematics
16 pointsComplex Parallel
FFT
Clock
Start
Real partData input [7:0]
Imaginary partData input [7:0]
FFT Realdata out 20q8
FFT ImagData out 20q8
Done
Edone
x16
x16
x16x16
rst
x0_re
x15_re
y0_im
y15_im
fx0_re
fx15_re
fy0_im
fy15_im
Selected FFT 16/32 core algorithm (Minimal DSP slices utilization)
Sixteen-point radix-4 decimation-in-time algorithm
Basic butterfly computation in a radix-4 FFT algorithm
XC6VLX240T FPGA utilization FFT size Maximal frequency DSP slices utilization
16 points 383MHz (12[GSam/sec]) 27
32 points 335MHz (21 [Gsam/sc]) 102=27*2+16*3
Debugging and verification
•RTL Matlab model of FFT core , signals values on each pipe line stage•Xilinx simulator •Xilinx development board verification using chip scope•Quantization error estimation against Matlab double precision FFT•Maximal frequency operation validation .
Stimulus ROM
Input dataControl
logic
Data
path
FFT 16 points
PLL Frequency multiplier
FFT resultsmemor
y
Output data
control logic
Increased clockTo all modules
Input clock
Data
path
ChipScopeTo PC
Xilinx development board design validation
Matlab results comarement
Results verification between Matlab fft function and 32 FFT core running at 320MHzAt Xilinx development board
FFT 16/32 core design validation and error estimation
Imaginary part of Matlab vs FFT core fft Quantization error estimation
FFT 16/32 core xilinx development board demo
FFT 32/16 core
Real data
Imag data
Transform Real data
Transform Imag data
4 different signals bank A
4 different signals bank B
Wrap around
Error estimationPLL Operational FFT clock
Input clock