FFT Chip design Midterm presentation -...

FFT Chip design Final presentation

Supervisor:

Leon Polishuk

Students :

Andrey Kual

Asher Pilai

28/02/2011

Project goal • Design an efficient FFT (Fast

Fourier Transform) IP core.

Project requirement •Design 1024 length FFT

• Input and output coefficients vector, represented in fix point data precision. By default each coefficient 32 bit long – 16 bit integer, 16 bit fractional.(Parametrically configurable).

• The design will be implemented using VHDL, considering a reasonable trade-off between performance and area consumption by design.

Theoretical overview – The FFT algorithm

• The DFT (N- length vector) definition:

• The FFT (N- length vector) definition:

While:

1

0

2;0 1; exp

Nnk k

N N

n

kX k x n W k N W i

N

;0 1 ;2

;0 1 ;2 2

k

N

k

N

NX k Y k Z k W k

N NX k Y k Z k W k

1 1

2 2

0 02 2

,

N N

nk nk

even N odd N

n n

Y k x n W Z k x n W

Theoretical overview – FFT vs. DFT time complexity

• This algorithm has better time-complexity than the

DFT algorithm

( Vs. ) complex multiplications. logN N 2N

Development stages • Verification of the FFT C-code correctness and

accuracy.

• Implementation of C algorithm as hardware.

Designing FSM and data path. Implementation on

VHDL.

• Mapping all parts of FFT together.

• Debug and Verification. Comparing VHDL

simulation results to C dump.

• Verifying the accuracy of the algorithm by

comparing it to Matlab and C(rounded)

equivalents.

Radix-2 “Butterfly” compute unit

Radix 2 time distribution. (Due limitation to memory

access)

Data flow diagram for N=8:

a decimation-in-time radix-2

Data flow diagram for N=16: a decimation-in-time radix-2

Twiddles symmetry • Due to the symmetry of cosine/sine , we store in the

ROM only quarter of period.

FFT algorithm • n1 = 0; /* FFT */

• n2 = 1;

•

• for (i=0; i < m; i++) // m= log2(1024) = 10

• {

• n1 = n2;

• n2 = n2 + n2;

• e = -6.283185307179586/n2;

• a = 0.0;

•

• for (j=0; j < n1; j++)

• {

• c = cos(a);

• s = sin(a);

• a = a + e;

FFT algorithm • for (k=j; k < n; k=k+n2)

• { • t1 = c*x[k+n1] - s*y[k+n1];

• t2 = s*x[k+n1] + c*y[k+n1];

• x[k+n1] = x[k] - t1;

• y[k+n1] = y[k] - t2; • x[k] = x[k] + t1;

• y[k] = y[k] + t2;

• }

• }

• } •

• return; • }

General block structure

Controllers design 4 controllers

• Top level controller (Data flow controller)

3 FFT algorithm controllers for each loop

• The outer loop steps through the stages (each column in slide #9).

• The middle loop steps through "flights" (butterflies with the same twiddle from each short-length DFT at each stage)

• The inner loop steps through the individual butterflies.

Note: This ordering minimizes the number of fetches or computations of the twiddle-factor values.

Twiddle ROM reduction • Twiddles almost symmetrical, reduce twiddle

memory ROM for quarter of original length .

• Number of Twiddles coefficients : 1024/4= 256

• Reduce in twiddle ROM size lead to FFT accuracy

deterioration and symmetry distortion.

Debugging and Verification

Sin(2*pi*5*t) to compute FFT

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1original signal in time domain

FFT of sine using FFT IP core (Real part)

0 200 400 600 800 1000 1200-2

0

2

4

6

8

10

12

14

16

Error between MATLAB FFT function -and

FFT IP core for Sin(10*pi*t)

0 200 400 600 800 1000 12000

0.5

1

1.5

2

2.5x 10

-3Real part

Error between MATLAB FFT function

-and FFT IP core for Sin(10*pi*t)

0 200 400 600 800 1000 12000

0.5

1

1.5

2

2.5x 10

-3

Error statistic for FFT IP core to matlab FFT double

precision

• Mean error : 4.51032e-04

• Real part max error : 0.002152

• Image part max error : 0.002275

Open issues • Change register based memories to Pre-

synthesized one.

• Pass synthesis with new memories .

• Pass Place and route.

• Write project book.

FFT Chip design Midterm presentation -...

Documents

Transcript of FFT Chip design Midterm presentation -...