OUTLINE
Mathematics of different DSP algorithms Generalization of DSP algorithms Systolic array architecture Logic block architecture & pipelining Mapping different algorithms to hardware Statistics
DSP TRANSFORMS
DFT: X(k) =
DCT: X(k) =
DST: X(k) =
DHT: X(k) =
1
0
)2/)12(*cos()(N
n
k Nknpinx
1
0
)/*2*exp(*)(N
n
Nnkpijnx
1
0
))1/()1)(1(sin(*)(N
n
Nknpinx
1
0
))/*2sin()/*2(cos(*)(N
n
NnkpiNnkpinx
1/√N, k=0 √2/N, k= 1 to N-1αk =
MATH BACKGROUND
Let the Transform function be δ(n,k)
then X(k)=
Where
For k=0;
X(0) =
),()(),( knnxkn
),(1
0
knN
n
1
0
)0,3()0,2()0,1()0,0()0,(N
n
n
For a 4 point transform:
X(0) = x(0)δ(0,0) + x(1)δ(1,0) + x(2)δ(2,0) + x(3)δ(3,0)
X(1) = x(0)δ(0,1) + x(1)δ(1,1) + x(2)δ(2,1) + x(3)δ(3,1)
X(2) = x(0)δ(0,2) + x(1)δ(1,2) + x(2)δ(2,2) + x(3)δ(3,2)
X(3) = x(0)δ(0,3) + x(1)δ(1,3) + x(2)δ(2,3) + x(3)δ(3,3)
Consider DFT:X(k) =
=
=
This is simillar to Hartley Transform except that the second term is multiplied by –j coefficient.
Xh(k) =
1
0
))/*2sin()/*2(cos(*)(N
n
NnkpijNnkpinx
1
0
)/*2*exp(*)(N
n
Nnkpijnx
1
0
)/*2sin()()/*2cos()(N
n
NnkpinjxNnkpinx
1
0
)/*2sin()()/*2cos()(N
n
NnkpinxNnkpinx
Generalizing the transforms: X(k) = Where:
cos(2*pi*nk/2N) – jsin(2*pi*nk/2N) - DFT
αkcos[pi(2n+1)k/2N] - DCT
δ(n,k) = sin[pi(n+1)(k+1)/(N+1)] - DST
cos(2*pi*nk/N) + sin(2*pi*nk/N) - DHT
1
0
),()(N
n
knnx
LUT ENTRIES
TRANSFORM
δ1(n,k) δ2(n,k)
DFT cos(2*pi*nk/N) -jsin(2*pi*nk/N)
DCT αkcos*[pi(n+1)k/2N] 0
DST cos[90-pi(n+1)(k+1)/(N+1)] 0
DHT cos(2*pi*nk/N) sin(2*pi*nk/N)
SAD 1 -1
δ(n,k) = δ1(n,k) + δ2(n,k)
SYSTOLIC ARRAY ARCHITECTURE
δ(0,0) δ(0,1)
δ(3,2)
δ(2,0)
δ(3,1)
δ(2,3)
δ(0,2)
δ(1,0) δ(1,1)
δ(2,2)δ(2,1)
δ(3,0)
δ(1,2) δ(1,3)
δ(0,3)
δ(3,3)
δ(n,k)= δ1(n,k)+ δ2(n,k)
SYSTOLIC ARRAY ARCHITECTURE
δ(0,0) δ(0,1)
δ(3,2)
δ(2,0)
δ(3,1)
δ(2,3)
δ(0,2)
δ(1,0) δ(1,1)
δ(2,2)δ(2,1)
δ(3,0)
δ(1,2) δ(1,3)
δ(0,3)
δ(3,3)
x(0)
x(1)
x(2)
x(3)
δ(n,k)= δ1(n,k)+ δ2(n,k)
SYSTOLIC ARRAY ARCHITECTURE
δ(0,0) δ(0,1)
δ(3,2)
δ(2,0)
δ(3,1)
δ(2,3)
δ(0,2)
δ(1,0) δ(1,1)
δ(2,2)δ(2,1)
δ(3,0)
δ(1,2) δ(1,3)
δ(0,3)
δ(3,3)
x(0)
x(1)
x(2)
x(3)
X(0) X(1) X(2) X(3)
δ(n,k)= δ1(n,k)+ δ2(n,k)
LOGIC BLOCK OF THE FPGA
LUT
k1
k2
×
×
+/-
+
From other LB
Mode
Mode
δ(n,k)= δ1(n,k1)+ δ2(n,k2)
Mode= SAD selection
Mode
x(n)
x(n)/y(n)
WHAT’s NEW IN THIS?
Customized for transforms C-Code CAD - Systolic array architecture –
suited for transforms Easier routing Specific Data-Path that supports tranforms Better performance Better utilization of resources.
intuitive
DFT SOFTWARE CODE
//PSEUDOCODE
for (i=0;i<N;i++) {
X(i) = 0; X*(i) = 0;
for (j=0;j<N;j++) {
X(i) = X(i) + x(j)*cos(2*pi*i*j/N);
X*(i) = X*(i) + x(j)*sin(2*pi*i*j/N);
}
}
DFT – Higher level
δ1(0,2)δ1(1,2)
δ2(0,2)δ2(1,2)
δ1(0,3)δ1(1,3)
δ1(2,2)δ1(3,2)
δ2(2,2)δ2(3,2)
δ1(2,3)δ1(3,3)
δ2(2,3)δ2(3,3)
δ2(0,3)δ2(1,3)
δ1(0,0)δ1(1,0)
δ2(0,0)δ2(1,0)
δ1(0,1)δ1(1,1)
δ1(2,0)δ1(3,0)
δ2(2,0)δ2(3,0)
δ1(2,1)δ1(3,1)
δ2(2,1)δ2(3,1)
δ2(0,1)δ2(1,1)
DFT – Higher level
δ1(0,2)δ1(1,2)
δ2(0,2)δ2(1,2)
δ1(0,3)δ1(1,3)
δ1(2,2)δ1(3,2)
δ2(2,2)δ2(3,2)
δ1(2,3)δ1(3,3)
δ2(2,3)δ2(3,3)
δ2(0,3)δ2(1,3)
x(0)
x(1)
x(2)
x(3)
δ1(0,0)δ1(1,0)
δ2(0,0)δ2(1,0)
δ1(0,1)δ1(1,1)
δ1(2,0)δ1(3,0)
δ2(2,0)δ2(3,0)
δ1(2,1)δ1(3,1)
δ2(2,1)δ2(3,1)
δ2(0,1)δ2(1,1)
x(0)
x(1)
x(2)
x(3)
DFT – Higher level
δ1(0,2)δ1(1,2)
δ2(0,2)δ2(1,2)
δ1(0,3)δ1(1,3)
δ1(2,2)δ1(3,2)
δ2(2,2)δ2(3,2)
δ1(2,3)δ1(3,3)
δ2(2,3)δ2(3,3)
δ2(0,3)δ2(1,3)
x(0)
x(1)
x(2)
x(3)
δ1(0,0)δ1(1,0)
δ2(0,0)δ2(1,0)
δ1(0,1)δ1(1,1)
δ1(2,0)δ1(3,0)
δ2(2,0)δ2(3,0)
δ1(2,1)δ1(3,1)
δ2(2,1)δ2(3,1)
δ2(0,1)δ2(1,1)
x(0)
x(1)
x(2)
x(3)
X(2) X*(2) X(3) X*(3)
X(0) X*(0) X(1) X*(1)
DCT/DST SOFTWARE CODE
//PSEUDOCODE
for (i=0;i<N;i++) {
X(i) = 0; X*(i) = 0;
for (j=0;j<N;j++) {
X(i) = X(i) + x(j)*cos(pi*(2n+1)k/2N);
}
}
DCT/DST – HIGHER LEVEL
δ(0,0)δ(1,0)
δ(0,1)δ(1,1)
δ(0,2)δ(1,2)
δ(2,0)δ(3,0)
δ(2,1)δ(3,1)
δ(2,2)δ(3,2)
δ(2,3)δ(2,3)
δ(0,3)δ(1,3)
δ(0,0)δ(1,0)
δ(0,1)δ(1,1)
δ(0,2)δ(1,2)
δ(2,0)δ(3,0)
δ(2,1)δ(3,1)
δ(2,2)δ(3,2)
δ(2,3)δ(2,3)
δ(0,3)δ(1,3)
DCT/DST – HIGHER LEVEL
δ(0,0)δ(1,0)
δ(0,1)δ(1,1)
δ(0,2)δ(1,2)
δ(2,0)δ(3,0)
δ(2,1)δ(3,1)
δ(2,2)δ(3,2)
δ(2,3)δ(2,3)
δ(0,3)δ(1,3)
x(0)
x(1)
x(2)x(3)
δ(0,0)δ(1,0)
δ(0,1)δ(1,1)
δ(0,2)δ(1,2)
δ(2,0)δ(3,0)
δ(2,1)δ(3,1)
δ(2,2)δ(3,2)
δ(2,3)δ(2,3)
δ(0,3)δ(1,3)
x(1)
x(2)x(3)
x(0)
DCT/DST – HIGHER LEVEL
δ(0,0)δ(1,0)
δ(0,1)δ(1,1)
δ(0,2)δ(1,2)
δ(2,0)δ(3,0)
δ(2,1)δ(3,1)
δ(2,2)δ(3,2)
δ(2,3)δ(2,3)
δ(0,3)δ(1,3)
x(0)
x(1)
x(2)x(3)
X(0) X(1) X(2) X(3)
δ(0,0)δ(1,0)
δ(0,1)δ(1,1)
δ(0,2)δ(1,2)
δ(2,0)δ(3,0)
δ(2,1)δ(3,1)
δ(2,2)δ(3,2)
δ(2,3)δ(2,3)
δ(0,3)δ(1,3)
x(1)
x(2)x(3)
x(0) X(0) X(1) X(2) X(3)
DCT/DST SOFTWARE CODE
//PSEUDOCODE
for (i=0;i<N;i++) {
X(i) = 0; X*(i) = 0;
for (j=0;j<N;j++) {
X(i) = X(i) + x(j)*(cos(2*pi*nk/N)+sin(2*pi*nk/N);
}
}
DHT – HIGHER LEVEL
δ(0,0) δ(0,1)
δ(3,2)
δ(2,0)
δ(3,1)
δ(2,3)
δ(0,2)
δ(1,0) δ(1,1)
δ(2,2)δ(2,1)
δ(3,0)
δ(1,2) δ(1,3)
δ(0,3)
δ(3,3)
DHT – HIGHER LEVEL
δ(0,0) δ(0,1)
δ(3,2)
δ(2,0)
δ(3,1)
δ(2,3)
δ(0,2)
δ(1,0) δ(1,1)
δ(2,2)δ(2,1)
δ(3,0)
δ(1,2) δ(1,3)
δ(0,3)
δ(3,3)
x(0)
x(1)
x(2)
x(3)
DHT – HIGHER LEVEL
δ(0,0) δ(0,1)
δ(3,2)
δ(2,0)
δ(3,1)
δ(2,3)
δ(0,2)
δ(1,0) δ(1,1)
δ(2,2)δ(2,1)
δ(3,0)
δ(1,2) δ(1,3)
δ(0,3)
δ(3,3)
x(0)
x(1)
x(2)
x(3)
X(0) X(1) X(2) X(3)
LOGIC BLOCK LEVEL
L
×
×
MUX
MUX
+ +
MUX
θ1
θ2
x1
x2
cos(θ1)
cos(θ2)
x2cos(θ2)
x1cos(θ1)
x1
x2
DCT/DFT/DST
LOGIC BLOCK LEVEL
L
×
×
MUX
MUX
+ +
MUX
θ1
θ2
x1
x2
cos(θ1)
cos(θ2)
x2cos(θ2)
x1cos(θ1)
x1
x2
x1cos(θ1)
x2cos(θ2)
X+
X+
DCT/DFT/DST
DHT – LOGIC BLOCK LEVEL
L
×
×
MUX
MUX
+ +
MUX
θ1
90-θ1
x1
x1
cos(θ1)
cos(90-θ1)
x1cos(90-θ1)
x1cos(θ1)
x1
x1
x1cos(θ1)
x1cos(90-θ1)
X+
X+
FPGA OR DSP
SAMPLE RATE > MHZ? FPGA CONTEXT SWITCH? DSP/FPGA FLOATING POINT? DSP C CODE? DSP
FPGA
FUTURE WORK
Top Related