1974 Fast Convolution Using Fermat Number Transforms With Applications to Digital Filtering
-
Upload
rajesh-bathija -
Category
Documents
-
view
229 -
download
0
Transcript of 1974 Fast Convolution Using Fermat Number Transforms With Applications to Digital Filtering
-
8/20/2019 1974 Fast Convolution Using Fermat Number Transforms With Applications to Digital Filtering
1/11
IEEE
TRANSACTIONSONACOUSTICS, PEECH,AND IGNALPROCESSING, VOL. ASSP-22,NO.
2,
APRIL
1974
87
Fast Convolution Using FermatNumber Transforms with
Applications to Digital Filtering
Absfracf-The tructure
of
transformshaving the convolution
property is developed.
A
particular transform is proposed that is de-
fined on a finite ring of integers with arithmetic carried out modulo
Fermat numbers. This Fermat number transform (FNT) is ideally
suited todigital computation, requiring n the order of
N
ogN addi-
tions, subtractions and bit shifts, but
no
multiplications. In addition
to.being efficient, the Fermat number transform implementation of
convolution
is
exact, i.e., there is no roundoff error. There is a re-
strictionon sequence ength imposedbyword length butmulti-
dimensional techniques are discussed which overcome this limita-
tion. Results of an implementation n he
IBM 370/155
are presented
andcomparedwith the ast Fourier ransform FFT)showing a
substantial improvement in efficiency and accuracy.
I. INTRODUCTION
T””
onvolution proper ty of certain ransformscan
be used to efficiently compute the cyclic convolution
of two long discrete sequences. One such transform is the
discrete Fourier ransform (D FT ) with the fast Fourier
transform (FFT) implementation
[l]
which operates on
sequences in the complex number field. To compute the
DFT of
a
sequence of length
iV
requires on the order of
( N / Z ) logz ( N / 2 ) complex multiplications, which often
take most of the computation time. In addition, the
FFT
implementation of cyclic convolution ntroduces signifi-
cantamounts of roundoff
error [ Z ] ,
thusdeteriorating
the signal-to-noise ratio. When working with digital ma-
chines, the data are available only with some finite pre-
cision and, therefore, without loss of generality, the data
can be considered to be integers with some upper bound.
In this digital domain, th e complex number field of t he
continuous domain can be replaced with a finite field
or,
more generally, a finite ring of integers with addition and
multiplicationmoduloomenteger F . In this ring,
using transforms similar to the DFT, the cyclic canvolu-
tion can be performed very efficiently and without any
roundoff error.Onesuch ransformpresentedhere sa
number theoretic transform which operates in the ring of
integers modulo a Fermat number. Computation of this
transform
of
length
N ,
analogous to an
FFI’,
requires on
the order
of N
log2 AT additions, bit shifts, and subtrac-
tions, but no multiplication. This is considerably simpler
than the FFT.
Knuth [SI has proposed the use of transforms in finite
fields. Pollard [4] discussed transforms having t he cyclic
This work was supported by NSF Grant GK-23697.
Rice University, Houston, Tex. 77001.
ManuscriptreceivedJuly16, 1973; revisedNovember19,1973.
The authors are with the Department of Electrical Engineering,
convolution property n a finite field and also gives th e
conditions for having transforms with the cyclic convolu-
tion property in a finite ring of integers. Good
[5]
also
mentioned the use of transforms in a finiteing of integers.
Schonhage and Strassen
[SI
defined transformshaving
th e cyclic convolution property modulo a Fermat num-
ber and discussed their application to fast multiplication
of very large integers. Knuth
[7]
elaborated on the work
of Schonhage and Strassen. Nicholson [SI presented an
algebraic theory of finite Fourier transfornls in any ring
and established fastFFT-typealgorithms ocompute
these transforms. Rader
[SI,
[lo] proposed finite trans-
forms n ings of integersmodulo both Mersenne and
Fermatnumbers. He firstproposed the application to
digital signal processing, showed th at th e ransforms could
be calculated using only additions nd bit shifting, howed
the ,word-length constraint, nd suggested two-dimensional
transformsas a possible relaxation of thatconstraint.
Agamal and Burrus [l l] defined Fermat transforms and
also proposed their applicationfor fast digital convolution.
The purpose of this paper s t o develop the general
structure of transforms with the convolution property and
then present the number heoretic ransformmoduloa
Fernrat number as a practical scheme for use with digital
computation. The properties are hen discussed for the
implementation of digital filters and the results of a soft-
ware implementation on an
I R M 370/155
are given.
It,
s
believed th at thepresentation here is different rom previ-
ous works and gives more insight into these transforms.
One result
of
this approach allows the maximum length
of sequences th at can be convolved to be increased by a
factor of
2
over what hadpreviously been reported n [IO].
The choice of terminology in a new area is always diffi-
cult. In this paper the generalname,number-theoretic
transform, is usedor anyransform using umber-
theoreticconcepts n tsdefinition. The twoparticular
transforms with arithmetic modulo Mersenne and Fermat
numbers are called klersennenumber transforms
MNT’s)
and Fermat number transforms (FNT’s) . The particular
MNT’s and FNT’s with a basis function
a = 2
are called
Rader transforms (RT’s)
These ransforms are ruly digital ransforms, aking
into account the quantization in amplitude and the finite
precision of digital signals. They bear the same relat ion
to digital ignal as heDFT does to discrete-time or
sampled da ta signals and the Fourier
or
Laplace trans-
forms do to continuous-time signals. In th e same manner
th at th e relation of discrete-time signals to continuous-
-
8/20/2019 1974 Fast Convolution Using Fermat Number Transforms With Applications to Digital Filtering
2/11
88
IEEE
TRANSACTIONS
O N ACOUSTICS, SPEECH, AND SIGNAL
PROCESSING, APRIL
1
time signals through sampling involves a possible folding
or
aliasing in the frequency domain, the relation of calcu-
lationswith the DFT to calculations with he number
theoretic ransforms involves a possible folding of the
amplitude that must be taken into account.
In Section 11, the structures of t'ransforms having the
cyclic and he noneyclic convolutionpropertiesare de-
veloped.
It
is shown Ohat what we call the DFT str ucture
is t,he only structure which supports the cyclic convolu-
tion property and any transformaving the
DFT
structure
will have the cyclic convolution property. In Section
111,
the necessary and sufficient conditions for the existmce
of transforms in a ring of integers modulo of an integer F
are establishcd. In Section IV,
FNT's
are defined which,
computationally,appear to be t.he best ransforms for
implementing convolution. In Section V, reference is given
to
a
two-dimensional transform implementat'ion for long
one-dimensional convolution. In Section VI, a hardware
implementation to compute the FNT is suggested. Com-
parison of the
FNT
with the
FFT
implementation of the
I>FT, o implement convolution, is made in Sect,ion VII.
In Section VIII,
a
software implementation of the FNT
on an IBM
370/155
is prescnt,ed and timing comparisons
are made with the PFT. For cyclic convolution engths
up to 236, FNT implementations are faster by a factor
of 3 to 5 over the best FFT implementations which make
use of the symmet,ry
of
the
DFT
for real data.
For
com-
puterswith slower multiplication, theadvantages look
even bet'tcr. lGnally, several generalizat,ions and applica-
tions are presented.
11. THE STRUC TURE
O F
TRASSFOILMS HAVING
THE CONVOLUTION PROPERTY
In this section the structure of transforms having the
cyclic convolution property, .e., the transform of cyclic
convolution of two sequences is equal to th e product of
their ransforms, is establishcd. The class of transforms
considered
is the set of lincar nonsingular (invertible)
t,ransforms which map a sequence of length A r to another
sequencc
of
length
N .
It is shown that what lvc call t,he
D I T structure is tho only structure which supports the
cyclic convolut,ion propert,y and, moreover, any transform
having the DPT str uct urc ill have the cyclic convolution
property.
Let xn and h,, n = O , l , .
.
lV
1,
be wo sequences
which are to be cyclically convolved as given by
N- 1
y n = xn*hn= 2 kh n -k n
=
O , l , . .
-
N
1. (1)
k=O
This definition assumes the sequences are periodically
extendedwith period N or the indiccs arc valuated
modulo N. Here we will describe sequences as vectors of
length
N.
Since only linear invertible t,ransforms are con-
sidered, they can be represented by anN
X N
nonsingular
matrix
T
whose clenwnts arc t k , m , k,wz = O , l , -
-
- ,N
1.
Let capital letters denote the transformed sequences, i.e.,
X =
T X
H
=
Th
Y = Tg.
Consider the conditions on
t k , m ' s
so that
Y = X @ H
where @ denotes term by term multiplicat'ion of
vectors.
Equations ( 2 ) and (1) are combined to give
N- l N-1 N-1
y k
= t k , n y n = t k , n
xmhn-m
n=O
n=O
m=O
x 1
N-
1
=
xmhmtk,m+l
m=O Z=O
N- 1
x k = tk ,mxnz
m=O
A - 1
HA
=
tk , zhl
I
=
O , l , .
.,.I:
1.
z=o
The individual terms in ( 3 ) can be rewritten as
Y k
= X k H k
which, using (4 ), becomes
N-
1 A- 1 A- 1 A'-1
&hltk,m+z xm:mhztk,nztk,z.
m=o z=o
m O
z=o
Matching the multiples of xmhl on both sides of (5 ) g
tk++l
=
tk ,&,z
k Z m
= O , l , - .
.,X
1.
Repeated applications of
(6)
gives
t k V m =
t k , P
l c , m =
O , l ,
*
- .,w 1.
And, since the convolution is cyclic, the illdices in
are added nlodulo N . This gives
t k , m A r = 1 16,772 =
0,1,
* . iv 1.
Therefore, t.he
tic,m's
are JVth roots of unity. Fo r T to
nonsingular, all the t k , 1 7 ~houldbedistinct and, si
there are only N distinct Nt h roots of unity, t'he t
must be these i\; distinct roots. Without loss of general
t l .1 can
be
taken as a root
of
order N , i.e., N is t,he le
positive integersuch that tl,l = 1. Therefore,all
t
can be written
as
some powers
of
t l , 1 .
Again without
of generality, the rows
of T
can be arrangod so that
tk.1 t l , l k .
Combining
(7)
and
(9)
and denoting t l .1 by
a
gives
following
t k , m akm
k , 7 ? % = 0,1,'
* ',Av
1. (
The structure thus established causes the transform
beorthogonal. The elements Zk ,m of are given by
Zk,m
= N-la--km k,m
=
O , l , . .,N 1. (
To prove this, consider
-
8/20/2019 1974 Fast Convolution Using Fermat Number Transforms With Applications to Digital Filtering
3/11
AGARWAL
AND
BURRUS:
FAST
C O N V O L U T I O N APPLICATIONS TO DIGITAL
FILTERING
89
TT-l
=
I
or
N-1
N 1 Ykn Y-ln = 6
k.2
(12)
7L=O
where
6 k , Z
is the Kroneckerdeltafunction.Taking
I
1 =
p , (12 ) can be rewritten as
N-
1
1 if p
=
0 (mod W
Lv-l =
(13)
n=O
0 otherwise.
To prove the orthogonality, we have Do prove (13).
If
p =
O(mod
W),
then
(YP
= 1 and he first part is thus
established.
If p
O(mod
A )
,
( YP
1 or (a" 1) 0.
Multiplying (13) by
( ( Y P
l ) ,we obtain
N- 1
p 1 Y ~
1)
( Y ~ n
l p l
Y ~ ~
1) =
0.
n=O
Thus (13) is established.
An examination of the preceding development reveals
th at he existence of an N X N transformhaving the
cyclic convolution property depends only on the existence
of an Y that is a
root.
of
unity of order N , and the xistence
of
1V-l.
The structures of t'he transform matrix T and its
inverse are given by (10) and
(11)
respectively, and
theseare he only structures which support he cyclic
convolutionproperty.This structure is called t'he DFT
structure . Because of the DFT structure, any transform
having the cyclic convolutionproperty also has a fast
computational algorithm simiiar to theFFT if is highly
composite Kg]. In t'he complex number field th e DFT,
with (Y =
e--jZriN,
is the only transform having the cyclic
convolution property. But, in a finite field
or,
more gen-
erally, a ringwe might also have transforms with theyclic
convolution property if there exists a root of un ity of
order N and if X- exists. In thenext section
we
establish
results or the special case of
a
finite ing of integers
modulo of an integer, which seem to be most appropriate
for practical digital computation. Appendix A lists some
of th e properties of these transforms.
Thus far wi: have discussed only the transforms having
the cyclic convolutionproperty. Noncyclic convolution
can be implemented using cyclic convolution by padding
the sequences with
zeros
[l].
If
only noncyclic convolu-
tion is desired, a more general class of transforms is pos-
sible. Consider two sequences of lengths L and M , respec-
t'ively, t'hat are noncyclically convolved giving an output
sequence of length
N
=
L + M
1. First
we
pad the
sequences with zeros to make hem of length N each.
For
the X N transform T , the desired restrictions on
t k , m 1 8
are still given by
(6),
but now the indices are not
added modulo X ; his modifies
( 6 )
to be
t k ,m + l = t k ,m t k , z k = 0,1," ' , N 1
112 = O , l , . . . , M
1
I
=
0,1,.*.,L
1.
(14)
Repeated applications of (14) gives
t k , m
=
k k , l m lc,m = 0,1,--.,N 1.
(15)
This s the same as (7 ) , but
now we
donothave
(8)
which implies cyclic convolution. Because of the absence
of 8 ) , k 1 7 ~re not restricted t o Nt h roots of uni ty. The
only restriction is the nonsignularityconstraintwhich
requiresall
tk,1 8
to be distinct.
A
similarprocedure s
discussed byKnuth [3] in connection with hemulti-
plication of large ntegers which is almost analogous to
noncyclic convolution. When the
t k , l ' s
are not restricted
to Nth roots of uni ty, ,in general there is no 1WT-type
algorithm to compute the transform
or
the inverse trans-
form. As a matt'er of fact, t,he inverse transform docs not
have a simple structure. Because of these limitations
we
restrict ourselves to transforms having the cyclic convolu-
tion property.
111. TRANSFORMS IN TH E RIN G
011
INTEGERS
MODULO A N
INTEGER
In anypractical situation,
or
when working with digital
machines, thedataare available only with some finite
precision, and therefore,without loss
of
generalit,y, the
da ta can be considered to be integers wit'h some upper
bound. Tocompute corivolution in his digitaldomain,
operations in the complex number field of the continuous
domain can be imitated with a finite field or, rnore gen-
erally, a finitx ring of integers under additions and multi-
plications modulo some integer F ; and an int'eger (Y of
order
M
replaces
e--j2T/i\i
used in a DIW. In thi s ing, when
two nteger sequences
z ( n )
and h ( h ) are convolved, the
output integer sequence
y
( n )
s congruent to the convolu-
tion of
x n)
nd
h ( n )
modulo F . In the ring of integers
modulo F , conventional ntcgcrscanbeunambiguously
represent'ed if their absolute value is loss than
F / 2 .
If the
input integer sequences z ( n ) and h ( n ) are so scaled that
y ( n )
I never exceeds
F / 2 ,
we would get the same results
by implementing convolut'ion in he ring of integers
modulo
F
as tha t obtained with normal arit.hmetic. This
is similar to th e overflow constraint in fixed-point digital
machines.
In most digit,al filt'ering applications, h ( n ) represents
the impulse response and is known a priori; also the max-
imummagnitude of the nput signal isusually known.
I n this situation, we can bound the peak output magni-
tude by,
Ai- 1
1 y ( n )
I 5 I z ( n )
I m x
c h ( n )
n O
When
we
relate ontinuous-time signals to discrete-
time signals through ampling, we obtain an aliasing
effect n he frequencydomain;all the frequencies are
obtainedmodulo the samplingrcquency. A similar
effec t occurs when the amplitude is discrctizod and con-
volution is performed in the integer domain modulo an
integer F . We obtain aliasing in amplitude which can be
avoided
as
noted before.
-
8/20/2019 1974 Fast Convolution Using Fermat Number Transforms With Applications to Digital Filtering
4/11
90 IEEE TRANSACTIONS
O N
ACOUSTICS,
SPEECH,NDIGNALROCESSING,PRIL 1
In order to have a computational advantage in imple-
menting cyclic convolution, wewill derive a transform
having the cyclic convolutionproperty in this ring. In
addition, we would like this ransfornl ohave a fast
cornput'ationalalgorithm.First, wewill present esults
about the existence of such transforms which mere shown
to depend on the exist.enceof
an
a
of order
N
and existence
N - l in the ring. The number theory necessary to under-
stand the derivations is very basic and can be found in
any elementary book o n number theory [14].
Let
F
= p l r l p 2 r z . .
lPL
(16)
be the prime factorization of F . Since a is of order N ,
we have
a N =
1(mod F ) (3.7)
which implies
all' = I(mod p i r i ) i = 1,2,..
1.
(18)
If
the transform matrix T is to be nonsingular, no two
rows of T can be the same, which implies
a p @(mod p i r i )
i = 1, 1 1 ,
P 4
0
5 p , q
5 A: 1. (19)
Equations
(18)
and
(19)
imply that
a
should be of order
N
with respect to each prime power factor p j r i of F , i.e.,
N is the least positive integer such that
a v = 1 (mod pi' i) i = 1,2,. ,Z .
(20)
If a is of order less than
N
with respect to any moduli
p iT < ,
hen 1 would be singular wit,h respect to that modu-
lus. Also, for 1V-l to exist, N should be relatively prime
t o F , i.e., p i should not be a factor of N . These two con-
siderations give Theorem
I,
which isproven in Appen-
dix B.
Theorem 1
An N X N transform T having the cyclic convolution
property in the ring of integers modulo an integer F exists
if
and only if
X
divides
0
( F )
g cd (p l 1,pZ ,
- a ,
Q L
1) .
IV. ECERRIAT
N U M B E R
TRANSFORMS
For number theoretic transforms to be more at,tractive
i n
comparison to the PE'T for implementing convolution,
they should bo computationally efficient'. There arc three
requirements that will be considered. First, I\; should be
highly composite (prefcrably a power of 2 ) for
a
fast
FE'T
type algorithm to exist. Second, since complex multiplica-
tions take most of the computational effort in calculating
thc
FF'T,
it is important' tha t themultiplication by powers
of
a
be a simple operation. This is possible if t he powers
of a have very few bit binary representations; preferably
also a power of two, where multiplication by a powcr of a
reduces t o a word shift. Third, in order to facilitate arith-
metic modulo P, F should also have a very few bit binary
representation.
Now we shall make a systematic investigation of good
choices for F , for which the maximum transform l.en
A T
is not too small. As noted above, we would like F
have a very few bit binary representation. The first p
sibility is
P
t
has a
prime factor
2
and therefore acco
ing to Theorem 1 of Section 111, the maximum possi
transform ength is 1. For 2k 1, let I be a composi
PQ,
where
1
is prime. Then
2 p
1
divides
2 Q
-
1
a
the maximum possible length of the t,ransform will
governed by he length possible for 2 p -- 1. Therefo
only the primes I need to be considered intcrcsting. Nu
bers
of this form are known
as
Merscnne nunlbers a
Rader [lo] has discussed convolution using Mersen
numbers in detail. For MNT's [lo], it can be hown t
transfornls of length at least 21 exist and the correspon
ing a =
-2.
Mersennc number transforms
are
not of
much nterest because 21' is not highly composito a
therefore, we do not have fast FYI'-type comput,at,ion
algorithms to compute the transforms. For
Zk -
I , sa
is odd. Thcn 3 divides
+
I
and hc largest possi
transform length is
2.
Thus
k
is
wen.
Let
IC
be ~ 2 ~h
s is an odd integer. Then
2 z t
+ I divides
2 8 z t
+
1
and t
length of the possible transform will be governed by t
length possible for
2zt
-
1.
Therefore, int'egers of the fo
2zL
+ 1
are of interest.Thesenumbersare known
Fermatnumbers and will be discussed in detail in t'
paper. E'ennat numbers seem to be optimum in the sen
of having transforms whose length is interesting while t
word size is moderate. Numbers of the form 2.2 +
1
a
also of limited int'erest and are discussed in Section I
A
systematic investigat'ion of t'hose F which require rno
than taw bit reprcsentation is difficult. Our prelimina
investigation in that direction does not seem t o be ve
encouraging.
In the light of the above discussion, it is proposed
use F t b $- 1 b
=
z t , being a positive integer, as go
choices for
I?.
Fr is known as the th Ferrnatnumber
Arithmetic modulo
17,
can be performed using 6-bit ari
metic, as will be discussed in Section VI. The nunher
bit s used for the signal and t'he filter coefficients deci
the value of b to be used so as not to cause error in t
output due tooverflow. The value of
b
required is sligh
more than double th e number of bits used for Dhc sign
representat,ion. One simplc bound on the output was p
sent'cd in t'he lastr section. Computationally, Fcrmat nun
bers s ewn to be the best choices for
F ,
therefore,
as
far
is practicable, they shouldbc used. If thovalue of
obtained from the overflow consideration is not
a
pow
os 2, it should be increased to he next power of 2
order that Fcrrnatnumbers anbe used. Some oth
possibilities are discussed in Section IX. Since thn arit
metic is done modulo a k'crmat number, we call
the
ass
ciated ransforms FNT's. Below we define
F h T s
an
their inverse transforms for sequences of length
lz =
2
X ( k )
=
z(n)anb(mod I ) k =
O , l , . . . , A r
-- 1 (2
A-1
L=O
N 1
z
(n )
=
( 2 -m )
X(k)a-"k(modFt)
k=O
-
8/20/2019 1974 Fast Convolution Using Fermat Number Transforms With Applications to Digital Filtering
5/11
AGARWAL ANDURRUS:ASTUTION AAPPLICATXONS TOIGITALILTERING
91
p ,
:: 6 :+ 1,
b
= 2t (22)
a is an nteger of order N , Le.,
N
is the east posit'ive
integer such that
ah' =
1
mod F , ) .
Note hat CY-^ = and 2-" = -2b-"(mod
F,).
As discussed in Sections I1 and 111, we canperform
cyclic convolut,ion of two integer sequences of length hr,
using the FNT,
if
certain condit ions,are satisfied. Now
we discuss the possible values of N for which an FNT
exists, given a choice of
F,.
s implied by Theorem
1
of
Section 111, for transforms (mod F , ) of length N to exist,
N should divideO(F,)
Fermat numbers up to
Fd
are all primes, therefore for
these cases O ( F , )
=
2b, and we canhavean
FNT
for
any length
N = 2",
m
5
b.
For
these Fermat primes the
integer
3
is an O( of order N =
2b,
giving the largest pos-
sible ransform ength. Of course thereare
2b-1
other
integers also which are
of
order
2b.
They can be obtained
by taking odd powers
of
3 .
If
an integer a is
of
order
N ,
then a P(mod
F,)
ill
be
of order
N / p ,
if
N / p
is an integer.
Therefore,usingFermatprimes, an integer a of order
2" m 5 b, is given by 32&"(modF,) The integer 2 is of
order 2b =
2t+1,
f
CY
is taken as 2 or a power of 2 , all the
powers of a would be some powers of
2,
and, for hese
cases, as discussed in t'he beginning of this section, he
FN T can be computed veryfficiently and is called the RT.
There are no other known Fernlat primes. For digital
fiheringapplications, F 5 ( b
=
3 2 ) and
F6 ( b =
64) seem
t o
be most practical. Lucas
[13]
has proven th at every
prime factor of composite F , is of the form K-2t+2+ 1.
Therefore, 21+2divides
O(F,)
for t
>
4.
n particular
it
can be verified th at for F5 nd F6, ( F t ) =
2t+2.
There-
fore, for these choices of Fermat numbers, the maximum
possible transform 1engt)h s
2t+2
= 4b.Also, we assert th at
a * given by (23) is of order 2, mod
F,,
2 2.
qj = 2 2 t - 2 ( 2 2 1 - 1 .-
1)
(23)
We denote thisaP s fi ecause
CY;
2 mod P,.
The proof that aP iven by (23) is of order 21+2 with re-
spect to any factor
of F ,
is given i n Appendix C. Any odd
power of
.\/z
will also be of order
2t+2.
By raising
q2
to
21+2-mthower, we obtain an integer
U
of order 2m,
m 5
t + 2.
Table I gives Val-ues of 1V for the two most important
values of
CY
and also gives the maximum possible N for
the most practical values of b. The Fermat number trans-
form has also been defined by Rader
[lo],
using
=
2
but his development does not indicate the possibility of
using Using the esults
of
Section
111,
we have ound
the maximumpossible length N for which an FNTexists,
for he most practical values of b, and have found he
corresponding nteger LY TJsing CY = fi, themaximum
possible engths for hese ransforms have increased by
a
factor of
2 ,
which makes them more useful. In th e next
section we discuss a two-dimensional convolution scheme
TABLE
I
PARAMETERS FOR SEVERAL POSSIBLE IMPLICATIONS FOR THE FNT'S
__._ ___ I__
3
4
8 2 + 1
16
2l'
-+
1 32
16 3256 3
64
655 6
3
5
32 232
4
64
12s 128
6 64 2e4
-+
1
128
256
256 fi
._
...__
. . . ~_ l l l _ _ _ I _ _ _ _
_ _
a This case corresponds to the
RT.
whereby very long one-dimensional sequences can also be
convolved using a two-dimensional FNT.
The basis functions for t,He FNT ar e integer exponen-
tials mod F, , in comparison t o the complex exponentials
for the DFT. These integer exponentials fold over after
they cross F J 2 . Because of the nature of modulo arith-
metic, he FNT coefficients do not seem to haveany
physical meaning. Although the signal for which the FN T
is being taken may
be
very small, ts FN T coefficients
may lie anywhere between 0 and Ft-1. As a matter
of
fact, during various stages of the computation each accu-
mulation of the signal "overflows" many times. But still
the end result of the convolution would be exact if the
input signals are properly bounded. FNT's f o l l o ~ ll the
properties mentioned in AppendixA.
Example
T o make the ideas of this section more clear, we now
present an example. This example will illustrate several
points, reat,ment of negativevalues n hedata, t'he
structure of the ransformand he inverse ransform
matrix, negative powers of
CY,
requent "overflow" during
computation,meaningless of the transformvalues,and
exactnegs
of
the final answer. This example w-ill not demon-
stra te the efhient implementation of the RT using the
binary arithmat ic. That will be illustrated in Section
V I
on implementation of the RT.
Consider two sequences 2 = (2,
,1,0)
and h
=
(1;2,0,0),whose convolution is desired. From he over-
flow consideration, it is sufficient 'if we work modulo
Fz
= 17. We want
N
=
4,
or
F2
the integer 2 is of order
8, therefore i2
4
is an a of order 4. The transformation
matrix T is given by
1 1 4 - 1
--4
1 4
16 13
1
16 1 16
1
I
(mod 17)
1 1 13 16 4 1
-
8/20/2019 1974 Fast Convolution Using Fermat Number Transforms With Applications to Digital Filtering
6/11
92
IEEE
TRANSACTIONS
O N
ACOUSTICS, SPEECH,
AND
SIGNAL
PROCESSING,
APRIL 1
Since 4-1 = -4(mod 17), the inverse transformation ma-
trix T-’
is
given by
1 4-1
4-2
4-3
T-1 = 4-1
1
4-2
4-4
4-6
1 4-3
4-6
4 9
1 1 1 1 -
= - 4 1 1 -41 4
1 -1 1 -1
1 -1 4
= ‘3 1; 1; l;
1;f3 mod17)*
The transforms of z nd h are given by
X = T x =
1 1 1 1
-1 16 1 1 [ 3
1
16 1 16
1 136 4
Note that in
x , -2
was represented by
+ 17
=
15.
Similarly,
H = (3,9,16,10) and Y =
X - H
= (3,90,80,90)
= (3,5,12,5)(mod 17).
Taking the inverse transform of
Y ,
y
= (2,2,14,2)(mod 17).
According to ourassumption, ntegersaresupposed to
lie between
-8
and
8.
Therefore,
14
must be represented
as 14 - 17 = -3. This gives y
=
(2,2, 3 , 2 ) , which s
the correct answer. Also, note that
y
is a symmetric se-
quence, therefore Y is also a symmetric sequence. Other
than this, the transform values have noignificance.
V. TWO-DIMENSIONALCONVOLUTION FOR
CONVOLVINGLONGSEQUENCES
Arithmeticmodulo F , canbe implementedusing
b
= 2t bit representation of integers, as will be discussed
later in Section VI, with some provision for representing
2b. Themaximum ength of sequenceswhich an be
cyclically convolved using the FN T is 46, using
= f i ,
TABLE
I1
MAXIMUM
ONE-DIMENSIONALYCLICONVOLUTIONENGT
USING WO-DIMEKSIONALNT
OR
RT
___
Word length
b
N for
CY
=
2 -N
for
CY = %
16
512
32
2048
2048192
64
81922768
as discussed above.Therefore, the ength of sequen
which can be convolved is proportional to theword len
in bits. Thus, or long sequences, word length requirem
may be excessive. Rader [lo] suggested using a w
dimensionalconvolution cheme to convolve ong o
dimensionalsequences. Agarwal and
Burrus [ll],
[
presented ucha two-dimensional convolution chem
Using this scheme, cyclic convolution of length AT =
is mplemented as
a
two-dimensional cyclic convolu
of length (2L 1) by M (or 21, by M ) This t
dimensional cyclic convolution can be implemented us
a
two-dimensional PNT [ll], [12] defined similar to
one-dimensional transform. Using this two-dimensio
scheme, the word length required is proportional to
square root of the lengthof the sequences to be convol
which would give for a maximum sequence ength,
rather than 4b. If M is taken as the maximum poss
length 4b, and 2L 1 is a small integer, then either di
convolution or another high-speed algorithm could
employed [la], 15] to computeconvolutionalong
short dimension and the one-dimensional FNT could
used along the long dimension. Computationally this co
binationcanbevery efficient as will be shown in
implementation in Section VIII. Table
I1
lists the m
imum lengths for two-dimensional transforms.
VI. A SUGGESTED HARDWARE
IMPLEMENTATION
Since the structure of the PN T s similar to th at of
DFT, and
N
is
a
power of
2,
a fast implementation of
FNT’s similar
t o
the FFT with radix 2 exists. As a ma
of fact, all we have t o do is replace
W = e--jZ*IN
by
any FFT algorithm [l].
In computing theFNT,arithmetic sdone mod
2b+ 1.
I n this arithmetic the only allowed integers
O,l,
-
and llntegers whose absolutevaluesdonot
exceed 2b-1can be represented unambiguously. Negativ
integers are represented by adding
2b
+
1 to
them; t
is similar o twos complement and ones complement re
sentation of negative integers. Using a b-bit register,
int,egers from0 to
2b -
1 can be represented. The probl
remains to represent 2 b . If the dat,a are uncorrelated,
probability tha t thi s number will appear after an ari
metical operation is approximately
2-b.
For digit’al fil
ing applications, b would typically be 32
or 64;
in the
cases the probability
of
occurrence of 2b is extrem
small. If occasional error in convolution is permissi
for these cases, probabily there is no need of any ex
-
8/20/2019 1974 Fast Convolution Using Fermat Number Transforms With Applications to Digital Filtering
7/11
AGARWAL AND BURRUS: FAST CONVOLUTION APPLICATIONS TO DIGITAL FILTERING 93
hardware to represent
2b. If
the need exists, an extra bit
could
be
used to represent
2b
at th e expense of a
more
complicated hardware. The following discussion is based
on
the b-bit representation of the integers.
Now, we discuss how various basic arithmetical opera-
tions can be performed modulo , .
A .
Negation
Let
h- 1
A =
ai2i,ai
= 0 or 1.
i O
Then
b-1
- A = - C a22
d = O
h-1
= di2i (26 1)
i=O
=
di2i
(26 1 ) + 2b+
1
mod F ,
h- 1
i =O
= di2; + 2 mod F,.
a- 1
i-0
Thus to negate a number, we have to complement each
bit and add 2 to the result. For example,
( 1011)
(mod 17):
4
= 0100; 4 =
[+11:)]
= 1101 = 13;
13
+ 4 =
17.
13. Addi t i on
When we add wo b-bit integers, we obtaina b-bit
integer and possibly a carry bit. The carry bit represents
2b = -
1
mod F,.
To
implement arithmeticmodulo 2b
+
1,
we
simply ubtract hecarrybit.Thus hehardwarc
should be of the carry subtract type. For example,
1010
(mod 17) : 10
+ 9
= 17
= 2(mod
17) +lo01
10011
1
0010 = 2
C . Subtraction
Subtraction is implemented asn addition byirst negat-
ing the subtrahend and then adding them. Addition must
be done according toB.
L . General Multiplication
When we multiply two 6-bit integers, we get a 2b-bit
product; Let
Ct
be theb-bit low-order part of it and C Hbe
the b-bit high-order part of it., then
A X B
=
C L
+
CW2b= C L C;i(mod F , ) .
Thus, all we have odo is subtract he higherorder
register rom the
lower
order egister. Thesubtraction
needs to be done according to C. For example,
0111 0101
(mod 17):13 X 9 = 117 = 15(mod17); 117
=
L rr
L ' L
CL =
0101
( -cl{) = 1010
1111 = 15.
E. Mult ip l icat ion
by
a
Power
of
2
If CY is taken as
2 or
a power of 2,
RT's
are obtained and
the only multiplications involved in computing them are
those by some powers of
2.
These multiplications are par-
ticularlysimple to mplement n arithmetic modulo
F,.
Suppose we need to multiply the content's of a register
by 2k, 0
<
k < 6, all we need to do is left-shift the con-
tents of the register by lc bits and subtract the c overflow
bits. A convenient way to do this is to append the data
register on the left with a register of zeros, left-shift the
double register by I positions, dropping the leading zeros,
and then subtract the igher order register from the lower
order register, as in
D. If
I is outside the range 0 < k <
b ,
we make use of the fact tha t
2b = 1
mod F,. Computa-
tion of the inverse transform required multiplications by
negative powers of 2 which can be converted to positive
powers by thefollowing relationship,
2-k
= b--lc mod F,.
An
alternate method to multiply
y
2-k
is
to
load
the data
in the higherorder egister of adouble egister, filling
the lower order register with zeros, t,hen right-shift the
double register by
Ic
positions and subtract the low-order
register from he high-order register mod P,. For fast
implementation of the bit shift operation, shift amount
k
should be expressed in a binary form and a t a clock pulse
shifting should be done by a power of
2.
For example,
(mod 17)
:
11 X Z3 = 88
=
3 mod 17
11 = 0000 1011.
0101 1000
Shift left
3
positions
__
c r r
C L
CI,
= 1000
( - C J r ) = +1100
1
0100
- L l
0011 = 3
11 X
2-3
= 11
X (-24-3) = -22 = 12 mod 17
11 = 1011 0000.
-
8/20/2019 1974 Fast Convolution Using Fermat Number Transforms With Applications to Digital Filtering
8/11
94 IEEE TRANSAI
ZTlONS ON ACOUSTICS, SPEECH, AND S I G N A L PHOCESSINU,APRIL
19
0001 0110
Shift right
3
positions
'H CL
1100 = 12.
For
implementation
of
the fast RT, nlike the
FFT,
we
do notneed to store the powers of
a. For
serial arithmetic,
we could have
a
register which stores the shift amount
k ,
and as we go along th e fast.
R T
flow chart, we continually
update the shift amount.All this is particularly simple to
do in binary arithmetic. If OL is taken as
f i ,
only the odd
powers of
a
require
a
two-bit representation. In th e fast
FNT algorit>hnz,onlyonestage equiresmultiplications
by odd powers of a . Multiplication by an odd power of
fi
would probably be best done by first multiplying by
the just lower even power of f i , which is
a
power of 2,
then multiplying by
f i ,
which is a two-bit number. The
resulting increase in computation effort is very small.
VII.
CObilPARISON W I T H
THE
FFT
As noted in the previous section, computing the K T is
a very simple operation on a binary machine.
Now
lei us
compare 'hecomplexity of variousbasicoperations in-
volved incomputing the RT vis-a-vis the FFT.
If
the
two sequences z ( n ) and
h ( n )
have
61
and
6 2
bit repre-
sentations, espectively, andare of length
N ,
then he
output y ( n )would need no more han
a
(01
+
02 + ogz A )
bit representation. T o obtain the correct result
b
2
bl +
0 2 +
og,
X.
In Section
I11
we have given a.better bound
on the output. Roughly peaking,
w e
need twice the num-
ber of bits to carry out the convolution using th e R T a s
compared to t'he fixcd-point
FFT
implementation of the
convolution. But in the DFT, every data point is treated
as a complex number and therefore requires two words,
one for the real part andone for he imaginary part. Thus,
in effect, the hardware requirement for two ransforms
is about the same. Although for real dat>a
it
is possible
to make use of the symmetry properties of the D I T S
theyrequireextracomputationandfor shepurposc of
comparison it will be ignored, even t'hough we have taken
this nto account forour I B M 370/155 implementation
to be discussed in he next scct'ion. Therefore
we
shall
assume, in the FE'T implementat ion, each data point is
represented by a
b / 2
bit real part and
a
0/2
bit imaginary
part.
One b / 2 bit complex addition is equivalent to two 6 / 2
bit real additions, which are comparable to a 0-bit addi-
tion modulo
F, .
Thus, the complexit'y of addi tionlsubtrac-
tion
is
the same in both the transforms. Similarly,
it
can
be shown that a b / 2 bit complex multiplication is com-
parable to a &bit multiplication modulo
F, .
Computation
of the RT requires multiplications by powers of 2, which
implemented as bit shifts and subtractions become much
simpler operations compared to complex multiplications
required in the FFT implcn~entat~ion.
To compute lengthN fast KT , 1% log, N additions/sub
tract'ions,and ( N / 2 ) og, N / 2 multiplications by som
powers of 2 are requiredwhich are implernentcd as bi
and subtractions. To compute the convolut.ion using th
FFT, -most of the time is talcen in computing the comple
multiplicat,ions equired
t o
compute ,he ransfornls.
comparison with
thc
RT
reveals that these complcx
nu
tiplications are replaced
by
bit
shifts
and
subtractio
which are much faster operatio~~s. Thi s results n co
siderablecomputational avings n the implementatio
of convolution. This fact has been verified on a gcnera
purposemachine
(IBll.2 370/155) The
cornputation r
quired to multiply the two transforms is about, the sam
for both the implementations.
To
convolve long sequenc
using the two-dimensional P N T or BT, the compuhtion
effort ncreases
by, at
t'hc nlost,
a
factor of 2. Still, th
RT
implementation of convolution
is
much faster as
c o m
pared to theFE'T implementation.
RT's have some additional advantages ovcr the
k k T
First, he
FYP
implemc ntation requiresstoringall ,he
powers of
W
requiringa significant, amount' of storag
which maybean mportant actor for a small min
computer or
a
special purpose hardware implcrnentatio
Second, fixed-point, FF'Y implementation introduces a si
nificant amount of the roundoff noise at the output,
i-
bits depending on the data
[a] This
degrades tha signa
to-noise rat'ioduring the filteringoperations. The PN
or RT implementation serrorfree, the onlysoum: o
error is input A / D quanti~at~ion.
VIII.
I~ IPS ,EMl i :NTATrC)NO N THE IB l l 370/15
The word length of t,hc
IBM 360 -370
series is
32
bi
and, therefore, is well suited for t,he rnplementation
convolution using the
P NT
or
1i.T
nloclulo
P 5 = 2 3 2
+
I t has two's compl(:mcnt represcntation of rlegat.ive int
gers, i.c., -- x is epresented as x.
W o
want. rwgativ
integers
to
be reprcscntedas the conlplernent modu
232+ 1,
i.e.,
- -x is t o
be rcprcscnted by
F 2 -
1
-- x,
an
to accomplish this
1
is
a.dded
whenever a nega.tivo int>e
is encountered in the data. As noted before,
on
this m
chine,
we
cannot represorrt
--- l .
If a
-- l
is cmmxntcr
in the data,
it
is roundcd t:ithw
to 0
or to
-2 .
This i
equivalent to introducing some initial quantizat,ion nois
If
the data are uncorrclatt:d, t'hc probability th at w
appear after an arithmetical opcration during cornp1~
tion of the transform
is
roughly
2-32
=
pcropcmtion
This crror int,roduced during computation
s
ratllcr
wriou
andprobably ho corrcsponding output b1oc.k
will
b
meaningless.Assuming
N = 64, thc
probability that
particular block is erroneous is roughly - k'
most filtering operat'ions, his may
be permissible.
Logical add ( A lA ) andsubtract nstructions
SLI
are used t o add and subt'ract
w o
intcgcrs modulo
Fj.
If
carry is detected aftor add,
1
s subtracted frorn the resul
and if 110 carry is dctectlctl aflcr subtract,
I
is added t
the result. Multiplication by 'Lk is done using tho logic
left shift operation
(SLl)I.,)
and m~dltiplicat,ion
y
2 ';
-
8/20/2019 1974 Fast Convolution Using Fermat Number Transforms With Applications to Digital Filtering
9/11
AGARWAL AND BURRUS: FAST CONVOLUTION APPLICATIONS TO DIGITALILTERING 95
TABLE 111
CYCLICCONVOLUTIONIMINGSOR LENGTHN
REAL
SEQUENCES
N
ms
ms
FFT FNT
or IZT
32 16
64
3 . 3
128
31
7 . 4
60
256
16.68
123
256
40
Ob
123
512
8o.G
243
1024
168.00
530
2048
340.Oc
1260 720.00
a
Using = d.
b
Using 2
by
128 convolution.
Using two-dimensional RT.
doneusing logical rightshiftoperation (SRDL)
, 0
<
k < 32, as explained in Section VI. Similarly, multiplica-
tion of two integers modulo F 5 can also be done.
Two assembler ,subprograms were written to compute
the fast R T and the inverse for any sequence length from
to 26, taking
a
as
a
power of
2 .
Two more subprograms
were written to compute the fast FNT and it s nverse for
length 128 sequences, using a: =
f i ,
given by (23). Out
of the 7 stages of the fast algorithm, 6nly one stage re-
quiresmultiplication byoda powers of ~, which are
implenxnted as suggest'ed in Section VI.
To cyclically convolve sequences longer t han 128, tyo-
dimensionalconvolution schemeswereused [l2].One
suchscheme was 2 by 128 convolution or length 256
sequences discussed in Section X of [li]. Another pro-
gram was wri tten o convolve ong one-dimension61
se-
quences using two-dimensional RT's, as discussed in Sec-
tion
I11
of
[12].
Timings for various implementations of
convolut'ion and heir comparisonwith the FFT imple-
mentationsarc shown inTable 111. To compute hese
timings it is assumed tha t transforms of the
h
sequences
have been precalculated. Thus, timings arc for computing
the transforms, multiplications of the transforms, and
their inverse transforms.
For
the FE'T implementation, a
very efficient algorithm [17] was used, which employed
both radix 4 and radix 2 steps and took into account the
fact, that the dat'a are real.
For
sequences up to length
256, improvements of t,he order of 3 to 5 were obtained
over the FFT implementations.Since thehardware of
IBM 360/370 series isnot designed todoarithmetic
modulo Permat numbers, extra branch instructions
were
necessaryaft,er add or subtract opcrat'ions to ake nto
account the carry bit. If the hardware is designed to do
arithmetic modulo Fermatnumbers, hese nstructions
could be avoided, resulting n a significant reduction in
the computationime. Copies of t'hessemblerub-
programs arc available from the authors .
IX. GENERALIZATIONSAND OTHER
APPLICATIONS
In this paperwe have discussed convolution in the ings
of integersmodulo of Fermat numbers. These numbers
seem to be the best choice for implementation on digit,al
computers. Nevertheless, any integer F , for which
0
( F ) >
1
can be used, was discussed in Section IV. Rader
[ lo]
proposed the use of Mersenne numbers
, M ,
(2"
-
1,
p a
prime). MNT's have Q
=
and N =
2 p ,
and therefore
do not have an FFT-type fast computational algorithm.
In many situations the quantization and overflow con-
siderations may require arithmetic to be done using 10
to 20-bit arithmetic.For hissituation,working n the
integer ring modulo F 4
2 1 6
+
1)
would be insufficient and
at the same tirne, computing the
FiYT
modulo F5(232+
1 )
would require excessive number of bits. In this situation;
we may perform convoiution modulo zz4
+
1. Also, many
computershave 24-bit word ength.Transformssimilar
to FNT's exist for thissituation.We could take F =
224f 1 and hen
=
8 gives N
=
16, and
(Y
= S1/Z
=
27(212
1)
gives N
=
32 as the maximum length of the
one-dimensional transform.
In generalone could take F = 2 sZ t +
1, s
is an odd
integer. In th at case,
a
=
2"
would give
N
=
2t+', and
2t+2.
n many situations it may be possible to have trans-
forms
o f
greater ength han 2t+2 . For example, taking
F
=
240+ 1, t'he maxinium possible transform length is
256,' and, aking F = 280+ 1; the maximum possible
transfotm length is 1024, but the corresponding
a's
may
not besimple.
As discussed earlier, the FNT implementation requires
roughly twice the number of bits for the arithmetic. Most
minicomputers have short word lengths and in that situ-
ation it may not be possible to efficiently use F'KT's, but
there is a solution to this problem. We can split the dat'a
bits int'o two halves and then convolve them.
= p)1/2
= 2 [ ( s - l ) / ~ + s 2 t - ~ 1 ( 2 ~ 2 t - - l 1) would give iv=
4 )
=
zz(n)
+
z1(n)2k, I x z ( n ) I <
h ( n ) =
h2(n) +
h n ) 2 k ,
I
hz(n)
<
2k ( 2 4 )
y = 2*h
=L
Zl*h1.22k + (21*h2+ zz*h1)2k
+
22 h2.
( 2 5 )
NOW,
ince
21,
h ~ ,2 ,and hz have roughly half the number
of bits,
it
should be possible to convolve them using ap-
proximately half the number of bit's. If necessary, a more
precise analysis of the abovesituation could be easily
performed. I n (25), the last erm, n comparison to the
first term, is very small and can be neglected. We need to
take two transforms for z and two transforms for h, the
pummation hownwithin the parent,hescscan be per-
formed in the transform domain. Finally,we need to take
two inverse ransforms, one for zl*hl and t,he other for
(z1*hz+ 52*hl .
FNT's can also be used forconvolutions equired in
block recursive realizations of recursive digital filters[lfi].
For these realizations, convolutions form the main com-
putational 'ask. Sinceconvolution with heIWT'scan
be performed with both efficiency and with perfect accu-
racy, it should be very attract,ivefor this applicat'ion. For
very high-order recursive filters this may reduce roundoff
noise problems associated with these filters. Other appli-
-
8/20/2019 1974 Fast Convolution Using Fermat Number Transforms With Applications to Digital Filtering
10/11
96 IEEE
TRANSACTIONS
ON ACOUSTICS, SPEECH, ANI)
SIGNAL PROCESSING,
APRIL
1
APPENDIX A
1’ROPERTIES
O F
THE
TRANSFORMSAVING
G.
Th’eore’n
THE CYCLICONVOLUTIOSROPERTY If
Belome summarize the definition and some important
T {
(n) = F ( W
propertics of the transforms having the cyclic convolution
property. It is assumed that allarithmeticaloperations
are performed in a ring. Most of these properties are simi-
H .
Fast ~ ~ ~ ~ ~ ~ ~ t ~ t i ~ ~ ~ l , 4 l
lar to the
DFT
properties [l].
A , Definition
Transform :
Ti f ( n +
m
= F ( K ) o c - m K . (A
If N can be factored as
N
= r 1 r 2 r m
then a fast computational algorithm similar to the
F
can be used to compute the transform which requires
the order of N ( r l
+ r2
+ -
- + r,)
arithmetical ope
tions.
N -
1
F ( k ) = f (n )a”k
k
= O , l , - . . , N 1.
7L=O
CYN
= 1
Inverse Transform:
N- 1 I . Convolut.ion Property
f ( n )
=
N-’ F ( k ) -nk n =
0,1, -,lv
- 1. (A2)
Transform of cyclic convolut,ion of two sequences
k 4 the product
of
their t’ransforms,.e., if
C . 0rth.ogonality
(A
The inner product of two basis functions are given by ? J . Parseval’s Theorem
[see (12 j and13)]
X ( k )
= T ( z ( n ) ]
H ( k ) = T { h ( n ) j .
N if
n a
= n mod N
0 otherwise
This implies t,he orthogonality of the basis functions.
B . Periodicity
(A41
Then
N- 1 N-
N s ( n ) h ( n )=
X ( k ) N ( - - 1 2 )
(A
n=O k=O
and
F ( k ) can also be periodically extended, similar to $he
x
1
N- 1
N
s ( n ) h ( - n ) = X ( k ) H ( k ) . ( h
extension of f ( n )
a=O
k=O
f ( n + N ) = f ( n ) For,heart’icular case of s (n ) = h n )
F ( K
+ N ) =
F ( K ) .
(AS)
A‘- 1 N - 1
N
2 2 ( n ) =
X k ) X
16
( M
E. Symmetry Proper ty
7Z=O k=O
If
the signal is symmetric, i.e.,
and
N-
1
N- 1
f n) f ( - n ) =
f ( N
-
n) J V
z n ) z
n)
=
X ( k ) 2 .
(-4
the transformed sequence is also symmetric, Le.,
n=o
k=O
Note that in his case, the conventional Parswal’s theor
F ( K ) = F ( - K )
=
F ( N
- K ) .
(AB)
N-1
N- 1
If the signal is antisymmetric, i.e.,
f (n ) = -f(-n.)
=
-f(N
n ) ,
does not hold,ecause in aingmagnitude is undefined
-
8/20/2019 1974 Fast Convolution Using Fermat Number Transforms With Applications to Digital Filtering
11/11
K . Stretch
Property
Thestretchpropertyexists if the ransformfor he
longer length sequence exists in that ing.
L.
Sampl ing Proper ty
The sampling propertyexists.
APPENDIX B
PROOF O F THEOREM
I
A8 shown before, the existence of such a transform de-
pends on the existence of a n a of order N with respect to
each prime power factor moduli, pari. According to Euler’s
Theorem
[14],
N , order of
a,
must divide
cp
( p i 7 ; ) ,where
(o(piT i ) s the Euler’s (o function given by
c p ( p p )
=
p p - y p i 1).
(B1)
Thus, N j ‘p (p i + i ) ) , nd since p i isnota actor of I\-
N
I ( p i , i = 1,2,.-.,1or
N
I
gcd(p1 ,p2
, .* . ,pz
1 )
or
N I
O ( F ) . (B2)
This proves the ecessary part of the theorem. To prove
sufficiency we have to show the existence of an
a
of order
N if
hr
[
0
( F ) Note that (B2) rules out F being an even
integerbecause n th at case 0
8’)
= 1. If N I 0 ( F )
N I (pi
-
1) or
N I
( o ( p i ) , herefore, one can find integers
pi of order N modulo piri [14], i = 1,2, *.,1. Then, by
the Chinese Remainder Theorem, one canfind an
a
of
order N modulo F = p p . - p f t , such tha t
a
=
pi(mod p i r i ) i =
1,2,. . Z. (B3)
This is our desired a of order
N .
Since
N
and p i are rela-
tively prime, N-I exists in this ring. This completes the
proof of the theorem.
APPENDIX C
T o prove that a p of ( 2 3 ) is of order 2t+2mod F t , we
have to prove that it is of order 2t+2with respect to each
prime power factor
of F , 20).
Let piri be a prime power
factor of F t , then
ap2t+2 =
22t+l =
1
mod
F t
=
1mod
p p .
AGARWAL
AND
BURRUS: FAST CONVOLUTION APPLICATIONS TO
DIGITAL
FILTERING
97
Let Ni e the order of apwith respect to p p , then by
EuIer’s theorem,
N i
I 2‘” (C2>
therefore N ; must be
a
power of 2.
Also,
aP
=
22t = -1mod
F ,
t+ l
=
-1
mod
p p .
((333
From (C l) and (C3)
, it
is obvious th at
completing the proof.
Ni= 2t+?
ACKNOWLEDGMENT
Theauthors wish to hank C. M. Rader of Lincoln
Laboratqries,Cambridge,Mass., or
his
comments and
discussions.
REFERENCES
B. Gold and C.
M.
Rader, Digital Processing of Signals. New
York: McGraw-Hill,
1969.
A.
V.
Oppenheim and C. Weinstein, “Effects of finite register
length in digital filtering and the fast Fourier transform,’, Proc.
D.
E.
Knuth, The Art
of
Computer Programming, vol, 2, Semi-
numerical Algorithms. Reading,
Mass.:
Addison-Wesley, 1969,
p.
555
and pp. 259-262.
J.
M.
Pollard, “The fast Fourier transform in
a
finite field,”
Math. Compul., vol.
25,
pp.
365-374,
Apr.
1971:
I.
J.
Good, “The relation between two fast Fourier transforms,”
IEE E Trans. Comput., vol.C-20, pp. 310-317, Mar. 1971.
A. Schonhage and V. Strassen, “Fast multiplication of large
numbers,” (in German) Comput., vol. 7, pp. 281-292, 1971.
D.
E. Knuth, ‘lThe art
of
computing programming-errata et
addenda,” Comput. Sei. Dep., Stanford Vniversity, Stanford,
Calif., Rep. STAN-CS-71-194, pp. 21-26, Jan. 1971.
forms,”
J . Comput.
Syst. Sci., vol. 5 ,
pp. 524-547, 1971.
P. J.
Nicholson, "Algebraic
theory
of
finite Fourier trans-
C. M. Rader,’ “The number theoretic DF T and exact discrete
convolution,” IEE E Arden House Workshop on Digital Signal
Processing, ‘Harriman, N. Y.,. Jan. 11, 1972.
R. C. Agarwal and
C.
S. Burrus, “Fast digital convolution
Trans. Comput., vol. C-21, pp. 1269-1273, Dee. 1972.
using Fermat transforms,” in outhwest IE EE Conf. Rec.,
Houston, Tex., Apr.
1973,
pp.
538-543.
dimensional techniques,” IEEE Trans. Awust., Speech,and
-, ‘‘Fakt one-dimensional digital convolution by multi-
Signal processing, vol.
ASSP-22,
pp.
1-10,
Feb.
1974.
L. E. Dickson, History of the Theory of Numbers, vol. I.
Washington, D. C.: Carnegie Institute,
1919,
p.
376.
Hill, 1948.
0. Ore, Number Theory and
I t s
History. New York: McGraw-
Proc. Conf. Applications of Walsh Functions; Washington,
D.
A. Pitassi, “Fast convolution usingWalsh functions,” in
D.
C., April 1971, pp. 130-133.
C. S Burrus, “Block realization of digital filters,” IE EE Trans.
Audio Electroacoust., vol. AU-20, pp. 230-235, Oct. 1972.
R.
C. Singleton, “An algorithm for computing the mixed
radix
f a s t Fourier transform,” IEEE Trans. Audio Eledroawust.,
vol. AU-17. DD.9.3-103. June 1969.
IEEE, V O ~ .
60,
pp.
957-976,
Aug.
1972.
“Discrete convolution via Mersenne transforms,” IEEE