Jisha Ijera Paper

download Jisha Ijera Paper

of 5

Transcript of Jisha Ijera Paper

  • 7/29/2019 Jisha Ijera Paper

    1/5

    Jisha M. Vijayan, Arathy Iyer / International Journal of Engineering Research and

    Applications (IJERA) ISSN: 2248-9622 www.ijera.com

    Vol. 3, Issue 1, January -February 2013, pp.1671-1675

    1671 | P a g e

    Area Efficient Fast Block LMS Adaptive Filter Using Distributed

    Arithmetic

    Jisha M. Vijayan, Arathy IyerM.Tech VLSI (IV Sem) S.N.G.C.E., KadayiruppuErnakulam, India

    Asst. Professor, Dept. of ECE S.N.G.C.E., KadayiruppuErnakulum, India

    AbstractIn FBLMS algorithm, the filter weights are

    adapted in the frequency domain by using the FFT

    algorithm. The main hardware complexity of thesystem is due to hardware multipliers in

    FFT/IFFT block. Introduction of Distributed

    Algorithm eliminates the need of that multipliers

    by a mechanism that generates partial products

    and then sums the products together and resulting

    system will have high area efficiency. Using DA,

    FFT can be efficiently calculated by jointly

    employing the Good-Thomas and Rader

    algorithms. In the proposed FBLMS algorithm

    using DA, it is required to calculate only half of

    the conjugate symmetric coefficients, under thatcondition the hardware requirements for the

    proposed system is approximately half of that of

    existing one.

    KeywordsFBLMS, signed DA, Unsigned DA,Good-Thomas, Rader algorithm.

    I. INTRODUCTIONThe term estimator or filter is commonly

    used to refer to a system that is designed to extractinformation about a prescribed quantity of interestfrom noisy data. At the core of DSP applications is

    the digital filter. Digital filters are generally used for:Separation of signals that have been combined,Restoration of signals that have been distorted,

    Transform operations. In fast block LMS algorithmthe filter parameters are adapted in the frequencydomain by using the FFT algorithm.

    In this paper, a new DA based FBLMS is proposedto reduce area and power consumption. Distributedarithmetic (DA) is an efficient multiplication-freetechnique for calculating inner products. The

    multiplication operation is replaced by a mechanismthat generates partial products and then sums theproducts together. The key difference betweendistributed arithmetic and standard multiplication isin the way the partial products are generated andadded together. In the proposed architecture, usingthe DA based FFT (and IFFT) block FFT can be

    efficiently calculated by jointly employing the Good-Thomas and Rader algorithms. DA based FFT blockprovides only half of the conjugate symmetric outputs

    without calculating others and in DA based IFFTblock we require to feed only half of the conjugatesymmetric coefficients not all, which is not possible

    in existing FBLMS based adaptive filters. Since inthe proposed FBLMS algorithm based adaptive filter,

    we require to calculate only half of the conjugatesymmetric coefficients, under that condition thehardware requirements for our proposed system isapproximately half of that of existing one.

    II. DISTRIBUTED ARITHMETICDistributed arithmetic (DA) is first

    introduced by Croisier, and Zohar and furtherdeveloped by Peled and Liu more than three decadesago. The DA is a direct method for sum of productsoperations, partial products can pre-compute by

    difference equation and storing in look-up-table

    (LUT) contained in memory, input signals are usedfor addressing.

    Consider the following inner product of two N

    dimensional vectors c and x, where c is a knownconstant vector, x is the input sample vector, and y is

    the result.

    = , = 1=0 (1)= 00 + 11+. . .+ 1 1 (2)

    An unsigned DA system assumes that the variablex[n] is given by

    = 2 1=0 0,1 (3)where xb[n] denotes the b

    thbit ofx[n], i.e., the n

    th

    sample ofx. The inner product y can, therefore, berepresented as:

    =

    2

    1

    =0

    1

    =0(4)

    in more compact form

  • 7/29/2019 Jisha Ijera Paper

    2/5

    Jisha M. Vijayan, Arathy Iyer / International Journal of Engineering Research and

    Applications (IJERA) ISSN: 2248-9622 www.ijera.com

    Vol. 3, Issue 1, January -February 2013, pp.1671-1675

    1672 | P a g e

    = 2 , (5)1=0

    1=0

    =

    2

    1

    =0 (

    ,

    )

    1

    =0(6)

    In case of signed DA, we use modified equation inorder to process a signed twos complement number.We use, therefore, the following (B + 1)-bitrepresentation

    = 2 + 2 (7)1=0 the outcomeyis defined by

    = 2 , + 2 , 1

    =01

    =0 (8)

    III. PROPOSED SYSTEM

    S/Ps(n)

    BUFFER

    OF

    LENGTH

    N= 2M

    DA-FFT

    of length

    N

    DA-IFFT

    of length

    N

    Save

    last N/2

    data

    y(n)y(k)

    Conjugate

    Multiply by

    DA-FFT of

    length N

    Append

    zero block

    Delete last

    block

    DA-IFFT of

    length N

    Delay

    (k)

    (k+1)

    DA-FFT of

    length N

    Insert zero

    block

    P/S

    S/Pd(n)

    Desired

    response

    Old

    s

    New

    s

    y

    Discarded

    e(n)

    SH(k)

    e0

    E(k)

    Discarded

    0

    S (k)

    Figure 1. Proposed DA based FBLMS Adaptive filter

    FFT and IFFT are the blocks that include largenumber of multipliers. Introduction of DA eliminatesthe need of that multipliers by a mechanism thatgenerates partial products and then sums the productstogether and resulting system will have high areaefficiency.

    IV. INTRODUCTION TO FFTUSING DA

    Figure 2. FFT Blocks

    Using DA, FFT can be efficiently calculated byjointly employing the Good-Thomas and Raderalgorithms.

    A. Good-Thomas Index MappingGood-Thomas algorithm re-expresses the

    discrete Fourier transform (DFT) of a size N = N1N2

    as a two-dimensional N1N2DFT, but only for the casewhere Nland N2are relatively prime.The index mapping suggest by Good and Thomas forn is

    = 21 + 12 0 1 1 10 2 2 1 (9)and as index mapping forkresults

    = 22111 + 11122 0 1 1 10 2 2 1 (10)A = 12-point DFT can be computed accordingto following steps:1) Index transform of input sequence, according to

    equation for n.

    2) Computation of 2 DFTs of length 1 usingRader algorithm.

    3) Computation of 1 DFTs of length 2 usingRader algorithm.

    4) Index transform of input sequence, according toequation for k.

    To find the 14 point FFT block, Fig. 3 shows aschematic of a block processing system.

  • 7/29/2019 Jisha Ijera Paper

    3/5

    Jisha M. Vijayan, Arathy Iyer / International Journal of Engineering Research and

    Applications (IJERA) ISSN: 2248-9622 www.ijera.com

    Vol. 3, Issue 1, January -February 2013, pp.1671-1675

    1673 | P a g e

    Figure 3. Mapping of 14 point FFT using Good-

    Thomas Algorithm

    From Fig. 3, we realize that first stage has 2 DFTs

    each having 7-points and second stage has 7 DFTseach having of length 2. Here multiplication withtwiddle factors between the stages is not required.B.Rader Algorithm

    It is easily seen from the definition of theDFT that the transform of a length N real sequence

    x(n) has conjugate symmetry, i.e. X(N - K) = X*(K).This property facilitates to compute only half of thetransform, as the remaining half is redundant andneed not be calculated. Rader algorithm providesstraightforward way to compute only half of the

    conjugate symmetric outputs without calculating theothers. Algorithm presented here first decomposes theone dimensional DFT into a multidimensional DFTusing the index map proposed by Good. Next, amethod which is based on the index permutationproposed by Rader is used to convert the short DFTs

    into convolution. This method changes a prime lengthN DFT of real data into two convolutions of length

    (N- 1)/2.

    Now consider1 = 7 if the data are real we need tocalculate only half of the transform. Also, as Radershowed the zero frequency term must be calculatedseparately.

    (0) = 1=0 (11)In matrixform, we write

    123 = 1 2 3

    2 4 6

    3 6 2

    4 5 61 3 5

    5 1 4

    1

    2

    3456+ 000 (12)

    Replacing by(),123 =

    1 2 3

    2 3 13 1 2

    3 2 11 3 22 1 3

    12

    3

    456

    + 0

    00 (13)

    If real and imaginary parts of W matrix are separated,a simplification is possible. Consider first the real

    part using notation in matrix that k stands fors(2/7). The real part becomes12

    (3)

    = 1 2 32 3 13 1 2

    (1) + (6)(2) + (5)

    (3) +

    (4)

    + 00

    (0)

    (14)Using the notation of k for n(2/7) gives, for theimaginary part

    12(3) = 1 2 32 3 13 1 2

    (1) (6)(2) (5)(4) (3)(15)

    These are cyclic convolution relation. Since in ourproblem we always convolve with the same

    coefficients, arithmetic efficiency can be improved byprecalculating some of the intermediate results. Theseare stored in table in memory and simply addressed

    as needed. Using distributed arithmetic this can beimplemented efficiently.

    x 0

    x 2

    x 4

    x 5

    x(3)

    x 1

    x(13)

    x 11

    x(9)

    x(7)

    x 12

    x 10

    x(8)

    x(6)

    X(0)

    X(7)

    X(8)

    X(1)

    X(2)

    X(9)

    X(10)

    X(3)

    X(4)

    X(11)

    X(12)

    X(5)

    X(6)

    X(13)

    k1=0

    k1=1

    k1=2

    k1=3

    k1=4

    k1=5

    k1=6

    n2

    =1

    n2

    =0

  • 7/29/2019 Jisha Ijera Paper

    4/5

    Jisha M. Vijayan, Arathy Iyer / International Journal of Engineering Research and

    Applications (IJERA) ISSN: 2248-9622 www.ijera.com

    Vol. 3, Issue 1, January -February 2013, pp.1671-1675

    1674 | P a g e

    R1+

    x(1)

    x(6)

    R2+

    x(2)

    x(5)

    R3+

    x(3)

    x(4)

    R4ROM 1

    R1-

    x(1)

    x(6)

    R2-

    x(2)

    x(5)

    R3-

    x(3)

    x(4)

    R4ROM 2

    ADD_SUB

    R5

    R6

    R7

    +

    +

    +

    x(0)

    Xk(1)

    Xk(2)

    Xk(3)

    ADD_SUB

    R5

    R6

    R7

    Xl(1)

    Xl(2)

    Xl(3)

    + X(0)

    R0

    x0x1x2x3x4x5x6

    Figure 5. Architecture for FFT using DA.

    V. EXPERIMENTAL RESULTSVerilog codes are written for both FFT using

    DA and shift and add method and synthesized using

    Xilinx 13.2 version. Family of device was Spartan 3E

    and target device was XC3S500E. Fig. 6 shows thelogic utilization of both the architectures and Table 1shows Delay in ns. From these results, it is clear that

    clear that our proposed architecture based adaptivefilter is faster than that used by shift and add methodand Area utilization using DA is very much less than

    that by shift and add method. Thus the powerutilization is also less in case of FBLMS using DA

    Table 1. Delay Comparison

    Design Minimum Period in

    ns

    1616 DA Multiplier 35.2811616 SA Multiplier 36.296

    Figure 6. Comparison of resource utilization in

    FPGA

    VI. CONCLUSIONIn this paper, we have proposed a new area

    efficient FBLMS adaptive filter. FBLMS is the fastestand computationally efficient adaptive algorithm andFFT, the main computational block in FBLMS can be

    efficiently calculated by DA. The concept of DAinvolves for implementation of FFT block without anyhardware multiplier using LUT and adders. Due to

    reduced hardware complexity the proposed DA basedFBLMS adaptive filter is best suitable forimplementation of higher order filters in FPGA

    efficiently with minimum area requirement, lowpower dissipation.

    REFERENCES[1] S. Haykin, Adaptive Filter Theory, 4th ed., T.

    Kailath, Ed. Pearson Education, 2008.

    [2] U Meyer Baese, Digital Signal Processing Withfield Programmable Gate Arrays,3rd ed.

    [3] Sudhanshu Baghel , Rafiahamed Shaik, FPGAImplementation of Fast Block LMS AdaptiveFilter Using Distributed Arithmetic for High

    Throughput, p443-p447, 2011.

    [4] Arman Chahardahcherik, Yousef S. Kavian, Otto

    Strobel, and Ridha Rejeb, Implementing FFTAlgorithms on FPGA, IJCSNS InternationalJournal of Computer Science and NetworkSecurity, VOL.11 No.11, pp. 148 156,November 2011.

    [5] S. A. White, "Applications of distributedarithmetic to digital signal processing: A tutorial

    review,"IEEE ASSP Magazine, July 1989.[6] C. S. Burrus, "Index mappings for

    multidimensional formulation of the DFT andconvolution," IEEE Transactions On Acoustics,

    Speech, And Signal Processing, vol. 25, pp. 239-242, June 1977.

    [7]

    D.J. Allred, W. Huang, Y.Krishnan, H. Yoo, D.VAnderson, "An FPGA

    4292

    2576

    7113

    3402

    2102

    5639

    0

    1000

    2000

    3000

    4000

    5000

    6000

    7000

    8000

    Slices Slice flip flops LUTs

    FFT using shift and add FFT using DA

  • 7/29/2019 Jisha Ijera Paper

    5/5

    Jisha M. Vijayan, Arathy Iyer / International Journal of Engineering Research and

    Applications (IJERA) ISSN: 2248-9622 www.ijera.com

    Vol. 3, Issue 1, January -February 2013, pp.1671-1675

    1675 | P a g e

    implementation for a high throughput adaptivefilter using distributedarithmetic." 12th Annual IEEE Symposium onField-Programmable

    Custom Computing Machines, pp. 324 - 325,

    2004.

    [8] S. K. M. Gregory A. Clark and S. R. Parker,

    "Block implementation of adaptive digitalfilters," IEEE Transactions on Circuits andSystems, vol. 28, pp. 584592, 1981.

    [9] C. M. Rader, "Discrete fourier transforms whenthe number of data samples is prime," IEEEProceedings, vol. 56, pp. 1107-1108, June 1968.