FPGA Based Hybrid LMS Algorithm Design on Distributed Arithmetic

6
International Journ Internat ISSN No: 245 @ IJTSRD | Available Online @ www FPGA Ba Design Y 1 P Department of University Visvesvaray ABSTRACT Filter plays a major role for removal signal/noise from the original sign Adaptive FIR filter is easy to attra applications, where it is need computational requirement. This paper pipelined design for low-power, high-th low-area implementation of adaptive fi distributed arithmetic (DA). The DA utilized for two separate blocks weight and filtering operations requires larger a suited for higher order filters ther reduction in the throughput. These issu overcome by efficient distributed fo Adaptive filters. LMS adaptation per sample by sample basis is replaced by a update by the weight update scheme. shift accumulation for inner product replaced by conditional signed accumulation to reduces the sampling p density. KEY WORDS: Distributed Arithmetic, Filter, VHDL, DA, Adaptive Filter, LM Carrysave adder, Noise Cancellation, Processing 1. INTRODUCTION Adaptive digital filters find wide appli advanced digital signal processing (DS noise and echo cancellation, system channel estimation, channel equalizatio [1]. The tapped-delay-line f response(FIR) filter whose weights are u celebrated “Widrow Hoff” least-mean- algorithm [2] might be considered as the forward known adaptive filter. The L filter is prominent because of its low-c nal of Trend in Scientific Research and De tional Open Access Journal | www.ijtsr 56 - 6470 | Volume - 3 | Issue – 1 | Nov w.ijtsrd.com | Volume – 3 | Issue – 1 | Nov-Dec ased Hybrid LMS Algorithm n on Distributed Arithmetic Yamuna P 1 , B K Venu Gopal 2 PG.Scholar, 2 Associate Professor f Electronics and Communication Engineering, ya College of Engineering, Bengaluru, Karnataka l of unwanted nal, especially act for many to minimize shows a novel hroughput, and filter based on A formulation t update block area and is n’t erefore causes ues have been ormulation of rformed on a dynamic LUT . Adder based t computation d carry-save period and area LMS Adaptive MS algorithm, Digital Signal ication in few SP) areas, e.g., identification, on and so forth finite-impulse- updated by the -square (LMS) e most straight LMS adaptive complexity, as well as due to its stability and performance [3]. because of current relevance and expand time, and power complexity, e of the LMS adaptive filter implement the LMS algorithm the filter weights during each the estimated error, which between the current filter o response. The new weights of are refreshed. Then μ converg which is usually assumed to b Nis the quantity of weights adaptive filter. The pipelinin over the time consuming com structure. The normal LMS support pipelined implementa behavior. It is adjusted to a f LMS (DLMS) algorithm. T altered systolic architecture t delay where they have utilize processing elements (PEs) for adaptation delay with the basic The direct form configuration the FIR filter subsequently p path because an inner product a filter output. subsequently, w having a high sampling rate reduce the critical path of the that it could not surpass th Distributed Arithmetic based t structure which is stay away expands the throughput. 2. FIR FILTER The FIR filter is a circuit tha (samples of numbers) and p evelopment (IJTSRD) rd.com v – Dec 2018 c 2018 Page: 558 a, India d attractive convergence its few essential use of ding limitations on area, efficient implementation is still very vital. To m, one needs to update h sampling period using equals the difference output and the desired f the LMS adaptive filter gence factor or step-size, be a positive number and s utilized in the LMS ng structure is proposed mbinational blocks of the S algorithm does not ation due to its recursive form called the delayed This DLMS utilize an o reduce the adaptation ed moderately extensive r accomplishing a lower c way of one MAC task. n on the forward path of prompts a long critical t computation to acquire when the input signal is e, it becomes crucial to e structure with the goal he sampling period [4]. technique comprises of a from of multiplier that at filters a digital signal provides output that is

description

Filter plays a major role for removal of unwanted signal noise from the original signal, especially Adaptive FIR filter is easy to attract for many applications, where it is need to minimize computational requirement. This paper shows a novel pipelined design for low power, high throughput, and low area implementation of adaptive lter based on distributed arithmetic DA . The DA formulation utilized for two separate blocks weight update block and filtering operations requires larger area and is nt suited for higher order filters therefore causes reduction in the throughput. These issues have been overcome by efficient distributed formulation of Adaptive filters. LMS adaptation performed on a sample by sample basis is replaced by a dynamic LUT update by the weight update scheme. Adder based shift accumulation for inner product computation replaced by conditional signed carry save accumulation to reduces the sampling period and area density. Yamuna P | B K Venu Gopal "FPGA Based Hybrid LMS Algorithm Design on Distributed Arithmetic" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-1 , December 2018, URL: https://www.ijtsrd.com/papers/ijtsrd19018.pdf Paper URL: http://www.ijtsrd.com/engineering/electronics-and-communication-engineering/19018/fpga-based-hybrid-lms-algorithm-design-on-distributed-arithmetic/yamuna-p

Transcript of FPGA Based Hybrid LMS Algorithm Design on Distributed Arithmetic

Page 1: FPGA Based Hybrid LMS Algorithm Design on Distributed Arithmetic

International Journal of Trend in

International Open Access Journal

ISSN No: 2456

@ IJTSRD | Available Online @ www.ijtsrd.com

FPGA BasDesign

Yamuna P1P

Department of University Visvesvaraya College

ABSTRACT Filter plays a major role for removal of unwanted signal/noise from the original signal, especially Adaptive FIR filter is easy to attract for many applications, where it is need to minimize computational requirement. This paper shows a novel pipelined design for low-power, high-throughput, and low-area implementation of adaptive filter based on distributed arithmetic (DA). The DA formulation utilized for two separate blocks weight update block and filtering operations requires larger area and is n’tsuited for higher order filters therefore causes reduction in the throughput. These issues have been overcome by efficient distributed formulation of Adaptive filters. LMS adaptation performed on a sample by sample basis is replaced by a dynamic LUT update by the weight update scheme. Adder based shift accumulation for inner product computation replaced by conditional signed carryaccumulation to reduces the sampling period and area density. KEY WORDS: Distributed Arithmetic, LMS Adaptive Filter, VHDL, DA, Adaptive Filter, LMS algorithm, Carrysave adder, Noise Cancellation, Digital Signal Processing 1. INTRODUCTION Adaptive digital filters find wide application in few advanced digital signal processing (DSP) areas, e.gnoise and echo cancellation, system identification, channel estimation, channel equalization and so forth[1]. The tapped-delay-line finiteresponse(FIR) filter whose weights are updated by the celebrated “Widrow Hoff” least-mean-algorithm [2] might be considered as the most straight forward known adaptive filter. The LMS adaptive filter is prominent because of its low-complexity, as

International Journal of Trend in Scientific Research and Development (IJTSRD)

International Open Access Journal | www.ijtsrd.com

ISSN No: 2456 - 6470 | Volume - 3 | Issue – 1 | Nov

www.ijtsrd.com | Volume – 3 | Issue – 1 | Nov-Dec 2018

FPGA Based Hybrid LMS Algorithm Design on Distributed Arithmetic

Yamuna P1, B K Venu Gopal2 PG.Scholar, 2Associate Professor f Electronics and Communication Engineering,

University Visvesvaraya College of Engineering, Bengaluru, Karnataka

Filter plays a major role for removal of unwanted signal/noise from the original signal, especially Adaptive FIR filter is easy to attract for many applications, where it is need to minimize computational requirement. This paper shows a novel

throughput, and filter based on

distributed arithmetic (DA). The DA formulation utilized for two separate blocks weight update block and filtering operations requires larger area and is n’t suited for higher order filters therefore causes reduction in the throughput. These issues have been overcome by efficient distributed formulation of Adaptive filters. LMS adaptation performed on a sample by sample basis is replaced by a dynamic LUT

e by the weight update scheme. Adder based shift accumulation for inner product computation replaced by conditional signed carry-save accumulation to reduces the sampling period and area

Distributed Arithmetic, LMS Adaptive DA, Adaptive Filter, LMS algorithm,

Carrysave adder, Noise Cancellation, Digital Signal

find wide application in few digital signal processing (DSP) areas, e.g.,

noise and echo cancellation, system identification, channel equalization and so forth

line finite-impulse-eights are updated by the

-square (LMS) might be considered as the most straight

known adaptive filter. The LMS adaptive complexity, as

well as due to its stability and attractiveperformance [3]. because of current relevance and expanding limitationstime, and power complexity, efficient implementation of the LMS adaptive filter is still very vitalimplement the LMS algorithm, one needsthe filter weights during each sampling period using the estimated error, which equals the difference between the current filter output and the desired response. The new weights of theare refreshed. Then µ convergence factor or stepwhich is usually assumed to be a positive number anNis the quantity of weights utilizedadaptive filter. The pipelining structure isover the time consuming combinational blocks of the structure. The normal LMS algorithm does not support pipelined implementation due tobehavior. It is adjusted to a form called the delayed LMS (DLMS) algorithm. This DLMS utilize an altered systolic architecture to reduce the adaptation delay where they have utilized moderately extensiveprocessing elements (PEs) for accomplishingadaptation delay with the basic way of one MAC taskThe direct form configuration on the forwarthe FIR filter subsequently prompts a long critical path because an inner producta filter output. subsequently, when the input signal is having a high sampling rate, it becomes crucial to reduce the critical path of the structure withat it could not surpass the sampling periodDistributed Arithmetic based technique compstructure which is stay away from of multiplier that expands the throughput. 2. FIR FILTER The FIR filter is a circuit that filters a digital(samples of numbers) and provides output that i

Research and Development (IJTSRD)

www.ijtsrd.com

1 | Nov – Dec 2018

Dec 2018 Page: 558

Karnataka, India

o its stability and attractive convergence . because of its few essential use of

vance and expanding limitations on area, time, and power complexity, efficient implementation

S adaptive filter is still very vital. To ement the LMS algorithm, one needs to update

the filter weights during each sampling period using the estimated error, which equals the difference between the current filter output and the desired

. The new weights of the LMS adaptive filter . Then µ convergence factor or step-size,

which is usually assumed to be a positive number and Nis the quantity of weights utilized in the LMS adaptive filter. The pipelining structure is proposed

the time consuming combinational blocks of the structure. The normal LMS algorithm does not

pipelined implementation due to its recursive to a form called the delayed

. This DLMS utilize an ystolic architecture to reduce the adaptation

they have utilized moderately extensive ing elements (PEs) for accomplishing a lower ion delay with the basic way of one MAC task.

The direct form configuration on the forward path of the FIR filter subsequently prompts a long critical

inner product computation to acquire , when the input signal is

having a high sampling rate, it becomes crucial to critical path of the structure with the goal

surpass the sampling period [4]. Distributed Arithmetic based technique comprises of a structure which is stay away from of multiplier that

The FIR filter is a circuit that filters a digital signal (samples of numbers) and provides output that is

Page 2: FPGA Based Hybrid LMS Algorithm Design on Distributed Arithmetic

International Journal of Trend in Scientific Research and Development (IJTSRD) ISSN: 2456

@ IJTSRD | Available Online @ www.ijtsrd.com

another digital signal with attributes that are reliantthe response of the filter. The FIR firecursive. It utilizes a finite duration of none zero input values and produces a finite duration of output values which are non-zero. These days, Field Programmable Gate Array technology is widely utilized in digitaprocessing area of the fact that FPGA-based solution can accomplish high speeds owing to its parallel structure and configurable logic, thus providing great amount of flexibility and high reliability throughout design process and later maintenance. A number of architectures have been reported in the writingmemory-based implementation of DSP algorithms which includes orthogonal transforms and digital filters [6]. FIR filters use addition to calculate their outputs simply like averaging does. The primitive elements utilized in the outline of a FIR filter are delays, multipliers and adders. The FIR filter comprisesseries of delays, multiplications and additions to create the time domain output response. The impulse response of the FIR filter is the coefficients utilized. The phase of a FIRwhich is its dominant feature. Filter is imsequentially utilizing a multiplier and an adder withfeedback which is outlined in the high level schematic in Fig. 1[11].

Fig.1: FIR Filter The LMS algorithm updates the coefficient in an iterative way and feeds to the FIR filter. The FIR filat that point utilizes this coefficient c(n) in addition to the input reference signal x(n) to generate the output response y(n). The output y(n) is then subtracted from the desired signal d(n) to produce an error signal e(n), or, in other words by the LMS algorithm to compute the next set of coefficients. 3. ADAPTIVE FIR FILTER The idea behind a closed loop adaptive filter is thvariable filter is balanced until the error (the difference between the filter output and the desired

International Journal of Trend in Scientific Research and Development (IJTSRD) ISSN: 2456

www.ijtsrd.com | Volume – 3 | Issue – 1 | Nov-Dec 2018

another digital signal with attributes that are reliant on the response of the filter. The FIR filter is non –

a finite duration of none zero duration of output

days, Field Programmable Gate Array (FPGA) in digital signal

based solution can accomplish high speeds owing to its parallel

and configurable logic, thus providing great amount of flexibility and high reliability throughout design process and later maintenance. A number of

been reported in the writing for of DSP algorithms

orthogonal transforms and digital

to calculate their outputs oes. The primitive elements of a FIR filter are delays,

comprises of a cations and additions to

the time domain output response. The impulse response of the FIR filter is the multiplication

. The phase of a FIR filter is linear ilter is implemented

a multiplier and an adder with a in the high level schematic

efficient in an e FIR filter. The FIR filter

s this coefficient c(n) in addition to the input reference signal x(n) to generate the output response y(n). The output y(n) is then subtracted from

an error signal e(n), LMS algorithm to compute

The idea behind a closed loop adaptive filter is that a until the error (the

difference between the filter output and the desired

signal) is minimized. Most adaptive filters are dynamic filters which change accomplish desired output. Adaptive filter has various parameters which should be chmaintain optimal filtering. Adaptive filteringtypes of algorithms one is Leastand the other Recursive0Least Squaresfiltering. one of the applications of adaptive filternoise cancellation. Adaptive filter has two vital work successfully in unknown environment; second, it is used to track the input signal of timecharacteristics [7]. Due to DSP has serial architecture,it can't process high sampling rateused only for amazingly ctasks. An example of adaptive filter systemin Fig. 3. The objective of themodify the coefficient of the adaptive fiwith the end goal to limit the error term e[n] in the mean-squared sense. The adaptive algorithm distinguishes a vector of coefficient w[n] that minimizes the following quadratic equation,

���� � ��|���|2 where e[n] = d[n] �y[n], d[n] is the output from the unknown system (desired signal), y[n] is the output from the adaptive filter, the expected value, and _[n] is the mean squared error (MSE). Manyto solve Eqn. 3.1,-0- the most widely recognized being the strategy for steepest descent. This procedure proceeds until the adaptive algorithm achievesstate, where the concessionsolution vector and the optimal solution, or MSE, is at its minimum. In general, adaptive algorithms can be separated into four steps that are performed consecutively: filtering, computing the error, calculating the coefficient updatescoefficients. When split into this form, the essentialdeference prediction and interference canceling.

Fig 3 Unknown and adaptive filter have different length

International Journal of Trend in Scientific Research and Development (IJTSRD) ISSN: 2456-6470

Dec 2018 Page: 559

signal) is minimized. Most adaptive filters are dynamic filters which change their attributes to

desired output. Adaptive filter has various be changed in order to daptive filtering has two

Least Mean0Square (LMS) Least Squares (RLS) optimal

filtering. one of the applications of adaptive filter is

properties: first, it can in unknown environment; second,

it is used to track the input signal of time-varying . Due to DSP has serial architecture, igh sampling rate applications and

complex math-intensive An example of adaptive filter system is shown

adaptive algorithm is to the coefficient of the adaptive filter, w[n],

the error term e[n] in the squared sense. The adaptive algorithm basically

vector of coefficient w[n] that minimizes the following quadratic equation,

� 2 (3.1)

y[n], d[n] is the output from the unknown system (desired signal), y[n] is the output

expected value, and _[n] is the mean squared error (MSE). Many methods exist

the most widely recognized being the strategy for steepest descent. This procedure

l the adaptive algorithm achieves steady ere the concession between the current

solution vector and the optimal solution, or MSE, is at aptive algorithms can be

eps that are performed ly: filtering, computing the error,

updates, and updating the plit into this form, the essential and interference canceling.

Unknown and adaptive filter have different

length

Page 3: FPGA Based Hybrid LMS Algorithm Design on Distributed Arithmetic

International Journal of Trend in Scientific Research and Development (IJTSRD) ISSN: 2456

@ IJTSRD | Available Online @ www.ijtsrd.com

Adaptive filtering has two basic operations: the filtering process and the adaptation processfilter used to generate output signal from an input signal by the filtering process, whereas the adaptation process consists of an algorithm whichdesired cost function by adjusting the coefficients of the filter. 4. LMS ADAPTIVE FILTER (EXISTING

DESIGN) LMS algorithm is introduced. LMS remainMean-Square. This algorithm was created by “Bernard Widrow” during the 1960’first0generally used adaptive algorithm. It is stigenerally utilized in adaptive digital signal processing and adaptive antenna arrays, primarily because of its simplicity, ease of implementation and greatconvergence properties. The objective of the LMS algorithm is to delivermean square error0 for the given conditionweights that minimize the mean-squared error between a desired signal and the arrays ouloosely speaking, it attempts to maximitowards the desired signal (who or what the array is trying to communicate with) and limit reception from the interfering or undesirable signals in the Fig 4

Fig 4 Conventional LMS adaptive filter some data is required before optimal weights can be resolved. The weights of LMS adaptivethe nth emphasis are updated according to the following equation. The updated weight is

wn+1= wn+ µenxn0 en = dn – yn0 yn = wTnx0

Where en is the error, dn is the desired during nth iteration calculated, µis convergence or step size. The weight vector Wn and input vector xn is given by

International Journal of Trend in Scientific Research and Development (IJTSRD) ISSN: 2456

www.ijtsrd.com | Volume – 3 | Issue – 1 | Nov-Dec 2018

two basic operations: the adaptation process. A digital

filter used to generate output signal from an input , whereas the adaptation

process consists of an algorithm which minimize a he coefficients of

LMS ADAPTIVE FILTER (EXISTING

is introduced. LMS remains for Least-. This algorithm was created by

’s, and is the used adaptive algorithm. It is still

in adaptive digital signal processing and adaptive antenna arrays, primarily because of its

ease of implementation and great

the LMS algorithm is to deliver the condition and squared error

between a desired signal and the arrays output, to maximize reception

the desired signal (who or what the array is reception from

desirable signals in the Fig 4

Conventional LMS adaptive filter

ptimal weights can be The weights of LMS adaptive filter during

are updated according to the following equation. The updated weight is

dn is the desired response , µis convergence factor

or step size. The weight vector Wn and input vector

The number of weights used in the LMS afilter is denoted as N. It is outstandingalgorithm has a moderate convergence for correlated inputs. Another circumstance that can show upidentification applications is when the coefficients of the model are time-varying. The ashould give a mechanism to track tmodel. Fig 4 in the LMS algorithm describvery low computational0complexity (number of additions, subtractions, divisions, multiplications per iteration) and memory loaexceptionally interestingimplementations. It is notableinfluences the performances of the aspite of its low complexity, the LMS has additionallysome drawback. 5. DISTRIBUTED ARITHMETIC (DA) Distributed arithmetic (DA) is signal processing algorithms where figurinproduct of two vectors involves majoritycomputational work load. This sortprofile depicts a huge partalgorithms, so the potential usage of distarithmetic is huge. The inner product is usprocessed utilizing multiplDistributed0Arithmetic, alongsideArithmetic, are computation algorithms that perform multiplication with look-up table based planblended some enthusiasm more than two decades prior however have mulled from that point onwardfact, DA particularly focuses on(sometimes referred to as the vector dot producomputation that spread a large numberDSP filtering and frequency transforming functions. XILINX FPGA look up table are not familiar fusers and DA algorithm popularly used to produce filter designs. The introduction of the DA algorithm to a great degree straightforward yet its applicaare to a great degree wide. The science incorporates a blend of Boolean and ordinary algeprior preparation - even for the logic designer. DA fundamentally diminishes the quantity ofadditions required for filtering. especially detectable for filters accuracy. This reduction in the remaining burden is a result of storing the precomputed partial sums of the filter comemory table. Distributed arithmetic needsarithmetic computing resources and nowhen compared with other alternati

International Journal of Trend in Scientific Research and Development (IJTSRD) ISSN: 2456-6470

Dec 2018 Page: 560

he number of weights used in the LMS adaptive . It is outstanding that the LMS

convergence for correlated rcumstance that can show up in

identification applications is when the coefficients of varying. The adaptive algorithm

a mechanism to track the changes of the in the LMS algorithm described has a

complexity (number of additions, subtractions, divisions, multiplications per iteration) and memory load, which makes it exceptionally interesting for practical mplementations. It is notable that the step-size µ

influences the performances of the adaptive filter. In low complexity, the LMS has additionally

DISTRIBUTED ARITHMETIC (DA) ibuted arithmetic (DA) is ordinarily utilized for

essing algorithms where figuring the inner vectors involves majority of the

mputational work load. This sort of computing rofile depicts a huge part of signal processing

algorithms, so the potential usage of distributed arithmetic is huge. The inner product is usually

multipliers and adders. Arithmetic, alongside Modulo

Arithmetic, are computation algorithms that perform up table based plans. Both

blended some enthusiasm more than two decades have mulled from that point onward. In

fact, DA particularly focuses on the sum of products (sometimes referred to as the vector dot product) computation that spread a large number of the vital DSP filtering and frequency transforming functions. XILINX FPGA look up table are not familiar for DSP users and DA algorithm popularly used to produce

tion of the DA algorithm is to a great degree straightforward yet its applications

The science incorporates a Boolean and ordinary algebra and require no

even for the logic designer.

fundamentally diminishes the quantity of ed for filtering. This decrease is

pecially detectable for filters with high piece . This reduction in the computational

is a result of storing the pre-computed partial sums of the filter co-efficient in the

ed arithmetic needs fewer arithmetic computing resources and no0multipliers, when compared with other alternatives.

Page 4: FPGA Based Hybrid LMS Algorithm Design on Distributed Arithmetic

International Journal of Trend in Scientific Research and Development (IJTSRD) ISSN: 2456

@ IJTSRD | Available Online @ www.ijtsrd.com

This part of distributed arithmetic is a goodcomputing environments with limited computational resources, especially multipliers. These kindcomputing environments can be found on older fieldprogrammable gate arrays (FPGAs) and loFPGAs.

Fig 5 Distributed arithmetic “Note that cK for{0 ≤ k ≤ 2N – 1}registered and put away in RAM-based LUT of 2N words. Notwithstanding, rather than putting away 2N words in LUT, store (2N − 1) words in a DA table of 2N − 1 registers. A case of such a DA tabis appeared in Fig 5. It contains just 15 registers to store the pre computed sums of input words. Seven new estimations of ck0 are computed by seven adders in parallel Fig 5[11]. AK is the incorporate when Xkb =1 that is the least significant bit. The substance of the kth LUT area can be identified as kj.where kj is the (j + 1)th bit of Nrepresentation of integer k for 0 ≤ k ≤ 2N (K = 4) execution of the DA filter is appearedFigure 3.1. The contains all 16 possible combination sums of the filter weights w0;w1;w2 and of shift registers in Fig5[11] stores 4 consecutive input samples (x[n - i]; i =0…0……..3). The connection of rightmost bits of the shift registers becomes the address of the memory tableclock cycle the shift register is shifted right.

International Journal of Trend in Scientific Research and Development (IJTSRD) ISSN: 2456

www.ijtsrd.com | Volume – 3 | Issue – 1 | Nov-Dec 2018

ibuted arithmetic is a good one for computing environments with limited computational

pecially multipliers. These kind of computing environments can be found on older field-

ys (FPGAs) and low-end, ease

Distributed arithmetic

} can be pre-based LUT of 2N

Notwithstanding, rather than putting away 2N 1) words in a DA table of

of such a DA table for N = 4 15 registers to

of input words. Seven are computed by seven adders

. AK is the incorporate into Sum significant bit. The

substance of the kth LUT area can be identified as kj. where kj is the (j + 1)th bit of N-bit binary

≤ 2N − 1. A 4-tap xecution of the DA filter is appeared in

Figure 3.1. The contains all 16 possible combination w0;w1;w2 and w3.The bank

stores 4 consecutive ……..3). The

ightmost bits of the shift registers becomes the address of the memory table”. At every

he shift register is shifted right.

6. DISTRIBUTED ARITHMETIC BASED APPROACH (PROPOSED DESIGN)

The essential block diagram for the proposed system for the design of an Adaptive FIR Filter using Distributive Arithmetic is proposedAdaptive Filter The proposed structure of DAadaptive filter of length N = 4 is appeared6.1[11]. It comprises of a fourblock and a weight-increment block along with additional circuits for the calculatione(n) and control word t for the barrel shifters. On relating with existing DA based adaptive filter0implementations the proposed architecture has achieved low complexity.

Fig 6.1 DA-based LMS adaptivelength N

For accomplishing this preferred standpointBinary Coding (OBC) has beendesign -based LMS adaptive delay element DA-table structure.

6.1. FOUR POINT INNER-The underlying part of the filtering process of the LMS adaptive filter, for each cycle, is the need to perform an inner product calculationArithmetic (DA) procedure is bitIt is entirely a bit-level modificationand accumulation operation. The computational algorithm that of the weighted sum of0produfour-point inner-product block shown in Fig.6incorporates a DA table comprisinregisters which stores the partial inner products yl for {0 < l ≤ 15} and a 16: 1 multiplexor (MUX) to choose the substance of one of those registers. Bit cutweights A = {w3l w2l w1l w0l} for 0 fed to the MUX as control in Lthe carry Save accumulator is fed by the output of the MUX.

International Journal of Trend in Scientific Research and Development (IJTSRD) ISSN: 2456-6470

Dec 2018 Page: 561

DISTRIBUTED ARITHMETIC BASED APPROACH (PROPOSED DESIGN)

block diagram for the proposed system for the design of an Adaptive FIR Filter using

is proposed DA-Based LMS The proposed structure of DA-based

= 4 is appeared in Fig s of a four-point inner product

increment block along with nal circuits for the calculation of error value

e(n) and control word t for the barrel shifters. On isting DA based adaptive

the proposed architecture has

LMS adaptive filter of filter length N

For accomplishing this preferred standpoint Offset Binary Coding (OBC) has been utilized in this filter

filter design requires 16 table structure.

-PRODUCT BLOCK part of the filtering process of the

LMS adaptive filter, for each cycle, is the need to orm an inner product calculation. Distributed

etic (DA) procedure is bit-sequential in nature. level modification of the multiply

and accumulation operation. The fundamental DA is a computational algorithm that affords effective usage

products, or dot product. The product block shown in Fig.6.2

incorporates a DA table comprising of an array of 15 registers which stores the partial inner products yl for

1 multiplexor (MUX) to choose of those registers. Bit cuts of

weights A = {w3l w2l w1l w0l} for 0 ≤ l ≤ L − 1 are fed to the MUX as control in LSB-to- MSB order, and

Save accumulator is fed by the output of the

Page 5: FPGA Based Hybrid LMS Algorithm Design on Distributed Arithmetic

International Journal of Trend in Scientific Research and Development (IJTSRD) ISSN: 2456

@ IJTSRD | Available Online @ www.ijtsrd.com

After L bit cycles, the carry-save accumulator shifaccumulates all the partial inner products and produces a sum word and a carry word of size (L + 2) bit each. The carry and sum words are shifted added with an input carry “1” to generate filter output which is consequently subtracted from the desired oud(n) to acquire the error e(n).

Fig 6.2 Four-point inner-product block Each one the bits of the error except the MSB bitignored, such that multiplication of inputerror is executed by a right shift, in the magnitude of the error the number of locations are number of leading zeros. “To generate the control word t for the barrel shifter the magnitude of the computed error is decoded by the logicconvergence factor µ is taken as0 (1/N). also we useas 2−i/N, where i is a small integer”. The number of shifts t in that case is increased by i, and the input to the barrel shifters is pre shifted by i locations accordingly to reduce the hardware complexity. used to compute the inner0(dot) product of a constacoefficient vector and a variable inpusingle bit serial operation”. This is the task that contributes to most of the critical path.product be thought to be

y(n)=��� (�) , e(n)=d(n)

where the weight vector w(n) and input vector x(n) the nth iteration are respectively given by

{X(n)=[x(n),x(n-1),…….x(n-N+W(n)=[(n),….,}the corresponding current value in the weight

register.

6.2 THE WEIGHT INCREMENT UNITIt comprises of four barrel shifters and fotractor cells in the Fig.6.3[11]. “The barshifts the distinctive input values xk for N – 1} determined by the location of the most

International Journal of Trend in Scientific Research and Development (IJTSRD) ISSN: 2456

www.ijtsrd.com | Volume – 3 | Issue – 1 | Nov-Dec 2018

save accumulator shift tial inner products and

s a sum word and a carry word of size (L + 2) bit each. The carry and sum words are shifted added

erate filter output which he desired output

product block

error except the MSB bit are ignored, such that multiplication of input xk by the

in the magnitude of given by the

o generate the control he magnitude of the by the logic. The (1/N). also we use µ

. The number of shifts t in that case is increased by i, and the input to the barrel shifters is pre shifted by i locations accordingly to reduce the hardware complexity. “DA

(dot) product of a constant coefficient vector and a variable input vector in a

. This is the task that . Let the inner

e(n)=d(n)-y(n)

) and input vector x(n) at the nth iteration are respectively given by

N+W(n)=[�0 (n), �1 the corresponding current value in the weight

THE WEIGHT INCREMENT UNIT s of four barrel shifters and four adder/sub

The barrel shifter input values xk for {k =0, 1, . . .,

determined by the location of the most

significant one in the estimated enumber of locations. The barrel shifter decides,with or subtracted from the current weights. Realize the corresponding multiplication of a shift operation. To update n weightsblock consists of n carry-save unitsunits performs the multiplication of shifted error values with the delayed input samples along with the addition with the old weights.weight-increment term the addition of old weight with weight increment term is merged with multiplicat To control for adder or subtract the error is used such that, when sign bit is zerobarrel-shifter output added and when one it issubtracted from the content of the corresponding current value in the weight Multiplication is performed by each MAC unit,shifted value of error with the delayed input samples xn-m followed by the additions.

Fig 6.3 the weight increment unit The final outputs of the carry-updated weights which serve as an input to the error computation block as well as the weightfor the next iterations. To keep the criticalto one addition time a pair of delays are introduced before and after the final addition of the weightupdate block. 7. SIMULATION AND RESULTS Simulation and synthesis is performed bysoftware using VHDL programming.may be seen that the power consumpto just below half of that in the existing design. This has resulted because of a reduced switching activity of the design based on carry-save adder. An efficpipelined architecture forimplementation of DA-based adaptiv

International Journal of Trend in Scientific Research and Development (IJTSRD) ISSN: 2456-6470

Dec 2018 Page: 562

significant one in the estimated error by appropriate The barrel shifter decides, added

with or subtracted from the current weights.

Realize the corresponding multiplication of a shift o update n weights, the weight-update

save units. Each carry-save ts performs the multiplication of shifted error

values with the delayed input samples along with the with the old weights. To the calculation of

the addition of old weight with weight increment term is merged with multiplication.

To control for adder or subtract cells the sign bit of at, when sign bit is zero the

shifter output added and when one it is subtracted from the content of the corresponding current value in the weight register [11].

is performed by each MAC unit, of the shifted value of error with the delayed input samples

by the additions.

weight increment unit

-save units constitute the weights which serve as an input to the error

computation block as well as the weight-update block o keep the critical-path equal pair of delays are introduced

before and after the final addition of the weight-

RESULTS and synthesis is performed by XILINX

using VHDL programming. Here it very well power consumption has reduced

to just below half of that in the existing design. This has resulted because of a reduced switching activity of

save adder. An efficient pipelined architecture for, and low delay

based adaptive filter.

Page 6: FPGA Based Hybrid LMS Algorithm Design on Distributed Arithmetic

International Journal of Trend in Scientific Research and Development (IJTSRD) ISSN: 2456

@ IJTSRD | Available Online @ www.ijtsrd.com

Fig 7.1 Simulation output for DA based weighted LMS algorithm

The simulation results in the fig.7.1describes the DA based weighted LMoutput and filtered, shifted outputs of adaptive LMS filter algorithm respectively.

Fig 7.2 Simulation output for adaptive LMS based filter algorithm

A carry-save accumulation scheme of signed partial inner products for the computation of filter output has been implemented. From the synthesis results, it was found that the proposed design consumes less power over our previous DA based FIR adaptive filter. In future, work can be implemented on digital communication, signal processing application, digital radio receivers, software radio receivers and echo cancellation.

DESIGN Gate

Count Delay (nS)

EXISTING 31,500 2.3 PROPOSED 40,000 1.70

Table 1: Result Comparison of Existing aProposed design in terms of Area, Delay and

Power 8. CONCLUSION This paper presents significantly less adaptation delay and provided significant saving of area and power. In proposed we are going to deal with direct form LMS algorithm along with transposed form of delayed least mean square algorithm (TDLMS), where the power is significantly reduced and the performance is increased along with decrease in area. From the synthesis results, it was found that the proposed design consumes less power over our previous DAFIR adaptive filter.

International Journal of Trend in Scientific Research and Development (IJTSRD) ISSN: 2456

www.ijtsrd.com | Volume – 3 | Issue – 1 | Nov-Dec 2018

Simulation output for DA based weighted

in the fig.7.1 and fig.7.2 DA based weighted LMS algorithm

of adaptive LMS

Simulation output for adaptive LMS based

save accumulation scheme of signed partial inner products for the computation of filter output has been implemented. From the synthesis results, it was

the proposed design consumes less power based FIR adaptive filter. In

future, work can be implemented on digital communication, signal processing application, digital radio receivers, software radio receivers and echo

Power (mW)

91 67

Table 1: Result Comparison of Existing and Area, Delay and

significantly less adaptation delay and provided significant saving of area and power. In proposed we are going to deal with direct form LMS algorithm along with transposed form of delayed least

TDLMS), where the power is ly reduced and the performance is increased

From the synthesis results, it was found that the proposed design consumes less power over our previous DA-based

9. ACKNOWLEDGMENTSThanks for the base idea paper by Dr. S. A. Whit“Applications of the distributed arithmetic to digital signal processing: A tutorial reviewMag., vol. 6. We would like to thank our mentor have been greatly supporting and provided guidance and offered help to make 10. REFERENCES 1. Apolinário Jr, José A., and Sergio L. Netto.

"Introduction to Adaptive Filters." In QRDAdaptive Filtering, pp. 1-27. Springer US, 2009.

2. B. Widrow and S. D. Steprocessing. Prentice Hall, Englewood Cliffs, NJ, 1985.

3. S. Háykin and B. Widrow, Leastadaptive filters. Wiley-Interscience, Hoboken, NJ, 2003.

4. Park, Sang Yoon, and Pr"Lowpower, high-throughput, and lowadaptive FIR filter based on distributed arithmetic." Circuits and Systems II: Express Briefs, IEEE Transactions on 60, no. 6 (2013): 346-350.

5. D. J. Allréd, H. Yoo, V. Krishnan, W. Huang, and D. V. Anderson, “LMS adaptive filters using distributed arithmetic for high throughput,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 52, no. 7, pp. 1327–1337, Jul. 2005.

6. P. K. Meher, ‘LUT Optimization for MemoryBased Computation,’ IEEE Trans on Circuits & Systems-II, pp.285-289, April 2010.

7. Háykin, Simon S. Adaptive filter theory. Pearson Education India, pp.18, 1996.

8. A. Croisiér, D. Estebán, M. L“Digital filter for PCM encoded signals US Patent 3, 777, 130,” 1973.

9. S. Zohar, “New Hardware Realizations of Nonrecursive Digital Filters,” IEEE Transactions on Computers, vol. C-22, no. 4, pp. 3281973.

10. A. Peléd and B. Liu, “A New Hardware Realization of Digital Filters,” IEEE Transactions on ASSP, vol. 22, no. 6, pp. 456

International Journal of Trend in Scientific Research and Development (IJTSRD) ISSN: 2456-6470

Dec 2018 Page: 563

ACKNOWLEDGMENTS paper by Dr. S. A. Whité,

Applications of the distributed arithmetic to digital signal processing: A tutorial review,” IEEE ASSP

ur mentor manikandan who greatly supporting and provided steady

and offered help to make this a success.

Apolinário Jr, José A., and Sergio L. Netto. "Introduction to Adaptive Filters." In QRD-RLS

27. Springer US, 2009.

B. Widrow and S. D. Steárns, Adaptive signal l, Englewood Cliffs, NJ,

ykin and B. Widrow, Least-mean-square Interscience, Hoboken, NJ,

Park, Sang Yoon, and Prámod Kumar Mehér. throughput, and low-area

daptive FIR filter based on distributed arithmetic." Circuits and Systems II: Express Briefs, IEEE Transactions on 60, no. 6 (2013):

d, H. Yoo, V. Krishnan, W. Huang, and D. V. Anderson, “LMS adaptive filters using

thmetic for high throughput,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 52, no. 7,

P. K. Meher, ‘LUT Optimization for Memory-Based Computation,’ IEEE Trans on Circuits &

289, April 2010.

Simon S. Adaptive filter theory. Pearson Education India, pp.18, 1996.

n, M. Lévilion, and V. Rizo, “Digital filter for PCM encoded signals US Patent

S. Zohar, “New Hardware Realizations of igital Filters,” IEEE Transactions

22, no. 4, pp. 328–338,

d and B. Liu, “A New Hardware Realization of Digital Filters,” IEEE Transactions on ASSP, vol. 22, no. 6, pp. 456–462, 1974.