DISP 2003 Lecture 5 – Part 2 Digital Filters 2 Implementation of FIR Filters IIR Filters...

DISP 2003Lecture 5 – Part 2

Digital Filters 2

Implementation of FIR FiltersIIR Filters Properties, Design and Implementation

Philippe Baudrenghien, AB-RF

20 March 2003 P Baudrenghien AB/RF 2

Basic Structure for FIRZ-1 Z-1 Z-1 Z-1 Z-1

x(n) x(n-1) x(n-2) x(n-3) x(n-N+1)

Xh0

Xh1

Xh2

Xh3

XhN-2

XhN-1

x(n-N+2)

SUM

y(n)

Tapped-delay line (N-1) delaysN multipliers1 adder (N inputs)

∑1-N

=k

*

0

k)- x(nh(k)=x(n)h(n)=y(n)


Structure for symmetric FIRTapped-delay line (N-1) delays

(N+1)/2 multipliers

(N-1)/2 adders (2 inputs)

1 adder (N+1)/2 inputs

Z-1 Z-1 Z-1 Z-1

x(n) x(n-1) x(n-2) x(n-3)

y(n)

Xh0 X

h1

x(n-N+1) x(n-N+2)

Z-1Z-1 Z-1 Z-1

x(n-N+3)

x(n-(N-1)/2)

Xh2 X X

h(N-1)/2

h(N-3)/2

We save on multipliers (~50 %)


An hardware example: LF3320 (1)

Used in the Feed-forward and Feed-back on the SPS 200 MHz cavities and in the SPS longitudinal damper.

83 MSPS

introduced in 1998


An hardware example: LF3320 (2)12 bits Data in

12 bits Coefficients

23 bits at multiplier output

32 bits after overall sum

16 bits output

RoundingLimitSelect

Delay line out for cascading

2x8 coeff.-> 16 (32) taps(x 256)


Better hardware under development …

165 MSPS

not available yet …


How about FPGA (CPLD) ?

Used in the SPS transverse damper. Foreseen in the SNS and LHC Low Level RF.

>200 MSPS!

introduced in 2002


IIR Design based on pole-zero plot• Idea: Deduce the location of

the poles and zeros from the desired frequency response.

• Example 3: Comb Filter want periodic resonances at 0,

0, 2 0, …, (N-1) 0

realized by a series of equispaced poles, on a circle of radius r, and at angles 0, 20,, 40, ,…, 20

r

unit circle z-plane

20

0

0

N=24

gain

0 0 0


Comb Filter used in accelerators (1)

• Transfer function (assuming that N.0= 1 )

• Poles

• Difference equation: y(n) = a y(n-N) + x(n)• Let 0 = 1/N = Frev/Fs with Frev the accelerator revolution

frequency -> Periodic Band-pass filter with Pass bands at multiple of the revolution frequency and Stop bands in between

Widely used in Low Level RF systems to filter out the periodic transient due to the passage of the beam in the RF cavity (transient beam loading).

NN- r=a with

z a -1

1 = H(z)

1-N,0,1,=kN

kπj2

Nk

N for e . a=p → a =p


Comb Filter (2)• Implementation

1 adder 1 multiplier 1 N samples delay line

• Examples: SPS LLRF Fs=40 MHz, N = 924. Implemented with FIFO

(IDT722x5 or CY7C42x5) and ALU (IDT7381 or L4C381) LHC LLRF Fs = 40 MHz (or 80 MHz), N = 3564 (or 7128).

Implemented with FPGA (Logic cells plus embedded memory)

x(n)

z-N

y(n)

+

+

a


IIR Design based on Analog Prototype• Idea: Transform an analog prototype (Butterworth,

Chebyshev, Elliptic) into a digital filter

• The transformation from s-plane to z-plane must Map the [–jFs , +jFs] portion of the imaginary axis (s-plane) on

the unit circle in the z-plane Preserve stability

• The analog features are kept Digital Butterworth are monotonic in both the Pass band and Stop

band Digital Chebyshev have ripple in the Pass band but are monotonic

in the Stop band (or vice versa) Digital Elliptic are equiripple in both Pass band and Stop band


Example 4: Elliptic LP IIR (1)LPF example (as before):

Pass band End Fpass = 0.1

Pass band Ripple Dpass=0.05 (5%=0.45 dB)

Stop band Start Fstop= 0.13

Stop band Attenuation Dstop = 0.1 (=20dB)

• Specifications can be achieved with a minimum fourth-order elliptic filter

• Transfer function is the ratio of two fourth-order polynomials

Numerator:

0.108106225554593

-0.225674608761028

0.312898868850319

-0.225674608761028

0.108106225554593

Denominator:

1.000000000000000

-2.846667053723339

3.458441598722129

-2.013976284146102

0.484098739290319

Only 10 coefficients needed!!!

4-3-2-1-0

-4-3-2-10

z a+z a +z a+z a+a

z b+z b +z b+z b+b=

)z(D

)z(N = H(z)

4321

4321


Example 4: Elliptic LP IIR (2)

• Achieved frequency response on log scale (blue trace)

• Very non-linear phase response (in green)

• Exact zeros in the stop band (Elliptic)


Example 4: Elliptic LP IIR (3)• Pole-Zero plot in the z-

plane• Poles inside the unit circle

(stability) at azimuth in the Pass band

• Zeros on the unit circle (elliptic) at azimuth in the Transition band and Stop band

2 poles very close to the unit circle (to get a steep transition band) -> caution !Pole-zero cancellation -> caution!


Example 4: Elliptic LP IIR (4)

• Impulse response lasts forever (IIR!)

• Comparison with FIR design:Significant reduction in

the computational complexity: 10 Multiply/Add compared to 31 (+)

Very sensitive to quantization effects. See next lecture (-)


Basic Structure for IIR•Tapped-delay line (N or M) delays•N + M + 1 multipliers•2 adders (N and M+1 inputs)•D/N structure (or poles/zeros structure) also called Direct Form II

x(n) y(n)

Z-1

Z-1

Z-1

Z-1

X

-a1

X

-a2

X

-a3

X

-aM

X

b1

X

b2

X

b3

X

bM

Z-1

X

b0

w(n)

w(n-1)

w(n-2)

w(n-3)

w(n-M)

w(n-N)

Assume:N>Ma0 = 1

X

-aN

Very sensitive to the effects of coefficients quantization if N or M are large!!! (next lecture)


Cascade of biquads• biquad= second-order filter (2 zeros and 2 poles)

• Elliptic biquad: zeros on the unit circle -> b0 = b2= 1 -> remains elliptic after coefficients quantization (see next lecture)

• Split the high-order filter into a cascade of biquads

2-1-0

-2-10

2-1-0

-2-10

N-2-1-0

-M-2-10

z a′+z a′+a′z b′+z b′+b′

•z a+z a+a

z b+z b+b=

z a+ +z a+z a+a

z b+ +z b+z b+b = H(z)

21

21

21

21

N21

M21

Z-1

X

-a2

X

b2

x(n)

Z-1

X

-a1

X

b1

X

b0

Z-1

X

-a’2

X

b'2

Z-1

X

-a’1

X

b'1

X

b'0

y(n)

A1

scalingbetweenbiquads

Much less sensitive to the effects of coefficients quantization (next lecture)


How about using a DSP ? (1)

{ Cascaded IIR Biquad Sections (Direct Form II or Transposed Direct Form I) w(n) = x(n) + a1*w(n-1) + a2*w(n-2) beware of signs here! y(n) = w(n) + b1*w(n-1) + b2*w(n-2) (single biquad structure) Each section consists of: b2,b1,a2,a1,w(n-1),w(n-2) Notice that coefs have been normalized such that b0=1.0

Calling Parameters --- cascaded_biquad --- l0, l1, l8 = 0 m1, m8 = 1 f8 = first x(n) r0 = number of biquad sections (filter order / 2) b0 = address of delay line buffer b8 = address of coefficient buffer --- cascaded_biquad_init --- l0 = 0 r0 = number of biquad sections (filter order / 2) b0 = address of delay line buffer

Returned Values f8 = last y(n) cascaded_biquad delay line zeroed cascaded_biquad_init

Register File Usage: f2, f3, f4, f8, f12 cascaded_biquad f2 cascaded_biquad_init

Data Address Generator Usage i0, b1, i1, i8, m1, m8 cascaded_biquad i0 cascaded_biquad_init

Benchmark 6 + 4*(sections) cycles cascaded_biquad 5 + 2*(sections) cycles cascaded_biquad_init (incl. non-delayed rts)

•Analog Devices assembly code for calling their cascaded_biquad subroutine (DSP 21XXX family)•biquad section•performances

per output sample we have (6 + 4 x nbr biquads) cycleswith the 10 ns instruction cycle of the AD2116x we get 2 (biquads) x 4 (cycles) x 10 ns = 80 ns + 60 ns per sample for our fourth order elliptic IIR. The throughput is on the order of 6 MSPS (optimistic!)but you have floating point arithmetics!


How about using a DSP ? (2)Memory Usage --- cascaded_biquad --- 10 words instructions in PM 4*(sections) coefficients in PM 2*(sections) delay line storage in DM --- cascaded_biquad_init --- 5 words instructions in PM 2*(sections) delay line storage in DM "cascade.asm" Analog Devices, Inc. DSP Applications P.O.Box 9106 Norwood, MA 02062 Christoph D. Cavigioli ... 25-Apr-1991 }.GLOBAL cascaded_biquad, cascaded_biquad_init;.EXTERN coefs, dline;

.SEGMENT /PM pm_code;cascaded_biquad_init: { --- call this once to set up initial conditions --- } f2=0; lcntr=r0, do clear until lce; {for each section, do:} dm(i0,1)=f2; {clear w`` storage}clear: dm(i0,1)=f2; {clear w` storage} rts;

{ --- comments for subroutine called cascaded_biquad --- } {TERMINOLOGY: w` = w(n-1), w`` = w(n-2), NEXT = "of next biquad section"} { #1 clear f12, rd w``, rd a2 loop prologue } { #2 for each section, do: } { #3 w`à2, 1st=x+0,else=y, rd w`, rd a1 loop body } { #4 wà1, x+w`à2, wr new w`, rd b2 loop body } { #5 w``b2, new w, rd NEXT w``, rd b1 loop body } { #6 w`b1, new w+(w``b2), wr new w, rd NEXT a2 loop body } { #7 calc last y after dropping out of loop loop epilogue }

cascaded_biquad: { --- call this for every sample to be filtered --- } b1=b0; f12=f12-f12, f2=dm(i0,m1), f4=pm(i8,m8); { #1 } lcntr=r0, do quads until lce; { #2 } f12=f2*f4, f8=f8+f12, f3=dm(i0,m1), f4=pm(i8,m8); { #3 } f12=f3*f4, f8=f8+f12, dm(i1,m1)=f3, f4=pm(i8,m8); { #4 } f12=f2*f4, f8=f8+f12, f2=dm(i0,m1), f4=pm(i8,m8); { #5 }quads: f12=f3*f4, f8=f8+f12, dm(i1,m1)=f8, f4=pm(i8,m8); { #6 } rts (db), f8=f8+f12; { #7 } nop; nop;.ENDSEG;

cascaded_biquad subroutineloop body 4 cycles/biquadplus 6 cycles overhead per sample

Be cautious: website quotes 20 ns per IIR biquad for AD21161. It is NOT wrong but …

DISP 2003 Lecture 5 – Part 2 Digital Filters 2 Implementation of FIR Filters IIR Filters...

Documents

Transcript of DISP 2003 Lecture 5 – Part 2 Digital Filters 2 Implementation of FIR Filters IIR Filters...