EE 445S Real-Time Digital Signal Processing Lab Fall 2011

EE 445S Real-Time Digital Signal Processing Lab Fall 2011Lab #3.1 Digital Filters

Debarati Kundu(With the help of Mr. Eric Wilbur, TI)

*OutlineDiscrete-Time ConvolutionFIR Filter DesignConvolution Using Circular BufferFIR Filter ImplementationFIR Filter Block ProcessingCode Optimization

*Discrete-Time ConvolutionRepresented by the following equation

Filter implementations will use the second version (hold h[n] in place and flip-and-slide x[n] about h[n])Z-transform of convolution

Discrete-Time Sinusoidal Response*Input two-sided complex sinusoid: x[n] = e j n LTI system has impulse response h[n]Output y[n] = x[n] * h[n]

H() is frequency response of the LTI systemFilters are stable, so H() = H[z] |z=exp(j )Multiplying by H() = A() e j () causes change in magnitude by A() and change in phase by ()H(w)

*FIR Filters Design & ImplementationAn FIR filter does discrete-time convolutionz-1 indicates delay elements and hence we need a bufferWe shall implement FIR filters using circular buffers

*FIR Filters Design & ImplementationImplementationUse the Filter Design & Analysis Tool (fdatool) to get the co-efficient.Specifications are given in the task listUse convolve function (explained in subsequent slides) to implement FIR filter given coefficients from fdatool.

*Convolution Using Circular BufferAlways choose the size of circular buffer to be larger than N.Make sure that the size of the circular buffer is a power of 2.

*Convolution Using Circular Buffermain(){int x_index = 0;float y, xcirc[N];------/*--------------------------------------------*//* circularly increment newest (No %)*/++newest;if(newest == N) newest = 0;/*-------------------------------------------*//* Put new sample in delay line. */xcirc[newest] = newsample;/*-------------------------------------------*//* Do convolution sum */Go on to the next column

y = 0;x_index = newestfor (k = 0; k < No_of_coeff; k++){y += h[k]*xcirc[x_index];/*-------------------------------------*//* circularly decrement x_index */--x_index;if(x_index == -1) x_index = N-1;/*-------------------------------------*/}...}

*Block Processing using Ping-Pong BufferThis lab uses a double-buffered (PING/PONG) channel-sorted (L/R) buffering scheme.A FIR algorithm requires history to be preserved over calls to the algorithm.FIR_process() must first copy the history, then process the data.

Processing of the last data blk (PONG) starts from the top of hist down thru data for DATA_SIZE items. This leaves the last ORDER-1 data items NOT processed. Therefore, user must copy the history of the last processed buffer (PONG) to the new buffer (PING), then filter. Repeat the process

TSKHWIisrAudiorcvBufsADCDACMcASPSR12isrAudioxmtBufsFIR or COPYAIC3106Audio CodecSEM_post()LEDPRD1PRD2CLK100ms500msSW8

Code OptimizationA typical goal of any systems algorithm is to meet real-timeYou might also want to approach or achieve CPU Min in order to maximize #channels processedThe minimum # cycles the algorithm takes based on architectural limits (e.g. data size, #loads, math operations reqd)Goals:CPU Min (the limit):Often, meeting real-time only requires setting a few compiler options However, achieving CPU Min often requires extensive knowledge of the architecture (harder, requires more time) Real-time vs. CPU Min

*Debug vs Optimized Benchmarksfor (j = 0; j < nr; j++) { sum = 0; for (i = 0; i < nh; i++) sum += x[i + j] * h[i]; r[j] = sum >> 15;}

Debug get your code LOGICALLY correct first (no optimization)Opt increase performance using compiler options (easier)CPU Min it depends. Could require extensive time

OptimizationMachine CyclesDebug (no opt, g)817KRelease (-o3, no -g)18KCPU Min6650

Provides the best debug environment with full symbolic support, no code motion, easy to single stepCode is NOT optimized i.e. very poor performanceCreate test vectors on FUNCTION boundaries (use same vectors as Opt Env)Debug (g, NO opt): Get Code Logically CorrectHigher levels of optimization results in code motion functions become black boxes (hence the use of FXN vectors)Optimizer can find errors in your code (use volatile)Highly optimized code (can reach CPU Min w/some algos)Each level of optimization increases optimizers scopeRelease (o3, g ): Increase Performance

Levels of OptimizationFILE1.C{{}

{ . . .}}

{. . .}FILE2.C-o0, -o1 -o2 -o3-pm -o3LOCAL

single blockFUNCTION

Across blocksFILE

Across

functionsPROGRAMAcross files{. . .}

DSPLIBOptimized DSP Function Library for C programmers using C62x/C67x and C64x devicesThese routines are typically used in computationally intensive real-time applications where optimal execution speed is critical. By using these routines, you can achieve execution speeds considerably faster than equivalent code written in standard ANSI C language. And these ready-to-use functions can significantly shorten your development time.The DSP library features:C-callableHand-coded assembly-optimizedTested against C model and existing run-time-support functions

Adaptive filteringMathDSP_firlms2DSP_dotp_sqrCorrelationDSP_dotprodDSP_autocorDSP_maxvalFFTDSP_maxidxDSP_bitrev_cplxDSP_minvalDSP_radix 2DSP_mul32DSP_r4fftDSP_neg32DSP_fftDSP_recip16DSP_fft16x16rDSP_vecsumsqDSP_fft16x16tDSP_w_vecDSP_fft16x32MatrixDSP_fft32x32DSP_mat_mulDSP_fft32x32sDSP_mat_transDSP_ifft16x32MiscellaneousDSP_ifft32x32DSP_bexpFilters & convolutionDSP_blk_eswap16DSP_fir_cplxDSP_blk_eswap32DSP_fir_genDSP_blk_eswap64DSP_fir_r4DSP_blk_moveDSP_fir_r8DSP_fltoq15DSP_fir_symDSP_minerrorDSP_iirDSP_q15tofl

***********************

EE 445S Real-Time Digital Signal Processing Lab Fall 2011

Documents

Transcript of EE 445S Real-Time Digital Signal Processing Lab Fall 2011