Which one? You have a vector, a[ ], of 50000 random integers, which can modern CPUs do faster and...

8
Which one? You have a vector, a[ ], of 50000 random integers, which can modern CPUs do faster and why? //find max of vector of random ints max=0; for (inda=0; inda<50000;inda++) { if (a[inda] > max) { max = a[inda]; index=inda; } } //find avg of vector of random ints sum=0; for (inda=0; inda<50000;inda++) { sum = sum + a[inda]; } avg = ((double) sum) / 50000;

Transcript of Which one? You have a vector, a[ ], of 50000 random integers, which can modern CPUs do faster and...

Page 1: Which one? You have a vector, a[ ], of 50000 random integers, which can modern CPUs do faster and why? //find max of vector of random ints max=0; for (inda=0;

Which one?• You have a vector, a[ ], of 50000 random integers,

which can modern CPUs do faster and why?

//find max of vector of random intsmax=0;for (inda=0; inda<50000;inda++)

{if (a[inda] > max)

{max = a[inda];index=inda;}

}

//find avg of vector of random intssum=0;for (inda=0; inda<50000;inda++)

{sum = sum + a[inda];}

avg = ((double) sum) / 50000;

Page 2: Which one? You have a vector, a[ ], of 50000 random integers, which can modern CPUs do faster and why? //find max of vector of random ints max=0; for (inda=0;

Introduction to Digital Filtering

SMD077 – Computer Architecture

31-Oct-2001

Dennis M. AkosLuleå University of Technology

Page 3: Which one? You have a vector, a[ ], of 50000 random integers, which can modern CPUs do faster and why? //find max of vector of random ints max=0; for (inda=0;

Motivation• Digital filtering is the “application” or “algorithm” that will be used in

the majority of the labs

• Very representative programmable processor operation that has wide ranging real world applications

• This is a computer architecture course (not a course in programming or signal processing!?!)– Goal is to map algorithms to the hardware

• Requires comprehensive understanding of the hardware, or architecture, itself• Compiler support does not exist, or is limited, for specialized hardware

– Few will be designing programmable processor (definitely an option) but many will be using programmable processors

• Labs will be based around a Finite Impulse Response (FIR) Filter

• Basic understanding is achieved via time/frequency domain transforms

Page 4: Which one? You have a vector, a[ ], of 50000 random integers, which can modern CPUs do faster and why? //find max of vector of random ints max=0; for (inda=0;

What this lecture is and is not!

• This is not a comprehensive overview of digital filters– “Gloss over” much of the mathematics and

theory involved with design and implementation of filters

– Many good references are available

• It is a simple introduction to motivate/help you to better understand the upcoming labs

Page 5: Which one? You have a vector, a[ ], of 50000 random integers, which can modern CPUs do faster and why? //find max of vector of random ints max=0; for (inda=0;

Finite Impulse Response (FIR) Filter

• Digital filter operates on a stream, or vector, of data representing some continuous signal– Sampled sinusoid– Audio signal (compact disk)

• There are four basic filter implementations: lowpass, highpass, bandpass, and bandstop (as well as many different classes (FIR, IIR, …) and subclasses (Butterworth, Chevychev,…)

• It is easiest to examine and consider the impact of different types of filters by their frequency domain characteristics

– Consider the “audio equalizer” analogy– What is the frequency domain representation of the sinusoid? Sum of sinusoids?

InputSampled

Signalx[n]

Output (Filtered)Sampled Signal

y[n]FIR Filter

M

0kk k]x[nby[n]

Page 6: Which one? You have a vector, a[ ], of 50000 random integers, which can modern CPUs do faster and why? //find max of vector of random ints max=0; for (inda=0;

Example: 3rd Order FIR Filter Structure

• FIR Filters can be of arbitrary order and extendable to an indefinite number of elements

• Filter order trade-off– Higher order results in sharper transitions between pass and stop bands– Higher order is more computationally complex

• bn’s are constants and completely define how the filter will act on the input (lowpass, highpass, …)

InputSampled

Signal

x[n]

Output (Filtered)Sampled

Signal

y[n]

xx x x

+++

delay delay delay

x[n - 1] x[n - 2] x[n - 3]

b3b2b1b0

Perfect structure for SIMD

(Single-Instruction Multiple-Data)

operations

Page 7: Which one? You have a vector, a[ ], of 50000 random integers, which can modern CPUs do faster and why? //find max of vector of random ints max=0; for (inda=0;

FIR Filter Input & Output Sequences

• Input signal can be specified as a vector of the resulting samples• Note that there can be a “transient” in the output until the filter has

all delay slots filled– Has implications for filtering short sequences– Higher order filters will have a longer transient

x[0]x[2]

x[4]

x[1]x[3]

x[5]

time

Sampled Input Signal

…y0] y[2]

y[4]

y[1]y[3]

y[5]

time

Resulting Output Signal

transient portion

Page 8: Which one? You have a vector, a[ ], of 50000 random integers, which can modern CPUs do faster and why? //find max of vector of random ints max=0; for (inda=0;

FIR Filter Resulting Algorithm• /**************************************************************************• fir_filter - Perform fir filtering sample by sample on floats

• Requires array of filter coefficients and pointer to history.• Returns one output sample for each input sample.

• float fir_filter(float input,float *coef,int n,float *history)

• float input new float input sample• float *coef pointer to filter coefficients• int n number of coefficients in filter• float *history history array pointer

• Returns float value giving the current output.• *************************************************************************/

• float fir_filter(float input,float *coef,int n,float *history)• {• int i;• float *hist_ptr,*hist1_ptr,*coef_ptr;• float output;

• hist_ptr = history;• hist1_ptr = hist_ptr; /* use for history update */• coef_ptr = coef + n - 1; /* point to last coef */

• /* form output accumulation */• output = *hist_ptr++ * (*coef_ptr--);• for(i = 2 ; i < n ; i++) {• *hist1_ptr++ = *hist_ptr; /* update history array */• output += (*hist_ptr++) * (*coef_ptr--);• }• output += input * (*coef_ptr); /* input tap */• *hist1_ptr = input; /* last history */

• return(output);• }

from “C Algorithms for Real-Time DSP”by P. Embree