EXCELlentway of looking at FIR optimization as a function...

EXCELlent way of looking at FIR optimization as a function of

processor architectureAssignment 3

Knowledge expected by midterm

Start with basic FIR filter

float FIR_Filter(float newValue, float *FIFO, float *coeffs, int numTaps)R4 R8 R12 ?

Course exams– I WILL PROBALY say – pretend numTaps comes in R16

How to handle in real life – write in C++ first and see what compiler does to handle this situation – then copy thatg

Careful – Compiler treats these situation differently as “it knows more in second case”float FIR_Filter_1(float newValue, float *FIFO, float *coeffs, int numTaps){

And Extern volatile float FIFO[ ];Extern volatile float coeffs[ ]; float FIR_Filter_2(float newValue, int numTaps){

And these differently – and perhaps differently between debug and release modesExtern volatile float FIFO[ ]; Extern volatile float coeffs[ ]; #define numTaps 120float FIR_Filter_2(float newValue) {}

Extern volatile float FIFO[ ]; Extern volatile float coeffs[ ]; Volatile int numTaps = 120;float FIR_Filter_2(float newValue) {

Extern volatile float FIFO[ ]; Extern volatile float coeffs[ ]; int numTaps = 120;float FIR_Filter_2(float newValue) {

Standard FIR filter from Lab 1

float FIR_Filter(float newValue, float *FIFO, float *coeffs, int numTaps) {For (int count = 1; count < numTaps, count++) FIFO[count – 1] = FIFO[count];

float *FIFOpt = FIFO + numTaps – 1; // DOes C do pointer arithmetic?*FIFOpt = newValue;

sum = 0.0;for (int count = 0; count < numTaps, count++)

sum = sum + *FIFOpt‐‐ * *coeffs++;

return sum

Assume processor architecture is von‐Neumann and can’t do data fetch, add or multiplication in same cycle

Now – increase cycle time by 25% to do pt++ in same cycle as fetch – STEP 1

Now – increase cycle time by 25% to do pt++ in same cycle as fetch – STEP 1 ‐‐ Change pipeline to allow 1 Math op to occur during next fetch – STEP 2

UNROLL LOOP TO OPEN UP OTHER POSSIBLE PARALLEL INSTRUCTIONSTOTALLY MEMORY / DAG 1 RESOURCE LIMITEDNEED TO CHANGE PROCESSOR ARCHITECTURE

Instead of 1 cycle mult + 1 cycle addUse 2 cycle (pipeline MACC instruction)Multiply / Accumulate

Does 1 or 2 cycle MACC improve performance

• FETCH MULT INSTRUCTION• DO MULT ‐‐ FETCH ADD INSTRUCTION• DO ADD

• Compared to 2 cycle MACC• FETCH MACC INSTRUCTION• DO MULT• DO ADD

Assume Harvard – Architecture with floating point MACC (SHARC)

Harvard processor without the MACColour each resource for an instruction

Take advantage (carefully) of parallel DM and PM operations to fetch instructions earlier

In principle 4 cycles faster for twice round the loop – but data dependencies conflict

You complete the analysis with separate Add and Mult instruction

Show the advantages of using a 2 cycle MACC instruction. Is 1 cycle MACC offer any further advantage?

Move over to Super Harvard architecture with instruction cache in use always. Start using PM bus for data ops

• DON’T LOOK AT NEXT SLIDE UNTIL YOU HAVE TACKLED LAST SLIDE

Loop of size 10 for twice around loopKey resource – FETCH INSTR 8 / 10

Using cache ONLY when instr / data conflict on pm bus means can have smaller (cheaper) cache

Get more speed by UNROLLING THE LOOP 3 times and then thinking

Re‐Roll the loop and execute N‐2 times

Next step – MOVE TO VLIW instruction setWHERE INSTR ALLOWS MATH‐OP, dm and pm fetch at the same time

DOES NOT HAVE TO WAIT

Next step – MOVE TO V‐VLIW instruction setWHERE INSTR ALLOWS + and *, dm and pm fetch at the same time

DOES NOT HAVE TO WAIT

IF USE V‐VLIW INSTR* + dm pm

then loop is 1 cycle

FIR loop look like this

• FETCH DATA1• FETCH DATA2, DO MULT OF DATA1• FETCH DATA3, DO MULT OF DATA2, ADD OF DATA1• FETCH DATA4, DO MULT OF DATA3, ADD OF DATA2• DO MULT OF DATA4, ADD OF DATA3• ADD OF DATA4

• Programming VLIW assembly code (single cycle FIR hardware loop)

• Does C++ automatically switch to this mode in release mode if we pass dm and pm memory array pointers

• If not – how do we make C++ switch to this mode

EXCELlentway of looking at FIR optimization as a function...

Documents

Transcript of EXCELlentway of looking at FIR optimization as a function...

FIR Management

fir response

Interrupts in Detailpeople.ucalgary.ca/~smithmr/2017webs/encm511_17/Lectures/17_L… · The “standard” instruction cycle RESET THE PROCESSOR ... Fetch Decode Execute Writebac

Douglas fir

VAAC LONDON ANCHORAGE VAAC MONTREAL ANCHORAGE … Reference Documents/VAAC Map... · MAGADAN OCEANIC WEST FIR GANDER DOMESTIC FIR LAHORE FIR MINSK FIR HARARE FIR AN CH OR G E I FIR

Design FIR

FIR System

FIR Compiler User Guide - Altera Day Specify Filter Characteristics to FIR Compiler Megafunction (FIR Compiler Assists in Area/ Speed Tradeoff) Traditional Flow FIR Compiler Flow 1–4

Constant-Coefficient FIR Filters Based on Residue … · Constant-Coefficient FIR Filters Based on ... FIR Filters Based on Residue Number System Arithmetic ... FIR filter architecture.

M. Smith University of Calgarypeople.ucalgary.ca/~smithmr/2017webs/encm511_17/...Many people like to sing in the shower. However, its rather boring as there is no accompaniment . The

Fir door collection - AZ Window Replacement Fir Door Brochure … · Fir door collection Woodgrain is excited to introduce the expansion of the Douglas Fir door program. Clean lines

Trees. SUBALPINE FIR S SUBALPINE FIR (Abies lasiocarpa Nutt.) a alpine fir, balsam fir, white balsam, white fir G GENERAL DESCRIPTION S Subalpine fir.

5 – Need to handle the “true PF registers” Customer game ...people.ucalgary.ca/~smithmr/2017webs/encm511_17/... · Canada 5 / 24 + extras PF registers Direction, Data, Polarity

Lab 2 – Hallowen Experience with new OSpeople.ucalgary.ca/~smithmr/2017webs/encm511_17/17_Labs/17_Lab2... · Modify the Lab 1 car drive state machine so that it can display the

IVAO PORTUGAL - the portuguese division of IVAO - Santa ...ivao.pt/atcoperations/lppo/lppo_oca_sop.pdfCanarias FIR (GCCC), Gander FIR (CZQX), Shanwick FIR (EGGX), Sal FIR (GVSC) and

Fir filter_utkarsh_kulshrestha

Pacific silver fir and subalpine fir in

FIR Design

FIR 01012011

FIR/UIR in the Lower Airspace (EUROCONTROL …lr b) zagreb f ir/u (ldzo) ch is nau fir (luuu) amsterd fir (ehaa) warszawa fir (epww) reims fir (lfee) santa maria o cean if r (l po)