Software and Hardware Circular Buffer Operations

25
Software and Hardware Circular Buffer Operations M. R. Smith, ECE University of Calgary Canada

description

Software and Hardware Circular Buffer Operations. M. R. Smith, ECE University of Calgary Canada. Tackled today. Have moved the DCremoval( ) over to the X Compute block Circular Buffer Issues DCRemoval( ) FIR( ) Coding a software circular buffer in C++ and TigerSHARC assembly code - PowerPoint PPT Presentation

Transcript of Software and Hardware Circular Buffer Operations

Page 1: Software and Hardware Circular Buffer Operations

Software and Hardware Circular Buffer Operations

M. R. Smith, ECEUniversity of Calgary

Canada

Page 2: Software and Hardware Circular Buffer Operations

04/19/23 Software Circular Buffer Issues, M. Smith, ECE, University of Calgary, Canada

2

Tackled today

Have moved the DCremoval( ) over to the X Compute block

Circular Buffer Issues DCRemoval( ) FIR( )

Coding a software circular buffer in C++ and TigerSHARC assembly code

Coding a hardware circular buffer Where to next?

Page 3: Software and Hardware Circular Buffer Operations

04/19/23 Software Circular Buffer Issues, M. Smith, ECE, University of Calgary, Canada

3

DCRemoval( )

Not as complex as FIR, but many of the same requirements Easier to handle You use same ideas in optimizing FIR over Labs 2 and 3 Two issues – speed and accuracy. Develop suitable tests for CPP code and

check that various assembly language versions satisfy the same tests

Memoryintensive

Additionintensive

Loops formain code

FIFO implementedas circularbuffer

Page 4: Software and Hardware Circular Buffer Operations

04/19/23 Software Circular Buffer Issues, M. Smith, ECE, University of Calgary, Canada

4

Next stage in improving code speedSoftware circular buffersSet up pointers to buffers

Insert values into buffersSUM LOOPSHIFT LOOPUpdate outgoing parametersUpdate FIFOFunction return

244 + N * 51 Was 1 + 2 * log2N63 + 6 * N2---------------------------23 + 11 N Was 22 + 11 N + 2

log2N

N = 128 – instructions = 1430

1430 + 300 delay cycles = 1730 cycles

Page 5: Software and Hardware Circular Buffer Operations

04/19/23 Software Circular Buffer Issues, M. Smith, ECE, University of Calgary, Canada

5

DCRemoval( )

If there are N points in the circular buffer, then this approach of moving the data from memory to memory location requires N Memory read / N Memory write (possible data bus conflicts) 2N memory address calculations

FIFO implementedas circularbuffer

Page 6: Software and Hardware Circular Buffer Operations

04/19/23 Software Circular Buffer Issues, M. Smith, ECE, University of Calgary, Canada

6

Alternative approach Move pointers rather than memory values In principle – 1 memory read, 1 memory

write, pointer addition, conditional equate

Page 7: Software and Hardware Circular Buffer Operations

04/19/23 Software Circular Buffer Issues, M. Smith, ECE, University of Calgary, Canada

7

Note: Software circular buffer is NOT necessarily more efficient than data moves Watch out – my version of FIR uses a different sort of circular

buffer FIR FIFO – newest element earliest in array (matching FIR

equation) DCremoval FIFO – newest element latest in array – because

that is the way I thought of it

Page 8: Software and Hardware Circular Buffer Operations

04/19/23 Software Circular Buffer Issues, M. Smith, ECE, University of Calgary, Canada

8

Note: Software circular buffer is NOT necessarily more efficient than data moves Now spending more time on moving / checking the software

circular buffer pointers than moving the data?

SLOWER

FASTER

Page 9: Software and Hardware Circular Buffer Operations

04/19/23 Software Circular Buffer Issues, M. Smith, ECE, University of Calgary, Canada

9

On TigerSHARC

Since we can have multiply instructions on one line, then “perhaps” if we can avoid pipeline delays then software circular buffer is faster than memory moves

Pipeline delay

XR4 = R4 + R5;;

XR4 = R4 + R6;;

Second instruction needs result of first

No Pipeline delay

XR4 = R4 + R5;;

XR3 = R4 + R6;;

Second instruction DOES NOT need result of first

Page 10: Software and Hardware Circular Buffer Operations

04/19/23 Software Circular Buffer Issues, M. Smith, ECE, University of Calgary, Canada

10

Generate the tests for the software circular buffer routine

Page 11: Software and Hardware Circular Buffer Operations

04/19/23 Software Circular Buffer Issues, M. Smith, ECE, University of Calgary, Canada

11

New static pointers needed in Software circular buffer code

Page 12: Software and Hardware Circular Buffer Operations

04/19/23 Software Circular Buffer Issues, M. Smith, ECE, University of Calgary, Canada

12

New sets of register definesNow using many of TigerSHARC registers

Page 13: Software and Hardware Circular Buffer Operations

04/19/23 Software Circular Buffer Issues, M. Smith, ECE, University of Calgary, Canada

13

Code for storing new value into FIFO requires knowledge of “next-empty” location First you must get the address of where the static variable –

saved_next_pointer Second you must access that address to get the actual

pointer Third you must use the pointer value Will be problem in labs and exams with static variables stored

in memory

Page 14: Software and Hardware Circular Buffer Operations

04/19/23 Software Circular Buffer Issues, M. Smith, ECE, University of Calgary, Canada

14

Adjustment of software circular buffer pointer must be done carefully

Get and update pointer

Check the pointer

Save corrected pointer

Page 15: Software and Hardware Circular Buffer Operations

04/19/23 Software Circular Buffer Issues, M. Smith, ECE, University of Calgary, Canada

15

Next stage in improving code speedSoftware circular buffersSet up pointers to buffers

Insert values into buffersSUM LOOPSHIFT LOOPUpdate outgoing parametersUpdate FIFOFunction return

28 Was 44 + N * 51 Was 1 + 2 * log2N614 Was 3 + 6 * N2---------------------------37 + 5 N Was 23 + 11 N

N = 128 – instructions = 677 cycles677 + 360 delay cycles = 1011 cycles

Was1430 + 300 delay cycles = 1730 cycles

Page 16: Software and Hardware Circular Buffer Operations

04/19/23 Software Circular Buffer Issues, M. Smith, ECE, University of Calgary, Canada

16

Next step – Hardware circular buffer Do exactly the same pointer calculations as with software circular

buffers, but now the calculations are done behind the scenes – high speed – using specialized pointer features

Only available with J0, J1, J2 and J3 registers (On older ADSP-21061 – all pointer registers)

Jx -- The pointer register JBx – The BASE register – set to start of the FIFO array JLx – The length register – set to length of the FIFO array VERY BIG WARNING? – Reset to zero. On older ADSP-21061 it

was very important that the length register be reset to zero, otherwise all the other functions using this register would suddenly start using circular buffer by mistake. Still advisable – but need special syntax for causing circular buffer

operations to occur

Page 17: Software and Hardware Circular Buffer Operations

04/19/23 Software Circular Buffer Issues, M. Smith, ECE, University of Calgary, Canada

17

Setting up the circular buffer functionsRemember all the tests to start with

Page 18: Software and Hardware Circular Buffer Operations

04/19/23 Software Circular Buffer Issues, M. Smith, ECE, University of Calgary, Canada

18

Store values into hardware FIFO CB instruction ONLY works on POST-

MODIFY operations

Page 19: Software and Hardware Circular Buffer Operations

04/19/23 Software Circular Buffer Issues, M. Smith, ECE, University of Calgary, Canada

19

Now perform Math operation using circular buffer operation MUST NOT DO XR2 = CB [J0 + i_J8]; Save N cycles as no longer need to increment index

Page 20: Software and Hardware Circular Buffer Operations

04/19/23 Software Circular Buffer Issues, M. Smith, ECE, University of Calgary, Canada

20

Update the static variablesFurther special CB instructions

A few cycles saved here

Page 21: Software and Hardware Circular Buffer Operations

04/19/23 Software Circular Buffer Issues, M. Smith, ECE, University of Calgary, Canada

21

Next stage in improving code speedHardware circular buffersSet up pointers to buffers

Insert values into buffersSUM LOOPSHIFT LOOPUpdate outgoing parametersUpdate FIFOFunction return

28 Was 43 + N * 4 Was 4 + N * 51 Was 1 + 2 * log2N614 Was 3 + 6 * N2---------------------------37 + 4 N Was 23 + 5 N

N = 128 – instructions = 549 cycles

549 + 300 delay cycle = 879 cyclesDelays are now >50% of useful time

Was 677 + 360 delay cycles = 1011 cycle

Page 22: Software and Hardware Circular Buffer Operations

04/19/23 Software Circular Buffer Issues, M. Smith, ECE, University of Calgary, Canada

22

Tackle the summation part of FIR Exercise in using CB (Assignment 2)

Page 23: Software and Hardware Circular Buffer Operations

04/19/23 Software Circular Buffer Issues, M. Smith, ECE, University of Calgary, Canada

23

Place assembly code here

Page 24: Software and Hardware Circular Buffer Operations

04/19/23 Software Circular Buffer Issues, M. Smith, ECE, University of Calgary, Canada

24

The code is too slow because we are not taking advantage of the available resources Bring in up to 128 bits (4

instructions) per cycle Ability to bring in 4 32-bit values

along J data bus (data1) and 4 along K bus (data2)

Perform address calculations in J and K ALU – single cycle hardware circular buffers

Perform math operations on both X and Y compute blocks

Background DMA activity Off-load some of the processing to

the second processor

Page 25: Software and Hardware Circular Buffer Operations

04/19/23 Software Circular Buffer Issues, M. Smith, ECE, University of Calgary, Canada

25

Tackled today

Have moved the DCremoval( ) over to the X Compute block

Circular Buffer Issues DCRemoval( ) FIR( )

Coding a software circular buffer in C++ and TigerSHARC assembly code

Coding a hardware circular buffer Where to next?