Post on 04-Jan-2016
Ultra sound solution
Profiles and other optimizations
Research Team discussion -- RECAP
Ultra-sound probe (20 MHz) that sends out signals into body that reflect off moving blood cells in (Artery? Vein?)
Ultra-sound frequency received is Doppler shifted compared to transmitted frequency Same as sound when ambulance goes by. Higher
if approaching, lower if receding They get the positive frequencies (towards)
on the left audio channel and negative frequencies (away) on the right audio channel.
04/20/23.ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 2 / 33
Picture looks like this -- RECAP
Note that the display loses all direction information Can I help them to output the maximum frequency?
04/20/23ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 3 / 33
Progress to this stage -- RECAP
Gather single input values into buffer as part of ISR
When buffer ready, launch an TTCOS processing task that executes outside the ISR -- MUST COMPLETE in time N * deltaT where N is buffer size and deltaT is sample time
Take values from a previously processed buffer and send to output inside ISR
Explored a number of pragmas for optimizing C++ code and looked at impact of C code
Only looked at simple copy program
04/20/23ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 4 / 33
Snap shot code execution -- RECAP
04/20/23ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 5 / 33
A/D DATA CAPTURE INSIDE ISR
D/A DATA OPERATION INSIDE ISR
DATA PROCESSING OUTSIDE ISR
MAJOR PROBLEM -- NEED TO RETRIEVE FROM END AND BEGINNING DATA BLOCKAssignment 2 – FIR operations on BLOCKS is another example
Approaches – two common -- RECAP
Store complex number (a + jb) data block in two sections Real part in dm memory space and imaginary
part in pm memory space Where to store FIR coefficients to avoid data
data conflicts? Store complex number data block in dm
space in double size array float normalArray[LENGTH] float complexArray[2 *LENGTH] Perhaps store FIR coefficients in pm space
04/20/23ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 6 / 33
Proposed algorithm – based on STFT – Short term FT -- RECAP
Gather N points of data to circular buffer Using last 2 N points, and do on both channels
Convert 2N real points to 2N complex points Perform 2N point FFT (complex numbers) with
windowing Calculate absolute value of FFT of 2N points Calculate area under frequency curve (area) Find index where X% of frequencies are below this
frequency (max frequency) This one frequency value is used as the best
estimate of the maximum frequency over the last N points (put same value into all N points of output) (STFT limitation)
04/20/23ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 7 / 33
Switch to Testing framework -- RECAP
Don’t test with TTCOS and real audio signals Don’t know what expected values are Testing framework can’t work with TTCOS anyway – CHECK
errors take 1/ 30 s to print = lost samples 1500 Check circular buffer operation for data storage Check circular buffer operation when convert 2N real points to
2N complex points (different buffer, not circular buffer operations)
Check Absolute calculation (complex to real) Check area and frequency maximum calculations Check FFT. Turn on optimizer and ‘hope things work in real time’
Plan B – Find better FFT code, that compiler can more easily optimize (for loops)
Plan C -- SHARC has FFT Accelerator and there is sample code available
04/20/23ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 8 / 33
Revised code -- Error message strange
04/20/23ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 9 / 33
Errors and Defects -- RECAP We all make mistakes – the key is when we
find the mistake If we find the mistake AFTER we move away
from this part of the problem – we call the mistake – ‘An error’. Errors can be made in coding or design or life.
If we find the mistake AFTER we move away from this part of the problem – we call the mistake – ‘A defect’. Defects can be made in coding or design or life. Defects are more costly to correct that errors
04/20/23ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 10 / 33
Code for absolute of complex number
04/20/23ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 11 / 33
04/20/23ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 12 / 33
FFT code – legacy code Typical problem with legacy code
Written by somebody else – how good or bad?
No tests available Can be very difficult to write tests
Useful article and book www.objectmentor.com/resources/artic
les/WorkingEffectivelyWithLegacyCode.pdf
Michael Feathers04/20/23
ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 13 / 33
Some obvious tests for FFT Take any signal signal = IFFT(FFT(signal)) and signal = FFT(IFFT(signal))
WARNING signal starts as a real value and ends up as a complex value of the form
value + j (close but not equal to zero)
04/20/23ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 14 / 33
Obvious tests for FFT Based on linear algebra If input is size N and is all ones (DC)
then FFT results should have spike at frequency 0 with a size of N
If input is size N and signal is sin(2 * pi * f * t / N) then should have spike at frequency f with a size of N (actually two spikes one at f and the other at –f since sin = (exp(jwt) + exp(-jwt)) / 2j
Compare to MathLab results
04/20/23ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 15 / 33
Need a new RealToComplex( ) Note – framework indicates memory
leak
04/20/23ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 16 / 33
delete complexDCSignalat end of TEST
First set of FFT tests Common error S = IFFT( FFT(S) ) / N;
Matlab does this automatically – we must do manually TIME PENALTY
04/20/23ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 17 / 33
Make sure you rerun tests For every 3 errors fixed, 1 introduced
SUSPECTCODE
04/20/23ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 18 / 33
KNOWN DEFECT This should work with this new buffer and
old buffer, but uses next buffer instead
Difficult to spot, even after release, as all the buffers will be very similar so results will be “Almost right” – problem with medical device – especially if you are the patient
04/20/23ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 19 / 33
Switch on the optimizing compiler
DOES NOT HELP
Where’s the timing info gone? Have to go in and tell optimizing compiler
to add debug info and do rebuild
04/20/23ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 20 / 33
Here is the only compiler warning
If more info available could go to SIMD
04/20/23ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 21 / 33
2 / 2efficiency100%
If’s take5 / 7 of time
04/20/23ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 22 / 33
Detail – out of order execution
04/20/23ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 23 / 33
Performance hog – function in loop
Also software loop
04/20/23ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 24 / 33
3 cycles every time if not true
NOT PREDICTED
04/20/23ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 25 / 33
Compare code
BEFORE AFTER – SMALLER LOOP, NOT JUMP
04/20/23ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 26 / 33
Efficiency of FFT code Don’t even want to look Software bit reversing -- 30% of time
SHARC can for hardware bit reverse address
Continually calculates sine and cosine coeffs Okay if FFT done once – otherwise get an
algorithm that uses table lookup Those while loops
04/20/23ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 27 / 33
Next step Do we need to worry? Is real-time
performance there
Add tested code into TTCOS and do a quick test
TTCOS will send out error message if not working fast enough
Look at using profiler to find out where the code is the slowest
04/20/23ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 28 / 33
Add to TTCOS Ausio program
04/20/23ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 29 / 33
It links in Debug – will it run?
04/20/23ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 30 / 33
It runs TTCOS is not complaining “no room for
new tasks We are processing 128 points at 41000 Hz So each task must finish in 128 / 41000 s
Or around 1/ 500 s or 2 ms Or 2000 * 500 cycles at 500 MHz= 1,000,000
Every cycle can do many things
Lets turn profiling on
04/20/23ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 31 / 33
Turn on statistical profile What is a profiler? What is a statistical profiler?
04/20/23ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 32 / 33
Profile information – TOTALLY IMPRESSED
Actually works better than I expected (profiler and code) – If the code works
DEBUG OPTIMIZED
04/20/23ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 33 / 33
Get line by line information
04/20/23ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 34 / 33
Project Need to add the amplitude modulating
code Do some more testing
Use audacity as a simple oscilloscope Fix the known defects
Must use only ½ the FFT data and not all Using wrong ‘old buffer’
But I am impressed!! Project is essentially there in about 1 week
elapsed time
04/20/23ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 35 / 33
04/20/23ENCM515 – Ultrasound ProblemCopyright smithmr@ucalgary.ca 36 / 33