‘C’ for Microcontrollers, Just Being Efficient Lloyd Moore, President...
-
Upload
oliver-fields -
Category
Documents
-
view
214 -
download
0
Transcript of ‘C’ for Microcontrollers, Just Being Efficient Lloyd Moore, President...
Agenda
Microcontroller ResourcesKnowing Your Environment Memory UsageCode StructureInterrupts Math TricksOptimization
Disclaimer
Some microcontroller techniques necessarily need to trade one benefit for another – typically lower resource usage for maintainability
Point of this presentation is to point out various techniques that can be used as needed
Use these suggestions when necessary Feel free to suggest better solutions as we go
along
Microcontroller Resources
EVERYTHING resides on one die inside one package: RAM, Flash, Processor, I/O
Cost is a MAJOR design consideration Typical costs are $0.25 to $25 each (1000’s)
RAM: 16 BYTES to 32K Bytes typical Flash/ROM: 384 BYTES to 256K Bytes Clock Speed: 4MHz to 80MHz typical
Much lower for battery saving modes (32KHz) Bus is 8, 16, or 32 bits wide (just like the old
days)
Other Considerations
Specialized resources often present Counters, UART, USB PHY, LCD Controller
Portability inside families a big concern Across families, not so much
Typically no operating system present May have hardware centric API, or just raw
registers! No floating point hardware
May have other math hardware (MAC, CRC) No protected memory / MMU
Do have specialized memory segments
Power Consumption
Microcontrollers typically used in battery operated devices
Power requirements can be EXTREMELY tightEnergy harvesting applicationsLong term battery installations (remote
controls, hard to reach devices, etc.)EVERY instruction executed consumes
power, even if you have the time!
Know Your Environment
Traditionally we ignore hardware detailsNeed to tailor code to hardware available
Specialized hardware MUCH more efficientCompilers typically have extensions
Interrupt – specifies code as being ISRMemory model – may handle banked
memory and/or simultaneous access banksMultiple data pointers / address generators
Debugger may use some resources
Memory Usage
Use ‘const’ to put data into program memory Alignment / padding issues
Typically NOT an issue, non-aligned access ok Avoid dynamic memory allocation
Take extra space and processing time Memory fragmentation a big issue
Use and reuse static buffers Reduces variable passing overhead Allows for smaller / faster code due to reduced indirections Does bring back over write bugs if not done carefully
Use the appropriate variable type Don’t use int and double for everything!! Affects processing time as well as storage
Char vs. Int Increment on 8051
char cX;cX++;
000A 900000 MOV DPTR,#cX000D E0 MOVX A,@DPTR000E 04 INC A000F F0 MOVX @DPTR,A
6 Bytes of Flash 4 Instruction cycles
int iX;iX++;
0000 900000 MOV DPTR,#iX0003 E4 CLR A0004 75F001 MOV B,#01H0007 120000 LCALL ?C?IILDX
10 Bytes of Flash + subroutine overhead
Many more than 4 instruction cycles with a LCALL
Code Structure
Count down instead of up Saves a subtraction on all processors DJNZ style instruction on some processors
Pointers vs. array notation Generally better using pointers
Bit Shifting May not always generate what you think May or may not have barrel shifter hardware May or may not have logical vs. arithmetic shifts
Shifting Example
cX = cX << 3;
0006 33 RLC A0007 33 RLC A0008 33 RLC A0009 54F8 ANL A,#0F8H
Constants turn into seperate statements
Variables turn into loops
Both of these can be one instruction with a barrel shifter
cA = 3;cX = cX << cA;
000B 900000 MOV DPTR,#cA000E E0 MOVX A,@DPTR000F FE MOV R6,A0010 EF MOV A,R70011 A806 MOV R0,AR60013 08 INC R00014 8002 SJMP ?C00050016 ?C0004:0016 C3 CLR C0017 33 RLC A0018 ?C00050018 D8FC DJNZ R0,?C0004
More Code Structure
Actual parameters typically passed in registers if available Keep function parameters to less than 3 May also be passed on stack or special parameter area May be more efficient to pass pointer to struct
Global variables While generally frowned upon for most code can be very
helpful here Typically ends up being a direct access
Read assembly code for critical areas Know which optimizations are present
Small compilers do not always have common optimizations Inline, loop unrolling, loop invariant, pointer conversion
Indexed Array vs Pointer on M8C
ucMode = g_Channels[uc_Channel].ucMode;
01DC 52FC mov A,[X-4] 01DE 5300 mov [__r1],A 01E0 5000 mov A,0 01E2 08 push A 01E3 5100 mov A,[__r1] 01E5 08 push A 01E6 5000 mov A,0 01E8 08 push A 01E9 5007 mov A,7 01EB 08 push A 01EC 7C0000 xcall __mul16 01EF 38FC add SP,-4 01F1 5F0000 mov [__r1],[__rX] 01F4 5F0000 mov [__r0],[__rY] 01F7 060000 add[__r1],<_g_Channels 01FA 0E0000 adc[__r0],>_g_Channels 01FD 3E00 mvi A,[__r1] 01FF 5403 mov [X+3],A
ucMode = pChannel->ucMode;
01ED 5201 mov A,[X+1] 01EF 5300 mov [__r1],A 01F1 3E00 mvi A,[__r1] 01F3 5405 mov [X+5],A
Does the same thing Saves 29 bytes of memory AND a
call to a 16 bit multiplication routine! Pointer version will be at least 4x
faster to execute as well, maybe 10x Most compilers not this bad – but
you do find some!
Interrupts
Generally implemented as individual hardware vectors with a small amount of program memory at the location
ISR is what you get – no OS, no threads, no IST Can use a flag with main loop to get IST behavior for less time
critical code Also very common to use interrupts to simulate
threads Interrupt itself take the place of the WaitFor_XXX or signal Follows very naturally for hardware tasks and timers
Generally an “interrupt” statement provided
Interrupt Example
static unsigned char g_TimerTriggered;void main(){
ConfigureTimer0();g_TimerTriggered = 0;GlobalEnableInterrupt();
while(1){
if(g_TimerTriggered){
g_TimerTriggered = 0; //Could also disable the timer interrupt hereDoTimerTask(); //to avoid a race condition resetting g_TimerTriggered
}//Can put optional sleep here, interrupts can wake up processor
}}
void Timer0ISR(void) interrupt 1 using 2 //Interrupt source 1, attached to vector 2{
g_TimerTriggered = 1;//Can put other small, quick work here
}
Switch Statement Implementation
Switch statements can be implemented in various ways Sequential compares In line table look up for case block Special function with look up table
Specific implementation can also vary based case clauses Clean sequence (1, 2, 3, 4, 5) Gaps in sequence (1, 10, 30, 255) Ordering of sequence (5, 4, 1, 2, 3)
Knowing which method gets implemented critical to optimizing!
Switch Statement Example
switch(cA){
case 0:cX = 4;break;
case 1:cX = 10;break;
case 2:cX = 30;break;
default:cX = 0;break;
}
0006 900000 MOV DPTR,#cA0009 E0 MOVX A,@DPTR000A FF MOV R7,A000B EF MOV A,R7000C 120000 LCALL ?C?CCASE000F 0000 DW ?C00030011 00 DB 00H0012 0000 DW ?C00020014 01 DB 01H0015 0000 DW ?C00040017 02 DB 02H0018 0000 DW 00H001A 0000 DW ?C0005
001C ?C0002:001C 900000 MOV DPTR,#cX001F 7404 MOV A,#04H0021 F0 MOVX @DPTR,A0022 8015 SJMP ?C0006
...More blocks follow for each case
Bit Variables
Some processors have special memory areas and op-codes for single bit storage
Saves overhead of masking operationsSome key from bit fields notation, some
need keyword (frequently ‘bit’) struct {
unsigned int foo : 1;
} flags;
unsigned int my_bit : 1;
bit my_bit;
Math Tricks
Floating point math VERY expensive on microcontrollers No hardware support Typically 32 bits for float, 64 bits for double Support provided by a BIG library
Can use fixed point math in many cases Basically the same as integer math, however move the decimal inside the
integer. Binary number is really:
2^7 + 2^6 +… 2^2 + 2^1 + 2^0 To make a fixed point number just adjust the exponents:
2^6 + 2^5 + … 2^1 + 2^0 + 2^-1 :Note 2^-1 = 0.5 Assume 8 bit value: Range = [0,255] Assume one binary decimal point
XXXXXX.X Range is now [0, 127.5] All the internal math stays the same so long as only fixed point numbers with the
same binary point location used together!
More Math Tricks
You may not have multiply and/or divide ops! Decomposing operations can help
X * 5 = X * 4 + X(X * 4) can become 2 shift left operations
Formulas should also be restructured for math available:Y=ax^2 + bx + c : 1 Pow or Mult, 2 Mult, 2 AddY = x (ax + b) + c : 2 Mult, 2 Add
Lookup tables can be great for limited domain problems
Optimization
Step 0 – Before coding anything think about risk points and prototype unknowns!!!
Step 1 – Get it working!! Fast but wrong is of no use to anyone Optimization will typically reduce readability
Step 2 – Profile to know where to optimize Usually only one or two routines are critical You need to have specific performance metrics to
target
Optimization
Step 3 – Let the tools do as much as they canTurn off debugging! Select the correct memory modelSelect the correct optimization level
Step 4 – Do it manuallyRead the generated code! Might be able to
make a simple code or structure change.Last – think about assembly coding
Summary
Microcontroller hardware is much simpler than most of us are used to
Be familiar with the hardware in your microcontroller
Be familiar with your compiler options and how it translates your code
For time or space critical code look at the assembly listing from time to time
Questions?