Optimised C/C++. Overview of DS General code Functions Mathematics.

Post on 05-Jan-2016

217 views 3 download

Transcript of Optimised C/C++. Overview of DS General code Functions Mathematics.

Optimised C/C++

Overview of DS

General code Functions Mathematics

General

CPU based copies throw everything out of the cache, meaning that cache misses will occur after a transfer.

General

Before any code level optimisations consider: Does it need to be optimised Is the algorithm sufficient Write simple/clear code

General Code Use native variable sizes e.g. DS is a 32bit machine, so use ints for calculations

Smaller variable sizes (e.g. char, short) have to be converted to 32bit variables before any calculations are carried out.

For example:

char count;for(count=0;count<128;count++)

versus

int count;for(count=0;count<128;count++)

General Code Use native variable sizes e.g. DS is a 32bit machine, so use ints for calculations

Smaller variable sizes (e.g. char, short) have to be converted to 32bit variables before any calculations are carried out.

For example:

char a,b;char ans = a + b;

versus

int a,b; int ans = a + b;

General Code Exploit locality of reference

When reading data, the CPU often reads chunks of surrounding data into fast cache (e.g. 32kb)

If the next accessed data is not within fast cache (a miss), the CPU must look for it in memory, and load that into cache.

If this is done repeatedly (i.e. lots of cache misses), thrashing occurs.

This situation can be avoided by structuring your code appropriately.

E.g.for (i=0;i<N;i++){ for (j=0;j<N;j++) { ans[j][i] = a[j][i]+b[j][i]; }

}

for (i=0;i<N;i++){ for (j=0;j<N;j++) { ans[i][j] = a[i][j]+b[i][j]; }

}

General Code Global variables:

Global variables cannot be cached. Therefore, need to be loaded and stored in a register on

each use. If using global variables in a tight game loop, consider

storing their value in a local variable before use E.g.

int global = 100;

//all other code here

int local = global;for (…) //time critical loop here

General Code Aliases:

Consider the code below:void func1( int *data ) {

int i; for(i=0; i<10; i++) { anyfunc( *data, i); }

} Even though data is not modified, compiler does not know this, so has to read

data value from memory every access This code can be improved as follows:

void func1( int *data ) {

int val = *data; int i; for(i=0; i<10; i++)

{ anyfunc( val, i); }

}

General Code General calculations: Division – replace with reciprical multiplication (e.g ans

= a/b vs. ans = a * 1/b) Power of 2 calculations

If any division or multiplication calculations are a power of 2, they can be replace with a left or right shift.

E.g. ans = result * 8 => ans = result<<3 ans = result / 8 => ans = result>>3

If using modulo by n which is a power of 2, can replace with a logical & by n-1

e.g. ans = result %8 => ans = result & 7 Even if a number is not a power of 2, it can be subdivided into

sub calculations that are E.g. ans = result * 136 =>

ans = result<<7 + (result<<3) ans = (result*128) + (result*8)

Loops

Loop unrolling Reducing the number of iterations a loop takes by

increasing the number of instructions in the loop body E.g.

for (i = 0 ;i<128;i++) ans = ans * 8;

becomes

for (i = 0 ;i<32;i++){ ans = ans * 8; ans = ans * 8; ans = ans * 8; ans = ans * 8; }

Loops

Loop jamming Combining adjacent loops which iterate over the

same range of values E.g.

for (i = 0 ;i<128;i++) ans = ans * 8; for (i = 0 ;i<128;i++) otherVal += someArray[i];becomes for (i = 0 ;i<128;i++){ ans = ans * 8; otherVal += someArray[i]; }

Loops

Loop inversion Rewriting loops to run from n to 0, rather than 0 to n. E.g. for (i = 0 ;i<128;i++)

becomes i = 128 while(i--){ }

Loops

Simplifying loop conditions Execution of a loop can be faster if the loop control

conditions are simplified. For example:

for(i=0;i<=100;i++)

versus

for(i=100;i--)

LoopsFunction looping

Used when functions are called inside loops. Rather than calling function inside loop, call function once, and rewrite

loop inside function. Removes overhead of function call every time. E.g.

for (i=0;i<1000;i++) doFunc();

becomes

doFunc()

void doFunc(){

for (i=0;i<1000;i++) //do something }

Conditional Statements Exploit lazy evaluation:

In if statements like (a<c && b<d), make sure the first case is most likely to give false, as the rest will then not be evaluated.

In large if statements, make sure the first case is most likely to give true.Prefer switches to if statements:

E.g if( val == 1)

dostuff1(); else if (val == 2) dostuff2(); else if (val == 3) dostuff3();

switch( val ) { case 1:

dostuff1(); break;

case 2: dostuff2(); break;

case 3: dostuff3(); break; }

Conditional Statements

Binary Breakdown: Structure if/else statements to consider blocks of

statements, rather than just one statement at a time. For example:

if(a==1) { } else if(a==2) { } else if(a==3) { } else if(a==4) { } else if(a==5) { } else if(a==6) { } else if(a==7) { } else if(a==8) { }

if(a<=4) {

if(a==1) { } else if(a==2) { } else if(a==3) { } else if(a==4) { }

} else {

if(a==5) { } else if(a==6) { } else if(a==7) { } else if(a==8) { }

}

Function DesignKeep functions small and simple.

Enables the compiler to perform optimisations easily.

Make the number of parameters being passed relative to the work done by the function

When parameters are passed to a function, they are first stored in fast registers, and then stored on the stack

If the work done by a function is small or called often, then the number of parameters passed to it should be minimised.

Promote the use of “leaf” functions A leaf function is one which does not call any other functions. These can be optimised by a compiler, as it does not have to

perform any stack management when calling other functions.

C++ OptimisationMinimise use of virtual functionsMinimise constructor overhead:

Constructors are called everywhere in C++, even when we don’t expect it.

Therefore, optimising constructor overhead can go a long way to speeding up code execution.

Mechanisms of doing this include: Pass object parameters by reference (same goes for structs in

C) Prefer prefix (++x) to postfix (x++) operators. Prefer two-phase to one-phase object creations Use constructor initialisation lists Define local variables in the inner most scope Prefer initialisation over assignment

C++ OptimisationPrefer two-phase to one-phase object creations

E.g. do minimal amount of work in actual constructor, and use “create” methods to allocate memory etc.

Use constructor initialisation lists When constructing objects with other objects, initialisation within

the constructor results in lower performance than using an initialisation list.

E.g. myObj (string name,string job) : m_name(name), m_job(job) { //rest of ctor}

versus

myObj (string name,string job) {

m_name = name;m_job = job;

}

C++ OptimisationDefine local objects in the inner most scope

e.g int func(int param){ Obj A; if (param == 1)

{A.value += 10;//rest of if statement

} //rest of function

}

int func(int param){ if (param == 1)

{

Obj A;A.value += 10;

} //rest of function

}

C++ Optimisation

Prefer initialisation over assigment e.g

Obj B;//work done on BObj A;A = B;//requires 2 constructor calls

vs.

Obj B;//work done on BObj A = B; //requires 1 constructor call