CW208-3/4-2010/11, Chp 2: Lec 41 Performance & Optimisation techniques Programming for Games...

31
CW208-3/4-2010/11, Chp 2: Lec 4 1 Performance & Optimisation techniques Programming for Games Devices

Transcript of CW208-3/4-2010/11, Chp 2: Lec 41 Performance & Optimisation techniques Programming for Games...

Page 1: CW208-3/4-2010/11, Chp 2: Lec 41 Performance & Optimisation techniques Programming for Games Devices.

CW208-3/4-2010/11, Chp 2: Lec 4 1

Performance & Optimisation techniques

Programming for Games Devices

Page 2: CW208-3/4-2010/11, Chp 2: Lec 41 Performance & Optimisation techniques Programming for Games Devices.

CW208-3/4-2010/11, Chp 2: Lec 4 2

Agenda

Introduction Measuring performance

Traceview Caliper

Optimising High-level Low-level

Page 3: CW208-3/4-2010/11, Chp 2: Lec 41 Performance & Optimisation techniques Programming for Games Devices.

CW208-3/4-2010/11, Chp 2: Lec 4

Code optimisations

Games fall broadly into one of two categories:

1) Input-driven display the current state of gameplay wait for the player to make a move Examples include card games, puzzle games

or turn-based strategy games. 2) Real-time games (or skill and action

games) Characterized by lots of screen movement Generally require rapid inputs from the player

3

Page 4: CW208-3/4-2010/11, Chp 2: Lec 41 Performance & Optimisation techniques Programming for Games Devices.

CW208-3/4-2010/11, Chp 2: Lec 4

Code optimisations

Usually not necessary to optimise for input-driven games – there are exceptions (examples?)

Real-time games clearly are good candidates for some optimisation.

Don’t optimise if your game does not require it!

Will optimisation add to the games ‘fun’ factor…if not don’t optimise

4

Page 5: CW208-3/4-2010/11, Chp 2: Lec 41 Performance & Optimisation techniques Programming for Games Devices.

CW208-3/4-2010/11, Chp 2: Lec 4

Code optimisations

Good rules of thumb: Monitor game performance as code base

grows! Integrate features into the full game early Continually test code base against games

performance There is often a trade-off between speed

and increased memory usage Do not optimise early in the dev cycle:

Makes code difficult to read/maintain Don’t waste time optimizing a game feature

that gets binned5

Page 6: CW208-3/4-2010/11, Chp 2: Lec 41 Performance & Optimisation techniques Programming for Games Devices.

CW208-3/4-2010/11, Chp 2: Lec 4

Code optimisations

Some of the pitfalls of optimizing code are: optimisation is a good way to introduce

bugs; Some techniques decrease the portability of

your code; Perhaps lots of effort for little reward; optimisation is hard (the emulator might not

‘emulate’ the handset too accurately!)

6

Page 7: CW208-3/4-2010/11, Chp 2: Lec 41 Performance & Optimisation techniques Programming for Games Devices.

CW208-3/4-2010/11, Chp 2: Lec 4

Code optimisations

High-level optimisations generally consider the efficiency of the architecture and algorithms used in a project

Low-level optimisations generally focus on low-level coding constructs

7

Page 8: CW208-3/4-2010/11, Chp 2: Lec 41 Performance & Optimisation techniques Programming for Games Devices.

CW208-3/4-2010/11, Chp 2: Lec 4

High-Level optimisations

There are two basic rules for writing efficient code: 1) Don't do work that you don't need to do. 2) Don't allocate memory if you can avoid it.

8

Page 9: CW208-3/4-2010/11, Chp 2: Lec 41 Performance & Optimisation techniques Programming for Games Devices.

CW208-3/4-2010/11, Chp 2: Lec 4

High-Level optimisations

Let’s consider some examples of high-level optimisations for game projects

Scenario: our game performance is degrading and we need to consider simple ways of speeding things up

Solution #1: ignore collisions between certain game objects...

Trick is to avoid compromising the gameplay and making the game less ‘fun’ to play

9

Page 10: CW208-3/4-2010/11, Chp 2: Lec 41 Performance & Optimisation techniques Programming for Games Devices.

CW208-3/4-2010/11, Chp 2: Lec 4

HL optimisations: Object Pooling

Object creation is slow… Increased garbage collection Reduced memory.

Solution #2: Use object pooling...a technique whereby a set (or pool) of objects is created and then handed out upon request.

For example, projectiles such as bullets might be good candidate objects for pooling….

10

Page 11: CW208-3/4-2010/11, Chp 2: Lec 41 Performance & Optimisation techniques Programming for Games Devices.

CW208-3/4-2010/11, Chp 2: Lec 4

HL optimisations: Object Pooling

Note: The efficiency of pooling objects vs. creating and disposing of objects is highly dependent on the size and complexity of the objects.

What problems do you think caching unneeded objects might have?

11

Page 12: CW208-3/4-2010/11, Chp 2: Lec 41 Performance & Optimisation techniques Programming for Games Devices.

CW208-3/4-2010/11, Chp 2: Lec 4

HL optimisations: Object Pooling

It turns out our game project is using a bubble sort algorithm with a running time of O(n2)

Solution #3: Consider alternative algorthms such as quick sort, merge sort, heap sort etc.

12

Page 13: CW208-3/4-2010/11, Chp 2: Lec 41 Performance & Optimisation techniques Programming for Games Devices.

CW208-3/4-2010/11, Chp 2: Lec 4

Low-level optimisations

Relate mostly to coding techniques, which we now consider….

13

Page 14: CW208-3/4-2010/11, Chp 2: Lec 41 Performance & Optimisation techniques Programming for Games Devices.

CW208-3/4-2010/11, Chp 2: Lec 4

HL optimisations: Algorithms

Another simple piece of advice is to use the right algorithm for the task at hand.

For example, don’t bubble sort when you can mergesort or quicksort

14

Page 15: CW208-3/4-2010/11, Chp 2: Lec 41 Performance & Optimisation techniques Programming for Games Devices.

CW208-3/4-2010/11, Chp 2: Lec 4

Low-level optimisations

Android apps will run on multiple h/w platforms

Different VM’s running on different processors running at different speeds.

Emulator is not a reliable indicator of performance on the actual device

Huge differences between devices with and without a JIT the "best" code for a device with a JIT is not

always the best code for a device without. Always test on the target device

15

Page 16: CW208-3/4-2010/11, Chp 2: Lec 41 Performance & Optimisation techniques Programming for Games Devices.

CW208-3/4-2010/11, Chp 2: Lec 4

Low-level optimisations

Allocating memory is always more expensive than not allocating memory.

E.g. Allocating objects in a UI loop will force periodic garbage collection degrading performance

Some examples of good practice follow:

16

Page 17: CW208-3/4-2010/11, Chp 2: Lec 41 Performance & Optimisation techniques Programming for Games Devices.

CW208-3/4-2010/11, Chp 2: Lec 4

Low-level optimisations

Extracting strings When extracting strings from a set of

input data, try to return a substring of the original data, instead of creating a copy. You will create a new String object, but it will share the char[] with the data.

String.substring(int start, int end)

String.substring(int start)

17

Page 18: CW208-3/4-2010/11, Chp 2: Lec 41 Performance & Optimisation techniques Programming for Games Devices.

CW208-3/4-2010/11, Chp 2: Lec 4

Low-level optimisations

If you have a method returning a string, and you know that its result will always be appended to a StringBuffer anyway, change your signature and implementation so that the function does the append directly, instead of creating a short-lived temporary object.

For example...

18

Page 19: CW208-3/4-2010/11, Chp 2: Lec 41 Performance & Optimisation techniques Programming for Games Devices.

CW208-3/4-2010/11, Chp 2: Lec 4

Low-level optimisations

Instead of:// Assuming buffer is a StringBuffer

buffer.append( myMethod.getString() );

Refactor to:

void append(StringBuffer s1) {

// append the string inside the method

s1.append( ... );

}

19

Page 20: CW208-3/4-2010/11, Chp 2: Lec 41 Performance & Optimisation techniques Programming for Games Devices.

CW208-3/4-2010/11, Chp 2: Lec 4

Low-level optimisations

Avoid multi-dimensional arrays – use single dimensional arrays in instead

Also, an array of int is much better than an array of java.lang.Integers....same applies to the other classes that wrap primitives (e.g. Float, Double etc.)

20

Page 21: CW208-3/4-2010/11, Chp 2: Lec 41 Performance & Optimisation techniques Programming for Games Devices.

Use one-dimensional arrays

Example:// Before

int[][] world; // a 4x4 table

world[row][col] = 0;

// After

int[] world; // a 1x16 table

world[col * colOffset + row] = 0;

1D arrays consume less heap memory

21CW208-3/4-2010/11, Chp 2:

Lec 4

Page 22: CW208-3/4-2010/11, Chp 2: Lec 41 Performance & Optimisation techniques Programming for Games Devices.

CW208-3/4-2010/11, Chp 2: Lec 4

Low-level optimisations

JIT or no JIT? Without JIT

Caching field accesses is about 20% faster than repeatedly accessing the field.

With JIT field access costs about the same as local

access, so not a worthwhile optimisation (same applies to final, static, and static final fields)

22

Page 23: CW208-3/4-2010/11, Chp 2: Lec 41 Performance & Optimisation techniques Programming for Games Devices.

CW208-3/4-2010/11, Chp 2: Lec 4

Low-level optimisations

Prefer Static Over Virtual If you don't need to access an object's

fields, make your method static Invocations will be about 15%-20%

faster Helps document the code as such

methods cannot alter the object’s state (like const member functions in C++)

23

Page 24: CW208-3/4-2010/11, Chp 2: Lec 41 Performance & Optimisation techniques Programming for Games Devices.

CW208-3/4-2010/11, Chp 2: Lec 4

Low-level optimisations

Avoid internal calls to getter/setter methods

Class getter/setter methods are good practice

However, avoid the following:class Foo {

private int mFooInt;

int getFooInt() {

return mFooInt;

}

void doSomethingWithFooInt() {

int x = getFooInt();

}

}24

Generates an expensive virtual method call

Page 25: CW208-3/4-2010/11, Chp 2: Lec 41 Performance & Optimisation techniques Programming for Games Devices.

CW208-3/4-2010/11, Chp 2: Lec 4

Low-level optimisations

Instead rewrite as:void doSomethingWithFooInt() {

int x = mFooInt;

}

Without a JIT, direct field access is about 3x faster than invoking a trivial getter.

With the JIT direct field access is about 7x faster than invoking a trivial getter.

This is true in Froyo, but will improve in the future when the JIT inlines getter methods.

25

Page 26: CW208-3/4-2010/11, Chp 2: Lec 41 Performance & Optimisation techniques Programming for Games Devices.

CW208-3/4-2010/11, Chp 2: Lec 4

Low-level optimisations

Without a JIT, direct field access is about 3x faster than invoking a trivial getter.

With the JIT (where direct field access is as cheap as accessing a local), direct field access is about 7x faster than invoking a trivial getter.

This is true in Froyo, but will improve in the future when the JIT inlines getter methods.

26

Page 27: CW208-3/4-2010/11, Chp 2: Lec 41 Performance & Optimisation techniques Programming for Games Devices.

CW208-3/4-2010/11, Chp 2: Lec 4

Low-level optimisations

Use static final for constants For example consider the following:class Foo {

static int intVal = 42;static String strVal = "Hello, world!";

...

The compiler generates a class initializer method, called <clinit>, that is executed when class Foo is first used.

27

Page 28: CW208-3/4-2010/11, Chp 2: Lec 41 Performance & Optimisation techniques Programming for Games Devices.

CW208-3/4-2010/11, Chp 2: Lec 4

Low-level optimisations

<clinit> stores the value 42 into intVal and extracts a reference from the classfile string constant table for strVal.

When these values are referenced later on, they are accessed with field lookups.

28

Page 29: CW208-3/4-2010/11, Chp 2: Lec 41 Performance & Optimisation techniques Programming for Games Devices.

CW208-3/4-2010/11, Chp 2: Lec 4

Low-level optimisations

Instead use:class Foo {

static final int intVal = 42;static final String strVal = "Hello, world!";

...

The constants go into static field initializers in the dex file so method <clinit> is no longer required.

References to intVal will use the integer value 42 directly, and accesses to strVal will use a relatively inexpensive "string constant" instruction instead of a field lookup.

29

Page 30: CW208-3/4-2010/11, Chp 2: Lec 41 Performance & Optimisation techniques Programming for Games Devices.

CW208-3/4-2010/11, Chp 2: Lec 4

Low-level optimisations

<clinit> stores the value 42 into intVal and extracts a reference from the classfile string constant table for strVal.

When these values are referenced later on, they are accessed with field lookups.

30

Page 31: CW208-3/4-2010/11, Chp 2: Lec 41 Performance & Optimisation techniques Programming for Games Devices.

CW208-3/4-2010/11, Chp 2: Lec 4

Low-level optimisations

Some further tips here: http://developer.android.com/guide/prac

tices/design/performance.html See tips from section ‘Use Enhanced

For Loop Syntax’ onwards...

31