Games at Bolton OpenMP Techniques Andrew Williams .

16
Games at Bolton OpenMP Techniques Andrew Williams http://www.bolton.ac.uk/staff/ adw1
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    217
  • download

    2

Transcript of Games at Bolton OpenMP Techniques Andrew Williams .

Games at Bolton

OpenMP Techniques

Andrew Williamshttp://www.bolton.ac.uk/staff/adw1

Games at Bolton

Some Technical Issues Outside parallel regions, a single

thread (the master thread) executes On encountering a parallel construct,

extra threads are created. – Each thread has a number – The master is always thread 0

Thread number is provided by: int omp_get_thread_num();

Games at Bolton

More Technical Issues In parallel regions, each thread maintains

its own stack– Therefore, threads can call functions without

fear of interference from other threads– Threads can even call the same function

• But watch out for static variables which will be shared unless you arrange it otherwise!

• And global variables must also be treated with caution

– Programming with threads demands more discipline

Games at Bolton

More Technical Issues On entry to a parallel region, you must

decide whether variables are shared or private

On entry to the next parallel region, you decide again– A variable my be private one time and

shared the next This is program design and what your

program needs depends on what it it doing.

Games at Bolton

More Technical Issues As well as private and shared, there is

another pseudo-category called a reduction– For the very common case where we are

trying to calculate the sum (or similar accumulation) of some numbers

– What happens if we run this over several threads?for(i=0; i<200000; i++) {

tot = tot + arr[i];

}

Games at Bolton

Assembly language T1 Load tot (940

say) T2 Add arr[82]

(eg 2) T3 Oops WAIT T4 Write tot (942)

T1 Load tot (954 say)

T2 Add arr[85] (17)

T3 Write tot (971)

Games at Bolton

Assembly language T1 Load tot (940

say) T2 Add arr[82]

(eg 2) T3 Oops WAIT T4 Write tot (942)

T1 Load tot (954 say)

T2 Add arr[85] (17)

T3 Write tot (971)Can you see the

problem?

Games at Bolton

Reduction - Solution There are two ways to resolve this:

1. You can tell the compiler that the variable tot is the target of a reduction in the loop (we say that arr is being reduced)• The compiler will then do two things:

– It will make a copy of tot for each thread– At the end of the parallel region it will add all the

private tots into the real tot in the program

Games at Bolton

Reduction - Solution

Two ways to resolve this (continued)2. You can use critical sections

• A critical section is one where only one thread may act at any one time

• The compiler ensures this for you

Games at Bolton

Critical Sections (assembly language) T1 WAIT – critical T2 WAIT – critical T3 WAIT – critical T4 Load tot (971) T5 Add arr[82] (2) T6 Write tot (973)

T1 Load tot (954 say)

T2 Add arr[85] (17)

T3 Write tot (971)

Games at Bolton

Approximating pi

There are several ways of approximating pi

This one is based on statistics and probability– What is the area of the blue

circle? – What is the area of the red

square?

2

r=1

Games at Bolton

Approximating pi

Suppose you were to randomly scatter (a lot of) points around this diagram

You would expect the ratio of dots in the square (ie all those you draw) to the dots in the circle to be approximately (4 : pi)– Because the area of the square is

4 and the area of the circle is pi

2

r=1

Games at Bolton

Approximating pi

We can do this very simply:– For(i=0 to lotsandlots) begin

• Randomly generate a pair of numbers in the range 0-2

• Treating them as (x,y) calculate the distance to the centre of the circle

• If (distance <= 1.0) inside++

– End– Return 4 * inside/lotsandlots

2

r=1

Games at Bolton

Approximating pi

Or, to put it another way:

for(i=0; i < ITERATIONS; i++) {rx = (((double) rand() / (double) RAND_MAX) * RMAX + RMIN);ry = (((double) rand() / (double) RAND_MAX) * RMAX + RMIN);dist = (rx-CENTREX)*(rx-CENTREX) + (ry-CENTREY)*(ry-CENTREY);if(dist <= 1.0) inside = inside + 1.0;

}myPI = 4 * (inside / (double)ITERATIONS);

2

r=1

Games at Bolton

piByArea

The code extract from the previous page is contained in a project called piByArea which is on the web-page for the module– Your first task is to make an attempt at

converting this to use OpenMP– You may use either a reduction clause

or a critical section• Faster students might like to compare the

speed of these approaches

Games at Bolton

LAB TASK – CONVERT piByArea TO OPENMP

#pragma omp parallel for ….. reduction(+:inside)