Games at Bolton OpenMP Techniques Andrew Williams .
-
date post
20-Dec-2015 -
Category
Documents
-
view
217 -
download
2
Transcript of Games at Bolton OpenMP Techniques Andrew Williams .
Games at Bolton
Some Technical Issues Outside parallel regions, a single
thread (the master thread) executes On encountering a parallel construct,
extra threads are created. – Each thread has a number – The master is always thread 0
Thread number is provided by: int omp_get_thread_num();
Games at Bolton
More Technical Issues In parallel regions, each thread maintains
its own stack– Therefore, threads can call functions without
fear of interference from other threads– Threads can even call the same function
• But watch out for static variables which will be shared unless you arrange it otherwise!
• And global variables must also be treated with caution
– Programming with threads demands more discipline
Games at Bolton
More Technical Issues On entry to a parallel region, you must
decide whether variables are shared or private
On entry to the next parallel region, you decide again– A variable my be private one time and
shared the next This is program design and what your
program needs depends on what it it doing.
Games at Bolton
More Technical Issues As well as private and shared, there is
another pseudo-category called a reduction– For the very common case where we are
trying to calculate the sum (or similar accumulation) of some numbers
– What happens if we run this over several threads?for(i=0; i<200000; i++) {
tot = tot + arr[i];
}
Games at Bolton
Assembly language T1 Load tot (940
say) T2 Add arr[82]
(eg 2) T3 Oops WAIT T4 Write tot (942)
T1 Load tot (954 say)
T2 Add arr[85] (17)
T3 Write tot (971)
Games at Bolton
Assembly language T1 Load tot (940
say) T2 Add arr[82]
(eg 2) T3 Oops WAIT T4 Write tot (942)
T1 Load tot (954 say)
T2 Add arr[85] (17)
T3 Write tot (971)Can you see the
problem?
Games at Bolton
Reduction - Solution There are two ways to resolve this:
1. You can tell the compiler that the variable tot is the target of a reduction in the loop (we say that arr is being reduced)• The compiler will then do two things:
– It will make a copy of tot for each thread– At the end of the parallel region it will add all the
private tots into the real tot in the program
Games at Bolton
Reduction - Solution
Two ways to resolve this (continued)2. You can use critical sections
• A critical section is one where only one thread may act at any one time
• The compiler ensures this for you
Games at Bolton
Critical Sections (assembly language) T1 WAIT – critical T2 WAIT – critical T3 WAIT – critical T4 Load tot (971) T5 Add arr[82] (2) T6 Write tot (973)
T1 Load tot (954 say)
T2 Add arr[85] (17)
T3 Write tot (971)
Games at Bolton
Approximating pi
There are several ways of approximating pi
This one is based on statistics and probability– What is the area of the blue
circle? – What is the area of the red
square?
2
r=1
Games at Bolton
Approximating pi
Suppose you were to randomly scatter (a lot of) points around this diagram
You would expect the ratio of dots in the square (ie all those you draw) to the dots in the circle to be approximately (4 : pi)– Because the area of the square is
4 and the area of the circle is pi
2
r=1
Games at Bolton
Approximating pi
We can do this very simply:– For(i=0 to lotsandlots) begin
• Randomly generate a pair of numbers in the range 0-2
• Treating them as (x,y) calculate the distance to the centre of the circle
• If (distance <= 1.0) inside++
– End– Return 4 * inside/lotsandlots
2
r=1
Games at Bolton
Approximating pi
Or, to put it another way:
for(i=0; i < ITERATIONS; i++) {rx = (((double) rand() / (double) RAND_MAX) * RMAX + RMIN);ry = (((double) rand() / (double) RAND_MAX) * RMAX + RMIN);dist = (rx-CENTREX)*(rx-CENTREX) + (ry-CENTREY)*(ry-CENTREY);if(dist <= 1.0) inside = inside + 1.0;
}myPI = 4 * (inside / (double)ITERATIONS);
2
r=1
Games at Bolton
piByArea
The code extract from the previous page is contained in a project called piByArea which is on the web-page for the module– Your first task is to make an attempt at
converting this to use OpenMP– You may use either a reduction clause
or a critical section• Faster students might like to compare the
speed of these approaches