Introduction to CUDA Programming Scan Algorithm Explained Andreas Moshovos Winter 2009.

Post on 18-Dec-2015

229 views 3 download

Tags:

Transcript of Introduction to CUDA Programming Scan Algorithm Explained Andreas Moshovos Winter 2009.

Introduction to CUDA ProgrammingScan Algorithm Explained

Andreas MoshovosWinter 2009

Reading

• You are strongly encouraged to read the following as it a contains a more formal treatment of the algorithm, plus an overview of various applications of scan.– Guy E. Blelloch. “Prefix Sums and Their

Applications”. In John H. Reif (Ed.), Synthesis of Parallel Algorithms, Morgan Kaufmann, 1990. http://www.cs.cmu.edu/afs/cs.cmu.edu/project/scandal/public/papers/CMU-CS-90-190.html

Two phases

• Up-Sweep– Essentially a reduction– Produces many partial results

• Down-Sweep– Propagating the partial results to all relevant

elements

Up-Sweep

• Just a reduction:

1 2 2 5 6 3 8 2 4 1 5 2 7 9 3 5

1 3 2 7 6 9 8 10 4 5 5 7 7 16 3 8

10 19 12 24

29 36

1 3 2 6 9 8 4 5 5 7 16 3

10 121 3 2 6 9 8 4 5 5 7 16 3

29 6510 121 3 2 6 9 8 4 5 5 7 16 3

Up-Sweep

• Now let’s see this is a tree

1 2 2 5 6 3 8 2 4 1 5 2 7 9 3 5

3 7 9 10 5 7 16 8

10 19 12 24

29 36

2910 121 3 2 6 9 8 4 5 5 7 16 3

• Notice we only have these nodes left in our array:– the rest were partial results

65

65

Up-Sweep

• So, this is what’s left– nodes without values don’t exist, they were partial

results

1 2 6 8 4 5 7 3

3 9 5 16

10 12

29

65

Down-Sweep

• For the second phase we need to think:– The edges in reverse– The empty nodes as placeholders for partial results

1 2 6 8 4 5 7 3

3 9 5 16

10 12

29

65

Down-Sweep

• Now let’s view the tree as a collection of nsubtrees– The root of each sub tree, where it’s still present

contains the reduction of all subtree elements• i.e., the sum of all subtree elements

1 2 6 8 4 5 7 3

3 9 5 16

10 12

29

65

Down-Sweep

• Let’s focus on the rightmost subtree:

1 2 6 8 4 5 7 3

3 9 5 16

10 12

29

65

Down-Sweep

• Before the last step of the down-sweep phase the yellow element will contain the sum (57) of all elements to the left of the subtree.

3

57

• The last step will take the following two actions– 3+ 57 = 60, this goes on the rightmost element

• This is the sum of all elements including 3 but excluding the right most one

– overwrite 3 with 57• This is the sum of all elements left of 3

Down-Sweep

• In terms of the array stored in memory the aforementioned actions look like this:

57 61

57

• Where:– the dark arrows represent addition– the red dotted arrow represents a move

3

Down-Sweep

• Let’s now focus at the rightmost subtree that contains the last four nodes:– This will be processed at the step before the

previous subtree we just discussed

7 3

16

Down-Sweep

• Before the previous to the last step of the down-sweep phase the green element will contain the sum (41) of all elements to the left of the subtree.

7 3

16

41

Down-Sweep

• The actions that will be taken at this step are:– 16 + 41 = 57 will be written as the

root of the rightmost subtree• As we saw before this is the sum of all

element left of the rightmost subtree

– 41 will replace 16• This is the sum of all elements left of the

subtree rooted by 16

7 3

41 57

41

Down-Sweep

• In terms of the array stored in memory the aforementioned actions look like this:

• Where:– the dark arrows represent addition– the red dotted arrow represents a move

7 41 3

16

57

417 3

Down-Sweep

• Now let’s go a step back looking at the complete right subtee (in green)

4 5 7 3

5 16

12

Down-Sweep

• Before this step the root node will contain the sum (29) of all elements of the left subtree

4 5 7 3

5 16

12

29

Down-Sweep

• As before we’ll do two things:– 29+12 = 41 and this becomes the root of the

rightmost subtree• This should be the sum of all elements to the left of that

subtree for the next step (which we saw previously)

– 29 replaces 124 5 7 3

5 16

29 41

29

same reason: 29 is the sumof all elements left of the subtreerooted by what was 12.

Down-Sweep

• Let’s try to generalize what happens at every step of the down-sweep phase

• Let’s look at step 1:– There is only one subtree shown in purple

1 2 6 8 4 5 7 3

3 9 5 16

10 12

29

65

Down-Sweep

• Before we process this tree as described before the root node must contain the sum of all elements to the left of the tree– There are no elements– Hence the root must be 01 2 6 8 4 5 7 3

3 9 5 16

10 12

29

0

Down-Sweep

• Now repeat the steps we saw before– 29 + 0 = 29 and this becomes the root of the right

subtree– 29 gets replaced by 0

1 2 6 8 4 5 7 3

3 9 5 16

10 12

0 29

0

Down-Sweep

• In terms of the array stored in memory the aforementioned actions look like this:

• Where:– the dark arrows represent addition– the red dotted arrow represents a move

29 010 121 3 2 6 9 8 4 5 5 7 16 3

0 2910 121 3 2 6 9 8 4 5 5 7 16 3