Parallel Algo c Sar

28
CS 515: Parallel Algorithms Chandrima Sarkar Atanu Roy 2010 - 02 -17

Transcript of Parallel Algo c Sar

Page 1: Parallel Algo c Sar

CS 515: Parallel Algorithms

Chandrima Sarkar

Atanu Roy

2010 - 02 -17

Page 2: Parallel Algo c Sar

Agenda

• Architecture

• Parallel Programming Languages

• Precedence Graph

• Elementary Parallel Algorithms

• Sorting

• Matrix Multiplication

Download :- http://www.cs.montana.edu/~atanu.roy/Classes/CS515.html

Page 3: Parallel Algo c Sar

Architecture

Flynn’s Classification S = single , M = multiple , I = instruction (stream), D = data (stream)

SISD SIMD

Page 4: Parallel Algo c Sar

Architecture

Flynn’s Classification S = single , M = multiple , I = instruction (stream), D = data (stream)

MISD MIMD

Page 5: Parallel Algo c Sar

Static Inter-connection Network

Linear Array

Ring

Ring arranged to use short wires

Fully Connected Topology

Chordal ring

Page 6: Parallel Algo c Sar

Multidimensional Meshes and Torus

Tree

Page 7: Parallel Algo c Sar

Tree Cont.

FAT TREE

STAR

Page 8: Parallel Algo c Sar

Hypercube

1-D 2-D 3-D 4-D

001 011

000 010

100 110

111 101

0-D

5-D

Page 9: Parallel Algo c Sar

Parallel Programming Languages

Control Mechanism Communication Mechanism

Shared Memory Message-passing

Control driven Fortran 90/HPF , C++ , HEP PL/I , Ada , Concurrent Pascal Modula-2 , MultiLisp (MIMD), Lisp Connection Machine (SIMD)

CSP , Ada , OCCAM (Von Neumann Language Extension )

Data driven VAL , ID LAU , SISAL ( data-flow languages )

Pattern driven Concurrent Prolog ( Shapiro )

Actors

Demand driven ( reduction language )

FP

Page 10: Parallel Algo c Sar

Dijkstra’s High Level language construct

• Degree of Parallelism is static Algol-68,CSP

A parbegin C begin B parbegin D E parend G end parend H

Precendence Graph

Page 11: Parallel Algo c Sar

Elementary Parallel Algorithms Finding sum using a 2D mesh architecture

Page 12: Parallel Algo c Sar

Finding sum of 16 values in a Shuffle Exchange SIMD Model

Page 13: Parallel Algo c Sar

Parallel summation in a Hypercube SIMD Model

Page 14: Parallel Algo c Sar

Broadcast in a Hypercube Algorithm 1

Algorithm 2

Page 15: Parallel Algo c Sar

Odd Even Transposition Sort

• (1) p = n • 14 – 5 – 15 – 8 – 4 – 11 – 13 – 12

• odd-even 14 5 – 15 8 – 4 11 – 13 12 • even-odd 14 – 5 15 – 4 8 – 11 13 – 12 • odd-even 5 14 – 4 15 – 8 11 – 12 13 • even-odd 5 – 4 14 – 8 15 – 11 12 – 13 • odd-even 4 5 – 8 14 – 11 15 – 12 13 • even-odd 4 – 5 8 – 11 14 – 12 15 – 13 • odd-even 4 5 – 8 11 – 12 14 – 13 15 • even-odd 4 – 5 8 – 11 12 – 13 14 – 15

Page 16: Parallel Algo c Sar

Odd Even Transposition Sort (contd…)

• (2) p << n • S= {12, 7, 2, 4, 1, 11, 9, 5, 6, 3, 10, 8}, p = 4

P1 P2 P3 P4

{12, 7, 2} {4, 1, 11} {9, 5, 6} {3, 10, 8}

{2, 7, 12} {1, 4, 11} {5, 6, 9} {3, 8, 10}

{1, 2, 4} {7, 11, 12} {3, 5, 6} {8, 9, 10}

{1, 2, 4} {3, 5, 6} {7, 11, 12} {8, 9, 10}

{1, 2, 3} {4, 5, 6} {7, 8, 9} {10, 11, 12}

{1, 2, 3} {4, 5, 6} {7, 8, 9} {10, 11, 12}

Page 17: Parallel Algo c Sar

Pseudocode

• Proc MERGE-SPLIT(S) for i:= 1 to p do in parallel

QUICKSORT(Si)

end for for (i := 1 to ceil(p/2)) for odd-numbered processor do in parallel MERGE(Si , Si + 1) SPLIT end for for odd-numbered processor do in parallel MERGE(Si , Si + 1) SPLIT end for end for

Page 18: Parallel Algo c Sar

2 – D mesh with Snake Order

Input : {23, 6, 1, 5, 11, 13, 55, 19, -3, 12, -5, -7, 9, 55, 28, -2}

Thompson and Kung (1977)

Page 19: Parallel Algo c Sar

Snake Order (contd.)

Page 20: Parallel Algo c Sar

Bitonic Merge Sort

• Bitonic Sequence :- 1, 3, 7, 8 6, 5, 4, 2

• Comparator

• Note :- Batcher’s Bitonic Merge Sort compares elements whose indices differ by a single bit.

Page 21: Parallel Algo c Sar

Bitonic Merge Sort

Page 22: Parallel Algo c Sar

Shuffle-Exchange Network

Bitonic Mergesort on Shuffle-Exchange Network

• A list of n = 2k unsorted elements can be sorted in time θ(lg2 n) with a network 2k-1[k (k-1) + 1] comparators using the shuffle-exchange network.

Page 23: Parallel Algo c Sar

Sorting Network

Page 24: Parallel Algo c Sar
Page 25: Parallel Algo c Sar

Odd Even Merging Network

Page 26: Parallel Algo c Sar

Systolic Matrix Multiplication

1. Multiply ai,k by ak,j

2. Add the result to ri,j

3. Send ai,k to cell ci+1,j

4. Send bk,j to cell ci,j+1

Page 27: Parallel Algo c Sar
Page 28: Parallel Algo c Sar

Home Work

• Show how the following 16 values would be sorted by Batcher’s Bitonic sort.

16, 7, 4, 12, 2, 10, 13, 9, 1, 8, 11, 3, 15, 6, 5, 14