Parallel CREW matrix multiplication · Parallel CREW matrix multiplication Contents I Reminder:...
Transcript of Parallel CREW matrix multiplication · Parallel CREW matrix multiplication Contents I Reminder:...
Parallel CREW matrix multiplication
Contents
I Reminder: Array total on EREW-PRAM
I Reminder: How to multiply matrices
I CREW matrix vector multiplication
I CREW matrix matrix multiplication
Reminder: Array total on EREW-PRAM
begin P-SumInput: n = 2k numbers stored in Array A[1..n]Output: S =
∑ni=1 A[i ]
Note k = log2 n
begin1. for i = 1...n in parallel do
B[i]=A[i]
2. for h = 1..k dofor 1 ≤ i ≤ n/2h in parallel do
B[i ] = B[2i − 1] + B[2i ]
3. S = B[1]end P-Sum-alg
Matrix multiplication
Multiply row by column. A× B = C
c11 = a11b11 + a12b21
Matrix multiplication
What could be done in parallel?Nobody knows T ∗(n) for matrix multiplication.https://en.wikipedia.org/wiki/Matrix multiplication algorithm
Sequential matrix vector product
begin Seq-Matrix-Vector-MultInput: Matrix A, n × n, Vector B , n × 1Output: Vector Y = A ∗ B
1. for i = 1, ..., n2. Y[i]=03. for j = 1, ..., n4. Y [i ] = Y [i ] + A[i , j ] ∗ B[j ]end Seq-M-V-Mult
Sequential Time analysis. T (n) = Θ(n2)n2 executions of line 4, at cost Θ(1) = 2 each time
Matrix vector product on CREW-PRAM
begin P-Matrix-Vector-MultInput: Matrix A, n × n, Vector B , n × 1Output: Vector Y , n × 1, Y = A ∗ B Note: n = 2k
1. This step is concurrent read on B[j ] for all ifor i , j = 1..n in parallel do
C [i , j ] = A[i , j ] ∗ B[j ]
2. for i = 1..n in parallel dofor h = 1..k do
for 1 ≤ j ≤ n/2h in parallel doC [i , j ] = C [i , 2j − 1] + C [i , 2j ]
3. for i = 1..n in parallel doY [i ] = C [i , 1]
end P-M-V-Mult-alg
Example
Sequentiala11 a12 b1 a11b1 + a12b2
× =a21 a22 b2 a21b1 + a22b2
Parallela11 a12 b1 a11b1 a12b2
× −→ −→a21 a22 b2 a21b1 a22b2
Matrix multiplication on CREW-PRAM
begin P-Matrix-MultInput: Matrix A,B n × n.Output: Vector Y , n × nNote: n = 2k
1. for i , j , ` = 1..n in parallel doC [i , j , `] = A[i , `] ∗ B[`, j ]
2. for i , j = 1..n in parallel dofor h = 1..k do
for 1 ≤ ` ≤ n/2h in parallel doC [i , j , `] = C [i , j , 2`− 1] + C [i , j , 2`]
3. for i , j = 1..n in parallel doY [i , j ] = C [i , j , 1]
end P-Matrix-Mult
Binary tree algorithms for EREW PRAM
Contents
I Array minimum
I Array membership
I Broadcast
I Binary-fan-in
I Complete EREW array membership algorithm
Array Minimum Sequential
Sequential Min-of-ArrayArray A[1..n] of data entries
Min = A[1]for i = 1...n do
if A[i ] < Min then Min = A[i ]end Seql-alg
Array Minimum EREW
Input: Array A[1..n] of data entriesOutput: Minimum in A[1]Simplifying assumption n = 2k
EREW Min-of-Arrayfor h = 1 step k do
for 1 ≤ i ≤ n/2h in parallel doA[i ] = min(A[2i − 1],A[2i ])
Output Min = A[1]end EREW Min
Index 1 2 3 4 5 6 7 8Value 3 5 7 1 6 4 2 8h = 1 3 1 4 2h = 2 1 2h = 3 1
Is-X-In-Array Sequential
Sequential Is-X-In-ArrayArray L of data entriesIndex =∞for i = 1...n do
if X = L[i ] then Index = iend Seql-alg
Is-X-In-Array EREW
EREW Is-X-In-ArrayArray Temp already initialized with value of XArray L of data entries
for i = 1...n in parallel doif L[i ] = Temp[i ]
then Temp[i ] = ielse
Temp[i ] =∞end EREW-alg
Note. Incomplete algorithm:1) Initialization of Temp with X2) How to collect the answer to variable IndexInitialization and collection will be explained later
Example
Figure: K. Berman and J. Paul, Algorithms: Sequential,Parallel and Distributed. Thompson (2005)
EREW broadcast
EREW BroadcastArray A[1..n] of size n = 2k
Input X value to be broadcast to array A(fill array A with entries value X )Output A[i ] = X , i = 1, ..., n
A[1] = XFor i = 0...k − 1
beginFor all j , 2i + 1 ≤ j ≤ 2i+1 in parallel do
A[j ] = A[j − 2i ]end-for-loop
end EREW-broadcast
Example
Figure: K. Berman and J. Paul, Algorithms: Sequential,Parallel and Distributed. Thompson (2005)
EREW array Minimum by binary fan-in
Note: Same as before. Technique is called binary fan-in
EREW Min-Binary-fan-in
EREW Min-BFInfor j = 1...k dobegin
for all i , 1 ≤ i ≤ n/2j in parallel doA[i ] = min(A[2i − 1],A[2i ])
endOutput Min = A[1]
end EREW Min-BFIn
Example
Figure: K. Berman and J. Paul, Algorithms: Sequential,Parallel and Distributed. Thompson (2005)
Complete EREW-is-X -in-array-A algorithm
EREW Broadcast query item X to array Temp[1..n]This initializes the Temp array
EREW Is-X-In-List for array L[1..n]This marks Temp entries where array L is equal to X
EREW Min-Binary-fan-in with array Temp[1..n]This returns smallest marked temp entry