1 Partitioning Loops with Variable Dependence Distances Yijun Yu and Erik D’Hollander Department...

1

Partitioning Loops with Variable Dependence

Distances

Yijun Yu and Erik D’HollanderDepartment of Electronics and Information Systems

University of Ghent, Belgium

2

Introduction

1. Overview2. Dependence analysis:

pseudo distance matrix (PDM)

3. Loop transformations: unimodular and partitioning

4. Results5. Conclusion

3

1. Overview

• Loop with linear array subscripts• Solve dependence equation• Find all non-constant distances• Create maximally covering grid and base-vectors• Create the pseudo distance matrix, PDM

containing all base-vectors of the covering grid• Find independent loops or independent partitions,

based on the rank of PDM

4

Approach

Uniform or constantdistance

Variable or non-constantdistance

rank(H)<loop depth?Non-full rank

Full rank

Partitioning transformation

Unimodulartransformation

N Y

Dependence analysis:

H=PDM

det(H)> 1?

Loop parallelization

YN

Linear dependence equation

Loop transformation:

5

2. Dependence Analysis

4I1-I2+3=J1+J2-1

2I1+I2-2=J1-J2+2

f(i)=g(j)

iA+a = jB+b

4 2

1 1

A1 1

1 1

B3

2

a1

2

b

i=(I1,I2) j=(J1,J2)

A[f(I)]=……=A[g(I)]

L1: do I1= -N,N

L2: do I2= -N,N

A(4I1-I2+3,2I1+I2-2)=…

…=A(I1+I2-1,I1-I2+2)

enddoenddo

i j A[f(i)]=A[g(j)] d=|j-i|(1, -5) (3, 10) A[12, -5] (2,15)(3, 0) (9, 7) A[15, 4] (6, 7)(-3,-3) (-9, 4) A[-6,-11] (6, -7)

…

6

The dependence distance

iA a jB b

l

r

i tU

j tU

| | | ( ) | | |r l d j i t U U tF

1. The linear dependence equation:

2. Using Banerjee’s unimodular transformation U to obtain an echelon matrix S, the equation t S=(b-a) is solved, yielding:

3. The distance between dependent iterations i, j is:

• Ul and Ur are left, right halves of U• t has constant part t1 and unknown part t2

7

The distance set

1 2where const, variable 1 2t t t t t

'| |,

''

Fd tF F

F

1. From the dependence equation t S = (b-a), the solution vector t contains a constant and an arbitrary part:

1 '| where '' or

''i Z

t Fd xR d 0 x R F

F

2. Matrix F=Ur-Ul can be vertically separated into two sub-matrices:

3. The distance set of the dependence equations is:

8

Distances in the iteration space

• Iteration-space (i1,i2) of loop 1 with dep. eqns:4I1-I2+3=J1+J2-1

2I1+I2-2=J1-J2+2. • The arrows

(I1,I2)(J1,J2)represent the distance vectors between dependent iterations.

i1

i2

9

Distances base vectors

1. The dependence distance is non-constant for the reference pair, e.g. (2,15),(6,7),(6,-7), as highlighted.

2. However, the distance set is spanned by the grid generated by the base vectors(2,1) and (0,2).

3. For example, (2,15) = (2,1) + 7 (0,2),(6, 7) = 3(2,1) + 2 (0,2),(6, -7) = 3(2,1) - 5 (0,2).

i1

i2

10

The largest base vectorsThe distance set is the linear combination of the row vectors in R:

A lattice L(R) is a group of vectors generated by all the linear combinations of the independent row vectors of a matrix R.

We look for the smallest lattice L(R) (generating the largest grid) which covers the whole distance set:

In this way, possible spurious dependencies introduced by replacing the distance set with a lattice are minimized.

( )L R

( ) | iL R xR x

| i d xR d 0 x

11

Pseudo Distance Matrix (PDM)• A Hermite normal form HNF(R) is a full row rank matrix reduced

from the echelon form of R by unimodular transformation.

• Therefore H generates the same lattice as R does, that is, the smallest lattice. In addition, the HNF rows are base vectors.

• H is called the pseudo distance matrix (PDM), because it generates the distance set from its row vectors.

• Since the row vectors of H are constant, the techniques from the uniform distance dependence matrix may apply.

H = HNF (R)

L (H) = L (R)

12

Calculating the PDM

3 4

1 0

1 14 4

2 1

0 2

t t

d

1. Solving the linear dependence equations:

2. Expressing the distance set:

3 4

0 4

1 2 1

0 2

t t

d

3. Finding the largest base vectors:0 4

2 12 1

0 20 2

HNF

2 1PDM

0 2

1 2 1 24 2 3 1 1 1

1 1 2 1 1 2i i j j

d tF

d xR

HNF( )H R

13

3. Loop transformations: unimodular and partitioning

LegalityAny transformation should be legal, i.e.preserve the executing order of dependent iterations.

Transformations depending on rank(H):3.1 Unimodular transformation: non-full rank PDM3.2 Partitioning transformation: full rank PDM3.3 Combined approach

14

3.1 Unimodular transformation

• Given a non-full rank (r m) pseudo distance matrix H, a unimodular matrix T can be developed such that the first m-r columns of HT are zero.

• As a result, m-r outermost loops can be parallelized.

15

3.2 Partitioning transformation

• Given a full rank pseudo distance matrix H, the loop nest can be partitioned such that det(H) partitions are found.

• The partitioned parallelism is det(H).

16

3.3 Combined approach

• After a unimodular transformation on a non-full rank PDM, the transformed PDM matrix has a full rank sub-matrix, S.

• When the det(S)>1, additional parallelism can be found using loop partitioning transformation.

17

L’1: doall J1=-2N,2N

L’2: do J2=max(-N,-N-J1),

min(N,N-J1)

I1=J2

I2=J1+J2

A(3I1+1,2I1+I2-1)=…

…=A(I1+3,I2+1) enddoenddoall

4. Results (1) Non-full rank PDM

PDM=(2,2) (2,0) (0,2)

L1: do I1=-N,N

L2: do I2=-N,N

A(3I1+1,2I1+I2-1)=…

…=A(I1+3,I2+1) enddoenddo

1 1

0 1

0 1

1 0

1 1 0 1 1 1

0 1 1 0 1 0

T

18

NF-rank: Dependence graphsj1

j2

i2

19

4. Results(2) partitioning

L1: do I1=-N,N

L2: do I2=-N,N

A(4I1-I2+3,2I1+I2-2)=…

…=A(I1+I2-1,I1-I2+2)

enddoenddo

L’1: doall Io1=0,1

L’2: doall Io2=0,1

L’3: do I1=-N+mod(N+Io1,2), N-mod(N-Io1,2),2

io’2=Io2+(I1-Io1)/2

L’4: do I2=-N+mod(N+Io’2,2),

N-mod(N-Io’2,2),2

A(4I1-I2+3,2I1+I2-2)=…

…=A(I1+I2-1,I1-I2+2)

enddoenddo

enddoall enddoall

1/ 2 0

0 1

1 1

0 1

1 0

0 1/ 2

1 0

0 1

det 1

2 1PDM

0 2

det 4

1 1

0 2

1 0

0 2

20

F-rank partitioning: dependence graphs

21

4. Results (3) Combined

PDM=(2,2) (0, 2)

L’1: doall J1=-2N,2N

L’2: do J2=max(-N,-N-J1), min(N,N-J1)

I1=J2

I2=J1+J2

A(3I1+1,2I1+I2-1)=…


(0, 1)

L’’1: doall Jo2=0,1

L’’2: doall J1=-2N,2N

p2=max(-N,-N-J1)

q2=min(N,N-J1)

L’’3: do J2=p2+mod(Jo2-p2,2),

q2-mod(q2-Jo2,2),2 I1=J2

I2=J1+J2

A(3I1+1,2I1+I2-1)=…


enddoall

1 0

0 1/ 2

1 1

1 0

det 2 det 1

22

F-rank submatrix dependence graphj1

j2 j2

j2

j1j1

23

5. Conclusion

• The distances of the dependent iterations are non-constant when the array subscripts are linear.

• A pseudo distance matrix(PDM) with the largest base vectors of the distance space is computed from the linear dependence equations.

• Parallelism can still be exploited for these loops with variable distances by the unimodular and partitioning transformations that are derived from the PDM.

1 Partitioning Loops with Variable Dependence Distances Yijun Yu and Erik D’Hollander Department...

Documents

Transcript of 1 Partitioning Loops with Variable Dependence Distances Yijun Yu and Erik D’Hollander Department...