Parallel Graph Partioning Using Simulated Annealing Parallel and Distributed Computing I Sadik...

29
Parallel Graph Partioning Using Simulated Annealing Parallel and Distributed Computing I Sadik Gokhan Caglar

Transcript of Parallel Graph Partioning Using Simulated Annealing Parallel and Distributed Computing I Sadik...

Page 1: Parallel Graph Partioning Using Simulated Annealing Parallel and Distributed Computing I Sadik Gokhan Caglar.

Parallel Graph PartioningUsing Simulated Annealing

Parallel and Distributed Computing I

Sadik Gokhan Caglar

Page 2: Parallel Graph Partioning Using Simulated Annealing Parallel and Distributed Computing I Sadik Gokhan Caglar.

Graph Partitioning Problem

Given a Graph G = (N,E) and a integer p

Find subsets N1,N2,…,Np such that

1. pi=1 Ni= N and Ni Nj = 0 for i j

2. W(i) W / p, i = 1,2,…,p, where W(i) and W are the sums of node weights in Ni and N respectively

3. The cut size is minimized

Page 3: Parallel Graph Partioning Using Simulated Annealing Parallel and Distributed Computing I Sadik Gokhan Caglar.

A Partitioned Graph

A partitioned graph with edge-cut of seven

Page 4: Parallel Graph Partioning Using Simulated Annealing Parallel and Distributed Computing I Sadik Gokhan Caglar.

Solutions To The Problem• Geometric Algorithms: Use the geometric

coordinates– Recursive coordinate (or orthogonal) bisection– Recursive circle bisection

• Structural Algorithms: – Graph-Walking Algorithms – Spectral Algorithms

• Refinement Algorithms:– Kernighan-Lin Algorithm– Simulated Annealing Algorithm

Page 5: Parallel Graph Partioning Using Simulated Annealing Parallel and Distributed Computing I Sadik Gokhan Caglar.

Solutions To The Problem

Multilevel technique:

• Coarsen

• Partition

• Refinement

Page 6: Parallel Graph Partioning Using Simulated Annealing Parallel and Distributed Computing I Sadik Gokhan Caglar.

Simulated Annealing

Page 7: Parallel Graph Partioning Using Simulated Annealing Parallel and Distributed Computing I Sadik Gokhan Caglar.

Implementation of SA

• Cost: The number of edges that has vertices in different sets

• Acceptation: The new cost is less than the old• Rejection: The new cost is more than the old,

a probabilistic calculation can change a rejection into an acceptation (ecost/Temp)

• Equilibrium: Number of rejections < (10 * vertexsize of the graph * number of sets)

Page 8: Parallel Graph Partioning Using Simulated Annealing Parallel and Distributed Computing I Sadik Gokhan Caglar.

Implementation of SA

• Frozen state: The temperature starts from 1, the cooling constant is 0.95, it is considered frozen at temperature 0.2

currentcost = cost(graph);printf ("The cost of the graph1 is %f \n", currentcost);while (temp > 0.2){

while (reject < (10 * graph.vertexsize * graph.setsize)){

makenewgraph (graph, &newgraph);tempcost = cost(newgraph);if (tempcost < currentcost){

currentcost = tempcost;graphfree(&graph);graph = newgraph;

}

Page 9: Parallel Graph Partioning Using Simulated Annealing Parallel and Distributed Computing I Sadik Gokhan Caglar.

Implementation of SAelse{

reject++;if (tempcost == currentcost)

prob = e(1, temp);else

prob = e((tempcost - currentcost), temp);prob2 = drand48();if (prob > prob2){

currentcost = tempcost;graphfree(&graph);graph = newgraph;

}else

graphfree(&newgraph);}//1st else

}//rejecttemp = temp * coolconst;reject = 0;printf("cooled!!! temp = %f \n", temp);printf ("currentcost %f\n", currentcost);

}printf ("The cost of the graph2 is %f \n", currentcost);

Page 10: Parallel Graph Partioning Using Simulated Annealing Parallel and Distributed Computing I Sadik Gokhan Caglar.

Input File Format

Page 11: Parallel Graph Partioning Using Simulated Annealing Parallel and Distributed Computing I Sadik Gokhan Caglar.

Data Structures

typedef struct Edge

{

int v1;

int v2;

} Edge;

typedef struct Set

{

int size;

int* vertex;

} Set;

typedef struct Graph

{

int vertexsize;

int edgesize;

int setsize;

struct Edge* edgelist;

struct Set* setlist;

} Graph;

Page 12: Parallel Graph Partioning Using Simulated Annealing Parallel and Distributed Computing I Sadik Gokhan Caglar.

Parallelization Approach 1

• Problem independent tried to implement a general parallel simulated annealing

• Every process will generate a new graph and calculate the new cost

• The results will be sent to the root process

• The root process will choose the best result and broadcast it.

Page 13: Parallel Graph Partioning Using Simulated Annealing Parallel and Distributed Computing I Sadik Gokhan Caglar.

Parallelization Approach 1

• The array that root process gathers: 0 – Acceptation ( 0 no, 1 yes, 2 probability)

1 – Cost

2 – The set number of the first vertex

3 – The set number of the second vertex

4 – The first vertex

5 – The second vertex

Page 14: Parallel Graph Partioning Using Simulated Annealing Parallel and Distributed Computing I Sadik Gokhan Caglar.

Parallelization Approach 1

• The array that root process broadcasts:0 - Temperature update1 – Change done2 – The set number of the first vertex3 – The set number of the second vertex4 – The first vertex5 – The second vertex6 – The cost of the new graph

Page 15: Parallel Graph Partioning Using Simulated Annealing Parallel and Distributed Computing I Sadik Gokhan Caglar.

Parallelization Approach 1

• The equilibrium function has changed. From Number of rejections < (10 * vertexsize of the graph * number of sets) to Number of rejections < (10 * vertexsize of the graph * number of sets / number of processes)

• The rest of the program is the same the data is not distributed

Page 16: Parallel Graph Partioning Using Simulated Annealing Parallel and Distributed Computing I Sadik Gokhan Caglar.

Parallelization Approach 2• Problem dependent, works for only graph

partition problem.• Most of the work in graph partitioning problem

is to calculate the cost of the graph.• This is dependent on the number of edges

that the graph has, the edges array can be scattered to the processes

• The processes only needs the edges it has to calculate the partial sum. It is perfectly parallelizable.

Page 17: Parallel Graph Partioning Using Simulated Annealing Parallel and Distributed Computing I Sadik Gokhan Caglar.

Parallelization Approach 2

• After each process calculates its partial sum and MPI_Reduce with add operation is done to calculate the total sum.

• All the simulated annealing operation is done on the root process the others only calculate their partial sums.

Page 18: Parallel Graph Partioning Using Simulated Annealing Parallel and Distributed Computing I Sadik Gokhan Caglar.

ParSA1 16 Nodes

1 .27

0.87

0.29

1

2.351851852

3.432432432

4.379310345

0.370.54

1 .459770115

0

0.5

1

1 .5

2

2.5

3

3.5

4

4.5

5

0 2 4 6 8 10 12 14 16 18

P r oc es s or s

T ime Speedup

Page 19: Parallel Graph Partioning Using Simulated Annealing Parallel and Distributed Computing I Sadik Gokhan Caglar.

ParSA1 100 Nodes

16.01

8.53

4.6

2.57

1 .61

3.480434783

6.229571984

10.00625

1 .876905041

0

2

4

6

8

10

12

14

16

18

0 2 4 6 8 10 12 14 16 18

T ime Speedup

Page 20: Parallel Graph Partioning Using Simulated Annealing Parallel and Distributed Computing I Sadik Gokhan Caglar.

ParSA1 300 Nodes

144.66

74.52

38.24

1

20.44

11 .23.7829497911 .9412238337.077299413

12.91607143

0

20

40

60

80

100

120

140

160

0 2 4 6 8 10 12 14 16 18

P r oc es s or s

T ime Speedup

Page 21: Parallel Graph Partioning Using Simulated Annealing Parallel and Distributed Computing I Sadik Gokhan Caglar.

ParSA1 500 Nodes

483.32

223.57

113.85

115.64648754

30.8958.66

2.16182851 4.245234958 8.239345380

100

200

300

400

500

600

0 2 4 6 8 10 12 14 16 18

P r oc es s or s

T ime Speedup

Page 22: Parallel Graph Partioning Using Simulated Annealing Parallel and Distributed Computing I Sadik Gokhan Caglar.

ParSA1 1000 Nodes

1809.9

901 .92

456.31

231 .62

1

118.97

15.213078937.8140920473.9663825032.006718999

0

200

400

600

800

1000

1200

1400

1600

1800

2000

0 2 4 6 8 10 12 14 16 18

P r oc es s or s

T ime Speedup

Page 23: Parallel Graph Partioning Using Simulated Annealing Parallel and Distributed Computing I Sadik Gokhan Caglar.

ParSA2 16 Nodes

1 .88

2.78

1

0.334532374

0.93

2.33

0.3991416310.494680851

0

0.5

1

1 .5

2

2.5

3

0 1 2 3 4 5 6 7 8 9

P r oc es s or s

T ime Speedup

Page 24: Parallel Graph Partioning Using Simulated Annealing Parallel and Distributed Computing I Sadik Gokhan Caglar.

ParSA2 100 Nodes

15.17

10.77

11 .408542247

11 .82

10.09

9.88

1 .535425101

1 .283417936

1 .503468781

0

2

4

6

8

10

12

14

16

0 2 4 6 8 10 12 14 16 18

P r oc es s or s

T ime Speedup

Page 25: Parallel Graph Partioning Using Simulated Annealing Parallel and Distributed Computing I Sadik Gokhan Caglar.

ParSA2 300 Nodes

134.76

92.13

45.28

1 2.97614841

61 .67

50.31

2.6785927252.185179181 .462715728

0

20

40

60

80

100

120

140

160

0 2 4 6 8 10 12 14 16 18

P r oc es s or s

T ime Speedup

Page 26: Parallel Graph Partioning Using Simulated Annealing Parallel and Distributed Computing I Sadik Gokhan Caglar.

ParSA2 500 Nodes

456.86

256.87

163.8

1

98.8115.21

4.6240890693.9654543882.7891330891 .778565033

0

50

100

150

200

250

300

350

400

450

500

0 2 4 6 8 10 12 14 16 18

P r oc es s or s

T ime Speedup

Page 27: Parallel Graph Partioning Using Simulated Annealing Parallel and Distributed Computing I Sadik Gokhan Caglar.

ParSA2 1000 Nodes

1232.36

677.44

335.9

1

1777.57

420.12

5.2919618934.2311006382.6239519371 .442411308

0

200

400

600

800

1000

1200

1400

1600

1800

2000

0 2 4 6 8 10 12 14 16 18

P r oc es s or s

T ime Speedup

Page 28: Parallel Graph Partioning Using Simulated Annealing Parallel and Distributed Computing I Sadik Gokhan Caglar.

ParSA2 10000 Nodes 40000 Edges

23.32

17.12

11 .24

7.47

3.66

1

6.371584699

4.693.121820616

1 .362149533

2.074733096

4.97228145

0

5

10

15

20

25

0 5 10 15 20 25 30 35

P r oc es s or s

T ime Speedup

Page 29: Parallel Graph Partioning Using Simulated Annealing Parallel and Distributed Computing I Sadik Gokhan Caglar.

ParSA2 10000 Nodes 80000 Edges

45.75

18.62

10.76

1

29.96

4.7

7.39

1 .527036048

9.734042553

6.1907983764.251858736

2.457035446

0

5

10

15

20

25

30

35

40

45

50

0 5 10 15 20 25 30 35

P r oc es s or s

T ime Speedup