Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning...

36
Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of Biomedical Informatics Department of Electrical & Computer Engineering The Ohio State University Workshop on Combinatorial Scientific Computing & Petascale Simulations 2008 June 10-13, 2008, Santa Fe, NM

Transcript of Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning...

Page 1: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

Dynamic Load Balancing (Repartitioning)

&Matrix Partitioning

Ümit V. Çatalyürek

Associate Professor

Department of Biomedical Informatics

Department of Electrical & Computer Engineering

The Ohio State University

Workshop onCombinatorial Scientific Computing & Petascale Simulations 2008

June 10-13, 2008, Santa Fe, NM

Page 2: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

OSU’s CSCAPES Contributions

• Load Balancing• Parallel Static Load Balancing• Parallel Dynamic Load Balancing

• Parallel Graph Coloring• Distance-1 coloring• Distance-2 coloring• talk by Bozdag Friday morning

• Parallel Matrix Partitioning

• Parallel Matrix Ordering

Umit Catalyurek "Dynamic Load Bal. & Matrix Partitioning"

2CSCAPES Workshop, June 10, 2008

Page 3: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

Roadmap

• Dynamic Load Balancing• Motivation• Background

• Classification of Repartitioning Techniques• Graph and Hypergraph Approaches

• New Hypergraph Model for Dynamic Load Balancing• Parallel Multilevel Hypergraph Partitioning with Fixed Vertices • Experimental Results & Summary

• Matrix Partitioning• 1D Hypergraph-based Methods: Row-wise and Column-wise • 2D Hypergraph-based Methods: Fine-grain, Jagged-Like, Checkerboard• Experimental Results & Summary

Umit Catalyurek "Dynamic Load Bal. & Matrix Partitioning"

3CSCAPES Workshop, June 10, 2008

Page 4: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

Partitioning and Load Balancing

• Goal: assign data to processors to• minimize application runtime• maximize utilization of computing resources

• Metrics:• minimize processor idle time (balance workloads)• keep inter-processor communication costs low

• Impacts performance of a wide range of simulations

Adaptive mesh refinementContact detection Particle simulations

x bA

=

Linear solvers & preconditioners

CSCAPES Workshop, June 10, 2008 4Umit Catalyurek "Dynamic Load Bal. & Matrix Partitioning"

Page 5: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

Dynamic Load Balancing/Repartitioning

• Applications with workload or locality that changes during simulation require dynamic load balancing (a.k.a. repartitioning)• Adaptive mesh refinement• Particle methods• Contact detection

• Repartitioning has additional cost:• Moving data from old to new decomposition

executionT = #iter x ( computationT + communicationT) + repartT + migrationT

CSCAPES Workshop, June 10, 2008 5Umit Catalyurek "Dynamic Load Bal. & Matrix Partitioning"

Page 6: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

Roadmap

• Dynamic Load Balancing• Motivation• Background

• Classification of Repartitioning Techniques• Graph and Hypergraph Approaches

• New Hypergraph Model for Dynamic Load Balancing• Parallel Multilevel Hypergraph Partitioning with Fixed Vertices • Experimental Results & Summary

• Matrix Partitioning• 1D Hypergraph-based Methods: Row-wise and Column-wise • 2D Hypergraph-based Methods: Fine-grain, Jagged-Like, Checkerboard• Experimental Results & Summary

Umit Catalyurek "Dynamic Load Bal. & Matrix Partitioning"

6CSCAPES Workshop, June 10, 2008

Page 7: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

Classification of Dynamic Load Balancing Approaches

Umit Catalyurek "Dynamic Load Bal. & Matrix Partitioning"

7CSCAPES Workshop, June 10, 2008

Page 8: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

Graph and Hypergraph Partitioning

Graphs Hypergraphs

Community load-balancing(highly successful for PDE problems)

VLSI, recently Computational Science

Model Vertices = computation/dataEdge = relationship between computation/data (bi-directional)

Vertices= computation/dataEdge = dependency to data elements (multi-way)

Goal Evenly distribute vertex weight while minimizing weight of cut edges

Evenly distribute vertex weight while minimizing cut size

Algorithms Kernighan, Lin, Simon, Hendrickson, Leland, Kumar, Karypis, et al.

Kernighan, Schweikert, Fiduccia, Mattheyes, Sanchis, Alpert, Kahng, Hauck, Borriello, Çatalyürek, Aykanat, Karypis, et al.

Serial Partitioner

Chaco (SNL), Jostle (U. Greenwich), METIS (U. Minn.), Party (U. Paderborn), Scotch (U. Bordeaux)

hMETIS (Karypis), PaToH (Çatalyürek), Mondriaan (Bisseling)

Parallel Partitioner

ParMETIS (U. Minn.), PJostle (U. Greenwich)

Zoltan PHG (Sandia), Parkway (Trifunovic)

CSCAPES Workshop, June 10, 2008 8Umit Catalyurek "Dynamic Load Bal. & Matrix Partitioning"

Page 9: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

Impact of Hypergraph Models(Where Graph is not Sufficient)

• Greater expressiveness Greater applicability• Structurally non-symmetric systems

• circuits, biology• Rectangular systems

• linear programming, least-squares methods• Non-homogeneous, highly connected topologies

• circuits, nanotechnology, databases• Multiple models for different granularity partitioning

• Owner compute, fine-grain, checkerboard/cartesian, Mondriaan

• Accurate communication model lower application communication costs

Umit Catalyurek "Dynamic Load Bal. & Matrix Partitioning"

9CSCAPES Workshop, June 10, 2008

P4P3

P1

Vi Vk

Vj

Vm

Vh

Vl

ni

nk

nl

nm

nh

Mondriaan PartitioningCourtesy of Rob Bisseling

P4P3

P1 P2

Vi Vk

Vj

Vm

Vh

Vl

Page 10: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

Roadmap

• Dynamic Load Balancing• Motivation• Background

• Classification of Repartitioning Techniques• Graph and Hypergraph Approaches

• New Hypergraph Model for Dynamic Load Balancing• Parallel Multilevel Hypergraph Partitioning with Fixed Vertices • Experimental Results & Summary

• Matrix Partitioning• 1D Hypergraph-based Methods: Row-wise and Column-wise • 2D Hypergraph-based Methods: Fine-grain, Jagged-Like, Checkerboard• Experimental Results & Summary

Umit Catalyurek "Dynamic Load Bal. & Matrix Partitioning"

10CSCAPES Workshop, June 10, 2008

Page 11: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

Hypergraph Model

• : #parts edge ei connects

• Cut =

• Cut = total comm volume

λi

eiÃŽE

λi

1 Âci

CSCAPES Workshop, June 10, 2008 11Umit Catalyurek "Dynamic Load Bal. & Matrix Partitioning"

Page 12: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

• Start with application hypergraph

• Add • one partition vertex for each partition• migration edges connecting application

vertices to their partition vertices

• Weight the hyperedges:• Migration edge weight =

size of application objects (migration size)• Application edge weight =

size of communication elements • Scale application edge weights by ≈

number of application communications between repartitions (#iter)

• Perform hypergraph partitioning with partition vertices “fixed”

Hypergraph Repartitioning

CSCAPES Workshop, June 10, 2008 12Umit Catalyurek "Dynamic Load Bal. & Matrix Partitioning"

Page 13: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

• Start with application hypergraph

• Add • one partition vertex for each partition• migration edges connecting application

vertices to their partition vertices

• Weight the hyperedges:• Migration edge weight =

size of application objects (migration size)• Application edge weight =

size of communication elements • Scale application edge weights by ≈

number of application communications between repartitions (#iter)

• Perform hypergraph partitioning with partition vertices “fixed”

Hypergraph Repartitioning

CSCAPES Workshop, June 10, 2008 13Umit Catalyurek "Dynamic Load Bal. & Matrix Partitioning"

Page 14: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

• Start with application hypergraph

• Add • one partition vertex for each partition• migration edges connecting application

vertices to their partition vertices

• Weight the hyperedges:• Migration edge weight =

size of application objects (migration size)• Application edge weight =

size of communication elements • Scale application edge weights by ≈

number of application communications between repartitions (#iter)

• Perform hypergraph partitioning with partition vertices “fixed”

Hypergraph Repartitioning

executionT = #iter x ( computationT + communicationT) + repartT + migrationT

CSCAPES Workshop, June 10, 2008 14Umit Catalyurek "Dynamic Load Bal. & Matrix Partitioning"

Page 15: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

• Start with application hypergraph

• Add • one partition vertex for each partition• migration edges connecting application

vertices to their partition vertices

• Weight the hyperedges:• Migration edge weight =

size of application objects (migration size)• Application edge weight =

size of communication elements • Scale application edge weights by ≈

number of application communications between repartitions (#iter)

• Perform hypergraph partitioning with partition vertices “fixed”

Hypergraph Repartitioning

CSCAPES Workshop, June 10, 2008 15Umit Catalyurek "Dynamic Load Bal. & Matrix Partitioning"

Page 16: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

Roadmap

• Dynamic Load Balancing• Motivation• Background

• Classification of Repartitioning Techniques• Graph and Hypergraph Approaches

• New Hypergraph Model for Dynamic Load Balancing• Parallel Multilevel Hypergraph Partitioning with Fixed Vertices • Experimental Results & Summary

• Matrix Partitioning• 1D Hypergraph-based Methods: Row-wise and Column-wise • 2D Hypergraph-based Methods: Fine-grain, Jagged-Like, Checkerboard• Experimental Results & Summary

Umit Catalyurek "Dynamic Load Bal. & Matrix Partitioning"

16CSCAPES Workshop, June 10, 2008

Page 17: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

Implementation of Hypergraph Repartitioning

• Implemented in Zoltan toolkit

• Based on parallel multilevel parallel hypergraph partitioner with recursive bisection (IPDPS’06)

• Automatically construct augmented hypergraph

• … with added capability for handling “fixed vertices.”

Umit Catalyurek "Dynamic Load Bal. & Matrix Partitioning"

17CSCAPES Workshop, June 10, 2008

Page 18: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

Experimental Results

• Experiments on • OSU-RI cluster

• 64 compute nodes connected with Infiniband

• Dual 2.4 GHz AMD Opteron processors with 8 GB RAM

• Sandia-Thunderbird cluster• 4,480 compute nodes connected with Infiniband

• Dual 3.6 GHz Intel EM64T processors with 6 GB RAM

• Zoltan v3 (alpha) hypergraph partitioner & ParMETIS v3.1 graph partitioner

• Test problems:• 2DLipid: density functional theory; 4K x 4K; 5.6M nonzeros• Xyce: ASIC Stripped; 680K x 680K; 2.3M nonzeros• Cage14: DNA Electrophoresis; 1.5M x 1.5M; 27M nonzeros

Xyce ASIC Stripped

Cage Electrophoresis

CSCAPES Workshop, June 10, 2008 18Umit Catalyurek "Dynamic Load Bal. & Matrix Partitioning"

Page 19: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

Communication Volume

2DLipid

• Hypergraph is better• Zoltan-repart trades comm with migration

to min tot cost• Scratch methods are comparable for large

alpha (#iter)

Xyce

Cage14

Page 20: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

Dynamic Graph: Partitioning Time on T-bird

2DLipid

Cage14

Xyce

Page 21: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

Summary of Dynamic Load Balancing

• A novel hypergraph model for dynamic load balancing• Single hypergraph that incorporates both communication

volume in the application and data migration cost• Performs better or comparable to graph-based dynamic load

balancing

• A parallel dynamic load balancing tool• Essential for peta-scale applications• Scales similar to those of graph-based tools

• Future Work• There is always room for improvement: speed and/or quality• Direct k-way refinement

Umit Catalyurek "Dynamic Load Bal. & Matrix Partitioning"

21CSCAPES Workshop, June 10, 2008

Page 22: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

Roadmap

• Dynamic Load Balancing• Motivation• Background

• Classification of Repartitioning Techniques• Graph and Hypergraph Approaches

• New Hypergraph Model for Dynamic Load Balancing• Parallel Multilevel Hypergraph Partitioning with Fixed Vertices • Experimental Results & Summary

• Matrix Partitioning• 1D Hypergraph-based Methods: Row-wise and Column-wise • 2D Hypergraph-based Methods: Fine-grain, Jagged-Like, Checkerboard• Experimental Results & Summary

Umit Catalyurek "Dynamic Load Bal. & Matrix Partitioning"

22CSCAPES Workshop, June 10, 2008

Page 23: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

Matrix Partitioning

• Hypergraph Models for Sparse-Matrix Partitioning• 1D

• row-wise • column-wise

• 2D• Fine-grain• Jagged-like• Checkerboard

• Serial Tool: PaToH & Matlab interface• Matrix Partitioning• Partitioned Matrix Display

CSCAPES Workshop, June 10, 2008 23Umit Catalyurek "Dynamic Load Bal. & Matrix Partitioning"

Page 24: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

1D Partitioning

• M x N matrices with K processors

• Worst case• Total Volume = (K-1) x N words or (K-1) x M words• Total Number Messages = K x (K-1)

Umit Catalyurek "Dynamic Load Bal. & Matrix Partitioning"

24CSCAPES Workshop, June 10, 2008

Page 25: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

2D Partitioning:Jagged-Like

• M x N matrices with K=PxQ processors

• Worst case• Total Volume = (K-P) x N + (Q-1) x M• Total Number Messages = K x (K-Q) + K x (Q-1) = K x (K-1)

Umit Catalyurek "Dynamic Load Bal. & Matrix Partitioning"

25CSCAPES Workshop, June 10, 2008

Page 26: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

2D Partitioning: Checkerboard

• M x N matrices with K=PxQ processors

• Worst case• Total Volume = (P-1) x N + (Q-1) x M• Total Number Messages = P+Q-2

Umit Catalyurek "Dynamic Load Bal. & Matrix Partitioning"

26CSCAPES Workshop, June 10, 2008

Page 27: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

cage5

Umit Catalyurek "Dynamic Load Bal. & Matrix Partitioning"

27CSCAPES Workshop, June 10, 2008

Page 28: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

cage5

Umit Catalyurek "Dynamic Load Bal. & Matrix Partitioning"

28CSCAPES Workshop, June 10, 2008

Page 29: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

cage5

Umit Catalyurek "Dynamic Load Bal. & Matrix Partitioning"

29CSCAPES Workshop, June 10, 2008

Page 30: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

Experimental Results

• Tested 1,413 matrices (out of 1,877) from UFL Collection• #rows >= 500 and #columns >= 500• #non-zeros < 10,000,000

• K-way partitioning for K = 4, 16, 64 and 256• If 50 x K >= max {#rows, #columns}

• Partitioning instance = matrix & K• For each partitioning instance we run RW, CW, JL, CH, FG methods

• Linux Cluster• 64 dual 2.4GHz Opteron CPUs, 8GB ram

Umit Catalyurek "Dynamic Load Bal. & Matrix Partitioning"

30CSCAPES Workshop, June 10, 2008

Page 31: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

Experimental Results: Total Communication Volume

Umit Catalyurek "Dynamic Load Bal. & Matrix Partitioning"

31CSCAPES Workshop, June 10, 2008

All Instances (4040) Square Symmetric (2231)

Performance Profiles

Page 32: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

Experimental Results: Total Communication Volume

Umit Catalyurek "Dynamic Load Bal. & Matrix Partitioning"

32CSCAPES Workshop, June 10, 2008

Square Non-symmetric (1102) Rectangular (707)N>M (662) CW better than RWM>N (45)

Page 33: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

Experimental Results: Total Number of Messages

Umit Catalyurek "Dynamic Load Bal. & Matrix Partitioning"

33CSCAPES Workshop, June 10, 2008

Page 34: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

Experimental Results: Execution Time

Umit Catalyurek "Dynamic Load Bal. & Matrix Partitioning"

34CSCAPES Workshop, June 10, 2008

Page 35: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

Summary of Matrix Partitioning

• Hypergraph models for Matrix Partitioning• Well.. some are not new but not have been adopted by applications yet.

Why? (Information dissemination problem? Tool?) • More hypergraph-based methods are being developed!

• Corner-Model• Hybrid Mondrian with Fine-Grain

• Matlab interface to PaToH for Matrix Partitioning• Currently supports: RW, CW, JL, CH, FG• Will be available soon

• Work in progress• Parallel Matrix Partitioning via Zoltan

CSCAPES Workshop, June 10, 2008 35Umit Catalyurek "Dynamic Load Bal. & Matrix Partitioning"

Page 36: Department of Biomedical Informatics Dynamic Load Balancing (Repartitioning) & Matrix Partitioning Ümit V. Çatalyürek Associate Professor Department of.

Department of Biomedical Informatics

Thanks

• Contact Info:• [email protected]• http://bmi.osu.edu/~umit

• Also: • http://www.cs.sandia.gov/Zoltan/• http://www.cscapes.org/

Umit Catalyurek "Dynamic Load Bal. & Matrix Partitioning"

36CSCAPES Workshop, June 10, 2008