Distributed Programming in Scala with APGAS Philippe Suter, Olivier Tardieu, Josh Milthorpe IBM...
-
Upload
lydia-bennett -
Category
Documents
-
view
214 -
download
0
Transcript of Distributed Programming in Scala with APGAS Philippe Suter, Olivier Tardieu, Josh Milthorpe IBM...
![Page 1: Distributed Programming in Scala with APGAS Philippe Suter, Olivier Tardieu, Josh Milthorpe IBM Research Picture by Simon Greig.](https://reader038.fdocuments.in/reader038/viewer/2022110207/56649d6e5503460f94a4f7d5/html5/thumbnails/1.jpg)
Distributed Programmingin Scala with APGAS
Philippe Suter, Olivier Tardieu, Josh MilthorpeIBM Research
Picture by Simon Greig
![Page 2: Distributed Programming in Scala with APGAS Philippe Suter, Olivier Tardieu, Josh Milthorpe IBM Research Picture by Simon Greig.](https://reader038.fdocuments.in/reader038/viewer/2022110207/56649d6e5503460f94a4f7d5/html5/thumbnails/2.jpg)
APGAS - Context
• Model for concurrency + distribution in X10.
• X10, general purpose language– Developed at IBM Research for 10+ years.– Focus/bias towards distributed HPC tasks.– JVM + native back-ends (through Java & C++).– Some X10 apps ran on >50K cores.
Asynchronous Partitioned Global Address Space
http://x10-lang.org and X10’15 @ PLDI (tomorrow)
![Page 3: Distributed Programming in Scala with APGAS Philippe Suter, Olivier Tardieu, Josh Milthorpe IBM Research Picture by Simon Greig.](https://reader038.fdocuments.in/reader038/viewer/2022110207/56649d6e5503460f94a4f7d5/html5/thumbnails/3.jpg)
APGAS in Scala• Goal: expose the concurrent/distributed core
of X10 as a library.– In Java 8 and as a Scala DSL.
• This contribution:– Introduction to programming w/ APGAS in Scala.– Illustrated through two benchmarks:• K-means clustering• Unbalanced Tree Search (see paper)
– Contrasting model with Akka (see paper).– Preliminary experimental scaling results.
![Page 4: Distributed Programming in Scala with APGAS Philippe Suter, Olivier Tardieu, Josh Milthorpe IBM Research Picture by Simon Greig.](https://reader038.fdocuments.in/reader038/viewer/2022110207/56649d6e5503460f94a4f7d5/html5/thumbnails/4.jpg)
APGAS Primer
• Concurrent tasks run at distributed places.• The environment exposes the available places.
def places : Seq[Place]def here : Place
def asyncAt(p : Place)(body: =>Unit) : Unitdef async(body: =>Unit) : Unit
• Tasks can be remote or local.• Tasks are asynchronous by default.
![Page 5: Distributed Programming in Scala with APGAS Philippe Suter, Olivier Tardieu, Josh Milthorpe IBM Research Picture by Simon Greig.](https://reader038.fdocuments.in/reader038/viewer/2022110207/56649d6e5503460f94a4f7d5/html5/thumbnails/5.jpg)
APGAS Primer
• The termination of tasks is controlled by the finish construct.
def finish(body: =>Unit) : Unit
• Blocks until enclosed tasks have completed, including all nested tasks, local or remote.
• Distributed termination is challenging, finish is a powerful contribution of APGAS.
![Page 6: Distributed Programming in Scala with APGAS Philippe Suter, Olivier Tardieu, Josh Milthorpe IBM Research Picture by Simon Greig.](https://reader038.fdocuments.in/reader038/viewer/2022110207/56649d6e5503460f94a4f7d5/html5/thumbnails/6.jpg)
Hello World
finish { for(p <- places) { asyncAt(p) { println(s“Hello from $here.”) } }}
Completes when all places have completed their task.
asyncAt returns immediately.
$> …Hello from place(0). Hello from place(3).Hello from place(1).Hello from place(2).
![Page 7: Distributed Programming in Scala with APGAS Philippe Suter, Olivier Tardieu, Josh Milthorpe IBM Research Picture by Simon Greig.](https://reader038.fdocuments.in/reader038/viewer/2022110207/56649d6e5503460f94a4f7d5/html5/thumbnails/7.jpg)
“Academic” Fibonacci
def fibonacci(i: Int) : Long = { if(i <= 1 ) i else { var a,b = 0L finish { async { a = fibonacci(i – 2) } b = fibonacci(i – 1) } a + b }}
finish guards a single asyncAt…
…but recursive invocations enclose many more.
finish completes exactly when the computation of all dependencies is complete.
![Page 8: Distributed Programming in Scala with APGAS Philippe Suter, Olivier Tardieu, Josh Milthorpe IBM Research Picture by Simon Greig.](https://reader038.fdocuments.in/reader038/viewer/2022110207/56649d6e5503460f94a4f7d5/html5/thumbnails/8.jpg)
Messages and Memory• Default mechanism for transferring memory
between places is to capture it in the closure of the body of asyncAt.
• APGAS lets the programmer define global symbols for memory local to places.
class Worker(…) extends PlaceLocal
![Page 9: Distributed Programming in Scala with APGAS Philippe Suter, Olivier Tardieu, Josh Milthorpe IBM Research Picture by Simon Greig.](https://reader038.fdocuments.in/reader038/viewer/2022110207/56649d6e5503460f94a4f7d5/html5/thumbnails/9.jpg)
Place-local Objects
• All instances of PlaceLocal resolve to objects that are place-specific.
class Worker(…) extends PlaceLocal
val w : Worker = PlaceLocal.forPlaces(places) { new Worker(…) }
for(p <- places) { asyncAt(p) { w.work() }}
One distinct instance is created at each place.
Here, w resolves to the worker at place p.
![Page 10: Distributed Programming in Scala with APGAS Philippe Suter, Olivier Tardieu, Josh Milthorpe IBM Research Picture by Simon Greig.](https://reader038.fdocuments.in/reader038/viewer/2022110207/56649d6e5503460f94a4f7d5/html5/thumbnails/10.jpg)
Global and Shared References
• For objects that cannot extend PlaceLocal, APGAS provides a wrapper (“pointer”)trait GlobalRef[T] { def apply(): T }
• Shared references refer to an object at a particular place and can only be dereferenced there.– Useful to “call back” from an asynchronous task.
trait SharedRef[T] { def apply(): T }
![Page 11: Distributed Programming in Scala with APGAS Philippe Suter, Olivier Tardieu, Josh Milthorpe IBM Research Picture by Simon Greig.](https://reader038.fdocuments.in/reader038/viewer/2022110207/56649d6e5503460f94a4f7d5/html5/thumbnails/11.jpg)
Global and Shared References
// at place p1val largeArray : Array[Double] = …val ref = SharedRef.make(largeArray)
asyncAt(p2) { … asyncAt(p1) { val array = ref() array(…) = … } …}
Dereference at p1 resolves to largeArray.
largeArray is never captured, therefore never serialized.
Dereferencing ref() here would be an error.
![Page 12: Distributed Programming in Scala with APGAS Philippe Suter, Olivier Tardieu, Josh Milthorpe IBM Research Picture by Simon Greig.](https://reader038.fdocuments.in/reader038/viewer/2022110207/56649d6e5503460f94a4f7d5/html5/thumbnails/12.jpg)
Distributed K-means Clustering• Goal: iteratively divide a set of points into K
disjoint clusters.• Distribute the points among workers.• In each iteration:– workers:• computes the new centroids for their own points.• communicate their view of the centroid to the master
– the master:• aggregates all workers’ data and checks convergence
![Page 13: Distributed Programming in Scala with APGAS Philippe Suter, Olivier Tardieu, Josh Milthorpe IBM Research Picture by Simon Greig.](https://reader038.fdocuments.in/reader038/viewer/2022110207/56649d6e5503460f94a4f7d5/html5/thumbnails/13.jpg)
Distributed K-Means: Memory
• Each worker needs to hold:– Its set of points.– Its local view of centroids.
• In addition, the master holds:– The aggregated centroids.
• In our implementation, the workers write their results directly at the master’s.– Requires synchronized data structure.
GlobalRef[WorkerData]
SharedRef[MasterData]
![Page 14: Distributed Programming in Scala with APGAS Philippe Suter, Olivier Tardieu, Josh Milthorpe IBM Research Picture by Simon Greig.](https://reader038.fdocuments.in/reader038/viewer/2022110207/56649d6e5503460f94a4f7d5/html5/thumbnails/14.jpg)
Distributed K-Means: Structure
while(!converged) { finish { for(p <- places) { asyncAt(p) { // compute new local centroids asyncAt(masterRef.home()) { // merge local centroids in master } } } }}
![Page 15: Distributed Programming in Scala with APGAS Philippe Suter, Olivier Tardieu, Josh Milthorpe IBM Research Picture by Simon Greig.](https://reader038.fdocuments.in/reader038/viewer/2022110207/56649d6e5503460f94a4f7d5/html5/thumbnails/15.jpg)
Unbalanced Tree Search
• Counts nodes in a dynamically generated tree.• Each node:– Has an associated SHA1 hash.– Has a number of children determined by a
probabilistic law.• Trees are unbalanced in an unpredictable but
deterministic way.
![Page 16: Distributed Programming in Scala with APGAS Philippe Suter, Olivier Tardieu, Josh Milthorpe IBM Research Picture by Simon Greig.](https://reader038.fdocuments.in/reader038/viewer/2022110207/56649d6e5503460f94a4f7d5/html5/thumbnails/16.jpg)
Unbalanced Tree Search
• Algorithm combines work-stealing and work-dealing among workers.
• Workers are modeled as state machines.• Termination:– in APGAS: a single, top-level finish.– in Akka: requires a counting protocol.
![Page 17: Distributed Programming in Scala with APGAS Philippe Suter, Olivier Tardieu, Josh Milthorpe IBM Research Picture by Simon Greig.](https://reader038.fdocuments.in/reader038/viewer/2022110207/56649d6e5503460f94a4f7d5/html5/thumbnails/17.jpg)
APGAS Implementation
• APGAS implementation:– ~2000 lines Java 8– ~200 lines Scala (definitions, helpers, serialization)
• Tasks are scheduled using fork/join.• Distribution built on top of Hazelcast.
• Benchmarks are ~1200 Scala lines– 1/3 APGAS, 1/3 Akka, 1/3 common.
![Page 18: Distributed Programming in Scala with APGAS Philippe Suter, Olivier Tardieu, Josh Milthorpe IBM Research Picture by Simon Greig.](https://reader038.fdocuments.in/reader038/viewer/2022110207/56649d6e5503460f94a4f7d5/html5/thumbnails/18.jpg)
Performance Evaluation
• For both benchmarks, we ran a fixed problem using 1, 2, 4, 8, 16, and 32 workers.
• Measured “unit of work” per second per worker.
• All experiments ran on single 48 core machine.– Akka benchmarks use akka-remote.
![Page 19: Distributed Programming in Scala with APGAS Philippe Suter, Olivier Tardieu, Josh Milthorpe IBM Research Picture by Simon Greig.](https://reader038.fdocuments.in/reader038/viewer/2022110207/56649d6e5503460f94a4f7d5/html5/thumbnails/19.jpg)
Performance Evaluation
• Experiments are meant to:– be a sanity check,– provide evidence of scalability potential.
• Please do not interpret as claim that X is better than Y.
“Comparable performance and scalability for comparable complexity.”
![Page 20: Distributed Programming in Scala with APGAS Philippe Suter, Olivier Tardieu, Josh Milthorpe IBM Research Picture by Simon Greig.](https://reader038.fdocuments.in/reader038/viewer/2022110207/56649d6e5503460f94a4f7d5/html5/thumbnails/20.jpg)
K-Means
0 5 10 15 20 25 30 350.34
0.36
0.38
0.4
0.42
0.44
0.46
0.48
APGASAkka
Itera
tions
/sec
ond/
wor
ker
Number of workers
![Page 21: Distributed Programming in Scala with APGAS Philippe Suter, Olivier Tardieu, Josh Milthorpe IBM Research Picture by Simon Greig.](https://reader038.fdocuments.in/reader038/viewer/2022110207/56649d6e5503460f94a4f7d5/html5/thumbnails/21.jpg)
Unbalanced Tree Search
0 5 10 15 20 25 30 358.4
8.6
8.8
9
9.2
9.4
9.6
APGASAkka
Mill
ion
of n
odes
/sec
ond/
wor
ker
Number of workers
![Page 22: Distributed Programming in Scala with APGAS Philippe Suter, Olivier Tardieu, Josh Milthorpe IBM Research Picture by Simon Greig.](https://reader038.fdocuments.in/reader038/viewer/2022110207/56649d6e5503460f94a4f7d5/html5/thumbnails/22.jpg)
Conclusion
• Made APGAS programming problem accessible to Scala programmers.
• Programming style is different, but a good fit for some problems.
• In particular, finish concisely solves hard distributed termination problems.
• Complexity is similar to equivalent Akka impls.• Promising preliminary scaling results.
![Page 23: Distributed Programming in Scala with APGAS Philippe Suter, Olivier Tardieu, Josh Milthorpe IBM Research Picture by Simon Greig.](https://reader038.fdocuments.in/reader038/viewer/2022110207/56649d6e5503460f94a4f7d5/html5/thumbnails/23.jpg)
Thank you!