U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science X10: IBM’s bid into...
-
Upload
julia-spain -
Category
Documents
-
view
217 -
download
0
Transcript of U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science X10: IBM’s bid into...
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science
X10: IBM’s bid into parallel languages
Paul B KohlerKevin S Grimaldi
University of Massachusetts Amherst
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 2
introduction A new language based of Java IBM’s entry to the DARPA’s PERCS project (Productive Easy-to-use Reliable Computer Systems)
Built for NUCCs(Non-Uniform Computing Clusters) where different memory locations incur different cost.
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 3
intro continued Will eventually be combined with
new tools for Eclipse Goals
Safe Analyzable Scalable Flexible
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 4
PGAS Past attempts at parallel languages
have used the illusion of a single shared memory This does not represent the situation
in NUCC. Problems occur when we try divide
memory among processors. X10 uses PGAS to reveal the non-
uniformity and make the language scalable.
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 5
PGAS(co nt) PGAS=Partitioned Global Address
Space Memory partitioned into places.
Data is associated with a place and can only be read/changed locally.
Provided in X10 through the abstractions of places and activities.
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 6
Places Contain a collection of resident mutable
data objects and associated activities Places represent locality boundaries
Very efficient access to resident data Set of places remains fixed at runtime
Places are virtual Mapped to physical processors by runtime Runtime may transparently migrate
places
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 7
Using Places Accessible via place.places
First activity runs at place.FIRST_PLACE
Iterate over places with next() and prev()
here represents current place
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 8
Activities Similar to java threads. Activities are associated with a place. Activities never migrate places. Activities may only read/modify
mutable data that is local to its place. However immutable data (i.e.final or
value) maybe accessed by any activity.
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 9
Activities (cont) Activities are GALS(Globally
Asynchronous Locally Synchronous)
Local data accesses are synchronized
Global data accesses are not by default. Synchronization can be explicitly forced.
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 10
Activities:Syntax It is very simple to spawn new
activities:async(place)statement
This runs the specified statement at the specified place.
Example: final int result;
async(here.next()){result=a+b}
This would add two numbers at the adjacent place and store the result(since result is final it can be accessed by other places)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 11
Type System X10 is strongly typed Unified type system
Everything is an object; no primitive types
Library supplies boolean, byte, short, char, int, long, float, double, complex, String classes
Borrows Java’s single inheritance combined with interfaces
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 12
Reference vs Value Types
Two types of objects Value types are immutable and can be freely
copied Reference types can contain mutable fields but
cannot be migrated Value classes are declared value keyword
instead of class Value classes can still contain fields that are of
reference types Allows them to refer to mutable data Copying ‘bottoms out’ on reference fields
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 13
Type System (cont) Objects are either scalar or
aggregate Each of value and reference types can
be either scalar or aggregate Types consist of two parts
Data type – The set of values it can take
Place type – The place at which it resides
No generics (yet)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 14
Variables Variables must be initialized (can
never be observed without a value) final variables cannot be
changed after initialization Declared by using the final
keyword and/or using a variable name that starts with a capital letter
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 15
Nullable Types Designers view ability to hold null value as
orthogonal to value vs reference type Either reference or value types can be preceded
by nullable Adds a null value to the type Multiple nullables are collapsed (i.e. nullable nullable T = nullable T)
Can cast between T and nullable T (nullable T) v always succeeds (T) null throws an exception if T is not nullable
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 16
Rooted exceptions What should happen when a
thread/activity terminates abnormally? In java it’s unclear since the spawning
thread may have already terminated. X10 uses a rooted exception model. All
uncaught exceptions get passed to the calling activity.
A new blocking command finish s is introduced. This command waits for all activities in s to terminate before proceeding.
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 17
Exceptions (cont) Finish allows exceptions to travel back
towards the root activity and possibly be caught and handled along the way.
Example:try{finish async(here.next()){
throw new Exception();}
}catch(Exception e){}
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 18
Arrays X10 features an array sub-language
similar to ZPL. Arrays have:
Regions Distributions
Arrays are operated on by: for foreach ateach And more!
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 19
Even more arrays Arrays may be value(immutable) or
reference(mutable) Keyword unsafe allows arrays that
will play nice with java code. Arrays can run code as an
initialization step.
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 20
Arrays:Regions Regions:As in ZPL a region is a set
of indexed data points. Regions and distributions are first
class constructs. Regions can be specified like this:
[0:128,0:256] creates a region 128x256
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 21
Regions(cont.) Regions can be modified by
operation such as union(||), intersection(&&) and set difference(-).
Predefined regions types can be constructed using factories.
region R2 = region.factory.upperTriangular(25)
In the future users may be able to define there own regions.
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 22
Arrays:Distributions
Every array has a distribution. A distribution is mapping of array
elements to places. Distributions are over a particular
region. Arrays are typed by their
distribution.
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 23
Distributions cont. Currently must use pre-defined
distributions(unique,block,cyclic…etc.)
Have set operations like regions. Can be used as functions so for a
point p and distribution d: d[p]=place which point p maps to(i.e. where the p’th element “lives”).
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 24
Subarrays Use various boolean operations on
distributions to create subdistributions To get the portion of a block
distribution that is located here:block([1:100]) && [1:100]->here
a | D1 is the portion of array a corresponding to the subdistribution D1
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 25
Array construction Here is an example of array
initialization:float [.] data= new[factory.cyclic([0:200,50:250])]
(point [i, j]){return i+j};
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 26
Array construction Here is an example of array
initialization:float [.] data= new[factory.cyclic([0:200,50:250])]
(point [i, j]){return i+j};
This specifies a 200x200 region
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 27
Array construction Here is an example of array
initialization:float [.] data= new[factory.cyclic([0:200,50:250])]
(point [i, j]){return i+j};
This specifies a 200x200 region. This specifies a cyclic distribution
over the region.
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 28
Array construction Here is an example of array
initialization:float [.] data= new[factory.cyclic([0:200,50:250])]
(point [i, j]){return i+j}; This specifies a 200x200 region. This specifies a cyclic distribution
over the region. This code initialize each element to
the some of its i,j coordinates
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 29
Array iteration Once you have an array what can you do
with it? Array iterators: for, foreach, ateach for: Sequentially iterates over a supplied
region. At each point it binds the point to a variable and executes the accompanying statement.
foreach: As with for but operations are done in parallel. That is it spawns a new activity for each point.
ateach: takes a distribution instead of a region. Performs operations in parallel at the place specified by the distribution.
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 30
Iteration example Example:
for(point p : A){A[p]=A[p]*A[p]
}
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 31
More array ops lift: Takes a binary function and two
arrays of the same distribution. Produces a new array formed by a pointwise application of the function to the two arrays.
reduce: As in MPI applies a binary function to every element to produce a single value.
scan: Creates a new array where the i’th element is the result of reduction on the first i elements.
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 32
Atomic Blocks X10 allows you to define atomic
blocks The contents of a block is guaranteed
to execute as a single atomic event. This is only in regards to other activities in the same place.
While this is guaranteed to be atomic the details are implementation specific.
Syntax: atomic S
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 33
Conditional Atmc Blck
Also provides: when(Cond) S This blocks until cond is true and
then executes S atomically. This allows the creation of a
number of synchronization mechanisms.
Dangerous! If cond is never true or if there is a cycle deadlock occurs.
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 34
Future and Force As discussed before futures allow
the asynchronous computation of a value that may be used in the future.
Futures return a object of type Future<T>
Force is a blocking call that waits for a particular future to be finished
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 35
Futures(cont.) Can only access final variables.
This prevents side effects. Syntax: future(p)e Example: Future <float> blah =
future(here.next){sqrt(a^2+b^2)};
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 36
Clocks Act as barriers
Much more flexible Guarantee no deadlock Dynamically associated with
different sets of activities
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 37
Clock Semantics Activities register with zero or more clocks
Can register/unregister at any time Clocks are always in some phase
Do not advance until all currently registered activities quiesce
Activities quiesce with next operation Indicates they are ready for all their clocks to
advance Suspends until all clocks have advanced This makes deadlock impossible
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 38
Status IBM has supposedly built a single
VM reference implementation Language still under heavy revision GPL’ed X10-XTC compiler available
Doesn’t conform to current language spec
Uses what will possibly be version 0.5 Speculatively contains support for
operator overloading and generics Currently very poor performance
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 39
conclusion So is X10 the answer to all our
parallel programming woes?
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 40
conclusion So is X10 the answer to all our
parallel programming woes? In my opinion probably not.
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 41
conclusion So is X10 the answer to all our
parallel programming woes? In my opinion probably not. Parallelism still very explicit. Still
opportunities for deadlock, race conditions etc.
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 42
conclusion So is X10 the answer to all our
parallel programming woes? In my opinion probably not. Parallelism still very explicit. Still
opportunities for deadlock, race conditions etc.
Takes a “…and the kitchen sink” approach which makes learning the syntax a chore.
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 43
conclusion So is X10 the answer to all our
parallel programming woes? In my opinion probably not. Parallelism still very explicit. Still
opportunities for deadlock, race conditions etc.
Takes a “…and the kitchen sink” approach which makes learning the syntax a chore.
It’s not FORTRAN. Will people bother to use it?