8/8/2019 Granger Parallel IPython
1/25
8/8/2019 Granger Parallel IPython
2/25
HardwareCheap, fast and widely available.Our free lunch is over -> Single CPUs arent getting much faster.Transition to multi-CPU and multi-core CPU based machines.Clusters and grids.
SoftwareSoftware development is labor intensive.Development of parallel codes is very labor intensive.
Parallel programming tools and paradigms have not evolved muchin the last 2 decades.
8/8/2019 Granger Parallel IPython
3/25
Complex algorithms
Lots of legacy code still used (BLAS,LAPACK, your own)
Need for high-performance
The code is always changing
Large amounts of data
Scientists love MATLAB, IDL,Mathematica
Collaborative development/execution
8/8/2019 Granger Parallel IPython
4/25
8/8/2019 Granger Parallel IPython
5/25
1. It is open source and accessible to everyone.
2. Can be used interactively (like MATLAB, Mathematica, IDL, etc.)
3. Simple, expressive syntax that is readable by human beings.
4. Powerful enough to use in large, complex applications.
5. Supports functional, object-oriented, generic and meta programming.
6. Extremely robust garbage collection.
7. Powerful built-in data-types and libraries.
8. Excellent tools for wrapping Fortran/C/C++/ObjC code ( SWIG , F2PY ,Pyrex , Boost , Weave , PyObjC ).
9. High quality external libraries for visualization ( MayaVi ), plotting( matplotlib ), numerical/scientic computing ( NumPy / SciPy ),networking ( Twisted ), etc.
10. Python bindings for major GUI toolkits ( wx, Tk , GTK , Qt ).
11. Cross platform.
8/8/2019 Granger Parallel IPython
6/25
IPython is an enhanced interactive Python shell
It is the de facto shell for scientic computing inPython.
Already comes with every major Linuxdistributions.
Capabilities:Extensible syntax
GUI integration (wx, Qt, GTK, etc.)
Seamless system shell access
Object/namespace introspectionCommand history/recall
Session logging
Embeddable
http://ipython.scipy.org
8/8/2019 Granger Parallel IPython
7/25
Pros:Robust, optimized, standardized, portable, commonExisting parallel libraries (FFTW, BLACS, ScaLAPACK, ...)Runs over Ethernet, Inniband, Myrinet.
Cons:Trivial things are not trivial -> lots of boilerplate code.Orthogonal to how scientists think and work.Load balancing and fault tolerance are difcult to implement (evenfor simple cases).Emphasis on compiled languages (C/C++/Fortran).Non-interactive and non-collaborative.Difcult to integrate into other computing environments (GUIs,
visualization and plotting tools, Web based tools, etc.).
Labor intensive compile/execute/debug cycles.
8/8/2019 Granger Parallel IPython
8/25Kernel = Network aware Python Instance
Python
- Objects- Commands
8/8/2019 Granger Parallel IPython
9/25
Python instance that listens on a network portMulti-threaded or multi-process with a execution queueUses Twisted -> asynchronous, non-blocking socketsMulti-protocol aware
Custom control protocolSSH, HTTP, . . .
Can be started at any time using SSH , Xgrid , PBS,GridEngine, Condor, . . .
Built-in GUI Integration ( wx , Qt, Tk, GTK, Cocoa, . . .)Pass Python objects, commands, modules, I/O, . . .
Auto-discovery using Bonjour/ZeroConf
8/8/2019 Granger Parallel IPython
10/25
Lightweight object oriented user interface in regular Python Additional syntax in IPython (enhanced Interactive Python)
Medium level of abstraction
Higher level than MPIDoesnt assume a particular high-level model
Automatic synchronization of kernels (no barrier() calls)
Non-blocking and blocking modes
Clean handling of remote I/O
Users process can be transient/kernels are persistent
8/8/2019 Granger Parallel IPython
11/25
Needed if system is used on an open network.
Start Kernels as user nobody
Firewall all but a few Gateway Kernels
Gateway Kernels can have SSL enabled forencrypted communications.
Authenticate users
Twisted has SSL/Authentication capabilities built-in.
8/8/2019 Granger Parallel IPython
12/25
Multiple users can connect simultaneously
Kernels started dynamically at any time
8/8/2019 Granger Parallel IPython
13/25
8/8/2019 Granger Parallel IPython
14/25
8/8/2019 Granger Parallel IPython
15/25
It is annoying to type ic.execute(...)
Use IPythons magic command system. Extended syntax!%cmd args --> magic_cmd(args)
ic.block=True/False toggles I/O forwarding
8/8/2019 Granger Parallel IPython
16/25
push(): one way send to a kernel
pull(): one way recv from a kernel
Graceful error handling:
8/8/2019 Granger Parallel IPython
17/25
Again, it is annoying to type ic.push() and ic.pull()
Can also scatter lists/arrays
8/8/2019 Granger Parallel IPython
18/25
Scatters the list/array to the kernelsEach kernel calls the function on the elements of the arrayResults are gathered back to the local processTook 13 lines of code to implement.
Parallel functions: instant trivial parallelization
8/8/2019 Granger Parallel IPython
19/25
Distributed Memory ObjectsData parallel computations
Task SystemsDynamically load balanced task systemFault tolerantCould allow tasks to be tightly coupled
Googles MapReduceMapReduce is a high-level programming model for processing andgenerating large data set on large clusters. Inspired by LISPs mapand reduce.
Interactive implementation is possible.
GOAL: Make it easy to implement high level constructs
8/8/2019 Granger Parallel IPython
20/25
In the middle of a parallel calculation, you can write a newPython module and load it into the running kernels
Can also reload() modied modules.
Can use to x bugs during a calculation
Test new algorithms without restarting
8/8/2019 Granger Parallel IPython
21/25
Multiple users can connect to a cluster simultaneously.
Shared namespace and data, common execution queue
Basic chat facilitySeparation of control and monitoring of kernels
Some users can monitor the kernels
Others can control them
Arbitrary congurations allowed
8/8/2019 Granger Parallel IPython
22/25
MPI is great at this, so lets use itNot needed in many cases -> MPI is optional
Start kernels with mpiexec and call MPI_Init()
Could wrap other MPI-based libraries.User can directly make calls to MPI through Pythonbindings.
A high level move() function:
8/8/2019 Granger Parallel IPython
23/25
Collaborative visualization/plotting/GUI control
Other network interfaces (web, ssh)
Notebook-like frontend (like Mathematica)
Integration into other cluster environments (PBS,Condor, GridEngine, Globus)
Scalability + Performance
SecurityFull MPI integration
Other high-level parallel constructs
8/8/2019 Granger Parallel IPython
24/25
The system is open source (BSD) and is part of the IPython project:
http://ipython.scipy.org
IPython is the de factoshell for interactive scientic computingin Python and comes with every major Linux distribution.
The kernel will become the foundation of a new version of IPython.
The working prototype is publicly available on the IPythonsubversion repository:
svn co http://ipython.scipy.org/svn/ipython/ipython/branches/chainsaw ipython1
8/8/2019 Granger Parallel IPython
25/25
Python is a useful tool in scienticcomputation.
The future of parallel computing isinteractive and collaborative .
Scientists want free, open source andextensible tools.
We dont have to give up the tools (Fortran/ C/C++/MPI) we love.
Lots of work remains.
Top Related