MOCCA - H2O-based CCA component framework for programming grids and metacomputing systems
description
Transcript of MOCCA - H2O-based CCA component framework for programming grids and metacomputing systems
Institute of Computer Science AGH
MOCCA - H2O-based CCA component framework for programming grids
and metacomputing systems
Maciej Malawski, Marian Bubak, Michał Placek, Daniel Harężlak,
Dawid Kurzyniec, Vaidy SunderamInstitute of Computer Science AGH, Kraków, Poland Academic Computer Centre CYFRONET-AGH, Kraków, PolandDistributed Computing Laboratory in the Dept. of Math and Computer Science, Emory University, Atlanta
Institute of Computer Science AGH2
Outline
• Motivation – programming grids• Component approach and CCA as a
component standard• H2O resource sharing platform• Design and implementation of MOCCA• Applications and initial results• Future research – GCM interoperability
Institute of Computer Science AGH3
Problem – how to program the Grid
• Many programming models:– MPI– Service Oriented Architectures, Web Services– Tuple spaces, HLA– Distributed Objects – Custom protocols– Components– ...
Institute of Computer Science AGH4
Common Component Architecture
• Component standard for High Performance Computing
• Uses and provides ports described in SIDL• Support for scientific data types (complex
numbers, data arrays)• Existing tightly coupled (CCAFFEINE) and loosely
coupled, distributed (XCAT) frameworks
Institute of Computer Science AGH5
Existing CCA Frameworks• CCAFFEINE
– Tightly coupled– Support for Babel– MPI support
• XCAT– Loosely coupled– Globus-compatible– Java-based
• DCA– MPI based– MxN problems
• SCIRun2– Metacomponent model
• LegionCCA– Based on Legion Metacomputing system
Institute of Computer Science AGH6
Requirements for a Metacomputing-oriented CCA
Framework• Facilitated deployment - provide easy mechanisms for
creation of components on distributed shared resources;
• Efficient communication - both for distributed and local components;
• Flexible - allow flexible configuration of components and various application scenarios;
• Support native components, i.e. components written in non-Java programming languages and compiled for specific architecture.
• Interoperable with Grid standards (Web services)
Solution: H2O
Institute of Computer Science AGH7
H2O Resource sharing platform
• Providers own resources• They independently share
them over the network– access control policies
• Clients discover, locate, and utilize resources
• Resources configurable via plugins
• Aggregation and reselling: cascading pairwise relationships
Network
Providers
Clients
Network
Providers
Clients
Institute of Computer Science AGH8
H2O Component Model• Nomenclature
– container = kernel
– component = pluglet
• Pluglet = remotely accessible object– implements Pluglet interface, used by
kernel to signal/trigger pluglet state changes
• Remote access: based on the RMI model– Pluglets export functional remote interfaces
Pluglet
Pluglet
Functionalinterfaces
Kernel
Clients
(e.g. Hello)
tutorial/step1/srv/Hello.java
public interface Hello extends Remote { String hello() throws RemoteException;}
Interface Pluglet { void init(RuntimeContext cxt); void start(); void stop(); void destroy();}
Institute of Computer Science AGH9
Example scenarios of H2O
1. Provider = deployer
e.g. resource = legacy application
2. Reseller:= developer = deployer
e.g. computational service offered within a grid system
3. Client = deployer
e.g. client runs custom distributed application on shared resources
Deploy
B
A
LegacyApp
DeployProvider
AClient
Repository
A BReseller
C
Deploy
Anativecode
ProviderClient
Repository
ABDeveloper
C
ProviderClient
B
A
...
Registration and Discovery e-mail,phone, ...JNDIUDDI LDAP DNS GIS ...
B
Publish Find
Provider
Institute of Computer Science AGH10
RMIX Communication Substrate
• Extensible framework
• Remote Method Invocations paradigm
• Pluggable protocol providers
• Multiple protocols supported– JRMPX, ONC-RPC, SOAP
• Request-Response and Asynchronous calls
• Combines simplicity, flexibility, and performance
ONC-RPCWeb Services
SOAP clients
JXTA
RMIX
RMIXXSOAP
RMIXRPCX
RMIX JXTA
RMIXJRMPX
Java
Service
Institute of Computer Science AGH11
RMIX: multiple protocols• Protocol switching
• Protocol negotiation
• Various protocol stacks for different situations– SOAP: interoperability
– SSL: security
– ARPC, custom (Myrinet, Quadrics): efficiency
Harness Kernel
Internet
security
firewall
efficiency
efficiency
H2O Kernel
H2O Kernel
H2O Kernel H2O Kernel
H2O Kernel
Institute of Computer Science AGH12
RMIX over JXTA• Fully operational RMI implementation running over
JXTA P2P network• Methods can be
invoked on remote objects located behind firewalls or NATs
• Our implementation of JXTA socket factories manages all the JXTA connectivity transparently from user’s point of view
Institute of Computer Science AGH13
MOCCA Implementation in H2O
ComponentPlugletComponent
Pluglet
CCAComponent
ComponentPluglet
CCAComponent
BuilderPluglet
H2O Kernel
BuilderService
Invoke
Manage
Builder
CCACCA
Pluglet Pluglet
Builder Builder
CCACCA
Pluglet Pluglet
BuilderBuilder
CCACCA
Pluglet Pluglet
Builder
MoccaMainBuilder
MoccaMainBuilder
• Each component running in separate pluglet– Facilitated deployment and security
• Thanks to H2O kernel security mechanisms, multiple components may run without interfering
• Using RMIX for communication – efficiency, multiprotocol interoperability
• Flexibility and multiple scenarios – as in H2O• MOCCA_Light: pure Java implementation - need for supporting
multilanguage components
Institute of Computer Science AGH14
Remote Port Call
Component Pluglet
CCAComponent
MOCCAServices
1. getPort 2. create
3. call
Component Pluglet
CCAComponent
DynamicPortProxy
4. RMIX call
5. call
User Side Provider Side
Institute of Computer Science AGH15
How to use MOCCA(step by step)
• Implement component code extending CCA interfaces (cca.Port, cca.Component)
• Compile component classes into JAR file• Publish application JARs on HTTP server• Use the Java client API or write a Jython script to assemble
application from components– Specify components and their connections– Specify locations of H2O kernels where to instantiate
components
• Running the script automatically deploys necessary pluglets into H2O kernels and spawns application
Institute of Computer Science AGH16
Example scriptbuilder = MoccaMainBuilder()uriKernel1 = URI.create("http://emily.mathcs.emory.edu:7800/")uriKernel2 = URI.create("http://zeus10.cyf-kr.edu.pl:7800/")userBuilderID = builder.addNewBuilder(uriKernel1, "MyBuilderPlugletA")providerBuilderID = builder.addNewBuilder(uriKernel2, "MyBuilderPlugletB")properties = MoccaTypeMap()properties.putString("mocca.plugletclasspath",
"http://emily.mathcs.emory.edu/mocca/mocca-samples.jar")properties.putString("mocca.builderID", userBuilderID.getSerialization())userID = builder.createInstance("My StarterComponent", "mocca.samples.pingpong.impl.MoccaStarterComponent”, properties)properties.putString("mocca.plugletclasspath", "http://emily.mathcs.emory.edu/mocca/mocca-samples.jar")properties.putString("mocca.builderID", providerBuilderID.getSerialization())providerID = builder.createInstance("MyPingComponent", "mocca.samples.pingpong.impl.PingPongComponent", properties)connectionID = builder.connect(userID, "PingPongUsesPort", providerID, "PingPongProvidesPort")MoccaBuilderClient.invokeGo(userID)
Institute of Computer Science AGH17
Automatic Flow Composer Example
Lookup
FlowOptimizer
FlowComposer
LinkEvaluator
SiteEvaluator
ComponentRegistry
Evaluate
Compose
Evaluate
• Compose application graph from initial data (e.g. initial ports) or incomplete graph
• First implemented for XCAT framework
• Easy migration to MOCCA
• Modification of code required (xcat.Port)
• Similar performance for XCAT and MOCCA (exchange of text documents)
Institute of Computer Science AGH18
Communication Intensive Application Benchmark
• Simplified scenario:
– 2 components
– Provides port: receive and send-back array of double (ping-pong)
• Tested on local Gigabit Ethernet and on transatlantic Internet between Atlanta and Krakow
• 2.4 GHz Linux machines
• Comparison with XCAT
Institute of Computer Science AGH19
Small Data Packets
Factors:
• SOAP header overhead in XCAT
• Connection pools in RMIX
Institute of Computer Science AGH20
Large Data Packets
• Encoding (binary vs. base64)
• CPU saturation on Gigabit LAN (serialization)
• Variance caused by Java garbage collection
Institute of Computer Science AGH21
Example: modeling gold clusters
• Clusters of atoms – Very interesting forms
between isolated atoms or molecules and solid state
– Important for the technology of constructing nanoscale devices.
• Modeling of clusters – Several energy minimization
methods such as MDSA or L-BFGS,
– Choosing an empirical potential
– Highly compute-intensive– The optimal result depends on
the number of possible iterations and initial configurations for each simulation run.
Institute of Computer Science AGH22
Example – deployment
Generator Control
Starter
Simulated Annealing
GatherMolecule
Molecule
...
Molecule
Annealing Control
User Input
Outputgenerator
Molecule
Component
H2O Kernel
Legend
Configuration Generator
Simulated Annealing
Storeroom
Local Minimization
Simulated Annealing
Control
Control
Institute of Computer Science AGH23
Integration with existing Grid middleware
• A pool of computing resources may be created by submitting a number of H2O kernels on many Grid sites
• Application components may be deployed on the kernels belonging to the pool
Standalonemachine
Cluster
Grid node
ResourceBrokerSSH
PBSLCG
H2O
H2O
H2OH2OH2O
H2O
User'svirtual
resourcepool
NSbind()
lookup()
Institute of Computer Science AGH24
Multilanguage support: motivation
• Grids are heterogeneous• Multiple programming languages – in single application
– Java for middleware– C for system programming– FORTRAN for computing– Python for scripting
• Multiple protocols – in single application– High speed local networks (Myrinet)– TCP/SSL/TLS in WAN– SOAP for loosely coupled message exchange– Overlay P2P networks for traversing private network
boundaries (NATs)• Context: MOCCA component framework
Institute of Computer Science AGH25
Multilanguage Solution - Babel• SIDL – Scientific Interface Definition Language
– Standard for CCA Components
– Supports arrays and complex types
– Focus on interfaces
• Babel: – SIDL parser
– Code generator
– Runtime library
• Intermediate ObjectRepresentation (IOR)
– Core of Babel object
– Array of function pointers
– Generated code in C
C
C++
f77
f90
Python
Java
Babel
C
C++
f77
f90
Python
Java
Babel
package example version 1.2 { class Hello { string hello( in string hello); }}
// user defined non-static methods: /** * Method: hello[] */ public java.lang.String hello_Impl ( /*in*/ java.lang.String hello ) { // DO-NOT-DELETE splicer.begin(example.Hello.hello) // Insert-Code-Here {example.Hello.hello} (hello) return ”Server says: ” + hello; // DO-NOT-DELETE splicer.end(example.Hello.hello) }
/** * Method: hello[] */char*example_Hello_hello( /*in*/ example_Hello self, /*in*/ const char* hello);
Institute of Computer Science AGH26
Currently: Babel for Local Applications
• All Babel objects in one process
• Implemented in CCAFFEINE framework
• Existing multilanguage CCA components – see CCA tutorial
Javaapplication
Fortrannativelibrary
SIDL
C++nativelibrary
SIDL
Babel IOR
Babel IOR
Institute of Computer Science AGH27
Our Solution
• Babel + RMIX• Implementation of
Babel RMI extensions– generic mechanism
of method invocation (reflection)
– Dynamic loading of communication library
– No need for code generation and compilation
Javaapplication
Fortrannativelibrary
SIDL
C++nativelibrary
SIDL
Babel IOR
RMIXlibrary
Babel IOR
Network
SIDL
RMIXlibrary
SIDL
Institute of Computer Science AGH28
Beyond CCA ?
• Supporting multiple component standards– Goal: to enable loading of components written for different
standards (e.g. Corba CCM, other)– Examples of similar solutions: CCAFFEINE supports „classic”
and „Babel” components; SCIRun2 implementing meta-component model
• Using MOCCA as promising platform for feasibility studies in various aspects of Grid components– For experiments with advanced features
• Scheduling and load-balancing• Fault-tolerance• Semantic description and composition
– As a platform for higher-level grid services and tools
Institute of Computer Science AGH29
Fractal/GCM – CCA Interoperability?
• CCA as Fractal– Adapter calls setServices() on a
component– Component registers ports on the
adapter– Component is ready for introspection
and connection• Fractal as CCA
– CCA framework creates component, adapter and invokes setServices()
– Adapter introspects the component and registers interfaces to the framework
– Adapter obtains references to external interfaces (getPort()) and binds them to the component
Adapter
CCA Component
CCC BC
Services
Adapter
C BC
Services
Fractal Component
Institute of Computer Science AGH30
References• Maciej Malawski, Dawid Kurzyniec, and Vaidy Sunderam. MOCCA
– towards a distributed CCA framework for metacomputing, Accepted for: 10th International Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS2005) at IPDPS’2005 http://mathcs.emory.edu/dcl/h2o/papers/h2o_hips05.pdf
• H2O Project homepage: http://www.mathcs.emory.edu/dcl/h2o/
• CCA Forum: http://www.cca-forum.org– CCA Specification
– Tutorial
• MOCCA homepage: http://www.icsr.agh.edu.pl/mambo/mocca– Download binary and source distribution
– README