Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale...
-
Upload
amberlynn-monica-eaton -
Category
Documents
-
view
223 -
download
1
Transcript of Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale...
![Page 1: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/1.jpg)
Complexity Computational Environment,
integrating data and simulation on the Grid:
Multiscale computing
JPLJune 18 2003
Geoffrey Fox, Marlon Pierce
Community Grids Lab
Indiana University
[email protected]://academia.web.cern.ch/academia/lectures/grid/
http://www.grid2002.org
![Page 2: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/2.jpg)
Grid Backdrop from CT Project• Grid Computational Environment (GCE) for SERVOGrid based
on Web services (WS)• Job submission Job management, simple security (to be
addressed), File processing• Support as WS key simulation and Pattern recognition codes
(DISLOC*, SIMPLEX*, VC, PARK, GEOFEST*, DAHMM, PDPC)– *Current
• Support databases and visualization• Simple workflow, notification, metadata services• Initial Schema for GEM specific (meta-)data• Portlet based Interfaces• Extend to ACES (Japan, Australia) for distributed computers,
software, databases, clients• Collaboration and other useful portlets• Can inherit Globus support from Alliance Portal, NMI efforts
![Page 3: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/3.jpg)
AIST Additions• Compatibility with Grid Services• Use of OGSA-DAI XML and SQL database standards
– Including extensions for streaming (sensor) data– Including extensions for integration with simulations
• Optimization for parallel simulations (e.g. parallel IO) (?)• Better workflow, notification, metadata services
– openGIS/GML compatibility (fault etc. Schema)– Semantic Grid
• Autonomic (Robust Reliable Resilient) services (?)• Support multi-scale simulations and data assimilation• ServoPSE Problem Solving Environments (?)
– GeoLanguage (ServoML specializing CCEML) integrating workflow and multi-scale support
– Interactive portlet based front end with Matlab and/or Mathemetica style interface
![Page 4: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/4.jpg)
Database Database
Closely Coupled Compute Nodes
Analysis and Visualization
RepositoriesFederated Databases
Sensor NetsStreaming Data
Loosely Coupled Filters
SERVOGrid Caricature
![Page 5: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/5.jpg)
Sources of Grid Technology?• Grids support distributed collaboratories or virtual
organizations that support People, Computers, Observational Data and results of thought and data processing
• The Web and Web Services– Most important for Information Grids as these are naturally
service-based
• Distributed Objects (CORBA Java/Jini COM)– Distributed Object same as a Service
• Globus Legion Condor NetSolve Ninf and other High Performance Computing activities– Compute/File Grids that need to be made into services (Globus
GT3) and integrated with Information Grids for Geocomplexity
• Peer-to-peer Networks
![Page 6: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/6.jpg)
Taxonomy of Grid FunctionalitiesName of Grid Type Description of Grid Functionality
Compute/File Grid
or Data File Grid
Run multiple jobs with distributed compute and data resources (Global “UNIX Shell”)
Desktop Grid
e.g. SETI@Home
“Internet Computing” and “Cycle Scavenging” with secure sandbox on large numbers of untrusted computers
Information Grid
or Data Service Grid
Grid service access to distributed information, data and
knowledge repositories
Complexity or Hybrid Grid
Hybrid combination of Information and Compute/File Grid emphasizing integration of experimental data, filters and simulations: Data assimilation
Campus Grid Grid supporting University community computing
Enterprise Grid Grid supporting a company’s enterprise infrastructure
![Page 7: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/7.jpg)
Approach• Build on e-Science methodology and Grid
technology• Geocomplexity (and Biocomplexity)
applications with multi-scale models, scalable parallelism, data assimilation as key issues– Data-driven models for earthquakes
• Use existing code/database technology (SQL/Fortran/C++) linked to “Application Web/OGSA services” – XML specification of models, computational
steering, scale supported at “Web Service” level as don’t need “high performance” here
– Allows use of Semantic Grid technology• AIST builds on CT
Typicalcodes
WS linkingto user andOther WS
(data sources)
Application WS
![Page 8: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/8.jpg)
HPCSimulation
DataFilter
Data FilterD
ata
Filt
er
Data
Filter
Data
Filter
Distributed Filters massage dataFor simulation
Other
Grid
and W
eb
Servi
ces
AnalysisControl
Visualize
SERVOGrid (Complexity)Computing Model
Grid
OGSA-DAIGrid Services
This Type of Gridintegrates with
Parallel computingMultiple HPC
facilities but only use one at a time
Many simultaneous data sources and
sinks
Grid Data Assimilation
![Page 9: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/9.jpg)
Data Assimilation• Data assimilation implies one is solving some optimization
problem which might have Kalman Filter like structure
• As discussed by DAO at Earth Science meeting, one will become more and more dominated by the data (Nobs much larger than number of simulation points).
• Natural approach is to form for each local (position, time) patch the “important” data combinations so that optimization doesn’t waste time on large error or insensitive data.
• Data reduction done in natural distributed fashion NOT on HPC machine as distributed computing most cost effective if calculations essentially independent – Filter functions must be transmitted from HPC machine
2 2
1
min ( , ) _obsN
i iTheoretical Unknownsi
Data position time Simulated Value Error
![Page 10: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/10.jpg)
Distributed Filtering
HPC Machine
Distributed Machine
Data FilterNobslocal patch 1
Nfilteredlocal patch 1
Data FilterNobslocal patch 2
Nfilteredlocal patch 2
GeographicallyDistributedSensor patches
Nobslocal patch >> Nfiltered
local patch ≈ Number_of_Unknownslocal patch
Send needed FilterReceive filtered data
In simplest approach, filtered data gotten by linear transformations on original data based on Singular Value Decomposition of Least squares matrix
Factorize Matrixto product oflocal patches
![Page 11: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/11.jpg)
Grid Politics• There is a Global Grid Forum meeting 3 times per year with about
700 attendees per meeting– Exchange information and define standards for “everything” not done in
W3C and OASIS – e.g. Grid Service, Security, What is a Job, Database, Computer, How to build
portals ….
• There is a large project called Globus developing software largely for “compute/file” Grids
• There are some 50 Grid projects (mainly in Europe and USA) developing software and applications as well as installing infrastructure– Some are “deployment”: EDG NMI VDT …..
• There are related initiatives called CyberInfrastructure (NSF USA) and e-Science (UK)
• There is a proposed OMII (Open Middleware Infrastructure Institute) – an international Alliance of separately funded projects with common coordination
![Page 12: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/12.jpg)
OGSA OGSI & Hosting Environments• Start with Web Services in a hosting environment
• Add OGSI to get a Grid service and a component model
• Add OGSA to get Interoperable Grid “correcting” differences in base platform and adding key functionalities
OGSI on Web Services
Broadly applicable services: registry,authorization, monitoring, data
access, etc., etc.
Hosting Environment for WS
Models for resources&
other entities
More specialized services: datareplication, workflow, etc., etc.
Domain -specific services
Other
models
Network
OGSAEnvironment
Possibly OGSA
Not OGSA
Given to us from on high
![Page 13: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/13.jpg)
OGSI Open Grid Service Interface• http://www.gridforum.org/ogsi-wg• It is a “component model” for web services.• It defines a set of behavior patterns that each OGSI service must exhibit.• Every “Grid Service” portType extends a common base type.
– Defines an introspection model for the service– You can query it (in a standard way) to discover
• What methods/messages a port understands• What other port types does the service provide?• If the service is “stateful” what is the current state?
• A set of standard portTypes for– Message subscription and notification– Service collections
• Each service is identified by a URI called the “Grid Service Handle” • GSHs are bound dynamically to Grid Services References (typically wsdl
docs)– A GSR may be transient. GSHs are fixed.– Handle map services translate GSHs into GSRs.
![Page 14: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/14.jpg)
OGSA-DAI(Malcolm Atkinson Edinburgh)
UK e-Science Grid Core Programme
Development of Data Access and Integration Services for OGSA
http://umbriel.dcs.gla.ac.uk/NeSC/general/projects/OGSA_DAI
- Access to XML Databases -
- Access to Relational Databases -
- Distributed Query Processing (DB Federation) -
- XML Schema Support for e-Science -
![Page 15: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/15.jpg)
DAI Key Services
GridDataService GDS Access to data & DB operations
GridDataServiceFactory GDSF Makes GDS & GDSF
GridDataServiceRegistry GDSR Discovery of GDS(F) & Data
GridDataTranslationService GDTS Translates or Transforms Data
GridDataTransportDepot GDTD Data transport with persistence
Integrated Structured Data TransportRelational & XML models supportedRole-based AuthorisationBinary structured files (later)
![Page 16: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/16.jpg)
Client
Client
Client
Relational database
Grid Data Service
Directory / File system
XML database
Interface transparency:
one GDS supports multiple database
types
![Page 17: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/17.jpg)
Integration of Data and Filters• One has the OGSA-DAI Data repository interface combined
with WSDL of the (Perl, Fortran, Python …) filter • User only sees WSDL not data syntax• Some non-trivial issues as to where the filtering compute
power is– Microsoft says filter next to data
DBFilter
WSDL
Of Filter
OGSA-DAI
Interface
![Page 18: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/18.jpg)
InfoGrid
MultiScale
Parallel Computing
Experiments
GeoInformatics
Extended/IntegratedVA+PARK+GEOFEST
Large SystemSimulations
GeneralComplexSystems
Simulations
Load Balancing Algorithms
Integrated CCE
Sensors/Satellites
Other FieldsX-Complexity
Infrastructuree-Science
CollaborationGrid
Computer Science
Modeling
Geology
Clusters
Grid
Visualization
FieldComplex
FluidsStock Market
GridPortals
Databases
BioComplexity
![Page 19: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/19.jpg)
DatabaseService
SensorService
ComputeService
ParallelSimulation
Service
Middle Tier with XML Interfaces
VisualizationService
ApplicationService-1
Users
Database
ApplicationService-2
ApplicationService-3
CCE Control Portal Aggregation
SERVOGrid Complexity Computing Environment
XML Meta-dataService
ComplexitySimulation
Service
![Page 20: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/20.jpg)
SERVOGrid Requirements• Seamless Access to Data repositories and large scale
computers• Integration of multiple data sources including sensors,
databases, file systems with analysis system– Including filtered OGSA-DAI
• Rich meta-data generation and access with SERVOGrid specific Schema extending openGIS standards and using Semantic Grid
• Portals with component model for user interfaces and web control of all capabilities
• Collaboration to support world-wide work• Basic Grid tools: workflow and notification
![Page 21: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/21.jpg)
GridComputing orProgrammingEnvironments
“Core”Grid
Resources
Portal such as “Jetspeed”
Application/User Framework supporting development and deployment of OGSI compliant AWS (Application Web Services)
AWS AWS AWS AWS
Database
WebServices
Hosting EnvironmentResource Grid Services
Generic Application Services
OGSA Interoperability Layer
“Sophisticated” System Services
OGSA Interoperability Layer
e.g. DAI compliantdatabase
Hosting Environment
![Page 22: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/22.jpg)
Taxonomy of Grid Operational StyleName of Grid Style Description of Grid Operational or
Architectural Style
Semantic Grid Integration of Grid and Semantic Web meta-data and ontology technologies
Peer-to-peer Grid Grid built with peer-to-peer mechanisms
Lightweight Grid Grid designed for rapid deployment and minimum life-cycle support costs
Collaboration Grid Grid supporting collaborative tools like the Access Grid, whiteboard and shared applications.
R3 or Autonomic Grid
Fault tolerant and self-healing Grid
Robust Reliable Resilient R3
![Page 23: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/23.jpg)
Paradigms Protocols Platforms and Hosting
• We can start from the Web view where the basic Grid paradigm is
• Meta-data rich Web Services communicating via messages
• These have some basic support from some runtime such as .NET, Jini (pure Java), Apache Tomcat+Axis (Web Service toolkit), Enterprise JavaBeans, WebSphere (IBM) or GT3 (Globus Toolkit 3)– These are the distributed equivalent of operating
system functions as in UNIX Shell
• Called Hosting Environment or platform
![Page 24: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/24.jpg)
Permeating Principles and Policies• Meta-data rich Message-linked Web Services as the permeating paradigm• “User” Component Model such as “Enterprise JavaBean (EJB)” or .NET. • Service Management framework including a possible Factory mechanism • High level Invocation Framework describing how you interact with system
components.– This could for example be used to allow the system to built from either W3C or
GGF style (OGSI) Web Services and to protect the user from changes in their specifications.
• Security is a service but the need for fine grain selective authorization encourages • Policy context that sets the rules for each particular Grid.
– Currently OGSA supports policies for routing, security and resource use.• The Grid Fabric or set of resources needs mechanisms to manage them. This
includes automatic recording of meta-data and configuration of software.• Quality of service (QoS) for the Network and this implies performance monitoring
and bandwidth reservation services. – Challenging as end-to-end and not just backbone QoS is needed.
• Messaging systems like MQSeries from IBM provide robustness from asynchronous delivery and can abstract destination and allow customization of content such as converting between different interface specifications.
• Messaging is built on transport mechanisms which can be used to support mechanisms to implement QoS and to virtualize ports
![Page 25: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/25.jpg)
Virtualization• The Grid could and sometimes does virtualize various
concepts• Location: URI (Universal Resource Identifier) virtualizes
URL• Replica management (caching) virtualizes file location
generalized by GriPhyn virtual data concept• Protocol: message transport and WSDL bindings
virtualize transport protocol as a QoS request• P2P or Publish-subscribe messaging virtualizes matching
of source and destination services• Semantic Grid virtualizes Knowledge as a meta-data
query• Brokering virtualizes resource allocation• Virtualization implies references can be indirect
![Page 26: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/26.jpg)
Interfaces and Functionality and Semantics I• The Grid platform tries to minimize detail in protocols and
maximize detail in interfaces to enhance scaling• However rich meta-data and semantics are critical for
correct and interesting operation– Put as much semantic interpretation as you can into specific
services– Lack of Semantic interoperation is in fact main weakness of
today’s Grids and Web services
• Everything becomes a service whether system or application level
• There are some very important “Global Services”– Discovery (look up) and Registration of service metadata– Workflow– MetaSchedulers
![Page 27: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/27.jpg)
Interfaces and Functionality and Semantics II• There are many other generally important services• OGSA-DAI The Database Service• Portal Service linked to by WSRP (Web services
for Remote Portals)• Notification of events• Job submission• Provenance – interpret meta-data about history of
data• File Interfaces• Sensor service – satellites …• Visualization• Basic brokering/scheduling
![Page 28: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/28.jpg)
Categories of Worldwide Grid Servicesto be exploited by SERVOGrid
• 1) Types of Grid– R3– Lightweight– P2P– Federation and Interoperability
• 2) Core Infrastructure and Hosting Environment– Service Management– Component Model– Service wrapper/Invocation – Messaging
• 3) Security Services– Certificate Authority– Authentication– Authorization– Policy
• 4) Workflow Services and Programming Model– Enactment Engines (Runtime)– Languages and Programming– Compiler– Composition/Development
• 5) Notification Services• 6) Metadata and Information Services
– Basic including Registry– Semantically rich Services and meta-data– Information Aggregation (events)– Provenance
• 7) Information Grid Services– OGSA-DAI/DAIT– Integration with compute resources– P2P and database models
• 8) Compute/File Grid Services– Job Submission– Job Planning Scheduling Management– Access to Remote Files, Storage and
Computers– Replica (cache) Management– Virtual Data– Parallel Computing
• 9) Other services including– Grid Shell– Accounting– Fabric Management– Visualization Data-mining and
Computational Steering– Collaboration
• 10) Portals and Problem Solving Environments• 11) Network Services
– Performance– Reservation– Operations
![Page 29: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/29.jpg)
Two-level Programming I• The paradigm implicitly assumes a two-level Programming
Model• We make a Service (same as a “distributed object” or
“computer program” running on a remote computer) using conventional technologies– C++ Java or Fortran Monte Carlo module
– Data streaming from a sensor or Satellite
– Specialized (JDBC) database access
• Such nuggets accept and produce data from users files and databases
• The Grid is built by coordinating such nuggets assuming we have solved problem of programming the nugget
Nugget Data
![Page 30: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/30.jpg)
Two-level Programming II• The Grid is discussing the linkage and distribution of the
nuggets with the onlyaddition runtime interfaces to Grid as opposed to UNIX data streams
• Familiar from use of UNIX Shell, PERL or Python scripts to produce real applications from core programs
• Such interpretative environments are the single processor analog of Grid Programming and this tends to be called workflow
• Workflow is the composition of multiple services (programs) together to make a new service– Includes “Software Bus”, “Application Integration”, “Co-
ordination Languages” etc.
Nugget1 Nugget2
Nugget3 Nugget4
![Page 31: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/31.jpg)
Workflow• Workflow has at least 4 parts
– “Programming Environment” – typically GUI to drag and drop services and their linkages (familiar from AVS etc. which was workflow for visualization)
– Language – from XML to extended Python– Compiler – converting Language into executable– Runtime controlling flow of information and notification events
• Can use Python, Mathematica, Matlab, JavaSpaces, IBM BPEL4WS, DoE CCA etc.– Don’t think current systems are very near “what we will want” but
expect much progress over next 3 years and plenty of systems to work with
• Metadata critical to tell you how to combine services in a sensible way – so workflow engines must interface with metadata service
![Page 32: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/32.jpg)
Workflow GCEs and Problem Solving Environments (PSEs)
• There is some confusion between fields of workflow (Grid Computing Environments GCE) and PSEs
• To extent PSEs “just” allow manipulation of “nuggets”, they are indistinguishable from a domain specific GCE
• They are distinct if they support intra nugget operations such as– Integration of mesh and simulation
– Closely coupled code linkage
– Generation of code from high level interface like Mathematica
• Even in latter case, a new generation of PSEs should be built with Grid architecture – e.g. message based – and using Grid services like metadata and notification
![Page 33: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/33.jpg)
Database
SERVOGrid ComplexitySimulation Service
XML Meta-dataService
Jobs Tools
SERVOPSE Programs
using CCEML(SERVOML)
MultiScaleOntologiesJob MetaData
Tool MetaData
Selected GeoInformatics Data
Complexity Scripts
Importance of Metadata Service; how should this be implemented?
Workflow
![Page 34: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/34.jpg)
Metadata Approaches• Specialized services like UDDI and MDS (Globus)
– Nobody likes UDDI– MDS uses LDAP– RGMA is MDS with a relational database backend
• “By hand” as in current GEM Portal which is roughly same as using service stored SDE’s (Service Data Elements) as in OGSI
• Some new MDS coming from Globus GT3?– Current MDS has both a Schema (insufficient for us) and a
“database technology”• Semantic Grid technologies• Some basic XML database (Oracle, Xindice …)• If “OGSA compliant” (not defined yet), then doesn’t
matter that much
![Page 35: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/35.jpg)
Workflow and SERVOGrid CCE• SERVOGrid should workflow technology to support both
– “code and data coupling” (DISLOC with SIMPLEX etc.)– Multiscale features
• Implementing multiscale model requires – building Web services for each model, – describing each model with metadata and – Describing linkage of models (linkage of ports on web services)– And describing when to use which scale model
• So workflow and multiscale depend on web services described by rich metadata
• This analysis isn’t correct if scales must be “tightly coupled” as current workflow won’t support this (CCA from DoE claims to address this but not clear if general)– We should focus on multiscale models with loose “nugget”
coupling– Hopefully we will learn how to take same architecture, compile
away inefficiencies and get high performance on tighter coupling than conventional distributed workflow
![Page 36: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/36.jpg)
Technologies under development at Indiana• Portal Infrastructure and Portlets integrating with rest of
Globus/OGSA-DAI Community– Including job submission, management of modest meta-data and
linkage to databases
– Should package as “application web service toolkit” and test on ACES world wide iSERVOGrid
• “Some” core portal Metadata (Semantic Grid) services• Messaging system between Web services that is useful for
– “Service Management”/Autonomic Grids
– Security
– Notification service
• Collaboration infrastructure and portlets
![Page 37: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/37.jpg)
Web Services as a Portlet• Each Web Service naturally has a
user interface specified as “just another port” – Customizable for universal access
• This gives each Web Service a Portlet view specified (in XML as always) by WSRP (Web services for Remote Portals)
• So component model for resources “automatically” gives a component model for user interfaces– When you build your
application, you define portletat same time
Application orContent source
WSDL
Web Service
S
R
W
P
Application as a WSGeneral Application PortsInterface with other WebServices
User Face ofWeb ServiceWSRP Ports define WS as a Portlet
Web Services have other ports (Grid Service) to be OGSI compliant
![Page 38: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/38.jpg)
Online Knowledge Center built from Portlets
• Web Services provide a component model for the middleware (see large “common component architecture” effort in Dept. of Energy)
• Should match each WSDL component with a corresponding user interface component
• Thus one “must use” a component model for the portal with again an XML specification (portalML) of portal component
A set of UIComponents
![Page 39: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/39.jpg)
Sample page with several portlets:
proxy credential manager,submission, monitoring
![Page 40: Complexity Computational Environment, integrating data and simulation on the Grid: Multiscale computing JPL June 18 2003 Geoffrey Fox, Marlon Pierce Community.](https://reader036.fdocuments.in/reader036/viewer/2022062304/56649cd95503460f949a295d/html5/thumbnails/40.jpg)
Provide information about application
andhost parameters
Select applicationto edit
Administer Grid Portal