Building Science Gateways
Marlon Pierce
Community Grids Laboratory
Indiana University
What Is a Web Portal?Web container that
aggregates content from multiple sources into a single display. “Start Pages”
Typically consume RSS/Atom news feeds.
More powerful versions these days support Flickr, calendars, games, etc. Gadgets, widgets
Examples: iGoogle, Netvibes, My Yahoo!
Grid Computing OverviewGrid computing software is designed to integrate large
supercomputing facilities. TeraGrid, Open Science Grid, EGEE, etc. This is done via network services
Key Service Components Authentication and authorization framework (MyProxy) Remote process access and control (GRAM, Condor) Remote file, I/O access (GridFTP)
Additional Services Information services, replica management, database
federation, storage management, schedulers, etc.
Example Grid Software Stacks: CTSS and VDT
TeraGrid Supercomputing Resources (GPIR)
Science Portals and GatewaysScience Gateways adapt Web portal
technology to build user interfaces to the Grid.
Science portals resemble standard portals, but must alsoSupport access to computing and storage
resources.Allow users remote, Unix-like access to these
resources.Provide access to science applications and
data sets.
And we must provide value added services as well as user interfaces.
Portlets + Client Stubs
DB Service
JDBC
DB
Job Sub/Mon And FileServices
Operating andQueuing Systems
WSDL
Browser Interface
WSDL
WSDL
WSDLWSDL WSDL
VisualizationService
DB
WSDL
Host 1 Host 2 Host 3
My 2002 “octopus” SOA diagram, from the archives.
SOAP/HTTP
HTTP(S)
WSDL WSDL
TerminologyPortlet: this is a standard Java component that generates
HTML and can also act as a client to a remote service. Lives in a portal container. I will also use this term generically.
Web Service: a remotely invokeable function on the Internet.SOAP: the XML message envelop for carrying commands
over HTTP.WSDL: describes the service’s API in XML. REST: A variation of this approach.
Lots more info: http://grids.ucs.indiana.edu/ptliupages/presentations/I590WebService.ppt
But Why?Three-tiered Service Oriented Architecture is the
network equivalent of the the famous Model-View-Controller design pattern.View: the user interface components.Controller: Web service middlewareModel: the backend resources.
Independence of tiers gives flexibilityServices can be reused with alternative user interfaces
Workflow composers like TavernaUser interfaces can work with different service
implementations.
Drawback: reliability and robustness are issues.
Two Approaches to the Middle Tier
Grid Service Grid Service
BackendResource
Web Service
Portal ClientPortal Client
Grid Client
BackendResource
Fat Client Thin Client
Grid Protocol (SOAP) Grid Client
HTTP + SOAP
Grid Protocol(SOAP)
Disloc output converted to KML and plotted.
GeoFEST Finite Element Modeling portlet and plotting tools
What’s In the Screenshots?GeoFEST and Disloc Portlets
Live on gf7.ucs.indiana.eduManage the user’s display: Web forms, links to output,
graphics.Save user session state persistently.
QuakeTables Fault DB Web ServiceLives on gf2.ucs.indiana.eduContains geometric fault models.
GeoFEST and Disloc Execution Web ServicesLives on gf19.ucs.indiana.eduGenerates input files from fault models.Runs and manages codes.
Best Practice for Scientific Web Services
There are many tools to choose from. .NET, Apache Axis, Sun WS, Ruby on Rails, etc.
Make them self-contained. If possible, generate input files within the service.Or have an input file generating service.Remember that they may be used by other people with
other client tools.
Communicate data files with URLs.
Be very careful about exposing the state of the service.Don’t assume persistent connections.
Components for PortalsOpen Grid Computing Environments
Examples. See http://www.collab-ogce.org/
Components for Science Portals
OGCE is founded on the principal that portals should be built out of reusable parts.
Key standard in our first phase: the JSR 168 portlet specification.
Portlets can run in multiple containersuPortal, Sakai, GridSphere, LifeRay, etc.
Allows us to build Grid specific components and deploy along side other goodies: Sakai collaboration tools, contributed portlets, etc.
Future: Open Social compliant Google Gadgets
OGCE GPIR portlet can interoperate with TeraGrid and your own GPIR
services.
Manage TeraGrid MyProxy credentials with the OGCE
ProxyManager portlets.
OGCE file management client portlets interact with TeraGrid
GridFTP servers.
General purpose batch and interactive job submission to GRAM, WS-GRAM is supported.
Dashboard Portlet
20
The dashboard portlet allows users to track jobs on the selected resource. The user can view either his own set of jobs or get information on all submitted jobs.
Queue forecasting portlets work with the NWS QBETS to predict wait times and deadlines.
PURSe portlets manage user requests for portal accounts and Grid credentials.
Condor and Condor-G
OGCE IFrame Portlet can be used to integrate external
sites.
Client Libraries for Grid Computing
Two Major Grid Client EffortsThe Java COG Kit
Supports several versions of Globus and SSH. Condor-G
Has a Web Service interface (BirdBath) and Java client libraries.
Supports Globus (v2 and v4) and several other Grid middleware systems.
You can build either portlets or Web services with either of these.
OGCE portlets use primarily COGWe prefer Condor-G based Web services for long
running jobs.
CoG Abstraction Layer
CoG CoG CoG CoG CoG
CoG Data and Task Management Layer
CoG Gridfaces Layer
CoG CoG
CoG
GridID
E
GT2GT3(X)
GT4WS-RF
Condor Unicore
Applications
SSH Others
Nanomaterials
Bio-Informatics
DisasterManagement
Portals
CoG Abstraction Layer
CoG CoG CoG CoG CoG
CoG Data and Task Management Layer
CoG Gridfaces Layer
CoG CoG
CoG
GridID
E
DevelopmentSupport
CoG Abstraction Layers
TaskTask
Handler
Service
TaskSpecification
SecurityContext
ServiceContact
The class diagram is thesame for all grid tasks (running jobs, modifying files, moving data).
Classes also abstract toolkit provider differences. You set these as parameters: GT2, GT4, etc.
Coupling CoG TasksThe COG
abstractions also simplify creating coupled tasks.
Tasks can be assembled into task graphs with dependencies.“Do Task B after
successful Task A”
Graphs can be nested.
Problems with Grid Client Development
Grid portlets typically wrap each single Grid capability in a separate portlet
Problem is that Grid portlets need to combine these operations Portlets are entire web applications, so we need a component
model for portlets: reusable portlet parts
Even with the COG Abstraction Layer, we must still do a lot of coding to build new applications.
To address these problems we have adopted Java Server Faces Provides several nice Model-View-Controller features JSF provides an extensible framework (tag libraries) for making
reusable components. Apache JSF portlet bridge allows you to convert standalone JSF
applications (development phase) into portlets (deployment phase).
GTLAB Example<html> <body> <f:form> <o:submit id=”test” action=”next_page” />
<o:myproxy id=”pr” hostname=”gf1.ucs.indiana.edu”
port=”7512” lifetime=”2” username=“mnacar” password=”***” />
<o:jobsubmit id=”task” hostname=”cobalt.ncsa.teragrid.org” provider=”GT4” executable=”/bin/ls” stdout=”tmp/result stderr=”tmp/error” />
</o:submit> </f:form> </body></html>
32
Grid Tags Associated Grid Beans Features
<submit/> ComponentBuilderBean Creating components, job handlers, submitting jobs
<handler/> MonitorBean Handling monitoring page actions
<multitask/> MultitaskBean Constructing simple workflow
<dependency/> MultitaskBean Defining dependencies among sub jobs
<myproxy/> MyproxyBean Retrieving myproxy credential
<fileoperation/> FileOprationBean Providing Gridftp operations
<jobsubmission/> JobSubmitBean Providing GRAM job submissions
<filetransfer/> FileTransferBean Providing Gridftp file transfer
ResourceBean Describes common properties among all tags and beans. Passing values given by standard visual JSF components.
Managing Scientific Workflows
Scientific Workflows
Portal interfaces encode scientific use cases.If you have a rich set of services, it is a lot of
work to make portlets for all possible use cases.And power users will have always want
something more.Example: our CICC project has dozens of
chemical informatics Web services.http://www.chembiogrid.org.wiki
Workflow composers can simplify this.Allow users to encode and execute their own
use cases.
Web Services and Workflows
Perform a similarity search on the NIH DTP Human Tumor data.
Filter the results based on Pharmacokinetic properties (FILTER)
Convert to 3D (OMEGA)
Docking into a pre-defined protein (FRED)
Visualize (JMOL).Taverna workflow connects remote services.
OGCE’s XBaya Workflow Composer
Future of Science Gateways
Social Gadgets+AJAX
DB Service
JDBC
DB
Job Sub/Mon And FileServices
Operating andQueuing Systems
REST
Browser Interface
REST
WSDL
RESTREST REST
VisualizationService
DB
REST
Host 1 Host 2 Host 3
Updating the Octopus
RSS,JSON/HTTP
HTTP(S)
REST REST
Enterprise Approach Web 2.0 Approach
JSR 168 Portlets Gadgets, Widgets
Server-side integration and processing
AJAX, client-side integration and processing, JavaScript
SOAP RSS, Atom, JSON
WSDL REST (GET, PUT, DELETE, POST)
Portlet Containers Open Social Containers (Orkut, LinkedIn, Shindig); Facebook; StartPages
User Centric Gateways Social Networking Portals
Workflow managers (Taverna, Kepler, etc)
Mash-ups
Grid computing: Globus, condor, etc Cloud computing: Amazon WS Suite, Xen Virtualization
Semantic Web: RDF, OWL, ontologies
Microformats, folksonomies
Microformats,KML, and GeoRSS feeds used to deliver SAR data to multiple clients.
More Information
Contact me: [email protected]
See what I’m up to: http://communitygrids.blogspot.com/
OGCE software: http://collab-ogce.org/
QuakeSim: http://www.quakesim.org/
CICC: http://www.chembiogrid.org/wiki/
Lots of people worked on all of these.
Top Related