GRADD: Scientific Workflows. Scientific Workflow E. Science laboris Workflows are the new rock and...
-
Upload
aidan-rollins -
Category
Documents
-
view
212 -
download
0
Transcript of GRADD: Scientific Workflows. Scientific Workflow E. Science laboris Workflows are the new rock and...
GRADD: Scientific Workflows
Scientific WorkflowE. Science laboris
• Workflows are the new rock and roll of eScience
• Machinery for coordinating the execution of (scientific) services and linking together (scientific) resources.
• Era of service oriented apps (SOA)
• Repetitive and mundane boring tasks made easier (data cleaning...)
• Facilitates sharing of science
Trident Scientific Workflow Workbench
• Visually program workflows, through a web browser
• Libraries of activities, workflows and services– Social annotations and search
• Abstract parallelism, for HPC & many core (CCR)
• Adaptive workflows, to detect and respond to events
• Automatic provenance capture, open provenance model
• Costing model, resources include time, power, data xfer
• Integrated data storage and access
• Integrated visualization tools
• Fault tolerance, facilitate smart reruns, what-if analysis
• Factory scheduling of workflows
Trident ImplementationBuilt on top of industrial workflow engine
Windows Workflow Foundation– Workflow in a general purpose framework– Part of Microsoft’s .NET Framework 3.5
TridentLogical Architecture
Trident Logical Architecture
Design
Workbench
Service Registry
Workflow Packages
Scientific Workflows
Windows Workflow Foundation
Trident Runtime Services
Data Access
Runtime
Workflow Monitor
Administration Console
WorkflowLauncher
HPC Scheduling Service
Provenance
Monitoring Service
Community
Portal:
Sharing and commenting on workflows, services, and data sources
Archiving
Visualization
Fault Tolerance
Data Object Model (Database Agnostic Abstraction)
SQL Server, Cloud DB, and others
Domain specific custom activities
Visual Workflow Designer
Runtime Services• Provenance• Fault Tolerance• HPC Scheduling Service• Monitoring Service
Registry
Runtime Admin Tools
Community Site
Activities: An Extensible Approach
OOB activities, OOB activities, workflow types,workflow types,
General-purposeGeneral-purpose
Basic workflowBasic workflow constructsconstructs
Create/Extend/ Create/Extend/ Compose activitiesCompose activities
Read from sensors,Read from sensors,Data pipelines, etc.Data pipelines, etc.
First-class citizensFirst-class citizens
Base ActivityBase ActivityLibraryLibrary
Custom Activity Custom Activity LibrariesLibraries
Read fromRead fromSensorSensor
Out-of-Box Out-of-Box ActivitiesActivities
Extend Extend activityactivity
Domain-specific activitiesDomain-specific activities
Domain specific workflow Domain specific workflow packages - oceanographypackages - oceanography
Domain-Specific Domain-Specific Workflow PackagesWorkflow Packages
Rosetta netRosetta net
CRMCRM
BiologyBiology
OceanographyOceanography
Compose Compose activitiesactivities
Trident Workflow DesignerVisually compose, search and archive (share)
Workflow Execution Provenance
For a workflow management system, provenance identifies what activities were executed, parameters supplied at runtime, data passed between activities, intermediate results generated, etc• Explain how a workflow result was created – sufficient to establish
trust;• Provides a replication recipe;• Guide development of future experiments;
Scientists routinely record the provenance of bench experiments in lab notebooks – – this is essential for computational experiments as well.
Provenance in Trident
Enactment engine documents all steps linking original inputs with final result so execution can be verified, reproduced or rerun – provenance is a first class data product in Trident…
Provenance capture is automatic and transparentWill persist provenance data for a fixed period of time.
Supports multiple levels of representation.
Storage provided by underlying systemInterface to query and reason over provenance data.
Efficient storage representation and query performance.
Applications and Scientists need a Curated Registry of Services
Just having a workflow system isn’t enoughand it’s not just about workflows...
Note: Registry, not repositoryServices are hosted elsewhere
Trident Registry
A Curated Registry of Services
(and…) Registry of Data Products
(and…) Registry of Provenance