Scientific Workflows & GEON

14
Efrat Jaeger – SDSC Bertram Ludäscher – UC DAVIS Krishna Sinha – Virginia Tech Ashraf Memon – SDSC Ghulam Memon – SDSC Ilkay Altintas – SDSC Kai Lin – SDSC & many others esp. KEPLER community San Diego Supercomputer Center UC DAVIS Department of Computer Science CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Scientific Workflows & GEON

description

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES. UC DAVIS Department of Computer Science. San Diego Supercomputer Center. Scientific Workflows & GEON. Efrat Jaeger – SDSC Bertram Ludäscher – UC DAVIS Krishna Sinha – Virginia Tech Ashraf Memon – SDSC Ghulam Memon – SDSC - PowerPoint PPT Presentation

Transcript of Scientific Workflows & GEON

Page 1: Scientific Workflows & GEON

Efrat Jaeger – SDSC

Bertram Ludäscher – UC DAVIS

Krishna Sinha – Virginia Tech

Ashraf Memon – SDSC

Ghulam Memon – SDSC

Ilkay Altintas – SDSC

Kai Lin – SDSC

& many others esp. KEPLER community

San Diego Supercomputer Center

UC DAVISDepartment ofComputer Science

CYBERINFRASTRUCTUREFOR THE GEOSCIENCES

Scientific Workflows & GEON

Page 2: Scientific Workflows & GEON

GEON AHM May 5-6, 2005, San Diego

CYBERINFRASTRUCTUREFOR THE GEOSCIENCES

Scientific Workflows Pre-Cyberinfrastructure

• Data Federation & Grid “Plumbing”:– access, move, replicate, query … data (Data-Grid)

• authenticate … SRB Sget/Sput … OPeNDAP, … Antelope/ORBs– schedule, launch, monitor jobs (Compute-Grid)

• Globus, Condor, Nimrod, APST, … • Data Integration:

– Conceptual querying & integration, structure & semantics, e.g. mediation w/ SQL, XQuery + OWL (Semantics-enabled Mediator)

• Data Analysis, Mining, Knowledge Discovery:– manual/textbook (e.g. ternary diagrams), Excel, R, simulations, …

• Visualization:– 3-D (volume), 4-D (spatio-temporal), n-D (conceptual views) …

one-of-a-kind custom apps., detached (island) solutions workflows are hard to reproduce, maintain no/little workflow design, automation, reuse, documentation

need for an integrated scientific workflow environment

Page 3: Scientific Workflows & GEON

GEON AHM May 5-6, 2005, San Diego

CYBERINFRASTRUCTUREFOR THE GEOSCIENCES

Page 4: Scientific Workflows & GEON

GEON AHM May 5-6, 2005, San Diego

CYBERINFRASTRUCTUREFOR THE GEOSCIENCES

Analysis Workflow in KEPLER

• Scientific Workflow (SWF) design• SWF automation• Exploration & discovery mode (change

parameters, data sets, etc. and rerun)• SWF reuse, documentation, reproducibility

Page 5: Scientific Workflows & GEON

GEON AHM May 5-6, 2005, San Diego

CYBERINFRASTRUCTUREFOR THE GEOSCIENCES

Some KEPLER Components (Actors)

Page 6: Scientific Workflows & GEON

GEON AHM May 5-6, 2005, San Diego

CYBERINFRASTRUCTUREFOR THE GEOSCIENCES

KEPLER Team Work: GEON Dataset Generation & Registration

Xiaowen (SDM)

Edward et al.(Ptolemy)

Yang (Ptolemy)

Efrat(GEON)

Ilkay(SDM)

SQL database access (JDBC)Matt,Chad,

Dan et al. (SEEK)

% Makefile$> ant run

% Makefile$> ant run

Page 7: Scientific Workflows & GEON

GEON AHM May 5-6, 2005, San Diego

CYBERINFRASTRUCTUREFOR THE GEOSCIENCES

KEPLER: an open source, cross-project collaboration

Ilkay Altintas SDM, Resurgence, NLADR,…Kim Baldridge Resurgence, NMI Chad Berkley SEEK Shawn Bowers SEEKTerence Critchlow SDM Tobin Fricke ROADNetJeffrey Grethe BIRNChristopher H. Brooks Ptolemy II Zhengang Cheng SDM Dan Higgins SEEKEfrat Jaeger GEON Matt Jones SEEK Werner Krebs, EOLEdward A. Lee Ptolemy II Kai Lin GEONBertram Ludaescher GEON, SDM, SEEK, BIRN, ROADNetMark Miller EOLSteve Mock NMISteve Neuendorffer Ptolemy II Jing Tao SEEK Mladen Vouk SDM Xiaowen Xin SDM Yang Zhao Ptolemy IIBing Zhu SEEK •••

Ptolemy IIPtolemy II

                                                

                                            

www.kepler-project.orgwww.kepler-project.org

Your Logos& NamesHERE!!!

Page 8: Scientific Workflows & GEON

GEON AHM May 5-6, 2005, San Diego

CYBERINFRASTRUCTUREFOR THE GEOSCIENCES

Page 9: Scientific Workflows & GEON

GEON AHM May 5-6, 2005, San Diego

CYBERINFRASTRUCTUREFOR THE GEOSCIENCES

Demonstration by Efrat Jaeger

Page 10: Scientific Workflows & GEON

GEON AHM May 5-6, 2005, San Diego

CYBERINFRASTRUCTUREFOR THE GEOSCIENCES

Q & A

Page 11: Scientific Workflows & GEON

GEON AHM May 5-6, 2005, San Diego

CYBERINFRASTRUCTUREFOR THE GEOSCIENCES

KEPLER: An Open Collaboration

• Initiated by members from NSF/ITR SEEK and DOE SDM/SPA; now several other projects (GEON, Ptolemy II, EOL, Resurgence/NMI, …)

• Open Source (BSD-style license)

• Intensive Communications: – Web-archived mailing lists– IRC (!)– Meetings, Hackathons

• Co-development: – via shared CVS repository– joining as a new co-developer (currently):

• get a CVS account (read-only)• local development + contribution via existing KEPLER member• be voted “in” as a member/co-developer

Page 12: Scientific Workflows & GEON

GEON AHM May 5-6, 2005, San Diego

CYBERINFRASTRUCTUREFOR THE GEOSCIENCES

Scientific Workflow (SWF) Design

• Support SWF design & reuse, via:– Structural data types – Semantic types– Associations (=constraints) between

them – Type checking, inference,

propagationSeparation of concerns:– structure, semantics, WF

orchestration, etc.

Page 13: Scientific Workflows & GEON

GEON AHM May 5-6, 2005, San Diego

CYBERINFRASTRUCTUREFOR THE GEOSCIENCES

Related Publications

Scientific Workflows• Scientific Workflow Management and the Kepler System, B. Ludäscher, I. Altintas, C. Berkley, D.

Higgins, E. Jaeger-Frank, M. Jones, E. Lee, J. Tao, Y. Zhao, Concurrency and Computation: Practice & Experience, Special Issue on Scientific Workflows, to appear, 2005.

• A Framework for the Design and Reuse of Grid Workflows, Ilkay Altintas, Adam Birnbaum, Kim Baldridge, Wibke Sudholt, Mark Miller, Celine Amoreira, Yohann Potier, and Bertram Ludaescher, Intl. Workshop on Scientific Applications on Grid Computing (SAG'04), LNCS 3458, Springer, 2005

• Kepler: An Extensible System for Design and Execution of Scientific Workflows, I. Altintas, C. Berkley, E. Jaeger, M. Jones, B. Ludäscher, S. Mock, 16th International Conference on Scientific and Statistical Database Management (SSDBM'04), 21-23 June 2004, Santorini Island, Greece.

• Kepler: Towards a Grid-Enabled System for Scientific Workflows, Ilkay Altintas, Chad Berkley, Efrat Jaeger, Matthew Jones, Bertram Ludäscher, Steve Mock, Workflow in Grid Systems (GGF10), Berlin, March 9th, 2004.

• An Ontology-Driven Framework for Data Transformation in Scientific Workflows, S. Bowers and B. Ludäscher, Intl. Workshop on Data Integration in the Life Sciences (DILS'04), March 25-26, 2004 Leipzig, Germany, LNCS 2994.

• A Web Service Composition and Deployment Framework for Scientific Workflows, I. Altintas, E. Jaeger, K. Lin, B. Ludaescher, A. Memon, In the 2nd Intl. Conference on Web Services (ICWS), San Diego, California, July 2004.

Page 14: Scientific Workflows & GEON

GEON AHM May 5-6, 2005, San Diego

CYBERINFRASTRUCTUREFOR THE GEOSCIENCES

Data Data IntegrationIntegration

KnowledgeKnowledgeRepresentationRepresentation

Process IntegrationProcess Integration(Scientific Workflows)(Scientific Workflows)

Source: B. Ludaescher, UC Source: B. Ludaescher, UC DAVISDAVISECS-289 Scientific Data Management WQ’05ECS-289 Scientific Data Management WQ’05

Data Data FederationFederation

EcoEcoGridGrid