An Overview of Scientific Workflows: Domains & Applications

24
An Overview of Scientific Workflows: Domains & Applications Laboratoire Lorrain de Recherche en Informatique et ses Applications Presented by Khaled Gaaloul Environments COOperation

description

E nvironments COO peration. An Overview of Scientific Workflows: Domains & Applications. Presented by Khaled Gaaloul. Laboratoire Lorrain de Recherche en Informatique et ses Applications. Plan. Context & Problematic State of Art In Progress Conclusion & Perspectives. 1. - PowerPoint PPT Presentation

Transcript of An Overview of Scientific Workflows: Domains & Applications

Page 1: An Overview of Scientific Workflows: Domains & Applications

An Overview of Scientific Workflows: Domains & Applications

Laboratoire Lorrain de Recherche en Informatique et ses Applications

Presented by

Khaled Gaaloul

EnvironmentsCOOperation

Page 2: An Overview of Scientific Workflows: Domains & Applications

Plan

1

I. Context & Problematic

II. State of Art

III. In Progress

IV. Conclusion & Perspectives

Page 3: An Overview of Scientific Workflows: Domains & Applications

I. Context & Problematic

Page 4: An Overview of Scientific Workflows: Domains & Applications

2

Context: Scientific applications

Need of WFMS for the orchestration and

optimization of the scientific endeavors.

Collecting, generating and analyzing of a

large data flow

Need of mechanisms supporting interactions

between heterogeneous applications

Context & Problématic

State of Art In ProgressConclusion & Perspectives

Page 5: An Overview of Scientific Workflows: Domains & Applications

3

Context: Scientific applications integration

Context & Problematic

State of Art In progressConclusion & Perspectives

Step1

Step2

AND

Labo.2

Labo.3

Labo.4

Definition & specification of processes

Data flow managing

Process orchestration

Step5

Step4

Step3Step6

XOR

AND

Labo.1

Dynamic Scheduling of a Scientific Process

Page 6: An Overview of Scientific Workflows: Domains & Applications

4

Prerequisites for scientific applications

High flexibility degree

High-performance for resources distribution

Workflow ad hoc architecture: moving and hierarchical

Data flow Management:

- Automate data streaming

- Enriching the semantic level

- Documentation & reutilisability

Context & Problematic

State of Art In progressConclusion & Perspectives

Page 7: An Overview of Scientific Workflows: Domains & Applications

5

Problematic: How to optimize and orchester scientific processes execution?

Problems in managing shared resources:

heterogeneous environment, virtual organizations

(VO), etc.

Moving Applications: Non-determinism aspect

Current approaches: lack of reutilisability and

documentations, business process oriented

Evolution format within data exchanges

Context & Problematic

State of Art In progressConclusion & Perspectives

Page 8: An Overview of Scientific Workflows: Domains & Applications

6

Problematic: New requirements

Context & Problematic

State of Art In progressConclusion & Perspectives

Designers

Step1

Step2

AND

Step5

Step4

Step3 Step6

XOR

AND

sub process1sub process2

sub process3

To deal with heterogeneityTo deal with

data exchange

Page 9: An Overview of Scientific Workflows: Domains & Applications

State of Art

Page 10: An Overview of Scientific Workflows: Domains & Applications

7

Scientific workflow

Definition: the application of workflow technology to scientific

endeavors, recognized as a valuable approach for assisting

scientists in accessing and analyzing data.

Features:

- Support for large data flows;

- Dynamic environment;

- Incomplete workflow: partial definition;

- Ad hoc planning;

- Reutilisabilty, documentation, etc.

Context & Problematic

State of Art In progressConclusion & Perspectives

ScientificWorkflowGRIDPBIO

Page 11: An Overview of Scientific Workflows: Domains & Applications

8

Scientific Workflow

Scientific domain: dedicated

to the data flow managing

More dynamic: non

predefined workflow

Traceability and

documentation: enriching

the semantic level within

data exchanges

Business Workflow

Business domain: dedicated

to the processes managing

and optimization

Lot of constraints:

predefined workflow,

satisfying end, execution

constraints, etc.

Lack of formalism: Syntactic

level

Context & Problematic

State of Art In progressConclusion & Perspectives

Scientific WorkflowGRIDPBIO

Page 12: An Overview of Scientific Workflows: Domains & Applications

9

Scientific Workflow Vs Business Workflow

Context & Problematic

State of Art In progressConclusion & Perspectives

Scientific WorkflowGRIDPBIO

Page 13: An Overview of Scientific Workflows: Domains & Applications

10

Solution for intensive computing

Virtual organization (VO)

- including different users committees

- sharing global resources (storing, processing)

- Strong impact on organization structure, networks,

security

Context & Problematic

State of Art In progressConclusion & Perspectives

ScientificWorkflowGRIDPBIO

GRID (Globalization of Informatics' Resources and Data)

Page 14: An Overview of Scientific Workflows: Domains & Applications

11

GridFlow (1): GRID and Workflow?

GRID complexity

- Virtual organization

- Needs of visualization, managing, and simulation

WfMS as a Grid service

- Transparent access to one or many GRID regrouping heterogeneous

machines

- Portals for users

Context & Problematic

State of Art In progressConclusion & Perspectives

Scientific Workflow GRIDPBIO

Page 15: An Overview of Scientific Workflows: Domains & Applications

12

GridFlow (2): Architecture

Context & Problematic

State of Art In progressConclusion & Perspectives

Scientific WorkflowGRIDPBIO

Page 16: An Overview of Scientific Workflows: Domains & Applications

13

PBIO: or how to deal with format evolution?

Heterogeneous environment, ad hoc solutions

- Data exchanges and complex communication

- Format evolution: lack of standardization of data streaming

PBIO (Portable Binary Input/Output)

- Approach to deal with binary data in storage and transmission

- Record oriented binary communication mechanism

- Data meta-representation

- Optimizing data storage/transmission

- Improving the communication between processes

Context & Problematic

State of Art In progressConclusion & Perspectives

Scientific WorkflowGRIDPBIO

Page 17: An Overview of Scientific Workflows: Domains & Applications

In Progress

Page 18: An Overview of Scientific Workflows: Domains & Applications

14

Cooperative processes for scientific workflows

Cooperation between applications

- Applications more flexible

- Working and communicating within the same virtual space of work

- Doing common tasks in synchronous or asynchronous way

BONITA: a flexible system for cooperative workflow

- Define, specify, execute, and coordinate different flows of work

- Based on the anticipating model

- Ensure an interface for the modeling and the visualization of the processes

- Managing flexible data

Context & Problematic

State of Art In progressConclusion & Perspectives

Page 19: An Overview of Scientific Workflows: Domains & Applications

15

Motivating Example: Numerizing scenario

Context & Problematic

State of Art In progressConclusion & Perspectives

1- Original Model

3- CAD+ Reconstruction & Modification

2- Digitalization

4- Simulation

5- Customer's Requirements

7- Prototyping

8- Prototype Lifting

10- Testing

6th step

9th step

11th step

Data flow : Input/Output Recovery of CAD's step

Page 20: An Overview of Scientific Workflows: Domains & Applications

16

Deploying the scenario into Bonita

Enhance execution flexibility

Anticipation: process optimizing

Context & Problematic

State of Art In progressConclusion & Perspectives

CAD Customer’s Requirements

CAD Simulation CR

CADSimulation

CR

...

Process

Execution

(Classic WFMS)

(BONITA)Anticipating

Anticipable

Executing

Simulation ...

Page 21: An Overview of Scientific Workflows: Domains & Applications

17

Mapping Data-Intensive Science into BONITA

Considerable data flows

Goal: Optimize the data streaming & enhance the

data exchange mechanismWF Engine

Data Management WF ExecutionData Flow

Contro

l exe

cutio

n

Control execution

Services CallServ

ices

Cal

l

Context & Problematic

State of Art In progressConclusion & Perspectives

PBIO framework

CAD Simulation

C R

Messages Exchange

Data flow computing

Page 22: An Overview of Scientific Workflows: Domains & Applications

17

Discussions

Existing approach: Flow-Based Programming (FBP)

- A new/old approach to scientific application development

- Data flow Vs. Workflow: which one fit to us?

- Anticipating an activity, is it possible with a partial result?

PBIO implementation

- Interactivity with Bonita services call

- Need of middleware like Echo Event to support messages exchange

- Portability of the PBIO approach for existing platforms

Context & Problematic

State of Art In progressConclusion & Perspectives

Page 23: An Overview of Scientific Workflows: Domains & Applications

Conclusion & Perspectives

Page 24: An Overview of Scientific Workflows: Domains & Applications

Conclusions:

Cooperative aspect for scientific applications

Combining strong concepts (GRID & workflows)

Developing a new middleware for scientific process

Perspectives:

Application onto the GRID: Bonita as a GRID service

Adding Non intrusive and user friendly aspects

Collaboration with AURARYD on others scenarios

(Volkswagen, BP)

18

Context & Problematic

State of Art In progressConclusion & Perspectives