e-Science and Grid The VL-e approach

26
e-Science and Grid The VL-e approach L.O. (Bob) Hertzberger Computer Architecture and Parallel Systems Group Department of Computer Science Universiteit van Amsterdam [email protected]

description

e-Science and Grid The VL-e approach. L.O. (Bob) Hertzberger Computer Architecture and Parallel Systems Group Department of Computer Science Universiteit van Amsterdam [email protected]. Background information experimental sciences. Experiments become increasingly more complex - PowerPoint PPT Presentation

Transcript of e-Science and Grid The VL-e approach

Page 1: e-Science and Grid The VL-e approach

e-Science and GridThe VL-e approach

L.O. (Bob) Hertzberger

Computer Architecture and Parallel Systems GroupDepartment of Computer Science

Universiteit van Amsterdam

[email protected]

Page 2: e-Science and Grid The VL-e approach

Background informationexperimental sciences

• Experiments become increasingly more complex Driven by detector developments

Resolution increases Automation & robotization increases

• Results in an increase in amount and complexity of data

• Something has to be done to harness this development Virtualization of experimental resources: e-Science

Page 3: e-Science and Grid The VL-e approach

The Application data crisis

• Scientific experiments start to generate lots of data medical imaging (fMRI): ~ 1 GByte per measurement (day) Bio-informatics queries: 500 GByte per database Satellite world imagery: ~ 5 TByte/year Current particle physics: 1 PByte per year LHC physics (2007): 10-30 PByte per year

• Data is often very distributed

Page 4: e-Science and Grid The VL-e approach

Paradigm shift in Life science

• Past experiments where hypothesis drivenEvaluate hypothesisComplement existing knowledge

• Present experiments are data drivenDiscover knowledge from large amounts

of data Apply statistical techniques

Page 5: e-Science and Grid The VL-e approach

The what of e-Science• e-Science is the application domain “Science” of Grid

& Web More than only coping with data explosion A multi-disciplinary activity combining human expertise &

knowledge between: A particular domain scientist ICT scientist

• e-Science demands a different approach to experimentation because computer is integrated part of experiment

Consequence is a radical change in design for experimentation

• e-Science should apply and integrate Web/Grid methods where and whenever possible

Page 6: e-Science and Grid The VL-e approach

Grid and Web ServicesConvergence

Grid

Definition of Web Service Resource Framework(WSRF) makes explicit distinction between “service” and stateful entities acting upon service i.e. the “resources” Means that Grid and Web communities can move forward on a common base

WSRF

Started far apart in apps & tech

OGSIGT2

GT1

HTTPWSDL, WS-*

WSDL 2, WSDM

Have beenconverging

Ref: Foster

Web

Page 7: e-Science and Grid The VL-e approach

Grid service ‘offerings’• Capability to run programs and scripts on remote

sites on demand• Ability to exchange and replicate large bulk-data sets• Replica location services for files based on logical

names• Job monitoring using a distributed relational

information system• Resource brokering and transparent access to

remote facilities• Management of user groups, roles and access rights

Page 8: e-Science and Grid The VL-e approach

Relation to European Grid infrastructures

• Common European e-Infrastructure middleware (EGEE) for core grid services

• Based on successful EU DataGrid, CrossGrid, and LCG software suite

• Already deployed worldwide on a O(100) site production facility

• Support through EGEE Regional Operations Centre (SARA and NIKHEF)

EGEE: Enabling Grids for E-science in Europe (EU FP6)

Page 9: e-Science and Grid The VL-e approach

Levels of Grid abstraction

Computational Grid

Data Grid

Information Web/Grid

Semantic/Knowledge Web/Grid

Page 10: e-Science and Grid The VL-e approach

e-Science Objectives• It should enhance the scientific process by:• Stimulating collaboration by sharing data & information

Improve re-use of data & information• Combing data and information from different modalities

Sensor data & information fusion• Realize the combination of real life & (model based) simulation experiments

• It should result in:• Computer aided support for rapid prototyping of ideas

Stimulate the creativity process

• It should realize that by creating & applying: New computing methodologies and an infrastructure stimulating this

• We try to do this via the Virtual Lab for e-Science (VL-e) project

Page 11: e-Science and Grid The VL-e approach

Virtual Lab for e-Science research Philosophy

• Multidisciplinary research & development of related ICT infrastructure

• Generic application support Application cases are drivers for computer & computational

science and engineering research

Page 12: e-Science and Grid The VL-e approach

Grid/Web ServicesHarness multi-domain distributed resources

Managementof comm. & computing

VL-e Application Oriented Services

Food Informatics

Dutch Telescience

Medical Diagnosis &

Imaging

VL-e projectBio-

Informatics Data

Intensive Science/

HEP

Bio-Diversity

Page 13: e-Science and Grid The VL-e approach

Virtual Lab for e-Science research Philosophy

• Multidisciplinary research and development of related ICT infrastructure

• Generic application support Application cases are drivers for computer & computational

science and engineering research Problem solving partly generic and partly specific Re-use of components via generic solutions whenever

possible

Page 14: e-Science and Grid The VL-e approach

Grid/ Web ServicesHarness multi-domain distributed resources

Managementof comm. & computing

Managementof comm. & computing

Managementof comm. & computing

Potential Genericpart Potential Generic

partPotential Generic

part

ApplicationSpecific

Part

ApplicationSpecific

Part

ApplicationSpecific

Part

Virtual Laboratory Application Oriented Services

App

licat

ion

pull

Page 15: e-Science and Grid The VL-e approach

Generic e-Science aspects• Virtual Reality Visualization & user interfaces• Imaging • Modeling & Simulation

Interactive Problem Solving• Data & information management

Data modeling dynamic work flow management

• Content (knowledge) management Semantic aspects Meta data modeling

Ontologies

• Wrapper technology• Design for Experimentation

Page 16: e-Science and Grid The VL-e approach

Virtual Lab for e-Science research Philosophy

• Multidisciplinary research and development of related ICT infrastructure

• Generic application support Application cases are drivers for computer & computational

science and engineering research Problem solving partly generic and partly specific Re-use of components via generic solutions whenever possible

• Rationalization of experimental process among others the experimental pipeline Reproducible & comparable

Page 17: e-Science and Grid The VL-e approach

Issues for a reproducible scientific experiment

interpretation

Rationalization of the experiment and processes via protocols

processingprocessed data

conversion, filtering,analyses, simulation, …experiment

parameters/settings,algorithms,

intermediate results,…

Parameter settings,Calibrations,

Protocols…

software packages,algorithms

raw dataacquisition

sensors,amplifiers imaging devices,, …

presentationvisualization, animationinteractive exploration, …

MetadataMuch of this is lost when an experiment is completed.

Page 18: e-Science and Grid The VL-e approach

Scientific Workflow Management Systems in an e-Science environment• Functionalities:

Automating experiment routines;

Rapid prototyping of experimental computing systems;

Hiding integration details between resources;

Managing experiment lifecycle;

• Cross different layers of middleware for managing: Data; Computing; Information; Knowledge.

Generic Grid middleware

Data management

Computing tasks

Information

Knowledge

SWMS High level workflow services

Engine

User support

Domain specific Applications

e-Science framework

Grid infrastructure

Page 19: e-Science and Grid The VL-e approach

Virtual Lab for e-Science research Philosophy

• Multidisciplinary research and development of related ICT infrastructure

• Generic application support Application cases are drivers for computer & computational science and

engineering research Problem solving partly generic and partly specific Re-use of components via generic solutions whenever possible

• Rationalization of experimental process Reproducible & comparable

• Two research experimentation environments Proof of concept for application experimentation Rapid prototyping for computer & computational science experimentation

Page 20: e-Science and Grid The VL-e approach

The VL-e infrastructure

Grid Middleware

Surfnet

Application specificservice

Application Potential

Generic service &

Virtual Lab. services

Grid &

NetworkServices

Virtual Laboratory

VL-e Proof of Concept Environment

Telescience Medical Application Bio ASP

VL-e Experimental Environment

Virtual Lab.rapid prototyping

(interactive simulation)

Additional Grid Services

(OGSA services)

Network Service (lambda networking)

VL-e Certification Environment

Test & Cert.Compatibility

Test & Cert.Grid Middleware

Test & Cert.VL-software

Page 21: e-Science and Grid The VL-e approach

Infrastructure for Applications

• Applications are a driving force of the PoC

• Experience shows applications value stability

• Foster two-way interaction to make this happen

Page 22: e-Science and Grid The VL-e approach

VL-e PoC environment• Latest certified stable software environment of

core grid and VL-e services• Core infrastructure built around clusters and

storage at SARA and NIKHEF (‘production’ quality) Good basis for Tier-1

• Controlled extension to other platforms and distributions

• On the user end: install needed servers: user interface systems, storage elements for data disclosure, grid-secured DB access

• Focus on stability and scalability

Page 23: e-Science and Grid The VL-e approach

Hosted services for VL-e

• Key services and resources are offered centrally for all applications in VL-e

• Mass data and number crunching on the large resources at SARA

• Storage for data replication & distribution• Persistent ‘strategic’ storage on tape• Resource brokers, resource discovery, user

group management

Page 24: e-Science and Grid The VL-e approach

Why such a complex scheme?• “software is part of the infrastructure”• stability of core software needed to

develop the new scientific applications• enable distributed systems management

(who runs what version when?)

“the grid is one big error amplifier”“computers make mistakes like

humans, only much, much faster”

Page 25: e-Science and Grid The VL-e approach

Building a scalable infrastructure

With good code, stable releases & supportyou can build large working systems, useful to science

Page 26: e-Science and Grid The VL-e approach

Conclusions• e-Science is a lot more more than trying to cope with

data explosion alone• Implementation of e-Science systems requires further

rationalization and standardization of experimentation process

• e-Science success demands the realization of an environment allowing application driven experimentation & rapid dissemination of feed back of these new methods

• We try to do that via development of Proof of Concept• Good basis for HEP Tier-1