D-Grid and DEISA: Towards Sustainable Grid Infrastructures

D-Grid and DEISA: Towards Sustainable Grid Infrastructures

Michael GerndtTechnische Universität München

[email protected]

Technische Universität München

• Founded in 1868 by Ludwig II, Bavarian King

• 3 campuses, München, Garching, Freising

• 12 faculties: with 20.000 students, 480 professors, 5000 employers

• 5 nobel price winners and thre winners who studied at TUM

• 26 Diplom study programs, 45 Bachelor and master programs

Faculty of Informatics

• 19 Chairs

• 30 professors, 3000 students

• 340 beginners in WS 04/05

• Computer science: diplom, bachelor, master

• Information systems: bachelor, masters

• Bioinformatics: bachelor, masters

• Applied informatics, computational science and engineering

Thanks

• Prof. Gentzsch: Slides on D-Grid• My students in Grid Computing: Slides on DEISA• Helmut Reiser from LRZ: Discussion on status of D-

Grid and DEISA

e-Infrastructure

1. Resources: Networks with computing and data nodes, etc.

2. Development/support of standard middleware & grid services

3. Internationally agreed authentication, authorization, and auditing infrastructure

4. Discovery services and collaborative tools

5. Data provenance

6. Open access to data and publications via interoperable repositories

7. Remote access to large-scale facilities: Telescopes, LHC, ITER, ..

8. Application- and community-specific portals

9. Industrial collaboration

10. Service Centers for maintenance, support, training, utility, applications, etc.

Courtesy Tony Hey

Biomedical Scenario

• Bioinformatics scientists have to execute complex tasks• There is the need to orchestrate these services in workflows

Tools

Computational Power

Storage and DataServices

(SOA)

Courtesy Livia Torterolo

Tools

Computational Power

Storage and DataServices

(SOA)

Grid

Gridified Scenario

Grid technology leverages both the computational and data management resources

Providing optimisation, scalability, reliability, fault tolerance, QoS,…

Appl.

Grid Portal/ Gateway

Courtesy Livia Torterolo

D-Grid e-Infrastructure *)

• Building a National e-Infrastructure for Research and Industry

01/2003: Pre-D-Grid Working Groups Recommendation to Government

09/2005: D-Grid-1: early adopters, ‘Services for Science’

07/2007: D-Grid-2: new communities, ‘Service Grids’

12/2008: D-Grid-3: Service Grids for research and industryD-Grid-1: 25 MEuro > 100 Orgs > 200 researchers

D-Grid-2: 30 MEuro > 100 addl Orgs > 200 addl researchers and industry

D-Grid-3: Call 05/2008

• Important: • Sustainable production grid infrastructure after funding stops

• Integration of new communities

• Evaluating business models for grid services

Structure of D-Grid

Steering CommitteeCG, DGI, Leader of technical areas

Ad

visory B

oard

External E

xperts and IndustryDGI: Integration Project

Community Grids

informs

advises

reports

reviews

CoordinatesCooperation

D-Grid -1, -2, -3 2005 - 2011

12

As

tro

-Gri

d

C3

-Gri

d

HE

P-G

rid

IN-G

rid

Me

diG

rid

ON

TO

VE

RS

E

WIK

ING

ER

WIS

EN

T

Te

xtg

rid

. . . . . .

Im W

iss

en

sne

tz

Business Services, SLAs, SOA Integration, Virtualization

User-friendly Access Layer, Portals

Generic Grid Middleware and Grid Services

Integration Project DGI-2

DGI Infrastructure Project

• Goals• Scalable, extensible, generic grid platform for the future• Longterm, sustainable grid operation, SLA-based

• Structure• WP 1: D-Grid basic software components

– large storage, data interfaces, virtual organizations, management

• WP 2: Develop, operate and support robust core grid– resource description, monitoring, accounting, and billing

• WP 3: Network (transport protocols, VPN)– Security (AAI, CAs, Firewalls)

• WP 4: Business platform and sustainability– project management, communication and coordination

DGI Services, Available Dec 2006

• Sustainable grid operation environment with a set of core D-Grid middleware services

• Central registration and information management for all resources

• Packaged middleware components for gLite, Globus and Unicore and for data management systems SRB, dCache and OGSA-DAI

• D-Grid support infrastructure for new communities with installation and integration of new grid resources

• Help-Desk, Monitoring System and Central Information Portal

14

DGI Services, Dec 2006, cont.

• Tools for managing VOs based on VOMS and Shibboleth

• Prototype for Monitoring & Accounting for Grid resources, and first concept for a billing system

• Network and security support for Communities (firewalls in grids, alternative network protocols,...)

• DGI operates „Registration Authorities“, with internationally accepted Grid certificates of DFN & GridKa Karlsruhe

• Partners support new D-Grid members with building their own „Registration Authorities“

DGI Services, Dec 2006, cont.

• DGI will offer resources to other Communities, with access via gLite, Globus Toolkit 4, and UNICORE

• Portal-Framework Gridsphere can be used by future users as a graphical user interface

• For administration and management of large scientific datasets, DGI will offer dCache for testing

• New users can use the D-Grid resources of the core grid infrastructure upon request

D-Grid Middleware

Nutzer

ApplicationDevelopment

and User Access

GAT API

Data/Software

Resourcesin D-Grid

High-levelGrid

Services

Basic Grid Services

DistributedData Archive

User

NetworkInfrastructur

LCG/gLite

Globus 4.0.1

AccountingBilling

User/VO-Mngt

SchedulingWorkflow Management

Data management

Security

Plug-In

UNICORE

DistributedCompute Resources

GridSphere

Monitoring

Die DGI-Infrastruktur (10/2007)

2.20

0 C

PU

-Co

res,

800

TB

Dis

k, 1

.400

TB

Tap

e

Image courtesy Harvey Newman, Caltech

HEP-Grid: p-p collisions at LHC at CERN

Tier2 Centre ~1 TIPS

Online System

Offline Processor Farm ~20 TIPS

CERN Computer Centre

FermiLab ~4 TIPSFrance Regional Centre

Italy Regional Centre

Germany Regional Centre

InstituteInstituteInstituteInstitute ~0.25TIPS

Physicist workstations

~100 MBytes/sec

~100 MBytes/sec

~622 Mbits/sec

~1 MBytes/sec

There is a “bunch crossing” every 25 nsecs.There are 100 “triggers” per secondEach triggered event is ~1 MByte in size

Physicists work on analysis “channels”.Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server

Physics data cache

~PBytes/sec

~622 Mbits/sec or Air Freight (deprecated)




Caltech ~1 TIPS

~622 Mbits/sec

Tier 0Tier 0

Tier 1Tier 1

Tier 2Tier 2

Tier 4Tier 4

1 TIPS is approximately 25,000 SpecInt95 equivalents

20

AstroGrid

21

MediGRID - Data flow in Life Science Grids

Grid-based Platform for Virtual Organisations inthe Construction Industry

Seite 23

BIS-Grid: Grid Technologies for EAI

Grid-based Enterprise Information System

Use of Grid technologies to integrate distributed enterprise information systems

D-Grid: Towards a Sustainable Infrastructure for Science and Industry

• 3nd Call: Focus on Service Provisioning for Sciences & Industry

• Close collaboration with: Globus Project, EGEE, Deisa, CoreGrid, NextGrid, …

• Application and user-driven, not infrastructure-driven => NEED

• Focus on implementation and production, not grid research, in a multi-technology environment (Globus, Unicore, gLite, etc)

• Govt is (thinking of) changing policies for resource acquisition (HBFG !) to enable a service model

24

DEISA

Distributed

European

Infrastructure for

Supercomputer

Applications

What is DEISA?

Framework 6 project.

Consortium of leading SC centers in Europe.

What are the main goals?

To deploy and operate a persistent, production quality, distributed supercomputing environment with continental scope

To enable scientific discovery across a broad spectrum of science and technology. Scientific impact (enabling new science) is the only criterion for success.

IDRIS-CNRS, Paris (F)Prof. Victor Alessandrini

LRZ, München (D)Dr. Horst-Dieter Steinhoefer

RZG, Garching (D)Dr. Stefan Heinzel

CINECA, Bologna (I) Dr. Sanzio Bassini

EPCC, Edinburgh (GB)Dr. David Henty

CSC, Espoo (FIN)Mr. Klaus Lindberg

SARA, Amsterdam (NL)Dr. Axel Berg

ECMWF, Reading (GB)(Weather forecast)Mr.Walter Zwieflhofer

FZJ, Jülich (D)Dr. Achim Streit

BSC, Barcelona (ESP)Prof. Mateo Valero

HLRS, Stuttgart, (D)Prof. Michael Resch

Projectdirector

Executivecommittee

DEISA: Principal Project Partners

2004 2006 2008

Deployment of a co-scheduling service

Synchronizing remote supercomputers

Allowing high performance data transfer services across sites.

Evolving star-like configuration 10 Gb/s Phase2 network.

DEISA deployed UNICOREEnabler for transparent

access to distributed resources

Allows high performance data sharing at a continental scale as well as transparent job migration across similar platforms

A virtual dedicated 1 Gb/s internal network provided by GEANT

DEISA: Principal Project Partners

DEISA Architecture

Linuxclusters

AIXdistributedSuperclusters

VectorSystems

DEISA environment incorporates different platforms and operating systems: IBM Linux on PowerPC, IBM AIX

on Power4-5 SGI Linux on Itanium NEC vector systems

Since 2007 DEISA infrastructure's aggregated computing power is close to 190 Teraflops.

The GRID file system allows storage of data in heterogeneous environments avoiding data redundancy.

DEISA Operation and Services

• Load balancing the computational workload across national borders

• Huge, demanding applications are run by reorganizing the global operation in order to allocate substantial resources in one site

• Runs “as such” with no modification. This strategy only relies on network bandwidths, which will keep improving in the years to come.

UNICORE Portal

Jobs may be attached to different target sites and systems

Dependencies between tasks / subjobs

UNICORE Infrastructure

DEISA – Extreme Computing Initiative

• Launched in May 2005 by the DEISA Consortium, as a way to enhance its impact on science and technology.

• Applications adapted to the current DEISA Grid • International collaborations involving scientific teams. • Workflow applications involving at least two platforms.• Coupled applications involving more than one platform.

• The Applications Task Force (ATASKF) was created in April 2005.

• It is a team of leading experts in high performance and Grid computing whose major objective is to provide the consultancy needed to enable the user’s adoption of the DEISA research infrastructure.

Common Production Environment

• DEISA Common Production Environment (DCPE)• Defined and deployed on each computer integrated in the

platform.

• The DCPE includes:• shells (Bash and Tcsh),• compilers (C, C++, Fortran and Java),• libraries (for communication, data formatting, numerical

analysis, etc.),• tools (debuggers, profilers, editors, batch and workflow

managers, etc.),• and applications.

• Accessible via the module command• list, load and unload each component.

Monitoring: The Inca System

• The Inca system provides user-level Grid monitoring• Periodic, automated testing of the software and services

required to support persistent, reliable grid operation.• Collect, archive, publish, and display data.

Potential Grid Inhibitors

• Sensitive data, sensitive applications (medical patient records)

• Different organizations have different ROI

• Accounting, who pays for what (sharing!)

• Security policies: consistent and enforced across the grid !

• Lack of standards prevent interoperability of components

• Current IT culture is not predisposed to sharing resources

• Not all applications are grid-ready or grid-enabled

• SLAs based on open source (liability?)

• “Static” licensing model don’t embrace grid

• Protection of intellectual property

• Legal issues (privacy, national laws, multi-country grids)

35

Lessons Learned and Recommendations

• Large infrastructure update cycles• During development and operation, the grid infrastructure

should be modified and improved in large cycles only: all applications depend on this infrastructure !

• Funding required after project• Continuity especially for the infrastructure part of grid projects is

important. Therefore, funding should be available after the project, to guarantee services, support and continuous improvement and adjustment to new developments.

• Interoperability• Use software components and standards from open-source and

standards initiatives especially in the infrastructure and application middleware layer.

• Close collaboration of Grid developers and users• Mandatory to best utilize grid services and to avoid application

silos.

Lessons Learned and Recommendations

• Management board steering collaboration• For complex projects (infrastructure and application projects),

a management board (consisting of the leaders of the different projects) should steer coordination and collaboration among the projects.

• Reduce re-invention of the wheel• New projects should utilize the general infrastructure, and

focus on an application or on a specific service, to avoid complexity, re-inventing wheels, and building grid application silos.

• Participation of industry has to be industry-driven.• Push from outside, even with government funding, is not

promising. Success will come only from real needs e.g. through existing collaborations with research and industry, as a first step.

D-Grid and DEISA: Towards Sustainable Grid Infrastructures

Documents

Transcript of D-Grid and DEISA: Towards Sustainable Grid Infrastructures