Grids at NIKHEF 2004.07.14 1 Grid Computing @ NIKHEF David Groep NIKHEF PDP 2004.07.14 The (data)...

20
Grids at NIKHEF 2004.07.14 1 Grid Computing @ NIKHEF David Groep NIKHEF PDP 2004.07.14 • The (data) problem to solve • beyond meta-computing: the Grid • realizing the Grid at NIKHEF • towards a national infrastructure

Transcript of Grids at NIKHEF 2004.07.14 1 Grid Computing @ NIKHEF David Groep NIKHEF PDP 2004.07.14 The (data)...

Page 1: Grids at NIKHEF 2004.07.14 1 Grid Computing @ NIKHEF David Groep NIKHEF PDP 2004.07.14 The (data) problem to solve beyond meta-computing: the Grid realizing.

Grids at NIKHEF 2004.07.14 1

Grid Computing @ NIKHEF

David GroepNIKHEF PDP2004.07.14

• The (data) problem to solve

• beyond meta-computing: the Grid

• realizing the Grid at NIKHEF

• towards a national infrastructure

Page 2: Grids at NIKHEF 2004.07.14 1 Grid Computing @ NIKHEF David Groep NIKHEF PDP 2004.07.14 The (data) problem to solve beyond meta-computing: the Grid realizing.

Grids at NIKHEF 2004.07.14 2

Place event info on 3D map

Trace trajectories through hits

Assign type to each track

Find particles you want

Needle in a haystack!

This is “relatively easy” case

A Glimpse of the Problem in HEP

Page 3: Grids at NIKHEF 2004.07.14 1 Grid Computing @ NIKHEF David Groep NIKHEF PDP 2004.07.14 The (data) problem to solve beyond meta-computing: the Grid realizing.

Grids at NIKHEF 2004.07.14 3

The HEP reality

Page 4: Grids at NIKHEF 2004.07.14 1 Grid Computing @ NIKHEF David Groep NIKHEF PDP 2004.07.14 The (data) problem to solve beyond meta-computing: the Grid realizing.

Grids at NIKHEF 2004.07.14 4

HEP Data Rates

level 1 - special hardware

40 MHz (40 TB/sec)level 2 - embedded processorslevel 3 - PCs

75 KHz (75 GB/sec)5 KHz (5 GB/sec)100 Hz(100 MB/sec)data recording &

offline analysis

• Reconstruct & analyze 1 event takes about 90 s

• Maybe only a few out of a million are interesting. But we have to check them all!

• Analysis program needs lots of calibration; determined from inspecting results of first pass.

Each event will be analyzed several times!

• Raw data rate ~ 5PByte/yr/expt.

• total volume: ~20 Pbyte/yr

• per major centre: ~2 PByte/yr

The ATLAS experiment

Page 5: Grids at NIKHEF 2004.07.14 1 Grid Computing @ NIKHEF David Groep NIKHEF PDP 2004.07.14 The (data) problem to solve beyond meta-computing: the Grid realizing.

Grids at NIKHEF 2004.07.14 5

Data handling and computation

interactivephysicsanalysis

batchphysicsanalysis

batchphysicsanalysis

detector

event summary data

rawdata

eventreprocessing

eventreprocessing

eventsimulation

eventsimulation

analysis objects(extracted by physics topic)

event filter(selection &

reconstruction)

event filter(selection &

reconstruction)

processeddata

Page 6: Grids at NIKHEF 2004.07.14 1 Grid Computing @ NIKHEF David Groep NIKHEF PDP 2004.07.14 The (data) problem to solve beyond meta-computing: the Grid realizing.

Grids at NIKHEF 2004.07.14 6

HEP is not unique in generating data

• LOFAR: 200 MHz,12 bits,25k antennas: 60Tbit/s

• Envisat GOME: ~ 5TByte/year

• Materials analysis (mass spectroscopy, &c):~ 2GByte/10min

• fMRI, PET/MEG, …

LHC data volume necessitates ‘provenance’ and meta-data information/data ratio even higher in other disciplines

both data and information ownership distributedaccess right for valuable data, add privacy for medical data

Page 7: Grids at NIKHEF 2004.07.14 1 Grid Computing @ NIKHEF David Groep NIKHEF PDP 2004.07.14 The (data) problem to solve beyond meta-computing: the Grid realizing.

Grids at NIKHEF 2004.07.14 7

Beyond meta-computing: the Grid

How can the Grid help? via resource accessibility and via sharing

A grid integrates resources that are

– not owned or administered by one single organisation– speak a common, open protocol … that is generic– working as a coordinated, transparent system

And …– can be used by many people from multiple organisations – that work together in one Virtual Organisation

Page 8: Grids at NIKHEF 2004.07.14 1 Grid Computing @ NIKHEF David Groep NIKHEF PDP 2004.07.14 The (data) problem to solve beyond meta-computing: the Grid realizing.

Grids at NIKHEF 2004.07.14 8

Virtual Organisations

A set of individuals or organisations, not under single hierarchical control, temporarily joining forces to solve a particular problem at hand, bringing to the collaboration a subset of their resources, sharing those at their discretion and each under their own conditions.

• A VO is a temporary alliance of stakeholders– Users– Service providers– Information Providers

Page 9: Grids at NIKHEF 2004.07.14 1 Grid Computing @ NIKHEF David Groep NIKHEF PDP 2004.07.14 The (data) problem to solve beyond meta-computing: the Grid realizing.

Grids at NIKHEF 2004.07.14 9

Common and open protocols

Applications

Grid Services GRAM

Grid Security Infrastructure (GSI)

Grid FabricFARMS Supers Desktops TCP/IP Apparatus

Application ToolkitsDUROC MPICH-G2Condor-G

GridFTPInformation

VLAM-G

Replica

DBs

Page 10: Grids at NIKHEF 2004.07.14 1 Grid Computing @ NIKHEF David Groep NIKHEF PDP 2004.07.14 The (data) problem to solve beyond meta-computing: the Grid realizing.

Grids at NIKHEF 2004.07.14 10

Standard protocols

• New Grid protocols based on popular Web Services Web Services Resource Framework (WSRF)

• Grid adds concept of ‘stateful resources’, likegrid-jobs, data elements & data bases, …

• Ensure adequate and flexible standards todayvia the Global Grid Forum

• Future developments taken up by industry

Page 11: Grids at NIKHEF 2004.07.14 1 Grid Computing @ NIKHEF David Groep NIKHEF PDP 2004.07.14 The (data) problem to solve beyond meta-computing: the Grid realizing.

Grids at NIKHEF 2004.07.14 11

Access in a coordinated way

• Transparently crossing of domain boundariessatisfying constraints of– site autonomy

– authenticity, integrity, confidentiality

• single sign-on to all services• ways to address services collectively• APIs at the application level• every desktop, laptop, disk is part of the Grid

Page 12: Grids at NIKHEF 2004.07.14 1 Grid Computing @ NIKHEF David Groep NIKHEF PDP 2004.07.14 The (data) problem to solve beyond meta-computing: the Grid realizing.

Grids at NIKHEF 2004.07.14 12

Realization: projects at NIKHEF

• Virtual Lab for e-Science (BSIK)– 2004-2008

• Enabling Grids for e-Science in Europe (FP6)– 2004-2005/2007

• GigaPort NG Network (BSIK)– 2004-2008

• NL-Grid Infrastructure (NCF)– 2002-…

• EU DataGrid (FP5, finished)– 2001-2003

Page 13: Grids at NIKHEF 2004.07.14 1 Grid Computing @ NIKHEF David Groep NIKHEF PDP 2004.07.14 The (data) problem to solve beyond meta-computing: the Grid realizing.

Grids at NIKHEF 2004.07.14 13

Research threads

1. end-to-end operation for data-intensive sciences (DISc): – data acquisition – ATLAS Level-3

– wide-area transport, on-line and near-line storage – LGC SC

– data cataloguing and meta-data – D0 SAM

– common API and application layer for DISc – EGEE App+VL-E

2. design scalable and generic Grids– grid software scalability research, security

3. deployment and certification– large-scale clusters, storage, networking

Page 14: Grids at NIKHEF 2004.07.14 1 Grid Computing @ NIKHEF David Groep NIKHEF PDP 2004.07.14 The (data) problem to solve beyond meta-computing: the Grid realizing.

Grids at NIKHEF 2004.07.14 14

End to End – the LCG Service Challenge

• 10 Pbyte per year exported from CERN (ready in 2006)• Targets for end 2004 –

1. SRM-SRM (disk) on 10 Gbps links between CERN, NIKHEF/SARA, Triumf, FZK, FNAL 500 Mb/sec sustained for days

2. Reliable data transfer service3. Mass storage system <-> mass storage system

1. SRM v.1 at all sites2. disk-disk, disk-tape, tape-tape

4. Permanent service in operation• sustained load (mixed user and generated workload)• > 10 sites • key target is reliability• load level targets to be set

slide: Alan Silverman, CERN

Page 15: Grids at NIKHEF 2004.07.14 1 Grid Computing @ NIKHEF David Groep NIKHEF PDP 2004.07.14 The (data) problem to solve beyond meta-computing: the Grid realizing.

Grids at NIKHEF 2004.07.14 15

Networking and security

• 2x10Gbit/s Amsterdam-Chicago• 1x10Gbit/s Amsterdam-CERN

– ATLAS 3rd level trigger (distributed DACQ)

– protocol tuning and optimization

– Monitoring and micro-metering

– LCG service challenge: sustained high-throughput

• collaboration with Cees de Laat (UvA AIR) + SURFnet• ideal laboratory for our security thread (many domains)

Page 16: Grids at NIKHEF 2004.07.14 1 Grid Computing @ NIKHEF David Groep NIKHEF PDP 2004.07.14 The (data) problem to solve beyond meta-computing: the Grid realizing.

Grids at NIKHEF 2004.07.14 16

Building the Grid

• The Grid is not a magic source of power!– Need to invest in storage, CPUs, networks

– LHC needs per major centre (assume 10 per expt.): ~ 3 PByte/yr, ~40 Gbit/s WAN, ~15 000 P4-class 2GHz

– … more for a national multi-disciplinary facility

– Collaborative build-up of expertise:NIKHEF, SARA, NCF, UvA, VU, KNMI, ASTRON, AMOLF, ASCI, …

– Resources: NIKHEF resources + NCF’s NL-Grid initiative + …

Page 17: Grids at NIKHEF 2004.07.14 1 Grid Computing @ NIKHEF David Groep NIKHEF PDP 2004.07.14 The (data) problem to solve beyond meta-computing: the Grid realizing.

Grids at NIKHEF 2004.07.14 17

Resources today (the larger ones)

• 1.2 PByte near-line StorageTek

• 36 node IA32 cluster ‘matrix’

• 468 CPU IA64 + 1024 CPU MIPS

• multi-Gbit links to 100TByte cache

• 7 TByte cache

• 140 nodes IA32

• 1Gbit link SURFnet

• multiple links with SARA

only resources with either GridFTP or Grid job management

Page 18: Grids at NIKHEF 2004.07.14 1 Grid Computing @ NIKHEF David Groep NIKHEF PDP 2004.07.14 The (data) problem to solve beyond meta-computing: the Grid realizing.

Grids at NIKHEF 2004.07.14 18

A Facility for e-Science

• Many (science) application with large data volumes:– Life Sciences: micro-arrays (Utrecht, SILS Amsterdam)– Medical imaging: functional MRI (AMC), MEG (VU)– ‘omics’ and molecular characterization: sequencing (Erasmus),

mass spectroscopy (AMOLF), electron microscopy (Delft, Utrecht)

today such groups are not yet equipped to deal with their >1TByte data sets, our DISc experience can help

• Common need for multi-Pbyte storage• ubiquitous networks for data exchange• sufficient compute power, accessible from anywhere

Page 19: Grids at NIKHEF 2004.07.14 1 Grid Computing @ NIKHEF David Groep NIKHEF PDP 2004.07.14 The (data) problem to solve beyond meta-computing: the Grid realizing.

Grids at NIKHEF 2004.07.14 19

Common needs and solutions?

• VL-E Proof of Concept environment for e-Science

• grid services address the common needs (storage, computing, indexing)

• application can rely on a stable infrastructure

• valuable experience as input to industry (mainly industrial research)

• can increasingly leverage emerging industry toolsthe Grid will be a household term like the Web

by pushing on the PByte leading edge, TByte-sized storage will be an e-Science commodity

Page 20: Grids at NIKHEF 2004.07.14 1 Grid Computing @ NIKHEF David Groep NIKHEF PDP 2004.07.14 The (data) problem to solve beyond meta-computing: the Grid realizing.

Grids at NIKHEF 2004.07.14 20

NIKHEF PDP Team

in no particular order:• End-to-end applications: Templon, Bos, Grijpink, Klous• Security: Groep, Steenbakkers, Koeroo, Venekamp• Facilities: Salomoni, Heubers, Damen, Kuipers,

v.d. Akker, Harapan• Scaling and certification: Groep, Starink

embedded in both the physics and the computing groups