GridPP & The Grid Who we are & what it is Tony Doyle.

38

Transcript of GridPP & The Grid Who we are & what it is Tony Doyle.

Page 1: GridPP & The Grid Who we are & what it is Tony Doyle.
Page 2: GridPP & The Grid Who we are & what it is Tony Doyle.

GridPP & The Grid

Who we are & what it is

Tony Doyle

Page 3: GridPP & The Grid Who we are & what it is Tony Doyle.

Web: information sharing

• Invented at CERN by Tim Berners-Lee

No. of

Inte

rnet

host

s (m

illio

ns)

Year

• Agreed protocols: HTTP, HTML, URLs

• Anyone can access information and post their own

• Quickly crossed over into public use Tim

Berners-Lee

Page 4: GridPP & The Grid Who we are & what it is Tony Doyle.

@Home Projects• Uses home PCs to run

numerous calculations with dozens of variables.

• Distributed computing project, not a grid

• Other @home projects– BBC Climate

Change ExperimentSETI @ Home

– FightAIDS@home

Peer To Peer Networks

Peer-to-peer network

• No centralised database of files• Legal problems with sharing

copyrighted material• Security problems

Page 5: GridPP & The Grid Who we are & what it is Tony Doyle.

Grid: Resource Sharing

• Share more than information• Data, computing power, applications

MIDDLEWARE

CPUCluster

User Interface Machine

CPUCluster

Resource Broker

DiskServer

Your Program

Disks, CPU etc

PROGRAMS

OPERATING SYSTEM

Word/Excel

Email/Web

Your Program

Games

• Middleware handles everything

Single computer

The Grid

Page 6: GridPP & The Grid Who we are & what it is Tony Doyle.

Analogy with the Electricity Power Grid

'Standard Interface'

Distribution Infrastructure

Power Stations

Computing and Data Centres

Fibre Optics of the Internet

Page 7: GridPP & The Grid Who we are & what it is Tony Doyle.

The CERN LHC

4 Large Experiments

The world’s most powerful particle accelerator - 2007

Page 8: GridPP & The Grid Who we are & what it is Tony Doyle.

ALICE- heavy ion collisions, to create quark-gluon plasmas

- 50,000 particles in each collision

LHCb- to study the differences between matter and antimatter

- will detect over 100 million b and b-bar mesons each year

ATLAS- General purpose- Origin of mass- Supersymmetry- 2,000 scientists from 34 countries

CMS- General purpose

- 1,800 scientists from over 150 institutes

“One Grid to Rule Them All”?

The Experiments

Page 9: GridPP & The Grid Who we are & what it is Tony Doyle.

Why do particle physicists need the Grid?

Example from LHC: starting from this event…

…we are looking for this “signature”

Selectivity: 1 in 1013

Like looking for 1 person in a thousand world populations

Or for a needle in 20 million haystacks

Page 10: GridPP & The Grid Who we are & what it is Tony Doyle.

Why do particle physicists need the

Grid?

Concorde(15 Km)

Mt. Blanc(4.8 Km)

One year’s data from LHC

would fill a stack of CDs 20km high • 100 million electronic

channels• 800 million proton-

proton interactions per second

• 0.0002 Higgs per second• 10 PBytes of data a year • (10 million GBytes

= 14 million CDs)

Page 11: GridPP & The Grid Who we are & what it is Tony Doyle.

Who else can use a Grid?• Astronomers

• Healthcare Profesionals

• Bioinformatics

• Digital curation

To create digital Libraries and

Museums

Scanning

Remote consultanc

y

Optical

X ray

Digitize almost anything

Page 12: GridPP & The Grid Who we are & what it is Tony Doyle.

19 UK Universities, CCLRC (RAL &

Daresbury) Funded by PPARCGridPP1 2001-2004

"From Web to Grid"

GridPP2 2004-2007 "From Prototype to Production"

Developed a working, highly functional Grid

Who are GridPP?

Page 13: GridPP & The Grid Who we are & what it is Tony Doyle.

What Have We Done So Far

• Simulated 46 million molecules for medical research in 5 weeks, which would have taken over 80 years on a single PC

• Reached transfer speeds of 1 Gigabyte per second in high speed networking tests from CERN – a DVD every 5 seconds

• BaBar experiment has simulated 500 million particle physics collisions on the UK Grid

• UK’s #1 producer of data for LHCb, ATLAS and CMS

Page 14: GridPP & The Grid Who we are & what it is Tony Doyle.

Worldwide LHC Computing Grid• GridPP is part

of EGEE and LCG (currently the largest Grid in the world)

EGEE stats:

182 Sites

42 Countries

38,201 CPUs

9,145 TBytes Storage

Page 15: GridPP & The Grid Who we are & what it is Tony Doyle.

Tier Structure

Tier 0

Tier 1National centres

Tier 2Regional groups

Tier 3Institutes

Offline farm

Online system

CERN computer centre

RAL,UK

ScotGrid NorthGridSouthGrid London

ItalyUSA

Glasgow Edinburgh Durham

FranceGermany

Detector

Page 16: GridPP & The Grid Who we are & what it is Tony Doyle.

UK Tier-1/A Centre• High quality data services• National and International

Role• UK focus for International Grid

development

•1000 Dual CPU

•200 TB Disk•220 TB Tape (Capacity 1PB)

Grid Operations Centre

Page 17: GridPP & The Grid Who we are & what it is Tony Doyle.

UK Tier-2 CentresScotGridDurham, Edinburgh, Glasgow NorthGridDaresbury, Lancaster, Liverpool,Manchester, Sheffield

SouthGridBirmingham, Bristol, Cambridge,Oxford, RAL PPD, Warwick

LondonBrunel, Imperial, QMUL, RHUL, UCL

Page 18: GridPP & The Grid Who we are & what it is Tony Doyle.

•Must •share data between thousands of scientists with multiple interests•link major and minor computer centres•ensure all data accessible anywhere, anytime•grow rapidly, yet remain reliable for more than a decade•cope with different management policies of different centres•ensure data security•be up and running routinely by 2007

What are the Grid challenges?

Page 19: GridPP & The Grid Who we are & what it is Tony Doyle.

Other Grids• UK National Grid Service

– UK’s core production computational and data Grid

• EGEE (Europe)– Enabling Grids for E-

sciencE

• Nordugrid (Europe)– Grid Research and

Development collaboration

• Open Science Grid (USA)– Science applications from

HEP to biochemistry

Page 20: GridPP & The Grid Who we are & what it is Tony Doyle.

The Future• Grow the LHC Grid

• Spread beyond science– Healthcare, commercial uses, government, games

• Will it become part of everyday life?

Page 21: GridPP & The Grid Who we are & what it is Tony Doyle.

Further Info

http://www.gridpp.ac.uk

Page 22: GridPP & The Grid Who we are & what it is Tony Doyle.

Backups

Page 23: GridPP & The Grid Who we are & what it is Tony Doyle.

“UK contributes to EGEE's battle with malaria”

BioMedSuccesses/Day 1107Success % 77%

WISDOM (Wide In Silico Docking On Malaria)

The first biomedical data challenge for drug discovery, which ran on the EGEE grid production service from 11 July 2005 until 19 August 2005.

GridPP resources in the UK contributed ~100,000 kSI2k-hours from 9 sites

Number of Biomedical jobs processed by country

Normalised CPU hours contributed to thebiomedical VO for UK sites, July-August 2005

Page 24: GridPP & The Grid Who we are & what it is Tony Doyle.

Is GridPP a Grid?

1. Coordinates resources that are not subject to centralized control

2. … using standard, open, general-purpose protocols and interfaces

3. … to deliver nontrivial qualities of service

1. YES. This is why development and maintenance of LCG is important.

2. YES. VDT (Globus/Condor-G) + EGEE(Glite) ~meet this requirement.

3. YES. LHC experiments data challenges over the summer of 2004.

http://www-fp.mcs.anl.gov/~foster/Articles/WhatIsTheGrid.pdf

http://agenda.cern.ch/fullAgenda.php?ida=a042133

Page 25: GridPP & The Grid Who we are & what it is Tony Doyle.

Application Development

ATLAS LHCb CMS

BaBar (SLAC) SAMGrid (FermiLab)QCDGrid PhenoGrid

Page 26: GridPP & The Grid Who we are & what it is Tony Doyle.

Middleware Development

Page 27: GridPP & The Grid Who we are & what it is Tony Doyle.

Middleware Development

Configuration Management

Storage Interfaces

Network Monitoring

Security

Information Services

Grid Data Management

Page 28: GridPP & The Grid Who we are & what it is Tony Doyle.

Requirement

Storage Element

Basic File Transfer

Reliable File Transfer

Catalogue Services

Data Management tools

Compute Element

Workload Management

VO Agents

VO Membership Services

DataBase Services

Posix-like I/O

Application Software Installation Tools

Job Monitoring

Reliable Messaging

Information System

15 Baseline Services for a functional Grid

We rely upon gLite components

This middleware builds upon VDT (Globus and Condor) and meets the requirements of all the basic scientific use cases:

1. Purple (amber) areas are (almost) agreed as part of the shared generic middleware stack by each of the application areas

2. Red are areas where generic middleware competes with application-specific software.

www.glite.org

gLite Middleware Stack

Page 29: GridPP & The Grid Who we are & what it is Tony Doyle.

2005 Metrics and Quality Assurance

Target Current status

Q2 2006 Target values

Number of Users

~ 1000 ≥ 3000

Number of sites

120 50

Number of CPU

~12000 9500 at month 15

Number of Disciplines

6 ≥ 5

Multinational 24 ≥ 15 countries

Page 30: GridPP & The Grid Who we are & what it is Tony Doyle.

LCG Service Challenges

SC2SC3

LHC Service OperationFull physics run

2005 20072006 2008

First physicsFirst beams

cosmics

June05 - Technical Design Report

Sep05 - SC3 Service Phase

May06 – SC4 Service Phase

Sep06 – Initial LHC Service in stable operation

SC4

SC2 – Reliable data transfer (disk-network-disk) – 5 Tier-1s, aggregate 500 MB/sec sustained at CERNSC3 – Reliable base service – most Tier-1s, some Tier-2s – basic experiment software chain – grid data throughput 500 MB/sec, including mass storage (~25% of the nominal final throughput for the proton period)SC4 – All Tier-1s, major Tier-2s – capable of supporting full experiment software chain inc. analysis – sustain nominal final grid data throughputLHC Service in Operation – September 2006 – ramp up to full operational capacity by April 2007 – capable of handling twice the nominal data throughput

Apr07 – LHC Service commissioned

Page 31: GridPP & The Grid Who we are & what it is Tony Doyle.

Status?: Exec2 Summary

• 2005 was the first full year of a Production Grid: the UK Tier-1 was the largest CPU provider on the LCG and by the end of the year the Tier-2s provided twice the CPU of the Tier-1.

• The Production Grid is considered to be functional and hence the focus is now on improving performance of the system, especially w.r.t. data storage and management.

• The GridPP2 Project is now approaching halfway and has met 40% of its original targets with 91% of the metrics within specification.

Page 32: GridPP & The Grid Who we are & what it is Tony Doyle.

Grid OverviewAim: by 2008 (full year’s data

taking)- CPU ~100MSi2k (100,000

CPUs)- Storage ~80PB - Involving >100 institutes

worldwide

- Build on complex middleware being developed in advanced Grid technology projects, both in Europe (Glite) and in the USA (VDT)

1. Prototype went live in September 2003 in 12 countries

2. Extensively tested by the LHC experiments in September 2004

Page 33: GridPP & The Grid Who we are & what it is Tony Doyle.

Some of the challenges for 2006

• File transfers – Good initial progress– But some way still to go with testing - stressing reliability, performance– Can only be done with participation of experiments– Distribution to other sites being planned

• Distributed VO services– Plan agreed – T1 will sign off and then VO boxes may be deployed by

T2s– But still to deploy pilot services - ALICE ATLAS CMS LHCb

• End-to-end testing of the T0-T1-T2 chain– MC production, reconstruction, distribution

• Full Tier-1 work load testing – Recording, reprocessing, ESD distribution,

analysis, Tier-2 support• Understanding the “Analysis Facility”

– batch analysis @ T1 and T2– interactive analysis

• Startup scenarios– Schedule is known at high level and defined for Service Challenges –

testing time ahead (in many ways)

Page 34: GridPP & The Grid Who we are & what it is Tony Doyle.

Data Processing

LEVEL-1 Trigger Hardwired processors (ASIC, FPGA) Pipelined massive parallel

HIGH LEVEL Triggers Farms of

processors

10-9 10-6 10-3 10-0 103

25ns 3µs hour yearms

Reconstruction&ANALYSIS TIER0/1/2

Centers

ON-lineOFF-line

sec

Giga Tera Petabit

9 or

ders

of

mag

nitu

de

Page 35: GridPP & The Grid Who we are & what it is Tony Doyle.

Getting Started

http://ca.grid-support.ac.uk/

1. Get a digital certificate

2. Join a Virtual Organisation (VO) For LHC join LCG and choose a

VO

3. Get access to a local User Interface Machine (UI) and copy your files and certificate there

Authentication – who you are

http://lcg-registrar.cern.ch/

Authorisation – what you are allowed to do

Page 36: GridPP & The Grid Who we are & what it is Tony Doyle.

Job Preparation

############# athena.jdl #################Executable = "athena.sh";StdOutput = "athena.out";StdError = "athena.err";InputSandbox = {"athena.sh", "MyJobOptions.py", "MyAlg.cxx", "MyAlg.h", "MyAlg_entries.cxx", "MyAlg_load.cxx", "login_requirements", "requirements", "Makefile"}; OutputSandbox = {"athena.out","athena.err", "ntuple.root", "histo.root", "CLIDDBout.txt"};Requirements = Member("VO-atlas-release-10.0.4", other.GlueHostApplicationSoftwareRunTimeEnvironment);################################################

Input files

Output Files

Choose ATLAS Version

Prepare a file of Job Description Language (JDL):

My C++ CodeJob Options

Script to run

Page 37: GridPP & The Grid Who we are & what it is Tony Doyle.

Dep

loym

ent

Bo

ard

Tie

r1/T

ier2

,T

estb

eds,

Ro

llou

t

Ser

vice

spec

ific

atio

n&

pro

visi

on

Use

r B

oar

d

Req

uir

emen

ts

Ap

plic

atio

nD

evel

op

men

t

Use

rfe

edb

ack

Met

adat

a

Wo

rklo

ad

Net

wo

rk

Sec

uri

ty

Info

. M

on

.

PM

B

Sto

rag

e

III. Grid Middleware

I. Experiment Layer

II. Application Middleware

IV. Facilities and Fabrics

UserBoard

DeploymentBoard

Management: Mapping Grid Structures

Page 38: GridPP & The Grid Who we are & what it is Tony Doyle.

GridPP Status?

GridPP status

(last night)

14 Sites

2,898 CPUs

124 TBytes storage