1 LHCb on the Grid A Tale of many Migrations Raja Nandakumar.

1

LHCb on the Grid

A Tale of many Migrations

Raja Nandakumar

2

LHCb computing model

➟ CERN (Tier-0) is the hub of all activity Full copy at CERN of all raw data

and dst-s All T1s have a full copy of dst-s

➟ Simulation at all possible sites (CERN, T1, T2) LHCb has used about 120 sites on

5 continents so far➟ Reconstruction, Stripping and

Analysis at T0 / T1 sites only Some analysis may be possible at

“large” T2 sites in the future➟ Almost all the computing (except

for development / tests) will be run on the grid. Large productions : production

team Ganga (Dirac) grid user interface

3

LHCb storage on Tier-1

➟ LHCb storage primarily on the Tier-1s and CERN➟ RAL dCache scheduled to stop in May 2008➟ Scale of storage usage in Oct 2007

Disk : ~75 TB Tape : ~45 TB

▓ 104 tapes in total Required storage on CASTOR provided on demand by

Tier-1

➟ CASTOR stable by Sept 2007 v2.1.4 (in Sept 2007) Great work done by Tier-1 in getting CASTOR operational A few delays from LHCb side in working on migration

▓ CHEP ‘07, DIRAC3 development issues, etc. LHCb data migration began in Nov 2007

4

LHCb disk migration

➟ Disk migration began ~ 20 Nov 2007 Completed ~ 20 Dec 2007

▓ A few minor issues / configuration problems. Quickly solved.

All disk servers finally released in Feb 2008▓ Files with transfer errors were re-transferred.

Files transferred using FTS between dCache and CASTOR▓ DIRAC interface for LHCb, to automatically register files

after transferring▓ Bulk LFC cleanup after groups of files are transferred

Peak rates of about 100 MB/s from dCache to CASTOR▓ Also depended on number of different servers on each end

Mostly smooth operations▓ Problems : minor, human

➟ LHCb running off CASTOR since mid Jan 2008

A few worries, but mostly fine

Disk

5

LHCb tape migration

➟ Tape migration first tried ~ 7 Dec 2007➟ In full scale in Jan 2008

Many iterations to get current procedure Tape staging by hand

▓ FTS does not automatically stage files on tape Automatic staging of files not practical

▓ Problems with coordination of tape staging (RAL) and FTS job submission (CERN)

▓ Files getting wiped off dCache before transfer Now staging two tapes at a time

▓ Wait for FTS jobs to complete before staging next tapes▓ ~ 50 tapes to go still …

Fine when it is running▓ ~ 60 MB/s transfer rates▓ ~ 4 hours to stage a tape▓ ~ 2 hours to transfer it to CASTOR▓ ~ 2 tapes a day

Tape

6

LHCb storage on Tier-1

➟ LHCb running off CASTOR now Ignore data only in dCache tape Transfers into CASTOR running fine

➟ Currently use srm-v1 for official production srm-v2 used in CCRC08

▓ Critical service for LHCb▓ File replication using gLite FTS▓ TURLs retrieved via gfal for access via available site

protocols root, rfio, dcap, gsidcap

▓ RDST output file upload to local Tier-1 SE via lcg-utils / gfal File removal using gfal

▓ Tier-0,1 Storage Elements providing SRM2 spaces: LHCb_RAW (T1D0), LHCb_RDST (T1D0) LHCb_M-DST (T1D1), LHCb_DST (T0D1) LHCb_FAILOVER (T0D1)

» Used for temporary upload in case of destination unavailability▓ Testing of srm-v2 is a key part of CCRC’08 for LHCb

To be used for all production jobs▓ As soon as DIRAC3 is in full production mode

7

LHCb on RAL-CASTOR

➟ A few issues running off CASTOR Hang when too many jobs run off data on a single

server▓ A single server can currently support ~ 200 lsf job slots▓ A single job can have 1-3 files open on the server▓ Can easily have 200 jobs running▓ Currently kill all the file requests to restore the server

Bonny / Shaun / Chris▓ Need for more CASTOR monitoring tools

Will be good to have rootd / xrootd also Pausing of jobs during downtime (works at CERN)

▓ Jobs should pick up from where they were, when they are restarted

➟ A few non-castor problems too Backplanes replacement / fire hazard Power down due to transformer shutdown

➟ RAL first to publish CASTOR storage to the IS Very very useful Overall service - stable

8

LHCb on the Grid

➟ DIRAC is LHCb’s interface to the grid Written mostly in python Pilot agent paradigm

▓ Fine grained visibility of grid to the jobs and DIRAC3 servers

➟ DIRAC3 Re-writing of DIRAC using 4 years of experience Main ideas / framework retained

▓ Many changes in algorithms and implementations Security : Authentication & logging for all operations Separate out generic and LHCb-specific modules Better designed to support more options

▓ srm v2, gLite WMS, generic pilots, …▓ Job throttling, job prioritisation, generic pilots, …DIRAC2 DIRAC3

9

DIRAC

Job Receiver

Job Receiver

JobJDL

Sandbox

JobInput

JobDB

Job Receiver

Job Receiver

Job Receiver

Job Receiver

DataOptimizer

DataOptimizer

TaskQueue

LFCLFC

checkData

AgentDirectorAgent

Director

checkJob

RBRBRBRBRB / WMSRB / WMS

PilotJob

CECE

WNWN

PilotAgentPilot

Agent

JobWrapper

JobWrapper

execute(glexec)

UserApplication

UserApplicationfork

MatcherMatcher

CEJDL

JobJDL

getReplicasWMSAdminWMS

Admin

getProxySE

uploadData

VO-boxVO-boxputRequest

AgentMonitorAgent

Monitor

checkPilot

getSandbox

JobMonitor

JobMonitor

DIRACservicesDIRAC

services

LCGservices

LCGservices

WorkloadOn WN

WorkloadOn WN

10

LHCb and DIRAC3

➟DIRAC3 status –Still under development

▓ Major parts already runningUsed in CCRC08

▓ So far, successful testing of simulation and reconstruction workflows

➟User analysis to be integrated with DIRAC3 once it is stable in production

➟Bookkeeping in DIRAC3Bookkeeping not to be stand alone

▓ Will be a set of fully integrated servicesNew : effort from Ireland in the web / user

interface of bookkeeping▓ Critical for Monte Carlo analysis and in future to data

analysis

11

DIRAC3 monitoring

➟ Note the authentication at top right Not needed for browsing the jobs Needed to perform actions

12

LHCb and CCRC08

➟ Planned tasks : Test the LHCb computing model Raw data distribution from pit to T0 centre

▓ Use of rfcp into CASTOR from pit - T1D0 Raw data distribution from T0 to T1 centres

▓ Use of FTS - T1D0 Recons of raw data at CERN & T1 centres

▓ Production of rDST data - T1D0 ▓ Use of SRM 2.2

Stripping of data at CERN & T1 centres ▓ Input data: RAW & rDST - T1D0 ▓ Output data: DST - T1D1 ▓ Use SRM 2.2

Distribution of DST data to all other centres ▓ Use of FTS - T0D1 (except CERN T1D1)

➟ First three tasks successfully accomplished Tests of stripping workflow within DIRAC3 ongoing

13

Data distribution

Site CPU Share (Percentage)

RAW Share (Percentage)

CERN 14 100

GridKa 11 12

IN2P3 25 29

CNAF 9 11

NIKHEF 26 30

PIC 4 5

RAL 11 13

14

RAW data transfers

➟Data transfers mimicked LHC data taking6 hours on, 6 hours off

▓ A few pauses for software upgrades▓ Peak of 125 MB/s (Feb 12th)

Nominal rate : 70 MB/s

Some Tier-1 problems seen▓ 0 checksum mismatches▓ No problems seen at RAL !

Tier-1 transfers

S.

Pat

erso

n –

CC

RC

F2F

mee

ting

15

FTS performance

➟ Histograms of time between a file being Assigned andTransferred to the LHCb Tier-1s (minutes) FTS submit / monitor / done cycle

➟ Most sites show stable behaviour

S. Paterson – CCRC F2F meeting

16

CCRC’08 LHCb issues

➟Automatic job submission successfully demonstratedProblems setting up job workflowsHad planned to run 23K jobs over 2 weeks

➟Cpu time underestimated for reconstructionMany jobs overshot wall time and were killedSystematic over all Tier-1s

➟DIRAC3 servers downtimeLost connection to running jobs

▓ Not possible to recover many jobsConfiguration service unstable

▓ Needed to be restarted regularlyNeed for backup / failover systems

17

CCRC’08 site issues

➟Problems on dCache sitesConfiguration of timeouts on gsidcap portsdCache not releasing space reserved even if

files are deleted.Data transfer, access problems due to load on

pnfs server (IN2P3)Problem with solaris servers at SARA

➟CERN –AFS instabilities

➟CNAF –Low CASTOR lsf slots per server

➟RAL : No problems

S. Paterson – CCRC F2F meeting

18

Upcoming schedule

➟March Introduce stripping workflowConsider usage of xrootd protocol

▓ Helps with flickering data access

➟AprilMigration of GridPP2+ to GridPP3

➟May4 weeks of running at nominal rate If possible, include analysis

▓ Possibly using generic pilots Pending approval and deployment of glexec

▓ Ganga (migrating to v5!) for analysis job submission

➟LaterData taking …

1 LHCb on the Grid A Tale of many Migrations Raja Nandakumar.

Documents

Transcript of 1 LHCb on the Grid A Tale of many Migrations Raja Nandakumar.