Download - A first look on the gLite RB (and more) Stefano Bagnasco I.N.F.N. Torino

Page 1: A first look on the gLite RB (and more) Stefano Bagnasco I.N.F.N. Torino

EGEE is a project funded by the European Union under contract IST-2003-508833

A first look on the gLite RB (and more)

Stefano BagnascoI.N.F.N. Torino

ALICE Software Week – CERN June 1, 2005

Page 2: A first look on the gLite RB (and more) Stefano Bagnasco I.N.F.N. Torino

Collaboration Meeting ALICE-Italia – Cagliari May 4, 2005- 2

The test setup

gLite 1.1 RB+LB

gLite 1.1 UI

LFC Clients



• Set up of a test gLite RB & CE in Torino test job submission, interaction with the ALICE file catalogue and,

gradually, other pieces of the framework Thanks to R. Brunetti, F. Nebiolo

• Tests of storage and data management components in Bari dCache+SRM, FTS, DPM (coming soon) To be integrated with the Torino setup to build a full testbed Thanks to G. Donvito, N. Fioretti, F. Minafra

Page 3: A first look on the gLite RB (and more) Stefano Bagnasco I.N.F.N. Torino

Collaboration Meeting ALICE-Italia – Cagliari May 4, 2005- 3

First results

• Jobs sent to gLite RB ( 1000, to LCG 2.4.0 on INFNGRID CEs:

• Not yet completed: 37

• Completed: 661 (68%)

• Aborted: 72 (8%)

• Error: 230 (24%) AliROOT crash: 28 NFS crash: 143 WN disk space < 4GB: 55 Other (not investigated) 4

• Funny RB problem: 100 (1 bunch) job destination lost

Page 4: A first look on the gLite RB (and more) Stefano Bagnasco I.N.F.N. Torino

Collaboration Meeting ALICE-Italia – Cagliari May 4, 2005- 4

• Problems & issues: The gLite UI command does not interact correctly with the VOMS.

• known problem, fixed but the fix did not get through to the release! (not even 1.1) Submission to the gLite RB fails with certificates mapped to a SGM

(software manager) account (Savannah bug #8616, fixed on friday) Some problems with the RB (e.g. missing location from status report), being

investigated Not all the “usual” C libraries on the WN – had to ship with AliRoot DGAS (accounting system) is using these jobs to debug its first deployment The infrastructure (just after the upgrade to 2.4.0) showed the same

toothing problems of last year, e.g.:• Hanging NFSs make software area inaccessible (this is a nasty one – remember

the “Black Hole Effect”!) • Communication problems between WN and RB • Problems with environment configuration on WNs

The support responsiveness definitely improved• Problem generally solved within an hour of submitting the ticket

Some comments

Page 5: A first look on the gLite RB (and more) Stefano Bagnasco I.N.F.N. Torino

Collaboration Meeting ALICE-Italia – Cagliari May 4, 2005- 5

Next steps: the interface

• Registering to Alien Catalogue without AliEn Will probably not be needed at all…

• Multi-thread submission (either direct or from AliEn Task Queue) Efficient use of dedicated RBs

• Testing the gLite RB ability to query the AliEn Data Catalogue The main “new” feature in the gLite RB

• Accessing files in a “standard” gLite SE (through DPM/SRM/AliEnSE/xrootd/FiReMan/whatever) This should be much easier than last year