WLCG Middleware Support II Markus Schulz CERN-IT-GT May 2011.

11
WLCG Middleware Support II Markus Schulz CERN-IT-GT May 2011

Transcript of WLCG Middleware Support II Markus Schulz CERN-IT-GT May 2011.

Page 1: WLCG Middleware Support II Markus Schulz CERN-IT-GT May 2011.

WLCG Middleware Support IIMarkus Schulz

CERN-IT-GT

May 2011

Page 2: WLCG Middleware Support II Markus Schulz CERN-IT-GT May 2011.

Markus Schulz 2

Overview

• Status • Problems?• What do we need?

Page 3: WLCG Middleware Support II Markus Schulz CERN-IT-GT May 2011.

Markus Schulz 3

• Fundamentally not much changed since the last discussion

• The 3.5 Empires are alive and working• OSG – Manages their releases independently of EMI

• NDGF– ARC still rules

• dCache – gLite releases include dCache

• Very limited usage (none)

– dCache (DESY) produce their own releases • These are used and are popular• Discussed at the WLCG-T1-Service Coordination M. (WTSCM)

Situation on the Factory Floor

Page 4: WLCG Middleware Support II Markus Schulz CERN-IT-GT May 2011.

Markus Schulz 4

• gLite: gLite-3.2/1– gLite-3.1 (SL4)

• On the way out

– gLite-3.2 (SL5)• De facto coordination by Maria Alandes (CERN)

– Patch prioritisation in WLCG T1 Service Coordination Meeting

• EGI– Coordinates gLite-3.2/3.1 Staged Rollout

• EMI– Prepares the first EMI-1 release ( based on EPEL )– Many structural changes

Situation on the Factory Floor

Page 5: WLCG Middleware Support II Markus Schulz CERN-IT-GT May 2011.

Markus Schulz 5

• Many in WLCG assumed (naively) that EGI and EMI are some form of EGEE-x – Assumed that long standing request would stay on

the work plan– Assumed that the privileged partnership between

project, sites, experiments and WLCG would just happen (TMB/TCG)

– Assumed that established informal exchange of change requests between experiments and developers would continue and drive the project

What we (WLCG) assumed

Page 6: WLCG Middleware Support II Markus Schulz CERN-IT-GT May 2011.

Markus Schulz 6

• Requirements:• EMI and EGI did what they stated in their work plans

– Defined and documented processes to gather requirements and change requests• Deliverable and milestone documents • Circulated and discussed within the projects

– Implemented these processes• Captured some old requirements (difficult)

– Gathered requirements (NGIs, etc.)– Prioritised them– Exchanged them between the projects– Wrote work plans for the next year(s)

• As a result EGI/EMI priority lists and WLCG expectations are not in good agreement– Resulting in discussions… – Example:

• CREAM-CE HA • passing of arguments to the batch system

Problems?

Page 7: WLCG Middleware Support II Markus Schulz CERN-IT-GT May 2011.

Markus Schulz 7

• For several components the direct interaction between sites, experiments and developers continued ( in a twilight zone)– ATLAS catalogue work– Infosystem – Monitoring– Condor/Cream etc. – FTS – Often not explicitly clear whether this is WLCG or

EMI related

What happened II

Page 8: WLCG Middleware Support II Markus Schulz CERN-IT-GT May 2011.

Markus Schulz 8

• EMI and EGI are strategic projects– Planning over long periods – Strong processes – Long term investment

• EPEL, source RPMs, Debian Support

– Less rooted in the past• LCG will benefit from the strategic goals• LCG frequently has tactical needs

– Often discovered in production• Rarely whiteboard solutions• Iterative solutions• Sometimes treating the symptoms • Needs rapid reaction

• LCG didn’t go through a re-birth and has memory

The Problems

Page 9: WLCG Middleware Support II Markus Schulz CERN-IT-GT May 2011.

Markus Schulz 9

• Understanding what information has been lost– Complete the official requirement list– Effort from both sides needed

• A more continuous interaction between projects and WLCG– The experiments use >80% of all resources

• Privileged partnership?

– Influence on the priorities of the development effort? • WLCG needs to communicate needed middleware

changes (also for relative fast changes)– For core components– Has to contain rollout plan ( including T2s)– Currently done for glexec and Cream-CE via WLCG-MB– Documentation and dissemination needed

WLCG Needs

Page 10: WLCG Middleware Support II Markus Schulz CERN-IT-GT May 2011.

Markus Schulz 10

• How to manage the middleware effort outside EMI?– Example: Support for extra batch systems– Several teams have resources that are not part of EMI– Do we need to formalize the direct communication between

sites/users/developers?– How do we deliver the resulting changes?

• Is there a risk of conflict between EGI/NGI and WLCG?– T1s, T2s, who decides what versions have to be run?

• WLCG roadmap for SL6 migration.– A bit early…….

• Do we have to take the LHC machine schedule into account?– Not very stable and computing doesn’t stop with the beam

WLCG questions

Page 11: WLCG Middleware Support II Markus Schulz CERN-IT-GT May 2011.

Markus Schulz 11

Summary

How do we get the lost requirements back?

How do we establish a fast feedback loop?

How do we manage WLCG middleware work?