PDC’06 – production status and issues Latchezar Betev TF meeting – May 04, 2006.
-
Upload
victoria-griffith -
Category
Documents
-
view
216 -
download
0
Transcript of PDC’06 – production status and issues Latchezar Betev TF meeting – May 04, 2006.
PDC’06 – production status and issues
Latchezar Betev
TF meeting – May 04, 2006
2PDC’06 – production status
Running status Central services – all OK, no intervention
necessary With the exception of the ProxyServer – solution is being
discussed (Andreas, Pablo, Predrag) Site services – all OK (on running sites) Running standard production jobs, old AliRoot Job duration – 8 hours Job output – CERN storage still firewalled:
prevents us from storing data at CERN Stable running since 25/04 – 9 days Currently 15 sites (2 T1s, 13 T2s)
3PDC’06 – production status
Site profiles Average 520 jobs, max 1180 jobs
4PDC’06 – production status
Site profiles (2) Job statistics
5PDC’06 – production status
Site profiles (CERN) Periodical drop in jobs accepted by LCG
6PDC’06 – production status
Site profiles (T2s) Uneven job acceptance, no method yet to
track and enforce ALICE resources share
7PDC’06 – production status
Repartition of done jobs Approximately 40/60 % repartition T2/T1. Muenster (Opteron) is
boosting the T2 share, T1s are underrepresented.
8PDC’06 – production status
Issues Storage at CERN – still unresolved Monitoring and submitting jobs at sites:
Sites typically advertise 0 free CPUsCurrent system is auto-calculating the number
of jobs to submit – penalizing ALICEHave to go back to the AliEn system of
deterministic values for number of CPUs and number of submitted job agents, irrespective of advertised resources.
Job communication (Proxy) with the central services
9PDC’06 – production status
Loss of connectivity with CS Simultaneous occurrence in sites, correlated with ERROR_S,
ERROR_IB
10PDC’06 – production status
Issues (2) Deployment of VO-boxes:
Process is steadily ongoing, however not as fast as we would like it to be
Mix of problems – some LCG, some AliEn services related.
Deployment experts are working around the clock Hopefully after the initial setup phase, further updates
will be much faster Hope that gLite 3.0 is not going to change the rules
completely
List of sites – to be discussed after this presentation