Ganga Status and Outlook K. Harrison (University of Cambridge) 16th GridPP Meeting Queen Mary,...
-
Upload
caleb-monroe -
Category
Documents
-
view
214 -
download
0
Transcript of Ganga Status and Outlook K. Harrison (University of Cambridge) 16th GridPP Meeting Queen Mary,...
Ganga Status and Outlook
K. Harrison(University of Cambridge)
16th GridPP MeetingQueen Mary, University of London, 27th-29th June 2006
http://cern.ch/ganga
28 June 2006 2/17
People/groups behind Ganga
• Ganga is an ATLAS/LHCb joint projectto develop a Grid user interface
• Current core team:– F.Brochu (Cambridge), U.Egede (Imperial), J.Elmsheuser (München), K.Harrison (Cambridge), H.C.Lee (ASCC), D.Liko (CERN),A.Maier (CERN), J.T.Moscicki (CERN), A.Muraru (Bucharest), A.Soroko (Oxford), C.L.Tan (Birmingham)
• Strong support from UK (PPARC/GridPP) and EU (EGEE/ARDA)
• Contributions past and present from many others
28 June 2006 3/17
LHCbapplications
ATLASapplications
Otherapplications
Applications
Experiment-specificworkload-management systems
Local batch systems Distributed (Grid) systems
Processing systems (backends)
Metadatacatalogues
Data storage and retrieval
Filecatalogues
Tools fordata
management
Localrepository
Remoterepository
Ganga job archives
Gangamonitoring
loop
User interfacefor job definitionand management
• Ganga has built-in support for ATLAS and LHCb• Component architecture allows customisation for other user groups
Ganga in sixty seconds
28 June 2006 4/17
Ganga job abstraction• A job in Ganga is constructed from a set of
building blocks, not all required for every job
Merger
Application
Backend
Input Dataset
Output Dataset
Splitter
Data read by application
Data written by application
Rule for dividing into subjobs
Rule for combining outputs
Where to run
What to run
Job
28 June 2006 5/17
Framework for plugin handling
• Ganga provides a framework for handling different types of Application, Backend, Dataset, Splitter and Merger, implemented as plugin classes
• Each plugin class has its own schema
Executable
GangaObject
IApplication IBackendIDatasetISplitter IMerger
LCG
Plugin
Interfaces
Example plugins
and schemas
-CE-requirements-id-status-reason-actualCE-exitcode
-exe-env-args
User
System
28 June 2006 6/17
Applications and backends• Running of a particular Application on a given Backend is enabled by
implementing an appropriate adapter component or Runtime Handler– Can often use same Runtime Handler for several Backend: less coding
PBS OSG NorduGridLocal LSF PANDA
US-ATLAS WMS
LHCb WMS
ExecutableAthena
(Simulation/Digitisation/Reconstruction/Analysis)
AthenaMC(Production)
Gauss/Boole/Brunel/DaVinci(Simulation/Digitisation/Reconstruction/Analysis)
LHCb Experiment neutral ATLAS
Implemented
Work in progress
28 June 2006 7/17
Job repository• Job repository provides for storage and retrieval of job
representations• User can choose to work with repository on local filesystem, or with
repository on remote server that has certificate-based authentication– Implementation makes use of AMGA database interface
AMGA interface for remote database
AMGA interface for local database
• API for local and remote repositories is the same, with CVS-like possibilities for job commit, checkout and update
• Also have support for selections, bulk operations, and fast retrieval of summary data
28 June 2006 8/17
Job monitoring• Job monitoring is multi-threaded
– Can set different refresh rate for different Backends• Actions initiated in monitoring threads include updating
of job status in repository, and output retrieval for completed jobs
28 June 2006 9/17
Ganga Command-Line Interface in Python (CLIP)
• CLIP provides interactive job definition and submission from an enhanced Python shell (IPython)– Especially good for trying things out, and understanding how the system works
# List the available application plug-ins list_plugins( “application” ) # Create a DaVinci job to be submitted to DIRAC j = Job( application = “DaVinci”, backend = “Dirac” # Set the job-options file j.application.optsfile = “myOpts.txt” # Submit the job j.submit() # Search for string in job’s standard output !grep “Selected events” $j.outputdir/stdout
28 June 2006 10/17
Ganga scripting
• From the command line, a script myScript.py can be executed in the Ganga environment using: ganga myScript.py – Allows automation of repetitive tasks
• Scripts for basic tasks included in distribution # Create an Athena job to be submitted to LCG ganga make_job Athena LCG test.py # Edit test.py to set Athena properties, then submit job ganga submit test.py # Query status, triggering output retrieval if job is completed ganga query
Approach similar to the one traditionally used when submitting to a local batch system
28 June 2006 11/17
Ganga Graphical User Interface (GUI)
• GUI consists of central monitoring panel and dockable windows
• Job definition based on mouse selections and field completion
• Highly configurable: choose what to display and howJob
details
Logical
Folders
Job Monitoring
Log window
Job builder
Scriptor
28 June 2006 12/17
Bringing Ganga to the users
CERN, September 2005 Cambridge, January 2006 Bologna, June 2006
• Since July 2005, have had three Ganga tutorials for LHCb and two for ATLAS, in various locations
• Approach of GridPP-supported LHCb-UK Software Course (January 2006), with Ganga/Grid session integrated in more-general course, very successful– Attract users who wouldn’t otherwise be considering the Grid
• Ganga tried out by 100+ people, with positive feedback– “Very handy way to organise job submission” (ATLAS user)– “Clever and nicely designed” (LHCb user)
• Small but growing group of people regularly using Ganga (also from a laptop)
28 June 2006 13/17
Successes in distributed analysis
• Success of undergraduate project students in running LHCb analyses using the experiment’s distributed-analysis system reported in GridPP news item
• System is based on LCG (Grid infrastructure), DIRAC (workload management layer and Ganga (user interface)• Together, project students and others in LHCb-Cambridge processed more than 75 million simulated beauty events over three-month interval• Fraction of jobs completing successfully averaged about 92%• Extended periods with success rate >95% Excellent demonstration that Ganga allows
physics analyses to be run easily on the Grid bypeople with no knowledge of Grid technicalities
Did he say 75 million
?
28 June 2006 14/17
Ganga beyond ATLAS and LHCb
• In EGEE, Ganga is used as submission engine and monitoring system for the DIANE job-distribution framework
• Ganga/DIANE combination adopted for a number of applications
• Use of Grid in search for drugs against avian flu widely reported•About one eighth of jobs submitted using Ganga/DIANE
Job statistics from Ganga
• Geant 4 regression tests performed for major releases (twice per year) Search for differences in simulation results• Ganga/DIANE adopted for running these tests on the Grid First use December 2005
• ITU Regional Radio Conference held in Geneva, May-June 2006• Required real-time optimisation of evolving plan for sharing frequencies between 120 countries Maximise number of satisfied requests Minimise interference•Ganga/DIANE used to run optimisation jobs on the Grid
28 June 2006 15/17
• (Nottingham, UK, September 2005)– Ganga user interface for job definition and management (K.Harrison)– Distributed analysis in the ATLAS experiment (C.L.Tan)
AHM2005
• (Milano, Italy, September 2005)– Ganga user interface for job definition and management (D.Liko/K.Harrison)
• (Mumbai, India, February 2006)– Ganga: a Grid user interface (K.Harrison)– Experience with distributed analysis in LHCb (U.Egede)
Conference contributions: July 2005 - June 2006
• (Taipei, Taiwan, May 2006)– Ganga: a job management and optimising tool for job submission to the Grid (A.Maier)
ISGC2006
AHM 2005
28 June 2006 16/17
• (Nottingham, UK, September 2006)– Ganga: a Grid user interface for distributed analysis (A.Soroko)– Distributed analysis in the ATLAS experiment (C.L.Tan)
AHM2006
Conference contributions: coming attractions
• (Geneva, Switzerland, July 2006)– Using Python in the Development of a Grid user interface for distributed data analysis (A. Soroko)
28 June 2006 17/17
Conclusions• Excellent progress with Ganga development since redesign (early
2005)• Wealth of functionality has been implemented
– Support for Applications and Backends of interest to ATLAS and LHCb• Work in progress on ATLAS-specific Backends: PANDA and NorduGrid
– Possibilities for working at the command line, with scripts, and through a graphical interface
– Job monitoring, local/remote repository, job splitting, and more
• Work on data handling delayed because of uncertainties in the experiments, but is now one of the top priorities
• Several highly successful Ganga tutorials have been held: more to come
• Ganga has allowed high-statistics LHCb physics studies to be performed on the Grid by people with no knowledge of Grid technicalities
• Ganga used for a range of applications beyond ATLAS and LHCb