Ganga Status and Outlook K. Harrison (University of Cambridge) 16th GridPP Meeting Queen Mary,...

17
Ganga Status and Outlook K. Harrison (University of Cambridge) 16th GridPP Meeting Queen Mary, University of London, 27th-29th June 2006 http://cern.ch/ganga

Transcript of Ganga Status and Outlook K. Harrison (University of Cambridge) 16th GridPP Meeting Queen Mary,...

Page 1: Ganga Status and Outlook K. Harrison (University of Cambridge) 16th GridPP Meeting Queen Mary, University of London, 27th-29th June 2006 .

Ganga Status and Outlook

K. Harrison(University of Cambridge)

16th GridPP MeetingQueen Mary, University of London, 27th-29th June 2006

http://cern.ch/ganga

Page 2: Ganga Status and Outlook K. Harrison (University of Cambridge) 16th GridPP Meeting Queen Mary, University of London, 27th-29th June 2006 .

28 June 2006 2/17

People/groups behind Ganga

• Ganga is an ATLAS/LHCb joint projectto develop a Grid user interface

• Current core team:– F.Brochu (Cambridge), U.Egede (Imperial), J.Elmsheuser (München), K.Harrison (Cambridge), H.C.Lee (ASCC), D.Liko (CERN),A.Maier (CERN), J.T.Moscicki (CERN), A.Muraru (Bucharest), A.Soroko (Oxford), C.L.Tan (Birmingham)

• Strong support from UK (PPARC/GridPP) and EU (EGEE/ARDA)

• Contributions past and present from many others

Page 3: Ganga Status and Outlook K. Harrison (University of Cambridge) 16th GridPP Meeting Queen Mary, University of London, 27th-29th June 2006 .

28 June 2006 3/17

LHCbapplications

ATLASapplications

Otherapplications

Applications

Experiment-specificworkload-management systems

Local batch systems Distributed (Grid) systems

Processing systems (backends)

Metadatacatalogues

Data storage and retrieval

Filecatalogues

Tools fordata

management

Localrepository

Remoterepository

Ganga job archives

Gangamonitoring

loop

User interfacefor job definitionand management

• Ganga has built-in support for ATLAS and LHCb• Component architecture allows customisation for other user groups

Ganga in sixty seconds

Page 4: Ganga Status and Outlook K. Harrison (University of Cambridge) 16th GridPP Meeting Queen Mary, University of London, 27th-29th June 2006 .

28 June 2006 4/17

Ganga job abstraction• A job in Ganga is constructed from a set of

building blocks, not all required for every job

Merger

Application

Backend

Input Dataset

Output Dataset

Splitter

Data read by application

Data written by application

Rule for dividing into subjobs

Rule for combining outputs

Where to run

What to run

Job

Page 5: Ganga Status and Outlook K. Harrison (University of Cambridge) 16th GridPP Meeting Queen Mary, University of London, 27th-29th June 2006 .

28 June 2006 5/17

Framework for plugin handling

• Ganga provides a framework for handling different types of Application, Backend, Dataset, Splitter and Merger, implemented as plugin classes

• Each plugin class has its own schema

Executable

GangaObject

IApplication IBackendIDatasetISplitter IMerger

LCG

Plugin

Interfaces

Example plugins

and schemas

-CE-requirements-id-status-reason-actualCE-exitcode

-exe-env-args

User

System

Page 6: Ganga Status and Outlook K. Harrison (University of Cambridge) 16th GridPP Meeting Queen Mary, University of London, 27th-29th June 2006 .

28 June 2006 6/17

Applications and backends• Running of a particular Application on a given Backend is enabled by

implementing an appropriate adapter component or Runtime Handler– Can often use same Runtime Handler for several Backend: less coding

PBS OSG NorduGridLocal LSF PANDA

US-ATLAS WMS

LHCb WMS

ExecutableAthena

(Simulation/Digitisation/Reconstruction/Analysis)

AthenaMC(Production)

Gauss/Boole/Brunel/DaVinci(Simulation/Digitisation/Reconstruction/Analysis)

LHCb Experiment neutral ATLAS

Implemented

Work in progress

Page 7: Ganga Status and Outlook K. Harrison (University of Cambridge) 16th GridPP Meeting Queen Mary, University of London, 27th-29th June 2006 .

28 June 2006 7/17

Job repository• Job repository provides for storage and retrieval of job

representations• User can choose to work with repository on local filesystem, or with

repository on remote server that has certificate-based authentication– Implementation makes use of AMGA database interface

AMGA interface for remote database

AMGA interface for local database

• API for local and remote repositories is the same, with CVS-like possibilities for job commit, checkout and update

• Also have support for selections, bulk operations, and fast retrieval of summary data

Page 8: Ganga Status and Outlook K. Harrison (University of Cambridge) 16th GridPP Meeting Queen Mary, University of London, 27th-29th June 2006 .

28 June 2006 8/17

Job monitoring• Job monitoring is multi-threaded

– Can set different refresh rate for different Backends• Actions initiated in monitoring threads include updating

of job status in repository, and output retrieval for completed jobs

Page 9: Ganga Status and Outlook K. Harrison (University of Cambridge) 16th GridPP Meeting Queen Mary, University of London, 27th-29th June 2006 .

28 June 2006 9/17

Ganga Command-Line Interface in Python (CLIP)

• CLIP provides interactive job definition and submission from an enhanced Python shell (IPython)– Especially good for trying things out, and understanding how the system works

# List the available application plug-ins list_plugins( “application” ) # Create a DaVinci job to be submitted to DIRAC j = Job( application = “DaVinci”, backend = “Dirac” # Set the job-options file j.application.optsfile = “myOpts.txt” # Submit the job j.submit() # Search for string in job’s standard output !grep “Selected events” $j.outputdir/stdout

Page 10: Ganga Status and Outlook K. Harrison (University of Cambridge) 16th GridPP Meeting Queen Mary, University of London, 27th-29th June 2006 .

28 June 2006 10/17

Ganga scripting

• From the command line, a script myScript.py can be executed in the Ganga environment using: ganga myScript.py – Allows automation of repetitive tasks

• Scripts for basic tasks included in distribution # Create an Athena job to be submitted to LCG ganga make_job Athena LCG test.py # Edit test.py to set Athena properties, then submit job ganga submit test.py # Query status, triggering output retrieval if job is completed ganga query

Approach similar to the one traditionally used when submitting to a local batch system

Page 11: Ganga Status and Outlook K. Harrison (University of Cambridge) 16th GridPP Meeting Queen Mary, University of London, 27th-29th June 2006 .

28 June 2006 11/17

Ganga Graphical User Interface (GUI)

• GUI consists of central monitoring panel and dockable windows

• Job definition based on mouse selections and field completion

• Highly configurable: choose what to display and howJob

details

Logical

Folders

Job Monitoring

Log window

Job builder

Scriptor

Page 12: Ganga Status and Outlook K. Harrison (University of Cambridge) 16th GridPP Meeting Queen Mary, University of London, 27th-29th June 2006 .

28 June 2006 12/17

Bringing Ganga to the users

CERN, September 2005 Cambridge, January 2006 Bologna, June 2006

• Since July 2005, have had three Ganga tutorials for LHCb and two for ATLAS, in various locations

• Approach of GridPP-supported LHCb-UK Software Course (January 2006), with Ganga/Grid session integrated in more-general course, very successful– Attract users who wouldn’t otherwise be considering the Grid

• Ganga tried out by 100+ people, with positive feedback– “Very handy way to organise job submission” (ATLAS user)– “Clever and nicely designed” (LHCb user)

• Small but growing group of people regularly using Ganga (also from a laptop)

Page 13: Ganga Status and Outlook K. Harrison (University of Cambridge) 16th GridPP Meeting Queen Mary, University of London, 27th-29th June 2006 .

28 June 2006 13/17

Successes in distributed analysis

• Success of undergraduate project students in running LHCb analyses using the experiment’s distributed-analysis system reported in GridPP news item

• System is based on LCG (Grid infrastructure), DIRAC (workload management layer and Ganga (user interface)• Together, project students and others in LHCb-Cambridge processed more than 75 million simulated beauty events over three-month interval• Fraction of jobs completing successfully averaged about 92%• Extended periods with success rate >95% Excellent demonstration that Ganga allows

physics analyses to be run easily on the Grid bypeople with no knowledge of Grid technicalities

Did he say 75 million

?

Page 14: Ganga Status and Outlook K. Harrison (University of Cambridge) 16th GridPP Meeting Queen Mary, University of London, 27th-29th June 2006 .

28 June 2006 14/17

Ganga beyond ATLAS and LHCb

• In EGEE, Ganga is used as submission engine and monitoring system for the DIANE job-distribution framework

• Ganga/DIANE combination adopted for a number of applications

• Use of Grid in search for drugs against avian flu widely reported•About one eighth of jobs submitted using Ganga/DIANE

Job statistics from Ganga

• Geant 4 regression tests performed for major releases (twice per year) Search for differences in simulation results• Ganga/DIANE adopted for running these tests on the Grid First use December 2005

• ITU Regional Radio Conference held in Geneva, May-June 2006• Required real-time optimisation of evolving plan for sharing frequencies between 120 countries Maximise number of satisfied requests Minimise interference•Ganga/DIANE used to run optimisation jobs on the Grid

Page 15: Ganga Status and Outlook K. Harrison (University of Cambridge) 16th GridPP Meeting Queen Mary, University of London, 27th-29th June 2006 .

28 June 2006 15/17

• (Nottingham, UK, September 2005)– Ganga user interface for job definition and management (K.Harrison)– Distributed analysis in the ATLAS experiment (C.L.Tan)

AHM2005

• (Milano, Italy, September 2005)– Ganga user interface for job definition and management (D.Liko/K.Harrison)

• (Mumbai, India, February 2006)– Ganga: a Grid user interface (K.Harrison)– Experience with distributed analysis in LHCb (U.Egede)

Conference contributions: July 2005 - June 2006

• (Taipei, Taiwan, May 2006)– Ganga: a job management and optimising tool for job submission to the Grid (A.Maier)

ISGC2006

AHM 2005

Page 16: Ganga Status and Outlook K. Harrison (University of Cambridge) 16th GridPP Meeting Queen Mary, University of London, 27th-29th June 2006 .

28 June 2006 16/17

• (Nottingham, UK, September 2006)– Ganga: a Grid user interface for distributed analysis (A.Soroko)– Distributed analysis in the ATLAS experiment (C.L.Tan)

AHM2006

Conference contributions: coming attractions

• (Geneva, Switzerland, July 2006)– Using Python in the Development of a Grid user interface for distributed data analysis (A. Soroko)

Page 17: Ganga Status and Outlook K. Harrison (University of Cambridge) 16th GridPP Meeting Queen Mary, University of London, 27th-29th June 2006 .

28 June 2006 17/17

Conclusions• Excellent progress with Ganga development since redesign (early

2005)• Wealth of functionality has been implemented

– Support for Applications and Backends of interest to ATLAS and LHCb• Work in progress on ATLAS-specific Backends: PANDA and NorduGrid

– Possibilities for working at the command line, with scripts, and through a graphical interface

– Job monitoring, local/remote repository, job splitting, and more

• Work on data handling delayed because of uncertainties in the experiments, but is now one of the top priorities

• Several highly successful Ganga tutorials have been held: more to come

• Ganga has allowed high-statistics LHCb physics studies to be performed on the Grid by people with no knowledge of Grid technicalities

• Ganga used for a range of applications beyond ATLAS and LHCb