The BTeV software infrastructure: a personal view The BTeV software infrastructure a personal point...

The BTeV software infrastructure: a personal view

The BTeV software infrastructurea personal point of view

Dario Menasce


ForewordForeword

Building a complex software infrastructure like the one we think BTeV needs,given the many constraints placed by the nature of the experiment, is clearlya daunting task. That said (an obvious statement), were do we start?

First, let’s produce a crude shopping list of requirements

• Online

• Offline

• Services

Trigger code: clearly in the online domain: must be light-weight, compact and efficient.Monitors: Very online specialized

Full reconstruction: clearly crossing the border btw on-offline (Lev I, II and III trigs)Calibration: not so sure, certainly needed by level I reconstruction Alignment: same as for calibrationData analysis: express-line to monitor data quality makes it probably a border-line issueDatabases: again, calibration constants needed in input by trigger codeWeb tools: combines Online (remote ctrl) Offline (data localization) and Grid (brokers of various nature)

Grid tools: THE service

Simulation:Analysis: both offline classics

• Mix


ForewordForeword

This is the BTeV’s peculiarity: no distinct boundary between online and offline.

This is already placing important constraints, on the overall design, whichneeds to be studied in depth before implementing code.

Code must be reusable at ease both in online and offline realms; an exampleis of course level I trigger code, that must run on several kinds of processors(online) and in the simulation frame.

But even concentrating on just, let’s say, offline, one would like to try games likerunning several instances (or incarnations) of a given algorithm (e.g. a vertex fitter), and make comparisons (efficiency, speed, …)

Of course a good OO design solves most of the hurdles of switching from onealgorithm to the next, but it does so by placing an additional cost in terms of CPU cycles that needs to be carefully evaluated.

I would say: write hard code in C and infrastructure code in C++, but I leave in-depth discussion to the experts. Just keep in mind the prime directive.


ForewordForeword

These considerations suggest, by and large, two possible approaches:

A: develop each needed component as a standalone package: if needed, glue component functionalities by suitable perl and/or python scripts. Each component (for e.g. simulation) can be a small, self-contained, framework.

B: Adopt a common framework from the beginning: by this approach every sub component (possibly a framework on its own) can communicate with any other one at will.

Pros and cons

A: pro - not much overall design is needed, development doesn’t require coordination and each component can be quickly developed con - interaction between, say, reconstruction and simulation, can become tricky and cumbersome. Even more so taking into account Grid issues.

B: pro - could solve (maybe) the issues raised above. One such framework already exists and is in use (Gaudi). Stefano played a little with it and can comment. con - might turn out to be a way too rigid structure, unsuitable for BTeV needs


ForewordForeword

Once more, all that said, were and how do we start?

My opinion (debatable) is let’s start with simulation:

• it’s needed early-on for almost everything • a basic framework has already been implemented• make it fully functional• keep it simple (at least from a user’s perspective)• embed it in a larger framework (but only if deemed necessary)• make it grid-compliant from the beginning (implies synergy with grid developers)• avoid unnecessary features and complications: if they’re needed, add them as plug-ins (keep components decoupled as much as possible, take advantage of the lessons learned by developing Pomone and The Beast)

As shown by Stefano, we already made a first tentative step towards this goal


What’s nextWhat’s next

We should start evaluating the Beast, decide whether it’s the tool we need ordiscard it and try something else.

People have already started playing with it, but I think we need a formal procedure to reach consensus towards one of the two possible outcomes. We propose to establish a committee to evaluate pros and cons of our proposal and make recommendation to the collaboration.

Once the committee has produced a report to the collaboration, we will have the possibility to look forward to the next steps and do some more work.

Let’s examine the two scenarios

1) The Beast proposal does not meet the requirements set forward by the appointed committee. Easy, start a new project (but the committee should perhaps provide new guidance in this case)

How can we reach such a decision?


What’s nextWhat’s next

2) The Beast proposal indeed meets the requirements. Take the recommendations provided by the committee and use them to further improve the development of the simulation framework.

Once the simulation tool has reached maturity (which means it can providefake data files to be used by an external reconstruction code), we will need to integrate it within a larger domain.

• this is the time to really evaluate whether we need an overseeing framework or we remain content of the simulation as a stand-alone package. Important criteria, to reach a decision, are considerations such as a common geometry description of the detector between simulation ad reconstruction (which requires some sort of a geometry manager and relative input-output protocol of data files, like XML or even GDML) plus interaction with the Grid. A common framework could already provide tools to make this integration seamless, but this needs investigation and formal approval by the committee.


Who’s supposed to do whatWho’s supposed to do what

To jump start this chain of events, we need to:

• establish the Beast evaluation committee: who’s appointing who? (Joel/Sheldon?)

• identify working groups to develop individual components (trigger, reconstruction code, grid…) (many are already in place and very active)

• Establish a software steering committee (same as the Beast’s evaluation one?) its charge should be to coordinate and give guidance to the individual working groups in order to develop a software environment which grows harmonically and well focused on the short/medium/long term time-range needs of the experiment. This should be a synergistic effort between physicist and software professionals (under the strong imperative of the prime directive, take data). Again, chain of command?


How to formalize requirementsHow to formalize requirements

• Continue and improve the periodic simulation meetings, but start to produce requirements and milestones for the development of the infrastructure (another charge for the members of the steering committee)

• avoid, during this process, any unnecessary bureaucracy: do not produce tons of papers with unrealistic (too fine-grained) requirements, but keep focused on the overall picture. Individual software components need to interact as smoothly as possible; the key to success is not complexity but decoupling between functionalities, obtained by providing a set of well defined public interfaces (design-first strategy!).

By personal experience, successful software projects start as well coordinatedsmall efforts, with an harmonic growth. The bureaucratic approach (define requirements, write papers, follow requirements, periodically verify conformance) almost always fails under its own weight if taken too literally. (I’m Italian, I know…)

but….


On a more technical sideOn a more technical side

To succeed in this enterprise we need to tame several other beasts

The names of the game are: C++ Geant4 root XML GDML Qt Grid (Globus, Condor, GriPhyn…)

A key aspect we have to safeguard in order to achieve a successful result is the contribution to the simulation and analysis effort from people who, for historical reasons, have not had the chance to familiarize themselves with these modern tools. How do we keep these individuals involved?

• this is where a framework comes handy: users will only deal with specific code implementations (a detector geometry, a track fitting algorithm), while the overall process will be driven by the embedding structure unnecessary complexity hidden away and dealt with by professionals

• learning root (or Geant4) in principle just requires following the tutorial examples that come with these packages. This is unfortunately NOT true unless you’re already somehow familiar about C++. In this case, again by personal experience, the learning process requires substantially longer times.


On a more technical sideOn a more technical side

I would therefore very favorably consider a process were novices are lead to learn the very basics of C++ during specialized seminar sessions. This can be highly successful if these courses are focused on real BTeV examples an not, as customary, on abstract or irrelevant examples and templates. These shouldbe rather short and very focused seminars on specific, practical issues.

I’ve been responsible for INFN to set up these kind of seminars nationwide, and they were judged to be very useful to help people to overcome the muchdreaded and feared Fortan-to-C++ transition. Once C++ basics are understood,people can proficiently start exercising root or Geant4 examples by themselves.

Still, writing truly efficient and well-designed C++ code requires long training,so may opinion is leave the overall design to professionals and individual class method implementations to individual users (decouple software expertise from detector or analysis expertise).

Now, back to urgently needed decisions


Short term time scheduleShort term time schedule

We need the following components to complete the design of version 1.0 of the Beast:

First and foremost: a geometry manager

we would like to quickly evaluate GDML (Geometry Description Markup Language)a product developed at Cern for which an Application Interface alreadyexist. We think theadoption of such a toolis very urgent to makequick progress towardsreleasing a first simulationprototype to users.

It already provides supportfor a rapidly growing listof subjects, like:



A second item in the priority list is the definition of the baseline geometricalvalues (from TDR, schematics, technical drawings etc…) and were to puta threshold for description-granularity (detectors only, detectors + supportstructures, everything up to bolts and nuts).

Also to evaluate at this point is whether automatic tools to translate constantsfrom technical drawings exist, are useful, we need them, we want them etc…

A third item is an event-model: hits produced by a Geant4 simulation in eachsub-detector will be collected at the end of an event and processed by somesort of event builder to be made persistent for subsequent processing (rec.).

Again, formal design of this event-model should be a task for the softwaresteering committee.

A fourth item is the design of the provenance model (Rob?). Data (objects)should always have pointers to classes that provide means to reconstructthe source of a particular item. How do we store these pointers in a data-file?



A fifth item is a formal Geant4 validation process. By this I mean validation ofadequate physics lists, multiple scattering models, particle decay descriptions,up-to-date branching fractions etc…

But also performance tuning (define appropriate production cuts in differentdetectors and materials), how much detail we need, in each detector (such as: do we need accurate charge release models to study charge sharing?)

Again, users should never need to define their own physics list, this shouldbe defined by the framework, but they should be allowed to switch on-offspecific effects (MCS) in a coherent way.

A possible approach could be a GUI that prohibits the user to do foolish orinconsistent things, warns that other things are not ‘officially’ blessed, butcan be overridden and in general knows how to coherently generate a bunchof events.



A sixth item is the evaluation of Gaudi: a good approach to this could be to setup an exercise where fake data produced by the Beast are fed to a reconstruction code (just a Kalman filter to begin with) and reconstructeddata are made available for subsequent analysis.

Finally, an important piece of the project is adequate documentation. The onlyway people can significantly and efficiently contribute with their own software components to a large project is by providing good (very good) documentation.

To this extent we have tried to verify how much and how well a project could be documented using doxygen: the issue here is to provide much more then just hyperlinked sourced code (a classical reference guide), but a good userguide, with adequate and lengthy explanations of critical points as well.

Doxygen can be used to do just that in a very effective way, as show by theBeast web pages we setup to document our project (this is still an attempt).



Completion of these five items could constitute a good milestone for the short term time schedule.

http://hal9000.mi.infn.it/~menasce/TheBeast


Medium term time scheduleMedium term time schedule

As mentioned before, the Grid tools will be crucial to implement a trulydistributed data model. The Grid, though, is not yet a mature field (is it?), so what needs to be done in the medium term is to track development in the field,play with existing, released tools, and be watch-full on developments.

In any case periodically exercise whatever framework has been chosen (ifany) in the context of this still developing toolkit, just to be sure no major incompatibility is introduced in the overall design.

Another item within this time frame is DBMS: alignment, calibrations and manyother things will be managed under the supervision of a database. Designof the database will require many ingredients not yet envisioned or insufficientlywell defined: for now be content to provide methods to deal with quantities thatwill be presumably dealt by a database, even if they are trivially implemented.

Finally an outlook to the long term time schedule.


Long term time scheduleLong term time schedule

It’s probably too early to make significant forecasts on a long term schedule, atleast I will refrain from trying that now.I would be satisfied if we succeed, in the very short term, to setup an efficientsteering committee and demonstrate that our proposal is a workable approach (i.e. it produces useful results in a reasonable time and people are happy with it, plus it is considered a good starting point for a true simulation framework)


ConclusionsConclusions

To begin a healthy production of useful and efficient software the BTeV collaboration needs:

1. To establish a proper steering committee to oversee and coordinate specific sub-groups of experts (both in the infrastructure domain and individual detector components). Limited bureaucracy!!...

2. To agree on a broad design for an infrastructure, whether a loose collection of components or a sophisticated framework encompassing almost all aspects of the computational environment of the experiment (from trigger to Grid)

3. A working proposal of a simulation framework has been put at the group disposal for evaluation. Decide if this has to be expanded to become BTeV’s simulation tool, or provide guidance on a novel approach. Do it quickly!

4. Define milestones and a sustainable schedule to accomplish the required tasks (identify sub-groups, people and responsibilities)

5. Make sure people in different sub-groups periodically talk to each other, in order to detect and avoid conflicts early on (periodic workshops?)

The BTeV software infrastructure: a personal view The BTeV software infrastructure a personal point...

Documents

Transcript of The BTeV software infrastructure: a personal view The BTeV software infrastructure a personal point...