ATLAS Analysis Model. Introduction On Feb 11, 2008 the Analysis Model Forum published a report (D....

18
ATLAS Analysis Model
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    213
  • download

    0

Transcript of ATLAS Analysis Model. Introduction On Feb 11, 2008 the Analysis Model Forum published a report (D....

ATLAS Analysis Model

Introduction

• On Feb 11, 2008 the Analysis Model Forum published a report (D. Costanzo, I. Hinchliffe, S. Menke, ATL-GEN-INT-2008-001) describing the analysis model needed.

• This report draws some guidelines on the way analysis should be done in Atlas.

• Although some of the things might not go as planned, I think it is very helpful to see what is the idea behind all the tools.

• Comments during this lecture are very welcome, since this is an evolving subject

• Heavy Ion needs were not addressed in this report. Their need are much different from the proton collision needs.

EDM

• This paragraph emphasizes the need for event data model:

Data Structure• RDO - Raw Data Object

– Content - full information of the detector response.– Size – should be ~2MB/evt

• ESD - Event Summary Data – Content - The detailed output of the detector reconstruction. – Derivation – from RDO. – Purpose - should have sufficient information for particle identification

and track re-fitting. – Size - should be ~500 kB/evt for real data. The current data size is

~20% larger.– Format – pool file

• AOD - Analysis Object Data – – Content - summary of all the reconstructed objects. – Derivation – from ESD– Purpose - provide sufficient information for common analyses. – Size - should be ~100kB/evt for real data, however it is now ~200kB/evt

where most of the data is trigger information. MC truth should take ~60kB/evt, so the truth information in the AOD is not full (reduction according to ATL-SOFT-INT-2007-002).

– Format – pool file

Data Structure• DPD (Derived Physics Data) –

– D1PD (primary DPD)• Content – different content for different communities , defined by the relevant

community.• Derivation - from AOD (sometimes from ESD)• Size – should be small enough to copy them to Tier-3 or off-grid disks. ~10kB/evt• Format – pool file

– D2PD (secondary DPD)• Content – specific for a certain analysis (defined by the relevant group). Derived

information can be added• Derivation – from D1PD and AOD• Format – pool file

– D3PD (tertiary DPD)• Content – should contain all the information need to produce the final plots for

publication• Format – hbook/ntuple/pool file/other

• Tags – Content – predefined fields for quick event identification– Size – should be ~1kB/evt– Format – database or ROOT files

Terms

• Skimming – Removal of events

• Thinning – Removal of containers

• Slimming – Removal of object from a container

Computing ModelBS DAQ

+Trigger

RDOReconstruction

TAGs

ESD

AODReconstruction

AODTAGs

AODCommon Analysis

DPD

Latex

Computing Model

Frameworks

• Athena – analysis inside Athena. The analysis is done by writing algorithms and tools using all the Athena framework

• Intermediate framework (“event view”) – collection of common tools to create DPDs.

• ARA – provides c++ and python code to convert persistent data into transient data. It does not include the Athena services, so analyses that need database services (like geometry) can’t be done in ARA (for example analyses that involve calorimeter cells and the full information of vertices and tracking)

Recommendations in the report • Official analyses must be done using validated

tools only!

So work with Athena tools as much as you can. And add your private tools to Athena

• Many recommendations were made. For completeness I copied all of them to here, but I will talk only on few of them.

Recommendations in the report

• Storage format of DPDsOnly D3PD can be ntuple

Use official tools for analysis.Put your tools in public place

Recommendations in the report

• Distribution of and access to DnPDs

• ARA

CINT is not recommendedPython – two times faster than CINTCompiled C++ - two times faster than python

Recommendations in the report

• Code distribution and software infrastructure

• Event Data Model

Recommendations in the report

• EDM

Back on the envelope calculation: reading time of 1M events~15min only for reading the info. Ntuples are ten times faster than that

Recommendations in the report

• Primary DPD content

• Priorities and coordination of Primary DPD production

Recommendations in the report • Primary DPD production

Recommendations in the report

• Toolkits or analysis frameworks

For my understanding that means that you cannot build primary DPDs with eventView – but I’m not sure I understand it correctly

Recommendations in the report

– EventView