Offline Week HLT Plans Thorsten Kollegger

23
Offline Week HLT Plans Thorsten Kollegger CERN | 26.03.2013

description

Offline Week HLT Plans Thorsten Kollegger. CERN | 26 .03.2013. HLT Status. A few words on the HLT operational status… More details in the HLT presentation during the Workshop on Run 1 Conclusions and Run 2 Outlook https://indico.cern.ch/conferenceDisplay.py?confId= 240668. - PowerPoint PPT Presentation

Transcript of Offline Week HLT Plans Thorsten Kollegger

Page 1: Offline Week HLT Plans Thorsten Kollegger

Offline Week

HLT Plans

Thorsten Kollegger

CERN | 26.03.2013

Page 2: Offline Week HLT Plans Thorsten Kollegger

HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 2

HLT Status

A few words on the HLT operational status…

More details in the HLT presentation during the

Workshop on Run 1 Conclusions and Run 2 Outlookhttps://indico.cern.ch/conferenceDisplay.py?confId=240668

Page 3: Offline Week HLT Plans Thorsten Kollegger

HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 3

Run 1 Conclusions

Operation

• Run participation: > 97%

• EOR percentage: <≈ 5%

- dominated by hardware issues, can we get better?

Performance

• Full online TPC tracking for maximum

TPC read-out rate demonstrated

• Efficiency comparable to offline tracking

• Limited by “online calibration”

Page 4: Offline Week HLT Plans Thorsten Kollegger

HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 4

Tracking Performance

HLT GPU tracking – speed comparison

• It’s fast…

Page 5: Offline Week HLT Plans Thorsten Kollegger

HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 5

Tracking Performance

HLT GPU tracking – physics performance

• Similar performance as offline…

• … if we have the same calibration (need online calibration)

Page 6: Offline Week HLT Plans Thorsten Kollegger

HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 6

LS1 – HLT Farm Availability

Two use cases

• HLT development

• Offline usage

AliEn CAFOn

Demand

QA

CernVM/FSCloud Agent

Cloud Gateway

HLTPublic Cloud

Private CERN Cloud

Page 7: Offline Week HLT Plans Thorsten Kollegger

HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 7

LS1 – HLT Farm Availability

From Technical Coordination

• No power starting week 22 for two months

• No cooling weeks 22-40

Implications

• Testing and development capabilities for HLT limited

• Offline usage of farm not feasible/worthwhile

Development of tools continuing in view of Run 2 downtimes,

general use

Page 8: Offline Week HLT Plans Thorsten Kollegger

HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 8

HLT Offline Plans LS1

?

Page 9: Offline Week HLT Plans Thorsten Kollegger

HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 9

HLT Offline Plans LS1

Group responsible for HLT offline integration left HLT project

• Can maintain current software,

but no development/restructuring

• Need to concentrate on core efforts,

e.g. HLT Event Display only on best-effort basis

• Have been in this mode since ≈ summer 2012

What can still be done in/with offline?

Page 10: Offline Week HLT Plans Thorsten Kollegger

HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 10

To be finished…

There are many HLT-related developments

waiting to be finished/back-ported to offline

• Fast (TPC) transformations (calibration, > factor 10 speed-up)

• HLT tracks as offline reconstruction TPC seeds

• HLT tracking performance evaluation

A more general remark:

Need to converge as-much-as-possible towards common code,

very good experience working with TPC group

Page 11: Offline Week HLT Plans Thorsten Kollegger

HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 11

Event Display

Event Display for nice pictures…

… not only, but also very useful

for fast feed-back for shift-crew,

displaced vertex runs “by eye”

HLT can support this only on a

best-effort basis:

• Can we converge towards

a common event display?

Page 12: Offline Week HLT Plans Thorsten Kollegger

HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 12

Start of Run (SOR)

Strong push from TC/RC to SOR time to 10 s

HLT engage times currently around 100 s• 25 s – Pendolino prepare calibration/HCDB• <5 s – Start of HLT processes• 60 s – Initialization of processes (CDB access, map

preparation)• <5 s - Subscriptions (Configured->Running+Subscribed+....)• <5 s - Network connections (Running... -> Connected)

Can AliROOT go from “off” to first event reco from in 2 s?

(incl. building geometry, updating calibrations…)

Page 13: Offline Week HLT Plans Thorsten Kollegger

HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 13

LS1 – HLT Farm Replacement

Full replacement of computing farm beginning of 2014

Switch from H-RORC to C-RORC

• 12 DDLs/board vs 2 DDLs/board

reduced number of FEP nodes

• Test-Setup in CR2 working

Tracking/Compute node hardware to be decided

• move tracker from CUDA to OpenCL to gain

vendor independence

Page 14: Offline Week HLT Plans Thorsten Kollegger

HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 14

Run 2 Outlook

All detectors available in HLT• Use requires development (not HLT core)• If we want to do this, major software development effort by

detectors needed

HLT “core” focused on improvement of online software

• Data transport and process control/monitoring framework

• HLT specific detector software

Page 15: Offline Week HLT Plans Thorsten Kollegger

HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 15

Run 2 Outlook

Full TPC reconstruction available• TPC cluster finding for data reduction• TPC seeding• Online calibration of TPC, replacement of CPass0?• Additional data reduction?

Towards online calibration• porting of DAs to HLT started, massive changes required

(TPC, MUON)• “CPass0” feasibility study with TPC

Need to know NOW computing requirements after LS1

Page 16: Offline Week HLT Plans Thorsten Kollegger

HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 16

Online Calibration

There are two running modes in the HLT

• Guaranteed analysis of every event

-> main chain

• “best-effort” event delivery

-> monitoring chain

The choice of chain defines requirements on your code!

Page 17: Offline Week HLT Plans Thorsten Kollegger

HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 17

Online Calibration Requirements

Stability

• Can your code run on >100M events?(memory consumption?)

• How well can you handle detector errors?

In offline you are too a large extend shielded

already by online (DAQ/HLT)

• How stable is your code?

Online code changes require verification,

only possible without beam

Page 18: Offline Week HLT Plans Thorsten Kollegger

HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 18

Online Calibration Requirements

Processing• What are the computing resources you need to process

1000 Hz of minimum bias Pb+Pb collisions?• Does your code allow for parallelization of event processing?

How do you handle merging of process output?• Input data format: ROOT adds huge overhead…

An example why this matters: Your code takes 1 second to process 1 event• No parallel code 1Hz• Parallel code 1000 Hz

However: 1000 cores 100 additional nodes >500k€

Page 19: Offline Week HLT Plans Thorsten Kollegger

HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 19

Online Calibration Requirements

Need to know NOW computing requirements after LS1 This defines the infrastructure,

there are no fast changes possible

Reminder:• All detectors available in HLT

Some limited computing power available for local processing• Resources available for TPC tracking (and limited calibration)

Everything beyond requires additional resources…

Page 20: Offline Week HLT Plans Thorsten Kollegger

HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 20

Run 3 Outlook

Developments towards

the upgrade…

O2

Online-Offline Facility

AliEn HLTDAQ

ReconstructionCalibration

Re-reconstructionOnline Raw Data Store

Page 21: Offline Week HLT Plans Thorsten Kollegger

HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 21

Run 3 Outlook

… to make this a reality

2

Page 22: Offline Week HLT Plans Thorsten Kollegger

HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 22

Backup

Page 23: Offline Week HLT Plans Thorsten Kollegger

HLT Plans | Offline Week | 26.03.2013 | Thorsten Kollegger 23

Tracking Performance

HLT uses GPUs for TPC tracking

• Unique at LHC, other experiments following now with R&D

• Factor 4 in total tracking time: factor 3 less nodes in system