ALICE O 2 project B. von Haller on behalf of the O 2 project 19.05.2015 CERN.
-
Upload
erik-edwards -
Category
Documents
-
view
216 -
download
0
description
Transcript of ALICE O 2 project B. von Haller on behalf of the O 2 project 19.05.2015 CERN.
ALICE O2 project
B. von Haller on behalf of the O2 project
19.05.2015CERN
Overview
▶O2 Project▶Upgrade for the Offline and Online computing▶Members of HLT, DAQ, Offline Build a unified computing system for after LS2
▶Guided tour of the O2 TDR (submitted to LHCC April 20 2015)▶Rationales▶General idea and architecture▶Computing needs
B. von Haller | O2 Project | 19.05.2015 2
Rationales
1. After LS2, LHC will deliver minimum bias PbPb at 50 kHz ~100 x higher rate than now
2. Running scenarios ▶Goal: 13 nb−1 for Pb–Pb collisions (minimum bias)
3. Physics topics addressed by ALICE upgrade▶ Very small signal-to-noise ratio and large background▶ Requires very large statistics▶ Triggering techniques very inefficient if not impossible
Too much data to be storedCompress data intelligently by processing it online
B. von Haller | TDR EC | 11.03.2015 3
Readout
B. von Haller | O2 Project | 19.05.2015 4
Detector Max read-out rate Data rate for Pb-Pb collisions at 50kHz
Average data size per interaction
kHz GB/s MBTPC 50 1012 (92.5%) 20.7 ITS 100 40 (3.6%) 0.8TRD 90.9 20 (1.8%) 0.5MFT 100 10 (0.9%) 0.2Other detectors - 11.269 (1.2%) 0.25Total 1093 22.4
Number of links Number of boardsDDL1 DDL2 GBT CRORC CRU
15 40 7998 13 463
Read-out parameters
Detector links and read-out boards
TPC : continuous readout to cope with the 50 kHz interaction rate
O2/T0/T1 T0/T1
ArchiveCTFAOD
Storage
EPNs O(1000)
FLPsFLPs O(100)
O2 architecture (1)
B. von Haller | O2 Project | 19.05.2015 5
Raw data input
Local processing
Frame dispatch
Global processing
Compressed timeframes
Partially compressedsub-timeframes
Storage
Sync
hron
ous
Data Reduction 0
e.g. clustering
Sub-timeframes
Calibration 0 on local data,
ie. partial detector
Time slicing
Buffering
Local aggregation QC
Tagging
Detector reconstruction
e.g. track finding
Timeframe building
Full timeframe
Data Reduction 1
Calibration 1 on full detectors
e.g. space charge distortion
QC
Detectors electronicsTPC TRD…
Trigger and clockITS …
Detector data samplesinterleaved with synchronized
heartbeat triggers
O2 architecture (2)
B. von Haller | O2 Project | 19.05.2015 6
O2/T0/T1
EPNs
Compressed timeframes
T0/T1
ArchiveStorage
Condition & Calibration Database
Quality Control
Sub-timeframesTimeframesCompressed timeframesAOD
CCDB Objects
Asyn
chro
nous
Sync
hron
ous
QC data
CTFAOD
Storage
Compressed timeframes
O2/T0/T1 O(1)
ESD, AOD
O2/T0/T1 O(1)
Event extractionTagging
Globalreconstruction
QCAOD extraction
Calibration 2
O2 architecture (3)
B. von Haller | O2 Project | 19.05.2015 7
O2/T0/T1
Reconstructionpasses
and event extraction
Compressed timeframes
T0/T1
Archive
Analysis
Storage
Simulation
Asyn
chro
nous
CTFAOD
Analysis Facilities
StorageHistograms,trees
O(1)
AnalysisAOD
Storage
T2
Simulation
CTF
AOD O(10)
QCReconstructionEvent buildingAOD extraction
ESD, AOD
Event Summary DataAnalysis Object Data
ESD, AODCompressed timeframes
Computing Model
B. von Haller | O2 Project | 19.05.2015 8
T0/T1
CTF -> ESD -> AOD
AF
AOD -> HISTO, TREE
O2
RAW -> CTF -> ESD -> AOD
1
T2/HPC
MC -> CTF -> ESD -> AOD
1..n
1..n 1..3
CTF
AODAOD
AOD
O2 software design
▶Message-based multi-processing ▶Ease of development▶Ease to scale hor izontally ▶Possibility to extend with different hardware▶Multi-threading within processes possible
▶ALFA : ALICE-FAIR concurrency framework ▶Provides data transport layer▶ZeroMQ▶Arbitrary payload
B. von Haller | O2 Project | 19.05.2015 9Libraries and tools
ALFA
Cbm ALICE O2Panda
FairRoot
. . . . . . .
Physics software designProcessing workflow
B. von Haller | O2 Project | 19.05.2015 10
EPN: synchronous asynchronousAll FLPs
Raw data
Local Processing E.g.
ClusterizationCalibration
Detector ReconstructionE.g. TPC & ITSTrack finding
CTF AOD
Step 1 Step 2 Step 3 Step 4
Inter-detectormatching
procedures
Final calibration, 2nd matching
Final matching, PID, Event extraction
Step 0
Technology survey (1)Comparison GPU and CPU for the Fast Cluster Finder
B. von Haller | O2 Project | 19.05.2015 11
Performance of the FPGA-based FastClusterFinder algorithm for DDL1 and DDL2 compared to the software implementation on a recent server PC. FPGA is the selected platform in this case
Technology survey (2)Comparison CPU vs GPU for the HLT TPC CA Tracker
B. von Haller | O2 Project | 19.05.2015 12
Tracking time of HLT TPC Cellular Automata tracker on Nehalem CPU (6Cores) and NVIDIA Fermi GPU. GPU is the selected platform in this case
Demonstrators – TPC CA Tracker
B. von Haller | O2 Project | 19.05.2015 13
Verified linear rise of processing time of TPC track finding for data samples corresponding to timeframe of 1 ms
Computing requirements for processing
B. von Haller | O2 Project | 19.05.2015 14
Computing requirements -> Total : ~ 100000 CPU cores 5000 GPU chips
Goes together, merging and fitting can run on
GPUs too
Being ported to GPU, conversion factor
unknown
Theoretically could run on GPU
Data reduction – TPC
B. von Haller | O2 Project | 19.05.2015 15
Data reduction factor of 20 for the TPC is feasible
Data reduction – Global
B. von Haller | O2 Project | 19.05.2015 16
Data rates for input to O2 system and output to permanent storage for routine data taking with Pb–Pb at 50 kHz interaction rate.
Data types characteristics
B. von Haller | O2 Project | 19.05.2015 17
▶ TF size - Duration of the time window (tTF)▶Data lost at the edges: 0.1/tTF(ms)▶ For calibration and reconstruction: 20ms - 100ms▶ Shorter is better for buffering and distribution 20ms (1000 interactions in Pb-Pb at 50kHz)
Data type Size (GB) Tape copy
TF (Pb-Pb) 10 No
CTF (Pb-Pb) 1.6 Yes
ESD 15% of CTF No
AOD 10% of CTF Yes
MC 100% of CTF No
MCAOD 30% of ESD Yes
HISTO 1% of ESD No
Data storage requirements
B. von Haller | O2 Project | 19.05.2015 18
Number of simulated events and storage requirements
Number of reconstructed collisions and storage requirements for scenarios.
~55 PB
B. von Haller | O2 Project | 19.05.2015 19
2 CRUs per FLP
Detectors
8100 Read-out Links
250FLPs
2 GPUsper EPN
1500EPNs
Input: 250 portsOutput : 1500 ports
1500 x 60MB/s
1.2 TB/s
SwitchingNetwork
500 GB/s 90 GB/s
Storage
O2 facility design (1)
O2 facility design (2)
B. von Haller | O2 Project | 19.05.2015 20
Network layout 2 : 4 independent EPN subfarms
FLP1
4 x 10 Gb/s
NetworkSub-Farm 4
EPN
EPN
1126
1500
FLP256
NetworkSub-Farm 3
NetworkSub-Farm 2
NetworkSub-Farm 1
EPN
EPN
751
1125
EPN
EPN
376
750
EPN
EPN
1
375
10 Gb/s
…
…
…
…
…
O2 facility design (3)
B. von Haller | O2 Project | 19.05.2015 21
FLPEPN
FLPEPN
25
11
30
140/56 Gb/sSEPN
SEPN
110 Gb/s1
EPN
EPN
1471
1500
10 Gb/s50
FLP
FLP250
226
10
10 X 40/56 Gb/s
50
2 X 40/56 Gb/s
Network layout 3 : Super-EPNs
O2 facility design (4)Simulation – Link speed
B. von Haller | O2 Project | 19.05.2015 22
Left : Network Layout 2 : Link speed on the FLPs and EPNs for a network layout with 4 EPN subfarms for 100 parallel transfers from the FLPs.
Right : Network Layout 3 : Link speed on the FLPs and Super-EPNs (configuration based on an Infiniband network at 56 Gb/s)
O2 facility design (6)Simulation - system scalability
B. von Haller | O2 Project | 19.05.2015 23
Latency of the timeframes for different interaction rates using layout 2 (left) and layout 3 (right) Layout 2 is cheaper but scales up to 90kHz only.
O2 facility – Power and cooling
B. von Haller | O2 Project | 19.05.2015 24
Schedule
B. von Haller | O2 Project | 19.05.2015 25
2015 2016 2017 2018 2019 2020
Today6/15
ITS half-layer test1/17
TPC read-out test4/17
ITS surface test9/18
TPC RCUs installation CR11/19
Data taking Cosmics with core detectors7/19
TPC pre-commissioning on surface7/19
TPC commissioning in cavern1/20
End of commissioning6/20
O2 system v1 - 1 CRU, 1 FLP, basic data processing, control, logging, QC, monitoring
1/17
O2 system v2 - 1 detector (e.g. ITS) full read-out capability
4/18
10% Data processing and storage HW installation
9/18
90% Data processing and storage HW installation
11/19
Full system ready2/20
Detectors milestones
O2 milestones
Conclusion
▶O2 is a new project with very ambitious requirements▶> 1TB/s detector input, ~100x more than today▶Online synchronous compression factor of 14
▶Major paradigm change with combined offline and online computing▶1 framework, ▶1 facility
▶Challenging schedule▶TDR submitted
B. von Haller | O2 Project | 19.05.2015 26
▶TDR draft available here : https://cds.cern.ch/record/2011297
▶Thank you for your attention
B. von Haller | O2 Project | 19.05.2015 27
IntroductionChapter 1
B. von Haller | O2 Project | 19.05.2015 28
Asynchronous data processingEvent extraction
Compressed Sub-Timeframes
Continuous and triggered streams of raw data
Data aggregationSynchronous global data processing
Data storageand archival
Compressed Timeframes
Recons-tructed
eventsCompressed Timeframes
ReadoutData aggregation Local data processing
Detectors electronics1. After LS2, LHC will deliver min bias
Pb-Pb at 50 kHz ▶ 100 x more data than today
2. Physics topics addressed by ALICE upgrade▶ Very small signal-to-noise ratio and
large background▶ Triggering techniques very inefficient if
not impossible▶ Needs large statistics
3. Running scenarios ▶ Goal: 13 nb−1 for Pb–Pb collisions
(minimum bias)Too much data to be stored Compress data intelligently by processing it online
O2 software design (3)Chapter 7 – Data Format
B. von Haller | O2 Project | 19.05.2015 29
Memory buffer FLP #125
Time window (frame length)
Link #N
Payloadhe
ade
r Payload
head
er Payload
head
er Payload
head
er
Payload
head
er Payload
head
er
time
Trigger heartbeat events
Other triggers
triggered
continuous
Trigger heartbeat
HB #376453 HB #376454
125_1_0 Link #1
Link #2
Link #4
0x2AE06A0
0x2FC03E0
0x34AED30
0x39CC120
0x3D21EF0
125_1_1 125_1_2 125_1_17
125_2_1125_2_0 125_2_2 125_1_20
125_1_0…
… 125_2_0
125_2_2
125_2_2
125_2_2
125_1_20
125_1_20
125_1_20
125_1_20
125_1_20
125_1_0
125_2_2
…
…
…
125_1_17 125_1_2 Link #3
125_2_0
MDH MDH MDH MDH MDH MDH MDH MDH
Sub Time Frame descriptor
Memory view
Data link view
Correlated eventsSingle events
Multiple Data Headers
FLPid_DDLid_counter
O2 software design (2)
▶Facility control, configuration and monitoring▶CCM will combine control of data taking and of
asynchronous data processing▶140000 commands to 70000 processes (peak) ▶600 kHz monitoring data
Chapter 7
B. von Haller | O2 Project | 19.05.2015 30
Control, Configuration and Monitoring
LHC Trigger
Status/Monitoring
dataStatus
DCS Grid
Commands/Configuration
dataStatus
Commands/Configuration
data
Status/Monitoring
data
GridJobs
StatusCommands
O2 Processes
O2 software design (4)
▶Dedicated FLP for DCS▶O2 process retrieves conditions data and insert
them into DCS data frames The required DCS data are embedded in the
data They are available for reconstruction and
calibration after the frame building
Chapter 7 - DCS
B. von Haller | O2 Project | 19.05.2015 31
Physics programme and data taking scenariosChapter 2
B. von Haller | O2 Project | 19.05.2015 32
ALICE running scenarios :Year System √sNN Lint Ncollisions
(TeV) (pb-1) (nb-1)
2020pp 14 0.4 2.7 · 1010
Pb-Pb 5.5 2.85 2.3 · 1010
2021pp 14 0.4 2.7 · 1010
Pb-Pb 5.5 2.85 2.3 · 1010
2022pp 14 0.4 2.7 · 1010
pp 5.5 6 4 · 1011
2025pp 14 0.4 2.7 · 1010
Pb-Pb 5.5 2.85 2.3 · 1010
2026
pp 14 0.4 2.7 · 1010
Pb-Pb 5.5 1.4 1.1 · 1010
p-Pb 8.8 50 1011
2027pp 14 0.4 2.7 · 1010
Pb-Pb 5.5 2.85 2.3 · 1010
Requirements (1)
Detector Max read-out rate Data rate for Pb-Pb collisions at 50kHz
Average data size per interaction
kHz GB/s MBACO 100 0.014 0.00028CTP 200 0.02 0.0004EMC 50 4 0.08FIT 50 0.115 0.023HMP 2.5 0.06 0.024ITS 100 40 (3.6%) 0.8MCH 100 2.2 0.04MFT 100 10 (0.9%) 0.2MID 100 0.3 0.006PHS 50 2 0.04TOF 200 2.5 0.05TPC 50 1012 (92.5%) 20.7 TRD 90.9 20 (1.8%) 0.5ZDC 100 0.06 0.0012Total 1093 22.4
Input rates
B. von Haller | O2 Project | 19.05.2015 33
Requirements (2)
Detector Number of links Number of read-out boardDDL1 DDL2 GBT CRORC CRU
ACO 1 1EMC 20 4FIT 2 1HMP 14 3ITS 495 23MCH 480 20MFTMID 2 2PHS 16 3TOF 72 3TPC 5904 360TRD 1044 54ZDC 1 1CTP 2 1Total 15 40 7998 13 463
Read-out
B. von Haller | O2 Project | 19.05.2015 34
Project organisation (1)Chapter 11
B. von Haller | O2 Project | 19.05.2015 35
Total : 112 FTE for the period 2015-19Compatible with the 120 FTEs from institutes
Tasks Insitutes Human Resources (FTE)
Architecture CERN, FIAS,GSI, IRI 2
Tools, procedure and software process CERN, IPNO, JU, LIPI, WRCP 2
Data flow, detector read-out CALTECH, CERN, FESB, FIAS, IRI, LIPI, WRCP 12
Computing platforms CERN, FIAS, IRI, JU, KISTI, KMUTT, KU, ORNL 12
Software framework and data model CERN, IPNO, GSI, LBNL 14
Calibration JU, WSU 16
Reconstruction CERN, FESB, GSI, IPHC, LIPI, LPC, SUBATECH, UH, WSU 16
Physics simulation CERN, CU, IPHC, IPNO, LBNL, ORNL, UH, UTK 14
Data Quality monitoring and visualization CERN, ISS, JU, WUT 6
Control, configuration, monitoring and logging
ASCR, CALTECH, CERN, CU, KMUTT, IRI 10
O2 facility hardware procurement, installation
CERN, FIAS, IRI, GSI 8
O2 facility and grid/cloud operations CERN, KISTI M&O
O2 facility design (7)
▶ Demonstrators, e.g.▶ Existing HLT TPC algorithms interfaced to the new ALFA framework ▶ HLT development cluster infrastructure with ~40 nodes, 30 nodes
with GPU hardware ▶ FLP and EPN data distribution and transport devices
▶ Verified in the prototype ▶ TPC reconstruction topology using 2011 PbPb data ▶ FLP-EPN data transport network with 36 FLPs and 28 EPNs ▶ Reproduced the performance of HLT TPC processing in ALFA▶ Verified linear rise of processing time of TPC track finding for data
samples corresponding to timeframe of 1 ms ▶ Ongoing work
Chapter 10
B. von Haller | O2 Project | 19.05.2015 36
Project organisation (3)Chapter 11
B. von Haller | O2 Project | 19.05.2015 37
A B C
Milestones relative to the framework and the facility at P2• Q1 2017
Version 1 (A) 1 CRU + QC (e.g. ITS half-layer test)• Q2 2018
Version 2 (B) 1 detector full read-out (e.g. ITS or TPC surface test)• Q4 2019
P2 installation and commissioning (C)
All FLPs 10% EPNs• Q2 2020
ProductionFull deployment
B. von Haller | O2 Project | 19.05.2015 38
2015
Today
February March April May June
Submission of the TDR to the LHCC20/4/2015
Presentation of the TDR to the LHCC2/6/2015
5/2/2015 - 18/2/2015
Comments on the TDR by the O2 project members
19/2/2015 - 1/3/2015TDR editing
23/2/2015 - 27/2/2015Proof-reading (Frank)
2/3/2015 - 15/3/2015Comments on the TDR by the whole ALICE Collaboration
17/3/2015ALICE internal review
18/3/2015 - 5/4/2015Modification by the authors
6/4/2015 - 19/4/2015Final editing of the TDR before submission
20/4/2015 - 31/5/2015LHCC review
1/6/2015 - 4/6/2015LHC Committee
• Prof. Borut Paul Kersevan, ATLAS (former computing coordinator)
• Tonko Ljubicic, BNL, STAR (Online project leader)
• Niko Neufeld, CERN, LHCb (Online)
TDR Schedule
O2 facility design (5)Simulation – Bisection data traffic
B. von Haller | O2 Project | 19.05.2015 39
Bisection data traffic in the system for one of the 4 EPNs subfarms of layout 2 (left) and for the whole layout 3 (right)
B. von Haller | O2 Project | 19.05.2015 40
B. von Haller | O2 Project | 19.05.2015 41
B. von Haller | O2 Project | 19.05.2015 42
B. von Haller | O2 Project | 19.05.2015 43