PROOF system for parallel MPD event processing
description
Transcript of PROOF system for parallel MPD event processing
![Page 1: PROOF system for parallel MPD event processing](https://reader036.fdocuments.in/reader036/viewer/2022062423/56814908550346895db640cf/html5/thumbnails/1.jpg)
LOGO
PROOF system for parallel MPD event processing
Gertsenberger K. V.
Joint Institute for Nuclear Research, Dubna
![Page 2: PROOF system for parallel MPD event processing](https://reader036.fdocuments.in/reader036/viewer/2022062423/56814908550346895db640cf/html5/thumbnails/2.jpg)
NICA scheme
Gertsenberger K.V. 2
![Page 3: PROOF system for parallel MPD event processing](https://reader036.fdocuments.in/reader036/viewer/2022062423/56814908550346895db640cf/html5/thumbnails/3.jpg)
Multipurpose Detector (MPD)
The software MPDRoot is developed for the MPD event simulation, reconstruction of experimental or simulated data and following physical analysis of heavy ion collisions registered by the MultiPurpose Detector at the NICA collider.
3Gertsenberger K.V.
![Page 4: PROOF system for parallel MPD event processing](https://reader036.fdocuments.in/reader036/viewer/2022062423/56814908550346895db640cf/html5/thumbnails/4.jpg)
high interaction rate (up to 6 KHz) high particle multiplicity, about 1000 charged particles for the
central collision at the NICA energyone event reconstruction takes tens of seconds in
MPDRoot now, 1M events – months large data stream from the MPD:
is estimated at 5 to 10 PB of raw data per year
1m simulated events ~ 50 TBMPD event data can be processed concurrently the ability to use multicore / multiprocessor machines,
computing clusters and, subsequently, GRID system
4Gertsenberger K.V.
Prerequisites of the parallel processing
![Page 5: PROOF system for parallel MPD event processing](https://reader036.fdocuments.in/reader036/viewer/2022062423/56814908550346895db640cf/html5/thumbnails/5.jpg)
Current NICA cluster in LHEP
5Gertsenberger K.V.
![Page 6: PROOF system for parallel MPD event processing](https://reader036.fdocuments.in/reader036/viewer/2022062423/56814908550346895db640cf/html5/thumbnails/6.jpg)
Data storage on the NICA cluster
6Gertsenberger K.V.
Distributed file system GlusterFS
it aggregates existing file systems in a common distributed file system
automatic replication works as background process
background self-checking service restores corrupted files in case of hardware or software failure
![Page 7: PROOF system for parallel MPD event processing](https://reader036.fdocuments.in/reader036/viewer/2022062423/56814908550346895db640cf/html5/thumbnails/7.jpg)
Parallel MPD event processing
PROOF serverparallel data processing in ROOT macros on the parallel architectures
concurrent eventprocessing
MPD-schedulerscheduling system for the task distribution to parallelize data processing on the cluster nodes
7Gertsenberger K.V.
![Page 8: PROOF system for parallel MPD event processing](https://reader036.fdocuments.in/reader036/viewer/2022062423/56814908550346895db640cf/html5/thumbnails/8.jpg)
Parallel data processing with PROOF
PROOF (Parallel ROOT Facility) is a part of the ROOT software, no additional installations
PROOF uses data independent parallelism based on the lack of correlation for MPD events good scalability
Parallelization for three parallel architectures:
1. PROOF-Lite parallelizes the data processing on one multiprocessor/multicores machine
2. PROOF parallelizes processing on heterogeneous computing cluster
3. Parallel data processing in GRID system
Transparency: the same program code can execute both sequentially and concurrently
8Gertsenberger K.V.
![Page 9: PROOF system for parallel MPD event processing](https://reader036.fdocuments.in/reader036/viewer/2022062423/56814908550346895db640cf/html5/thumbnails/9.jpg)
Using PROOF in MPDRoot The last parameter of the reconstruction: run_type (default, “local”).
Speedup on the user multicore machine:
$ root reco.C(“evetest.root”, “mpddst.root”, 0, 1000, “proof”)
parallel processing of 1000 events with thread count being equal logical processor count
$ root reco.C(“evetest.root”, “mpddst.root”, 0, 500, “proof:workers=3”)
parallel processing of 500 events with three concurrent threads
Speedup on the NICA cluster:$ root reco.C(“evetest.root”, “mpddst.root”, 0, 1000, “proof:[email protected]:21001”)
parallel processing of 1000 events on all cluster’s cores of the PoD farm
$ root reco.C(“evetest.root”, …, 0, 500, “proof:[email protected]:21001:workers=15”)
parallel processing of 500 events on the PoD cluster with 15 workers
XRootD files support
9Gertsenberger K.V.
![Page 10: PROOF system for parallel MPD event processing](https://reader036.fdocuments.in/reader036/viewer/2022062423/56814908550346895db640cf/html5/thumbnails/10.jpg)
The speedup of the reconstruction on 4-cores machine
10Gertsenberger K.V.
![Page 11: PROOF system for parallel MPD event processing](https://reader036.fdocuments.in/reader036/viewer/2022062423/56814908550346895db640cf/html5/thumbnails/11.jpg)
PROOF on the NICA cluster
11Gertsenberger K.V.
proof proof proof
proof
proof = master serverproof = slave node
*.root
GlusterFS
Proof On Demand Cluster
(10) (10) (14)
$ root reco.C(“evetest.root”,”mpddst.root”, 0, 3, “proof:[email protected]:21001”)
event count
evetest.root event №1 event №2
mpddst.root
event №0
![Page 12: PROOF system for parallel MPD event processing](https://reader036.fdocuments.in/reader036/viewer/2022062423/56814908550346895db640cf/html5/thumbnails/12.jpg)
The speedup of the reconstruction on the NICA cluster
12Gertsenberger K.V.
![Page 13: PROOF system for parallel MPD event processing](https://reader036.fdocuments.in/reader036/viewer/2022062423/56814908550346895db640cf/html5/thumbnails/13.jpg)
The description of PROOF system on mpd.jinr.ru
13Gertsenberger K.V.
![Page 14: PROOF system for parallel MPD event processing](https://reader036.fdocuments.in/reader036/viewer/2022062423/56814908550346895db640cf/html5/thumbnails/14.jpg)
Conclusions The distributed NICA cluster was deployed on LHEP farm for the
NICA/MPD experiment (Fairsoft, ROOT/PROOF, MPDRoot, Gluster). 128 cores
The data storage was organized with the GlusterFS distributed file system: /nica/mpd[1-8]. 10 TB
PROOF On Demand cluster containing nc10 (with POD server), nc11 and nc13 machines with 34 processor cores was implemented to parallelize event data processing for the MPD experiment. PROOF support was added to the reconstruction macro.
The web site mpd.jinr.ru in section Computing – NICA cluster – PROOF parallelize presents the manual for the PROOF system.
14Gertsenberger K.V.