Analysis of the ROOT Persistence I/O Memory Footprint in LHCb Ivan Valenčík Supervisor Markus...

14
Analysis of the ROOT Persistence I/O Memory Footprint in LHCb Ivan Valenčík Supervisor Markus Frank 19 th September 2012

Transcript of Analysis of the ROOT Persistence I/O Memory Footprint in LHCb Ivan Valenčík Supervisor Markus...

Page 1: Analysis of the ROOT Persistence I/O Memory Footprint in LHCb Ivan Valenčík Supervisor Markus Frank 19 th September 2012.

Analysis of the ROOT Persistence I/O Memory Footprint in LHCb

Ivan ValenčíkSupervisor

Markus Frank19th September 2012

Page 2: Analysis of the ROOT Persistence I/O Memory Footprint in LHCb Ivan Valenčík Supervisor Markus Frank 19 th September 2012.

Goals of the Project

• Investigate the influence of the ROOT I/O mechanism parameters for writing LHCb event data.

• Analyze the performance for reading back these data.

• Come to an optimization decision based on gathered data.

• Create a method that would make measurement easy to perform in the future.

Page 3: Analysis of the ROOT Persistence I/O Memory Footprint in LHCb Ivan Valenčík Supervisor Markus Frank 19 th September 2012.

ROOT I/O Persistency

• A conversion service that allows us to persist event data to ROOT files.

• We can optimize various parameters– basket size, splitting level, tree branch buffer

• We are interested in multiple features– memory usage, run duration, resulting file size,

behavior when more streams are present

Page 4: Analysis of the ROOT Persistence I/O Memory Footprint in LHCb Ivan Valenčík Supervisor Markus Frank 19 th September 2012.

Performance Monitoring Service

• A service for monitoring • We can observe usage of these system

resources and observables:– virtual memory size (vsize)– resident set size (rss)– processor time scheduled in user mode– processor time scheduled in kernel mode– elapsed time– file sizes

Page 5: Analysis of the ROOT Persistence I/O Memory Footprint in LHCb Ivan Valenčík Supervisor Markus Frank 19 th September 2012.

Performance Monitoring Service

• It can be added to any service using RootCnv for data persistence just by adding one script to the execution line.

• Example:python `which gaudirun.py`

stripping.py RootPerfMon.py

Page 6: Analysis of the ROOT Persistence I/O Memory Footprint in LHCb Ivan Valenčík Supervisor Markus Frank 19 th September 2012.

Questions to be Answered

• What is the effect of adding more streams?• How much memory does one stream cost?• What is the effect of changing basket size?• What effect does split level and buffer size of

the branch have on the resource usage?• Technique used:

Performed a series of sweeps over optimization parameters.

Page 7: Analysis of the ROOT Persistence I/O Memory Footprint in LHCb Ivan Valenčík Supervisor Markus Frank 19 th September 2012.

Effect of the Fraction of the Copied Data

• Real stripping job has 14 streams that keep up to 14% of the reconstructed data.

• It is not necessary to copy all data to reach maximum memory usage.

• Memory usage depends only marginally on the fraction of the copied data.

• In the benchmark we are copying 10% of10 000 events in each stream and reading back all copied events.

Page 8: Analysis of the ROOT Persistence I/O Memory Footprint in LHCb Ivan Valenčík Supervisor Markus Frank 19 th September 2012.

Memory Usage and Runtime of1 Copying Stream

Page 9: Analysis of the ROOT Persistence I/O Memory Footprint in LHCb Ivan Valenčík Supervisor Markus Frank 19 th September 2012.

Memory Usage and Runtime of1 Copying Stream

Page 10: Analysis of the ROOT Persistence I/O Memory Footprint in LHCb Ivan Valenčík Supervisor Markus Frank 19 th September 2012.

Memory Usage and Runtime of1 Copying Stream

1 streamsplit level 0 99buffer size 2 kB 32 kB 2 kB 32 kB

basket size2 MB

vsize (MB) 660 660 662 681+FSR (MB) 660 688 665 708

time (s) 373 395 790 773file size (MB) 84.0 83.7 96.9 96.8

basket size20 MB

vsize (MB) 687 685 687 702+FSR (MB) 687 712 687 726

time (s) 322 486 384 405file size (MB) 80.4 80.4 84.9 84.9

basket size40 MB

vsize (MB) 709 714 709 727+FSR (MB) 709 735 709 747

time (s) 342 343 401 400file size (MB) 80.2 80.2 84.4 84.5

Page 11: Analysis of the ROOT Persistence I/O Memory Footprint in LHCb Ivan Valenčík Supervisor Markus Frank 19 th September 2012.

Memory Usage and Runtime of1 Reading Stream

1 streamsplit level 0 99buffer size 2 kB 32 kB 2 kB 32 kB

basket size2 MB

vsize (MB) 463 463 465 467

time (s) 23 27 28 30

basket size20 MB

vsize (MB) 489 491 490 485

time (s) 25 36 27 24

basket size40 MB

vsize (MB) 507 510 513 507

time (s) 20 23 26 28

Page 12: Analysis of the ROOT Persistence I/O Memory Footprint in LHCb Ivan Valenčík Supervisor Markus Frank 19 th September 2012.

Cost of 1 Stream

• 1 write stream requires 19 MB when low basket size and no splitting is used.

• 1 write stream requires more than 60 MB when big basket size and splitting is used.

• This gives us a potential for saving 40 MB per stream.

• Processing time and output file sizes for small basket are slightly smaller than those obtained with current settings.

Page 13: Analysis of the ROOT Persistence I/O Memory Footprint in LHCb Ivan Valenčík Supervisor Markus Frank 19 th September 2012.

Findings

• Splitting branches requires a lot of memory, processing time and makes output files bigger.

• Giving ROOT more memory does not decrease the processing time significantly.

• More memory for ROOT makes output filesa little smaller.

• Current settings are bad for writing FSR.• Basket size has the biggest effect on memory

needed in reading. Big sizes are unnecessary.

Page 14: Analysis of the ROOT Persistence I/O Memory Footprint in LHCb Ivan Valenčík Supervisor Markus Frank 19 th September 2012.

Summary

• Gained insight into Gaudi framework and ROOT conversion service.

• Implemented performance monitoring service.• Created test environment simulating stripping.• Explored parameter space of the optimization

parameters and performed an analysis of the memory footprint which indicates possible gain of 40 MB per output stream.