User Community Report
description
Transcript of User Community Report
![Page 1: User Community Report](https://reader036.fdocuments.in/reader036/viewer/2022062304/5681460c550346895db31b3d/html5/thumbnails/1.jpg)
User Community ReportUser Community Report
Dimitri BourilkovDimitri BourilkovUniversity of FloridaUniversity of Florida
UltraLight Visit to NSFUltraLight Visit to NSF
Arlington, VA, January 4, 2006Arlington, VA, January 4, 2006
![Page 2: User Community Report](https://reader036.fdocuments.in/reader036/viewer/2022062304/5681460c550346895db31b3d/html5/thumbnails/2.jpg)
D.BourilkovD.Bourilkov User Community Report 22
Physics Analysis User GroupMotivation and Mission
• Establish a community of physicists - early adopters and users:• first within UltraLight (expert users)
• later outside users
• This community uses the system being developed e.g.• starts actual physics analysis efforts
exploiting the test-bed
• provides a certain user perspective on the problems being solved
![Page 3: User Community Report](https://reader036.fdocuments.in/reader036/viewer/2022062304/5681460c550346895db31b3d/html5/thumbnails/3.jpg)
D.BourilkovD.Bourilkov User Community Report 33
• Organizes early adoption of the system
• Identifies the most valuable features of the system from the users perspective, to be released early in production (or useful level of functionality)
• This is "where the rubber meets the road" and will provide rapid user feedback to the development team
Physics Analysis User Group
![Page 4: User Community Report](https://reader036.fdocuments.in/reader036/viewer/2022062304/5681460c550346895db31b3d/html5/thumbnails/4.jpg)
D.BourilkovD.Bourilkov User Community Report 44
Physics Analysis User Group
• Evolving dialog with applications WG on:• the scope (what is most valuable for physics
analysis)• priorities for implementing features• composition and timing of releases - aligned with
the milestones of the experiments
• Develops, in collaboration with the applications group, a suite of functional tests; can be used for:• measuring the progress of the project• educating new users and making it easier to pass
the threshold for adopting the system• demonstrating the UltraLight services in action in
education/outreach workshops
![Page 5: User Community Report](https://reader036.fdocuments.in/reader036/viewer/2022062304/5681460c550346895db31b3d/html5/thumbnails/5.jpg)
D.BourilkovD.Bourilkov User Community Report 55
Physics Analysis User Group
• Studies in depth the software framework of HEP applications (e.g. ORCA/COBRA or the new Software framework for CMS, ATHENA for ATLAS), the data and metadata models and the steps to best integrate the systems
• Maintains a close contact with people in charge of software developments in the experiments and responds to their requirements and needs
• Provides expert help with synchronization and integration between UltraLight and the software systems of the experiments
![Page 6: User Community Report](https://reader036.fdocuments.in/reader036/viewer/2022062304/5681460c550346895db31b3d/html5/thumbnails/6.jpg)
D.BourilkovD.Bourilkov User Community Report 66
Physics Analysis User Group
• Contributes to ATLAS/CMS Physics preparation milestones (UL members are already active in LHC physics and several analyses are officially recognized in CMS)
• In the longer term enables physics analysis and LHC physics research
• In the shorter term involved actively in SC|05 activities, culminating with the Bandwidth Challenge in Seattle in November
• Prepared a tutorial on data analysis with ROOT for the E & O workshop in Miami, June 2005
![Page 7: User Community Report](https://reader036.fdocuments.in/reader036/viewer/2022062304/5681460c550346895db31b3d/html5/thumbnails/7.jpg)
D.BourilkovD.Bourilkov User Community Report 77
CMS Data Samples for Testing
• For initial testing generated seven samples of Z’ events with masses from 0.2 to 5 TeV: fully simulated with OSCAR and reconstructed with ORCA on local Tier2 resources at UF; in total 42k events, ~ 2 GB in root trees (ExRootAnalysis format); used for SC|05
• Additional data for different channels: QCD background, top events, bosons + jets, SUSY points; ~ 35 GB, same format
• In addition ~ 100k single or di-muon events were generated over the summer at the FNAL LPC Tier1 resources
![Page 8: User Community Report](https://reader036.fdocuments.in/reader036/viewer/2022062304/5681460c550346895db31b3d/html5/thumbnails/8.jpg)
D.BourilkovD.Bourilkov User Community Report 88
Prototype CMS Analysis
• Developed a stand-alone C++ code to analyze the ExRootAnalysis trees:• Lightweight, no external dependencies
besides ROOT, used for iGrid2005 and SC|05 demos and by users at UF for analysis and CMS production validation
• Some parts of the info e.g. trigger bits, harder to access than in CMS framework (need to load ORCA libraries)
![Page 9: User Community Report](https://reader036.fdocuments.in/reader036/viewer/2022062304/5681460c550346895db31b3d/html5/thumbnails/9.jpg)
D.BourilkovD.Bourilkov User Community Report 99
Visualization
Before detector simulation: PYTHIA 4-vectors – CMKINViewer (DB)
After reconstruction – COJAC (Julian Bunn)
![Page 10: User Community Report](https://reader036.fdocuments.in/reader036/viewer/2022062304/5681460c550346895db31b3d/html5/thumbnails/10.jpg)
D.BourilkovD.Bourilkov User Community Report 1010
Collaborative Community Tools:CAVES / CODESH Projects
• Concentrate on the interactions between scientists collaborating over extended periods of time
• Seamlessly log, exchange and reproduce results and the corresponding methods, algorithms and programsAutomatic and complete logging and reuse of work /
analysis sessions: collect all user activities on the command line + code of all executed programs
• Extend the power of users working / performing analyses in their habitual way, giving them virtual logbook capabilities
• CAVES is used in normal analysis sessions with ROOT• CODESH is a UNIX shell with virtual logbook capabilities• Build functioning collaboration suites - close to users!• Formed a team: CODESH: DB & Vaibhav Khandelwal;
CAVES: DB & Sanket Totala
![Page 11: User Community Report](https://reader036.fdocuments.in/reader036/viewer/2022062304/5681460c550346895db31b3d/html5/thumbnails/11.jpg)
D.BourilkovD.Bourilkov User Community Report 1111
Choice of Scenarios
Case1: SimpleUser 1 : Does some analysis and produces a result with tag analX_user1. User 2: Browses all current tags in the repository and fetches the session stored with tag analX_user1.
Case2: ComplexUser 1 : Does some analysis and produces a result with tag analX_user1. User 2: Browses all current tags in the repository and fetches the session stored with tag analX_user1.User 2: Does a modification in the program obtained from the session of user1 and stores the same along with a new result with tag analX_user2_mod_code.User 1: Browses the repository, finds that his program was modified and decides to extract that session using the tag analX_user2_mod_code.This scenario can be extended to include an arbitrary number of steps and users in a working group or groups in a collaboration.
![Page 12: User Community Report](https://reader036.fdocuments.in/reader036/viewer/2022062304/5681460c550346895db31b3d/html5/thumbnails/12.jpg)
D.BourilkovD.Bourilkov User Community Report 1212
CAVES / CODESH ArchitecturesScalable and Distributed
First prototypes use popular tools: Python, ROOT and CVS; e.g. all ROOT commands and CAVES commands or all UNIX shell commands and CODESH commands available
![Page 13: User Community Report](https://reader036.fdocuments.in/reader036/viewer/2022062304/5681460c550346895db31b3d/html5/thumbnails/13.jpg)
D.BourilkovD.Bourilkov User Community Report 1313
Working Releases - CODESH
• Virtual log-book for “shell” sessions
• Parts can be local (private) or shared
• Tracks environment variables, aliases, invoked program code etc during a session
• Reproduces complete working sessions
• Complex CMS ORCA example operational
• All CMS data generations for the community group done at the LPC are stored in CODESH and the knowledge is available
![Page 14: User Community Report](https://reader036.fdocuments.in/reader036/viewer/2022062304/5681460c550346895db31b3d/html5/thumbnails/14.jpg)
D.BourilkovD.Bourilkov User Community Report 1414
Working Releases - CAVES
Higgs W+W-
![Page 15: User Community Report](https://reader036.fdocuments.in/reader036/viewer/2022062304/5681460c550346895db31b3d/html5/thumbnails/15.jpg)
D.BourilkovD.Bourilkov User Community Report 1515
Large Scale Data Transfers
• Network aspect:Bandwidth*Delay Product (BDP); we have to use TCP
windows matching it in the kernel AND the application
• On a local connection with 1GbE and RTT 0.19 ms, to fill the pipe we need around 2*BDP2*BDP = 2*1Gb/s*0.00019s = ~ 48 KBytesOr, for a 10 Gb/s LAN: 2*BDP = ~ 480 KBytes
• Now on the WAN: from Florida to Caltech the RTT is 115 ms. So for 1 Gb/s to fill the pipe we need2*BDP = 2*1Gb/s*0.115s = ~ 28.8 MBytes etc.
• User aspect: are the servers on both ends capable of matching these rates for useful disk-to-disk? Tune kernels, get highest possible disk read/write speed etc. Tables turned: WAN outperforms disk speeds!
![Page 16: User Community Report](https://reader036.fdocuments.in/reader036/viewer/2022062304/5681460c550346895db31b3d/html5/thumbnails/16.jpg)
D.BourilkovD.Bourilkov User Community Report 1616
bbcp Tests
bbcp was selected as a starting tool for WAN tests:• Supports multiple streams, highly tunable (window
size etc), peer-to-peer type
• Well supported by Andy Hanushevsky from SLAC
• Is used successfully in BaBar
• I have used it in 2002 for CMS production: massive data transfers from Florida to CERN; the only limit observed at the time was disk writing speed (LAN), network (WAN)
• Starting point Florida Caltech: < 0.5 MB/s on the WAN, very poor performance
![Page 17: User Community Report](https://reader036.fdocuments.in/reader036/viewer/2022062304/5681460c550346895db31b3d/html5/thumbnails/17.jpg)
D.BourilkovD.Bourilkov User Community Report 1717
Evolution of Tests Leading to SC|05
• End points in Florida (uflight1) and Caltech (nw1): AMD Opterons over UL network
• Tuning of kernels and bbcp window sizes – coordinated iterative procedure
• Current status (for file sizes ~ 2GB):• 6-6.5 Gb/s with iperf• up to 6 Gb/s memory to memory• 2.2 Gb/s ramdisk remote disk write
> the speed was the same writing to SCSI disk which is supposedly less than 80 MB/s or writing to a raid array, so de facto it always goes first to memory cache (the Caltech node has 16 GB ram)
• Used successfully with up to 8 bbcp processes in parallel from Florida to the show floor in Seattle; CPU load still OK
![Page 18: User Community Report](https://reader036.fdocuments.in/reader036/viewer/2022062304/5681460c550346895db31b3d/html5/thumbnails/18.jpg)
D.BourilkovD.Bourilkov User Community Report 1818
bbcp Examples Florida Caltech
[bourilkov@uflight1 data]$ iperf -i 5 -c 192.84.86.66 -t 60------------------------------------------------------------Client connecting to 192.84.86.66, TCP port 5001TCP window size: 256 MByte (default)------------------------------------------------------------[ 3] local 192.84.86.179 port 33221 connected with 192.84.86.66 port 5001[ 3] 0.0- 5.0 sec 2.73 GBytes 4.68 Gbits/sec[ 3] 5.0-10.0 sec 3.73 GBytes 6.41 Gbits/sec[ 3] 10.0-15.0 sec 3.73 GBytes 6.40 Gbits/sec[ 3] 15.0-20.0 sec 3.73 GBytes 6.40 Gbits/sec
bbcp: uflight1.ultralight.org kernel using a send window size of 20971584 not 10485792bbcp -s 8 -f -V -P 10 -w 10m big2.root [email protected]:/dev/nullbbcp: Sink I/O buffers (245760K) > 25% of available free memory (231836K); copy may be slowbbcp: Creating /dev/null/big2.rootSource cpu=5.654 mem=0K pflt=0 swap=0
File /dev/null/big2.root created; 1826311140 bytes at 432995.1 KB/s24 buffers used with 0 reorders; peaking at 0.Target cpu=3.768 mem=0K pflt=0 swap=01 file copied at effectively 260594.2 KB/s
bbcp -s 8 -f -V -P 10 -w 10m big2.root [email protected]:dimitribbcp: uflight1.ultralight.org kernel using a send window size of 20971584 not 10485792bbcp: Creating ./dimitri/big2.rootSource cpu=5.455 mem=0K pflt=0 swap=0
File ./dimitri/big2.root created; 1826311140 bytes at 279678.1 KB/s24 buffers used with 0 reorders; peaking at 0.Target cpu=10.065 mem=0K pflt=0 swap=01 file copied at effectively 150063.7 KB/s
![Page 19: User Community Report](https://reader036.fdocuments.in/reader036/viewer/2022062304/5681460c550346895db31b3d/html5/thumbnails/19.jpg)
D.BourilkovD.Bourilkov User Community Report 1919
Data Transfers and Analysis
• CMS service challenges
• Phedex CMS system for data transfers Tier0 Tier1 Tier2• Get expertise with the system
• Provide user feedback
• Integrate Storage/Transfer (SRM/Dcache/Phedex) with network
• Analysis of data from the cosmic runs in collaboration with FNAL Tier1 (muons, calorimetry)
![Page 20: User Community Report](https://reader036.fdocuments.in/reader036/viewer/2022062304/5681460c550346895db31b3d/html5/thumbnails/20.jpg)
D.BourilkovD.Bourilkov User Community Report 2020
Outlook on Data Transfers
• The UltraLight network is already very performant
• The hard problem from the user perspective now is to match it with servers capable of sustained rates for large files > 20 GB (when the memory caches are exhausted); fast disk writes are key (raid arrays)
• To fill 10 Gb/s pipes we need several pairs (3-4) of servers
• In ramdisk tests we achieved 1.2 GB/s on read and 0.3 GB/s on write (cp, dd, bbcp)
• Next step: disk-to-disk transfers between Florida, Caltech, Michigan, FNAL, BNL, CERN
![Page 21: User Community Report](https://reader036.fdocuments.in/reader036/viewer/2022062304/5681460c550346895db31b3d/html5/thumbnails/21.jpg)
D.BourilkovD.Bourilkov User Community Report 2121
UltraLight Analysis Environment
• Interact closely with the application group on integration of UltraLight services in the experiments’ software environments• Clarens web services oriented framework
• MCPS job submission
• Grid Analysis Environment etc.
See talk by Frank van Lingen – Application group
![Page 22: User Community Report](https://reader036.fdocuments.in/reader036/viewer/2022062304/5681460c550346895db31b3d/html5/thumbnails/22.jpg)
D.BourilkovD.Bourilkov User Community Report 2222
Align with ATLAS/CMS/OSG Milestones
• ATLAS/CMS Software stacks are complex and still developing
> Integration work is challenging and constantly evolving
• Data and Service Challenges 2006>Exercise computing services together with LCG + centers>System scale: 50% of single experiment’s needs in 2007
• Computing, Software, Analysis (CSA) Challenges 2006
>Ensure readiness of software + computing systems for data
>10M’s of events through the entire system (incl. Tier2)>Extensive needs for Tier2 to Tier2 data exchanges;
collaboration with DISUN
![Page 23: User Community Report](https://reader036.fdocuments.in/reader036/viewer/2022062304/5681460c550346895db31b3d/html5/thumbnails/23.jpg)
D.BourilkovD.Bourilkov User Community Report 2323
Outlook
• Dedicated groups of people (expert users) for data transfers and analysis tasks available
• Excellent collaboration with the networking and application groups
• Team for developing collaboration tools formed• Explore commonalities and increase the
participation of ATLAS at the analysis stage• SC|05 was a great success, laying a solid
foundation for the next steps• We are involved actively in LHC physics
preparations e.g. the CMS Physics TDR• The Physics Analysis group will play a key role in
achieving successful integration of UltraLight applications in the experiments’ analysis environments
![Page 24: User Community Report](https://reader036.fdocuments.in/reader036/viewer/2022062304/5681460c550346895db31b3d/html5/thumbnails/24.jpg)
D.BourilkovD.Bourilkov User Community Report 2424
Backup slides
![Page 25: User Community Report](https://reader036.fdocuments.in/reader036/viewer/2022062304/5681460c550346895db31b3d/html5/thumbnails/25.jpg)
D.BourilkovD.Bourilkov User Community Report 2525
Linux Kernel Tunings
• Edit sysctl.conf to add the following lines> net.core.rmem_default = 268435456
> net.core.wmem_default = 268435456
> net.core.rmem_max = 268435456
> net.core.wmem_max = 268435456
> net.core.optmem_max = 268435456
> net.core.netdev_max_backlog = 300000
> net.ipv4.tcp_low_latency = 1
> net.ipv4.tcp_timestamps = 0
> net.ipv4.tcp_sack = 0
> net.ipv4.tcp_rmem = 268435456 268435456 268435456
> net.ipv4.tcp_wmem = 268435456 268435456 268435456
> net.ipv4.tcp_mem = 268435456 268435456 268435456
• Enable on the fly the changes in sysctl.conf by executing: sysctl -p /etc/sysctl.conf
• Sizes ~ 256 MB worked best (bigger were not helpful)
![Page 26: User Community Report](https://reader036.fdocuments.in/reader036/viewer/2022062304/5681460c550346895db31b3d/html5/thumbnails/26.jpg)
D.BourilkovD.Bourilkov User Community Report 2626
bbcp Examples Caltech Florida
[uldemo@nw1 dimitri]$ iperf -s -w 256m -i 5 -p 5001 -l 8960------------------------------------------------------------Server listening on TCP port 5001TCP window size: 512 MByte (WARNING: requested 256 MByte)------------------------------------------------------------[ 4] local 192.84.86.66 port 5001 connected with 192.84.86.179 port 33221[ 4] 0.0- 5.0 sec 2.72 GBytes 4.68 Gbits/sec[ 4] 5.0-10.0 sec 3.73 GBytes 6.41 Gbits/sec[ 4] 10.0-15.0 sec 3.73 GBytes 6.40 Gbits/sec[ 4] 15.0-20.0 sec 3.73 GBytes 6.40 Gbits/sec[ 4] 20.0-25.0 sec 3.73 GBytes 6.40 Gbits/sec
bbcp -s 8 -f -V -P 10 -w 10m big2.root [email protected]:/dev/nullbbcp: Sink I/O buffers (245760K) > 25% of available free memory (853312K); copy may be slowbbcp: Source I/O buffers (245760K) > 25% of available free memory (839628K); copy may be slowbbcp: nw1.caltech.edu kernel using a send window size of 20971584 not 10485792bbcp: Creating /dev/null/big2.rootSource cpu=5.962 mem=0K pflt=0 swap=0
File /dev/null/big2.root created; 1826311140 bytes at 470086.2 KB/s24 buffers used with 0 reorders; peaking at 0.Target cpu=4.053 mem=0K pflt=0 swap=01 file copied at effectively 263793.4 KB/s