XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number...
Transcript of XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number...
![Page 1: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/1.jpg)
XRootD Monitoring Report A.Beche
D.Giordano
![Page 2: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/2.jpg)
Outlines Talk 1: XRootD Monitoring Dashboard Context Dataflow and deployment model Database: storage & aggregation User interface & use cases Open issues & future work Summary
Talk 2: Beyond XRootD monitoring HTTP/WebDAV integration Integration in the WLCG Transfers Dashboard
10 – April - 14 A.Beche – Federated Workshop 2
![Page 3: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/3.jpg)
05
1015202530354045
# si
tes
Number of sites reporting
XRootD federation monitoring
Activity started during summer 2012 4 sites for FAX, 11 for AAA
Monitoring data increased accordingly
July 2012 March 2014
AAA 606k 43M
FAX 15k 22M
10 – April - 14 A.Beche – Federated Workshop 3
![Page 4: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/4.jpg)
Why monitoring ?
Understand data flows to estimate data traffic Provide information for efficient operations
Identify access patterns and propose data
placement strategies
10 – April - 14 A.Beche – Federated Workshop 4
![Page 5: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/5.jpg)
Raw
Stats
10 m
inut
es
XRootD monitoring dataflow
Federation GLED
Collector Consumer
WEB API
Dashboard UI
External applications
real time
UDP
stomp
stomp
asynchronous
ActiveMQ
10 – April - 14 A.Beche – Federated Workshop 5
![Page 6: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/6.jpg)
GLED Deployment model
050
100150200
Hz
EOS monitoring data rate
05
101520
Hz
Federation monitoring data rate
AMQ @ CERN Shared cluster
5 nodes AAA
UCSD (16Hz)
EOS CERN (10Hz)
EOS CERN (150Hz)
FAX US SLAC (9Hz)
FAX EU CERN (1 site)
10 – April - 14 A.Beche – Federated Workshop 6
![Page 7: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/7.jpg)
Consolidated dataflow
Two usage of these raw data: Dashboard monitoring XRootD popularity
Now share the same database: Storage optimization Consistency guaranteed
10 – April - 14 A.Beche – Federated Workshop 7
![Page 8: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/8.jpg)
AAA ~300 GB
~1B records
Database
FAX ~600 GB
~2B records
Daily insert 2 GB / 6M rows
Storage Raw, statistics, metadata Tables daily partitioned, no global indexes
0100200300400500600700
GB
Database usage growth*
* Indexes excluded
10 – April - 14 A.Beche – Federated Workshop 8
![Page 9: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/9.jpg)
Database Raw data aggregation: Done using PL/SQL procedures Events are unordered Stateless: Full re-computation of touched bins
each time Compute stats from raw data in 10 min bins Aggregate 10 min stats in daily bins
10 – April - 14 A.Beche – Federated Workshop 9
![Page 10: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/10.jpg)
Aggregation methods
2pm 3pm 4pm 5pm 6pm 7pm Tran
sfer
s
Easy method
Transfers 1 0 0 2 1
Bytes
10 0 0 15 20
10 – April - 14 A.Beche – Federated Workshop 10
![Page 11: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/11.jpg)
Aggregation methods
2pm 3pm 4pm 5pm 6pm 7pm
Transfers 1 0 0 2 1
Bytes
10 0 0 15 20
Transfers 1 (1) 1 (0) 2 (0) 3 (2) 1 (1)
Bytes 8 1 14 (9+6) 15 (1+9+5) 5
Easy method
Tran
sfer
s
Adopted method
10 – April - 14 A.Beche – Federated Workshop 11
![Page 12: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/12.jpg)
Visualization Interface
10 – April - 14 A.Beche – Federated Workshop 12
![Page 13: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/13.jpg)
Pre-defined set of views
10 – April - 14 A.Beche – Federated Workshop 13
![Page 14: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/14.jpg)
Use case example Understand site access patterns
1. Which sites are reading from FNAL
2. Zoom to a specific site to understand which users are reading
3. Understand which files are read by a user
1
2 3
2
10 – April - 14 A.Beche – Federated Workshop 14
![Page 15: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/15.jpg)
Data popularity XRootD monitoring provides information
about file access patterns: Including non official collections (ie: user files) Contribute to simplify and make more efficient the
usage of disk resources
Popularity data analytics built on this information: Adopted already for CMS-EOS will be extended to full AAA
10 – April - 14 A.Beche – Federated Workshop 15
![Page 16: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/16.jpg)
Archive recommendation for CMS-EOS Help to manage the disk space of EOS including user space
No central bookkeeping system
Unused files: created > 4 months ago, no access in the last 3 months: ~500 TB of space occupied and not used <=> 30% of total for these areas
10 – April - 14 A.Beche – Federated Workshop 16
% TB
![Page 17: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/17.jpg)
Open issues Missing servers: Dcache sites
Server should provide their site name. CMS: only 5 sites:
anon, BUDAPEST, Hephy-Vienna, T2_US_USCD, UKI-LT2-Brunel Not coherent convention naming
ATLAS: GLED RPM to be deployed
GLED Collector improvements: Reliability of the service:
Recover time, can be long due to time difference GLED should be operated as a production service
Scalability: to be fixed with automatic reconnection soon
10 – April - 14 A.Beche – Federated Workshop 17
![Page 18: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/18.jpg)
Future work Strong requirement from ATLAS to
understand efficiency: Need the concept of error / failure How XRootD server could be instrumented to report it?
European GLED collector is up and running: Only 1 pilot site is reporting to it (CNAF) Should we keep it?
Data mining activity (not started yet): Almost 2 years of raw data (1TB)
10 – April - 14 A.Beche – Federated Workshop 18
![Page 19: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/19.jpg)
Data Mining Extract further knowledge from the data… Detect inefficiencies Propose deletion strategies Define data placement
… by Understand access patterns and data usage Correlate data traffic and data access performance
Possibility to automate some operations
10 – April - 14 A.Beche – Federated Workshop 19
![Page 20: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/20.jpg)
Application usage
20
10
30
15
FAX AAA
10 – April - 14 A.Beche – Federated Workshop 20
![Page 21: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/21.jpg)
Summary Monitoring federations is a challenge
High rate of traffic & information Challenge met by data aggregation, scalable technologies
Dashboard is not actively used
Less than 10 daily users (FAX), less than 15 (AAA) Is there any missing functionalities?
Improvement work is ongoing
New requests are coming
XRootD monitoring is a one piece of the entire Data transfers puzzle See next talk
10 – April - 14 A.Beche – Federated Workshop 21
![Page 22: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/22.jpg)
Beyond XRootD monitoring A.Beche
D.Giordano
![Page 23: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/23.jpg)
Outlines Talk 1: XRootD Monitoring Dashboard Context Dataflow and deployment model Database: storage & aggregation User interface & use cases Open issues & future work Summary
Talk 2: Beyond XRootD monitoring HTTP/WebDAV integration Integration in the WLCG Transfers Dashboard
10 – April - 14 A.Beche – Federated Workshop 23
![Page 24: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/24.jpg)
HTTP Federation is coming
HTTP protocol will be used in the future XRootD servers can be accessed See Fabrizio’s presentation on xrdhttp
Two kind of accesses: Pure HTTP access (through Apache) HTTP gate to XRootD server
Can’t be monitor in the same way
10 – April - 14 A.Beche – Federated Workshop 24
![Page 25: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/25.jpg)
Monitoring XRootD access protocol
XRootD 4 will now reports the user protocol: All the monitoring chain needs to be updated Dashboard DB and UI are fully ready
HTTP
XRootD
10 – April - 14 A.Beche – Federated Workshop 25
![Page 26: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/26.jpg)
Site
GLED collector
ActiveMQ
JOB
XRootD Federation
XRoo
tD
Site
SE
HTTP/WebDAV federation monitoring
10 – April - 14 A.Beche – Federated Workshop 26
![Page 27: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/27.jpg)
Site
GLED collector
ActiveMQ
JOB
XRootD Federation
XRoo
tD
Site
SE
HTTP Federation
Site
HTTP/WebDAV federation monitoring
10 – April - 14 A.Beche – Federated Workshop 27
![Page 28: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/28.jpg)
Site
28
GLED collector
ActiveMQ
JOB
JOB
XRootD Federation HTTP Federation
XRoo
tD Xrd
HTTP
Site Site
SE
29 November 2013 Alexandre Beche - ITTF
HTTP/WebDAV federation monitoring
![Page 29: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/29.jpg)
Site
GLED collector
ActiveMQ
JOB
JOB
JOB
XRootD Federation HTTP Federation
XRoo
tD Xrd
HTTP
Apache
Site Site
SE
HTTP/WebDAV federation monitoring
10 – April - 14 A.Beche – Federated Workshop 29
![Page 30: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/30.jpg)
Site
GLED collector
ActiveMQ
JOB
JOB
JOB
XRootD Federation HTTP Federation
XRoo
tD Xrd
HTTP
Apache
Site Site
SE
?
HTTP/WebDAV federation monitoring
10 – April - 14 A.Beche – Federated Workshop 30
![Page 31: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/31.jpg)
How to compare data from different applications?
10 – April - 14 A.Beche – Federated Workshop 31
![Page 32: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/32.jpg)
data transfers & accesses monitoring tools
WEB API / UI
WEB API/UI
WEB API/UI
WLCG FAX AAA
FAX
EO
S
AAA
EO
S
FTS
10 – April - 14 A.Beche – Federated Workshop 32
![Page 33: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/33.jpg)
WLCG Transfers Dashboard federated approach
WEB API / UI
WEB API/UI
WEB API/UI
FTS FAX AAA FA
X
EO
S
AAA
EO
S
FTS
WLCG Transfers Dashboard API / UI
10 – April - 14 A.Beche – Federated Workshop 33
![Page 34: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/34.jpg)
Some plots
10 – April - 14 A.Beche – Federated Workshop 34
FTS
XRootD
ALTAS
CMS
LHCb
ALICE
![Page 35: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/35.jpg)
Summary
Lots of effort has been put in XRootD monitoring workflow and dashboard in the last 2 years Reliable system achieved Lots of use cases covered
HTTP Monitoring already started Will require a lot of effort to reach XRootD monitoring level
New WLCG Transfers Dashboard architecture Highly extensible system Cross-VO or cross-technology analysis
10 – April - 14 A.Beche – Federated Workshop 35
![Page 36: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/36.jpg)
Credits Andreeva Julia Cons Lionel Giordano Domenico Saiz Pablo Tadel Matevz Tuckett David Vukotic Ilija The AAA and FAX deployment team ….
10 – April - 14 A.Beche – Federated Workshop 36
![Page 37: XRootD Monitoring Report - indico.fnal.gov · 0. 5. 10. 15. 20. 25. 30. 35. 40. 45 # sites. Number of sites reporting . XRootD federation monitoring Activity started during summer](https://reader033.fdocuments.in/reader033/viewer/2022060415/5f136aa4f98566651f51cdf4/html5/thumbnails/37.jpg)
Useful links AAA Dashboard
http://dashb-cms-xrootd-transfers.cern.ch FAX Dashboard:
http://dashb-atlas-xrootd-transfers.cern.ch CHEP materials
https://indico.cern.ch/abstractDisplay.py?abstractId=101&confId=214784 https://indico.cern.ch/getFile.py/access?contribId=94&sessionId=6&resId=0&materialId=slide
s&confId=214784 https://indico.cern.ch/getFile.py/access?contribId=265&sessionId=6&resId=1&materialId=slid
es&confId=214784
Xbrowse framework: https://twiki.cern.ch/twiki/bin/view/ArdaGrid/XbrowseFramework
Thanks for your attention
10 – April - 14 A.Beche – Federated Workshop 37