Data Transfer Efficiency - leave no byte unchurned Jens Jensen Rutherford Appleton Laboratory...
-
Upload
audra-hardy -
Category
Documents
-
view
219 -
download
1
Transcript of Data Transfer Efficiency - leave no byte unchurned Jens Jensen Rutherford Appleton Laboratory...
Data Transfer Efficiency- leave no byte unchurned
Jens JensenRutherford Appleton
LaboratoryGridPP26, U Sussex, March
2011
Background
• GridPP’s data grid– Distributed Storage Elements– Data movers (FTS, PhEDEx et al)– Catalogues (usu. replica)
• e-Infrastructure (aka cyberinfrastructure)
• (Presentation at ISGC)
The Data Grid
• WLCG is primarily a data grid– Computation can (in principle) be
redone• Jobs go to where data is
– Moving a job is quicker than moving data
Postmature non-optimisation is the root of
some evil• The role of infrastructure code
– Scientist as a programmer– “Bad” code moves up the stack?– “Bad” code improves over time?
• Doofers stay in prod’n
Efficiencaciousness Goals
Service• Availability• Performance• Grows as needed• Robust (no SPoF?)
People• (Effective)
support• Training• Expertise• Availability of…
Approaches• Philosophy
– Get it done – WLCG– Get it done right – EGI?– Do It Perfectly The First Time…
• Evolutionary (control system) vs revolutionary– Proactive vs reactive
Efficiencaciousness Issues
• Failures– Sites – BDII, network– Elements – storage– Components – disk servers
• Timeouts• DDoS
Efficiencaciousness Issues
• Overall effort– Funded, contributed, external
• Availability of expertise– Single Point of Knowledge
• Decoherence• 2nd Law of Thermodynamics• Learning from incidents
Efficiencaciousness Issues
• Primary communication– Sites– Users: large VOs, small VOs, single
users– PMB
• Secondary– WLCG– NGS
Efficiencaciousness Issues
• Sites– There Is Always A Bottleneck
Somewhere– Site dependent– Usage dependent
• Information– Freshness– Accuracy (“spped is substute fo
accurcy”)
Efficiencaciousness Issues
• Usage patterns– C.f. Wahid’s talk yesterday– WAN vs LAN (WN) traffic
• Technology– In the narrow sense (drives, controllers)– And the wider sense: dist’d filesystems
• Support: Upstream (EGI), Fabric
Efficiencaciousness Issues
• Overheads– Complexity of use of stack (see next)– Infrastructure is complex– But Complexity Has To Go Somewhere
• Time-to-production– Testing, troubleshooting, monitoring,
tweaking, tuning
•DDM et al
Expt
•FTS
•Catalogues
Data movers
•SRM
•SE GRIS
Data control
•WAN: GridFTP
•LAN: RFIO, DCAP, …
Transport
•Routers, switches, firewalls, OPN
Network
•HDD, SSD, tapes
•Network cards, disk/RAID controllers
Fabric
With apologies to the OSI stack
Graeme’s talk
• “Get the best out of what we can afford to buy”
• Proactive sites better• Standards are good
E[GM]I involvement
• EMI data roadmap– Support for dCache, DPM, StoRM– Support for standards (NFS4, CDMI)
• But then– StoRM=INFN, dCache=DESY,
DPM=CERN
The Cloud View
• Supplement resources with on-demand
• Agile• CDMI is superset of SRM
– But using ReST+JSON, not SOAP
(Open) Standards
• Standards promote interoperation and stability
• Interoperation • Multiple (independent)
implementations– Both Java and (C or C++)
The Case for Non-HEP Data
• Benefit from non-HEP data– Outreachy stuff– Benefit to society (eg saving lives)
• NGI interop (at compute)• Others…