What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth...
-
Upload
laurel-norris -
Category
Documents
-
view
217 -
download
0
Transcript of What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth...
![Page 1: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649f2f5503460f94c49a7b/html5/thumbnails/1.jpg)
What is GridFTP?
High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks
Based on FTP protocol - defines extensions for high-performance operation and security
We supply a reference implementation: Server Client tools (globus-url-copy) Development Libraries
Multiple independent implementations can interoperate Fermi Lab and U. Virginia have home grown servers that
work with ours.
![Page 2: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649f2f5503460f94c49a7b/html5/thumbnails/2.jpg)
GridFTP
Two channel protocol like FTP Control Channel
Communication link (TCP) over which commands and responses flow
Low bandwidth; encrypted and integrity protected by default
Data Channel Communication link(s) over which the
actual data of interest flows High Bandwidth; authenticated by
default; encryption and integrity protection optional
![Page 3: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649f2f5503460f94c49a7b/html5/thumbnails/3.jpg)
Performance
Disk transfer between Urbana, IL and San Diego, CA
0
5
10
15
20
0 10 20 30 40 50 60 70
Degree of Striping
Throughput (Gbit/s) # Stream = 1 # Stream = 2 # Stream = 4# Stream = 8 # Stream = 16 # Stream = 32
![Page 4: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649f2f5503460f94c49a7b/html5/thumbnails/4.jpg)
GridFTP over UDT
GridFTP uses XIO for network I/O operations XIO presents a POSIX-like interface to many
different protocol implementations
GSI
TCP
Default GridFTP
GridFTP over UDT
GSI
UDT
![Page 5: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649f2f5503460f94c49a7b/html5/thumbnails/5.jpg)
Reliable File Transfer Service (RFT)
RFT Service
RFT Client
SOAP Messages
Notifications(Optional)
GridFTP Server
GridFTP Server
CC CC
DC
Persistent Store
GridFTP client WSRF complaint fault-tolerant service
![Page 6: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649f2f5503460f94c49a7b/html5/thumbnails/6.jpg)
Architecture Overview
RFTClient
VO 1
RFT Service
GridFTPStriped Server
GridFTPClient A
GridFTPServer
GridFTPServer GridFTP
Client BBB
VO 2
GridFTPStriped Server
GridFTPServer
RFT
A
![Page 7: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649f2f5503460f94c49a7b/html5/thumbnails/7.jpg)
GridFTP Service
On demand transfer service When a connection is formed, resources are dedicated GridFTP might say “not now” Not a queuing service
Transfer data as fast as possible Maximize resource usage
Without over heating!
![Page 8: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649f2f5503460f94c49a7b/html5/thumbnails/8.jpg)
What GridFTP Does
Fast data transfer service Cluster to cluster copy tool
Intra-cluster broadcast tool Multi-cast transfers
Scalable Need more throughput, add more stripes
![Page 9: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649f2f5503460f94c49a7b/html5/thumbnails/9.jpg)
RFT Service
Orchestrates transfers on client’s behalf Third party transfers Interacts with many GridFTP servers
Sees a bigger picture VO level
Queue requests RFT should not say no
Retry requests on failure Optimizes its workload
![Page 10: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649f2f5503460f94c49a7b/html5/thumbnails/10.jpg)
What RFT Does
Reliable service DB backend Recovers from GridFTP and RFT service failures
Batch requests Light weight sessions Submit a Request Wait for notifications
Started, finished, failed, etc
![Page 11: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649f2f5503460f94c49a7b/html5/thumbnails/11.jpg)
GridFTP: On Demand Service
Resources are limited Data transfers are heavy weight operations Sometimes hardware is too busy
Adding another transfer can cause thrashing Collective system throughput goes down
GridFTP might say “no”
Transfer requests happen immediately We do not queue, or delay transfers An established session means an active transfer
![Page 12: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649f2f5503460f94c49a7b/html5/thumbnails/12.jpg)
Why Doesn’t GridFTP Queue
A GridFTP session is heavy weight Idle sessions consume resources
Backward compatible protocol
Sometimes less is more Goal: Maximize the collective throughput
Sum of all active transfer rates Too many transfers cause thrashing
Results in lower collective throughput Avoid overheating system resources
It is in the systems best interest We know what’s good for you
![Page 13: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649f2f5503460f94c49a7b/html5/thumbnails/13.jpg)
GridFTP Session Resources
Even for an idle session Active TCP control channel
Part of the 959 protocol. A session is defined by a TCP connection
Fork/setuid process
Robustness File system/OS permissions
OS buffer space
Data channels require large TCP OS buffers
Active transfers Lots of memory/Net/Disk IO
Avoid too small of partitions
![Page 14: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649f2f5503460f94c49a7b/html5/thumbnails/14.jpg)
If GridFTP Always Said Yes
OOM: the out of memory handle OS optimistic provision of TCP buffers Random processes will be killed Meltdown
Shared FS overuse Pushing the I/O throughput beyond optimal Causing OOM on IOD machines
Shares of bandwidth too small 1 Million transfers at 500b/s each? OR 10 transfers at 100Mb/s each
![Page 15: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649f2f5503460f94c49a7b/html5/thumbnails/15.jpg)
Simultaneous Sessions
Goal: Collective throughput entire servers bytes transferred / time
Not the number of transfers at once
Only reasons for more than 1 connection Provide an interactive service for many One session does not use all of the local resource
The remote side is the bottleneck Hide control messaging overhead in another sessions data transfer payload
![Page 16: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649f2f5503460f94c49a7b/html5/thumbnails/16.jpg)
Remote Bottleneck
Allow more than one simultaneous transfer to use all resources
10Gb/s
![Page 17: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649f2f5503460f94c49a7b/html5/thumbnails/17.jpg)
But We Want Queuing!
May I offer you something in an RFT? RFT says yes Server side retries
Light weight sessions GridFTP does the heavy lifting Queues up requests of pending transfers Notification upon completion Scalability
Manages/Optimizes access to GridFTP Servers
![Page 18: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649f2f5503460f94c49a7b/html5/thumbnails/18.jpg)
GridFTPdst
GridFTPsrc
RFT Session Interactions
RFT
GridFTPsrc
GridFTPdst
ClientRequest
Notification
![Page 19: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649f2f5503460f94c49a7b/html5/thumbnails/19.jpg)
Scalability
GridFTP Connection rejection is a feature
It SHOULD say no Intended to scale to system transfer rates
Not beyond them To scale up add more nodes as stripes (dynamic
backbends) Use faster NICs
RFT Intended to scale to memory
It should not say no
![Page 20: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649f2f5503460f94c49a7b/html5/thumbnails/20.jpg)
GridFTP Broke My Cluster!
GridFTP will push hardware as hard as it is allowed
But not harder
sudo rm –rf / Did sudo break the FS?
ssh –u root host1 fork.bomb Did sshd take down the host?
globus-url-copy –tcp-bs 100GB <src> <dst>
Did GridFTP break the cluster?
![Page 21: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649f2f5503460f94c49a7b/html5/thumbnails/21.jpg)
Resource Protection
Limits need to be in place to protect Knowing it is ok to say ‘no’ is step 1
What will hardware allow? How fast are my disks? How fast is my NIC? How fast is can I send data while using the NetFS? How many WAN transfers can I support with system memory? How many simultaneous transfers can are reasonable to sustain?
![Page 22: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649f2f5503460f94c49a7b/html5/thumbnails/22.jpg)
Fast Transfer Resources
CPU Packet switching
Memory OS buffers (BWDP) User space buffers WAN needs much more
System bus Disk
Shared FS? (net also)
Network Router and LAN
TCP Buffers
CPU
Bus
![Page 23: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649f2f5503460f94c49a7b/html5/thumbnails/23.jpg)
Cluster Components
Disk Shared I/O servers
Net Backplate bandwidth
Systems CPU/Memory Are IODs and GridFTP servers co
located?Shared IO Servers
GridFTP Backends
GridFTP Frontends
![Page 24: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649f2f5503460f94c49a7b/html5/thumbnails/24.jpg)
Connection Caps
As a function of system memory Cap = |mem| / (2MB + avg(BWDP)) Never more than |mem| / 4MB
service gsiftp{ instances = 20 socket_type = stream wait = no
env += GLOBUS_LOCATION=… env += LD_LIBRARY_PATH=… server = /usr/local/globus-4.0.1/sbin/globus-gridftp-server server_args = -i -p 2811
disable = no}
% globus-gridftp-server –connection-max 20
![Page 25: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649f2f5503460f94c49a7b/html5/thumbnails/25.jpg)
Connection Caps
As a function of system bandwidth Cap = min(FS.BW, Net.BW) / (Target average transfer rate)
As a function of my gut 20 - 50 Best guess based on personal experience Typically this is where collective BW plateaus
![Page 26: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649f2f5503460f94c49a7b/html5/thumbnails/26.jpg)
System Buffer Limits
Limit the amount of OS space per conneciton
Auto tuning 16MB - 64MB
% sysctl -w net.core.rmem_max=<value>% sysctl -w net.core.wmem_max=<value>
% cat /proc/sys/net/ipv4/tcp_wmem4096 16384 4194304
% cat /proc/sys/net/ipv4/tcp_rmem4096 16384 4194304
![Page 27: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649f2f5503460f94c49a7b/html5/thumbnails/27.jpg)
GFork Memory Manager
Dynamically rations memory 10% of the allowed connections get 90% of the memory Remaining session get half of available memory
Allows for high connection limits |mem| / 2MB
![Page 28: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649f2f5503460f94c49a7b/html5/thumbnails/28.jpg)
Conclusions GridFTP is an on demand service
OK to say no RFT is a VO level queuing service
please use it http://www.gridftp.org [email protected]
![Page 29: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649f2f5503460f94c49a7b/html5/thumbnails/29.jpg)
What is OGSA DAI?
OGSA-DAI executes workflows OGSA-DAI is not just for data access, also
does data updates, transformations and delivery.
![Page 30: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649f2f5503460f94c49a7b/html5/thumbnails/30.jpg)
The Globus Replica Location Service
Distributed registry Records the locations of data
copies Allows replica discovery RLS maintains mappings
between logical identifiers and target names
Must perform and scale well: support hundreds of millions
of objects hundreds of clients
Mature and stable component of the Globus Toolkit
Replica Location Indexes
Local Replica Catalogs
LRC LRC LRC LRC
RLI RLI RLI
RLI RLI
![Page 31: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649f2f5503460f94c49a7b/html5/thumbnails/31.jpg)
Approach: Combine Pegasus Workflow Management with Globus Data Replication Service
Workflow Planner: Pegasus
Data Placement
Service: Globus
DRS
Compute Cluster Storage
ElementsJobs Data
Transfer
Workflow Tasks
Staging Request
Setup Transfers
![Page 32: What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines.](https://reader036.fdocuments.in/reader036/viewer/2022062408/56649f2f5503460f94c49a7b/html5/thumbnails/32.jpg)
Examples of Placement Policies