Globus Toolkit - Argonne National Laboratorykettimut/talks/Managed_GridFTP.pdf · 2011-05-25 ·...

21
Globus Toolkit Managed GridFTP Raj Kettimuthu Argonne National Laboratory and The University of Chicago

Transcript of Globus Toolkit - Argonne National Laboratorykettimut/talks/Managed_GridFTP.pdf · 2011-05-25 ·...

Page 1: Globus Toolkit - Argonne National Laboratorykettimut/talks/Managed_GridFTP.pdf · 2011-05-25 · • High-performance, secure data transfer protocol optimized for high-bandwidth wide-area

Globus Toolkit

Managed GridFTP Raj Kettimuthu Argonne National Laboratory and The University of Chicago

Page 2: Globus Toolkit - Argonne National Laboratorykettimut/talks/Managed_GridFTP.pdf · 2011-05-25 · • High-performance, secure data transfer protocol optimized for high-bandwidth wide-area

www.globustoolkit.org

•  High-performance, secure data transfer protocol optimized for high-bandwidth wide-area networks

•  Based on FTP protocol - defines extensions for high-performance operation and security

•  Multiple independent implementations can interoperate –  Fermi Lab and U. Virginia have home grown

servers that work with ours

GridFTP

2

Page 3: Globus Toolkit - Argonne National Laboratorykettimut/talks/Managed_GridFTP.pdf · 2011-05-25 · • High-performance, secure data transfer protocol optimized for high-bandwidth wide-area

www.globustoolkit.org

GridFTP

•  Two channel protocol like FTP •  Control Channel

–  Communication link (TCP) over which commands and responses flow

–  Low bandwidth; encrypted and integrity protected by default

•  Data Channel –  Communication link(s) over which the

actual data of interest flows –  High Bandwidth; authenticated by

default; encryption and integrity protection optional

3

Page 4: Globus Toolkit - Argonne National Laboratorykettimut/talks/Managed_GridFTP.pdf · 2011-05-25 · • High-performance, secure data transfer protocol optimized for high-bandwidth wide-area

www.globustoolkit.org

•  Parallel TCP streams, optimal TCP buffer •  Non TCP protocol such as UDT •  Cluster-to-cluster data movement

•  SSH, Grid Security Infrastructure (GSI) •  Transfer checkpointing

Globus GridFTP

4

Page 5: Globus Toolkit - Argonne National Laboratorykettimut/talks/Managed_GridFTP.pdf · 2011-05-25 · • High-performance, secure data transfer protocol optimized for high-bandwidth wide-area

www.globustoolkit.org

GridFTP Servers Around the World

5

Created by Tim Pinkawa (Northern Illinois University) using MaxMind's GeoIP technology (http://www.maxmind.com/app/ip-locate).

Page 6: Globus Toolkit - Argonne National Laboratorykettimut/talks/Managed_GridFTP.pdf · 2011-05-25 · • High-performance, secure data transfer protocol optimized for high-bandwidth wide-area

www.globustoolkit.org

GridFTP Usage

6

Page 7: Globus Toolkit - Argonne National Laboratorykettimut/talks/Managed_GridFTP.pdf · 2011-05-25 · • High-performance, secure data transfer protocol optimized for high-bandwidth wide-area

www.globustoolkit.org

GridFTP: On Demand Service

  Transfer requests happen immediately   We do not queue, or delay transfers   An established session means an active

transfer   Transfer data as fast as possible   Resources are limited

  Data transfers are heavy weight operations   Sometimes hardware is too busy

  Adding another transfer can cause thrashing   Collective system throughput goes down

7

Page 8: Globus Toolkit - Argonne National Laboratorykettimut/talks/Managed_GridFTP.pdf · 2011-05-25 · • High-performance, secure data transfer protocol optimized for high-bandwidth wide-area

www.globustoolkit.org

Why Doesn’t GridFTP Queue

  Backward compatibility with legacy FTP   Even for an idle session

  Active TCP control channel   Part of the 959 protocol.   A session is defined by a TCP connection

  Fork/setuid process   Robustness   File system/OS permissions

  OS buffer space   Data channels require large TCP OS buffers

8

Page 9: Globus Toolkit - Argonne National Laboratorykettimut/talks/Managed_GridFTP.pdf · 2011-05-25 · • High-performance, secure data transfer protocol optimized for high-bandwidth wide-area

www.globustoolkit.org

If GridFTP Always Said Yes

  OOM: the out of memory handle   OS optimistic provision of TCP buffers   Random processes will be killed   Meltdown

  Shared FS overuse   Pushing the I/O throughput beyond optimal   Causing OOM on IOD machines

  Shares of bandwidth too small   1 Million transfers at 500b/s each?   OR 10 transfers at 100Mb/s each

9

Page 10: Globus Toolkit - Argonne National Laboratorykettimut/talks/Managed_GridFTP.pdf · 2011-05-25 · • High-performance, secure data transfer protocol optimized for high-bandwidth wide-area

www.globustoolkit.org

Simultaneous Sessions

  Goal: Collective throughput   Entire servers bytes transferred / time

  Not the number of transfers at once

  Only reasons for more than 1 connection   Provide an interactive service for many   One session does not use all of the local

resource   The remote side is the bottleneck

  Hide control messaging overhead in another sessions data transfer payload

10

Page 11: Globus Toolkit - Argonne National Laboratorykettimut/talks/Managed_GridFTP.pdf · 2011-05-25 · • High-performance, secure data transfer protocol optimized for high-bandwidth wide-area

www.globustoolkit.org

Resource Protection

  GridFTP   Connection rejection is a feature

  It SHOULD say no   Intended to scale to system transfer rates   To scale up add more data nodes

  Limits need to be in place to protect   Knowing it is ok to say ‘no’ is step 1

11

Page 12: Globus Toolkit - Argonne National Laboratorykettimut/talks/Managed_GridFTP.pdf · 2011-05-25 · • High-performance, secure data transfer protocol optimized for high-bandwidth wide-area

www.globustoolkit.org

Connection Caps

  As a function of system memory   Cap = |mem| / (2MB +

avg(BWDP))‏   Never more than

|mem| / 4MB   As a function of system

bandwidth   Cap = min(FS.BW,

Net.BW) / (Target average transfer rate)‏

service gsiftp { instances = 20 socket_type = stream wait = no

env += GLOBUS_LOCATION=… env += LD_LIBRARY_PATH=… server = /usr/local/globus-5.0.3/sbin/globus-gridftp-server server_args = -i -p 2811

disable = no }

% globus-gridftp-server –connection-max 20

12

Page 13: Globus Toolkit - Argonne National Laboratorykettimut/talks/Managed_GridFTP.pdf · 2011-05-25 · • High-performance, secure data transfer protocol optimized for high-bandwidth wide-area

www.globustoolkit.org

Resource Control

  To access the server   A user must be authenticated   Have read and write permissions and   Respect the total connection limit

  But beyond these requirements, there is no management or control   A user can hold a connection open

indefinitely   Move an unlimited number of files (barring

disk space or system quota constraints).   A more flexible management is needed

limit, prioritize and control 13

Page 14: Globus Toolkit - Argonne National Laboratorykettimut/talks/Managed_GridFTP.pdf · 2011-05-25 · • High-performance, secure data transfer protocol optimized for high-bandwidth wide-area

www.globustoolkit.org 14

Server  Host  

Inetd/  Xinted  

GridFTP  Server    

Instance  

Fork  

GridFTP  Server    

Instance  

GridFTP    Server    Instance  

Client  

Inherited    

Links  

Control  Channel  Connec;ons  

Client  

Client  

Inetd/Xinted

Page 15: Globus Toolkit - Argonne National Laboratorykettimut/talks/Managed_GridFTP.pdf · 2011-05-25 · • High-performance, secure data transfer protocol optimized for high-bandwidth wide-area

www.globustoolkit.org 15

Server  Host  

GFork  Server  

Master  Plugin  

GridFTP  Server    

Instance  

Fork  

GridFTP  Server    

Instance  

GridFTP    Server    Instance  

State    Sharing      Link  

Client  

Inherited    

Links  

Control  Channel  Connec;ons  

Client  

Client  

Globus Fork (GFork)

Page 16: Globus Toolkit - Argonne National Laboratorykettimut/talks/Managed_GridFTP.pdf · 2011-05-25 · • High-performance, secure data transfer protocol optimized for high-bandwidth wide-area

www.globustoolkit.org

GFork Memory Manager

  Dynamically rations memory   10% of the allowed connections get 90% of

the memory   Remaining session get half of available

memory   Allows for high connection limits

  |mem| / 2MB   This limitation is different from the original

connection limitation   Based on amount of used memory, not a

static value based on total system memory.

16

Page 17: Globus Toolkit - Argonne National Laboratorykettimut/talks/Managed_GridFTP.pdf · 2011-05-25 · • High-performance, secure data transfer protocol optimized for high-bandwidth wide-area

www.globustoolkit.org 17

Performance with Memory Management

Performance without Memory Management

Memory Management

Page 18: Globus Toolkit - Argonne National Laboratorykettimut/talks/Managed_GridFTP.pdf · 2011-05-25 · • High-performance, secure data transfer protocol optimized for high-bandwidth wide-area

www.globustoolkit.org 18

Control

Client

DC

Data

Control

IPC

Data Data Data

Data Data Data Data

DC

DC

DC

IPC

10x Gbit/s

1 Gbit/s 1 Gbit/s 1 Gbit/s 1 Gbit/s

Striped Data Movement

Page 19: Globus Toolkit - Argonne National Laboratorykettimut/talks/Managed_GridFTP.pdf · 2011-05-25 · • High-performance, secure data transfer protocol optimized for high-bandwidth wide-area

www.globustoolkit.org 19

   

Control node

GFork

Server

GridFTP

Plugin

Frontend

Instance

Fork

Lookup available backend

Backend

Instance

Data Mover

Fork

Registration

Control Connection GFork

Server

GridFTP

Plugin

Dynamic Data Movers

Page 20: Globus Toolkit - Argonne National Laboratorykettimut/talks/Managed_GridFTP.pdf · 2011-05-25 · • High-performance, secure data transfer protocol optimized for high-bandwidth wide-area

www.globustoolkit.org

Future work

  This is a start   More sophisticated resource management

capabilities are needed   Better than best-effort service   Service guarantees

20

Page 21: Globus Toolkit - Argonne National Laboratorykettimut/talks/Managed_GridFTP.pdf · 2011-05-25 · • High-performance, secure data transfer protocol optimized for high-bandwidth wide-area

www.globustoolkit.org

Questions?

21