Wide Area Data Replication for Scientific Collaborations Ann Chervenak, Robert Schuler, Carl...

28
Wide Area Data Replication for Scientific Collaborations Ann Chervenak, Robert Schuler, Carl Kesselman USC Information Sciences Institute Scott Koranda Univa Corporation Brian Moe University of Wisconsin Milwaukee

Transcript of Wide Area Data Replication for Scientific Collaborations Ann Chervenak, Robert Schuler, Carl...

Page 1: Wide Area Data Replication for Scientific Collaborations Ann Chervenak, Robert Schuler, Carl Kesselman USC Information Sciences Institute Scott Koranda.

Wide Area Data Replication for Scientific Collaborations

Ann Chervenak, Robert Schuler, Carl KesselmanUSC Information Sciences Institute

Scott KorandaUniva Corporation

Brian MoeUniversity of Wisconsin Milwaukee

Page 2: Wide Area Data Replication for Scientific Collaborations Ann Chervenak, Robert Schuler, Carl Kesselman USC Information Sciences Institute Scott Koranda.

Motivation Scientific application domains spend considerable

effort managing large amounts of experimental and simulation data

Have developed customized, higher-level Grid data management services

Examples: Laser Interferometer Gravitational Wave Observatory

(LIGO) Lightweight Data Replicator System

High Energy Physics projects: EGEE system, gLite, LHC Computing Grid (LCG) middleware

Portal-based coordination of services (E.g., Earth System Grid)

Page 3: Wide Area Data Replication for Scientific Collaborations Ann Chervenak, Robert Schuler, Carl Kesselman USC Information Sciences Institute Scott Koranda.

Motivation (cont.) Data management functionality varies by application

Share several requirements: Publish and replicate large datasets (millions of files) Register data replicas in catalogs and discover them Perform metadata-based discovery of datasets May require ability to validate correctness of replicas In general, data updates and replica consistency services

not required (i.e., read-only accesses)

Systems provide production data management services to individual scientific domains

Each project spends considerable resources to design, implement & maintain data management system

Typically cannot be re-used by other applications

Page 4: Wide Area Data Replication for Scientific Collaborations Ann Chervenak, Robert Schuler, Carl Kesselman USC Information Sciences Institute Scott Koranda.

Motivation (cont.) Long-term goals:

Generalize functionality provided by these data management systems

Provide suite of application-independent services

Paper describes one higher-level data management service: the Data Replication Service (DRS)

DRS functionality based on publication capability of the LIGO Lightweight Data Replicator (LDR) system

Ensures that a set of files exists on a storage site Replicates files as needed, registers them in catalogs

DRS builds on lower-level Grid services, including: Globus Reliable File Transfer (RFT) service Replica Location Service (RLS)

Page 5: Wide Area Data Replication for Scientific Collaborations Ann Chervenak, Robert Schuler, Carl Kesselman USC Information Sciences Institute Scott Koranda.

Outline

Description of LDR data publication capability Generalization of this functionality

Define characteristics of an application-independent Data Replication Service (DRS)

DRS Design DRS Implementation in GT4 environment Evaluation of DRS performance in a wide area Grid Related work Future work

Page 6: Wide Area Data Replication for Scientific Collaborations Ann Chervenak, Robert Schuler, Carl Kesselman USC Information Sciences Institute Scott Koranda.

A Data-Intensive Application Example: The LIGO Project

Laser Interferometer Gravitational Wave Observatory (LIGO) collaboration

Seeks to measure gravitational waves predicted by Einstein

Collects experimental datasets at two LIGO instrument sites in Louisiana and Washington State

Datasets are replicated at other LIGO sites Scientists analyze the data and publish their results,

which may be replicated Currently LIGO stores more than 40 million files across

ten locations

Page 7: Wide Area Data Replication for Scientific Collaborations Ann Chervenak, Robert Schuler, Carl Kesselman USC Information Sciences Institute Scott Koranda.

The Lightweight Data Replicator LIGO scientists developed the Lightweight Data

Replicator (LDR) System for data management Built on top of standard Grid data services:

Globus Replica Location Service GridFTP data transport protocol

LDR provides a rich set of data management functionality, including

a pull-based model for replicating necessary files to a LIGO site

efficient data transfer among LIGO sites a distributed metadata service architecture an interface to local storage systems a validation component that verifies that files on a storage

system are correctly registered in a local RLS catalog

Page 8: Wide Area Data Replication for Scientific Collaborations Ann Chervenak, Robert Schuler, Carl Kesselman USC Information Sciences Institute Scott Koranda.

LIGO Data Publication and Replication

Two types of data publishing

1. Detectors at Livingston and Hanford produce data sets

Approx. a terabyte per day during LIGO experimental runs Each detector produces a file every 16 seconds Files range in size from 1 to 100 megabytes Data sets are copied to main repository at CalTech, which

stores them in tape-based mass storage system LIGO sites can acquire copies from CalTech or one another

2. Scientists also publish new or derived data sets as they perform analysis on existing data sets

E.g., data filtering or calibration may create new files These new files may also be replicated at LIGO sites

Page 9: Wide Area Data Replication for Scientific Collaborations Ann Chervenak, Robert Schuler, Carl Kesselman USC Information Sciences Institute Scott Koranda.

Some Terminology A logical file name (LFN) is a unique identifier for the

contents of a file Typically, a scientific collaboration defines and manages

the logical namespace Guarantees uniqueness of logical names within that

organization

A physical file name (PFN) is the location of a copy of the file on a storage system.

The physical namespace is managed by the file system or storage system

The LIGO environment currently contains: More than six million unique logical files More than 40 million physical files stored at ten sites

Page 10: Wide Area Data Replication for Scientific Collaborations Ann Chervenak, Robert Schuler, Carl Kesselman USC Information Sciences Institute Scott Koranda.

Components at Each LDR Site Local storage system GridFTP server for file transfer Metadata Catalog: associations

between logical file names and metadata attributes

Replica Location Service: Local Replica Catalog (LRCs)

stores mappings from logical names to storage locations

Replica Location Index (RLI) collects state summaries from LRCs

Scheduler and transfer daemons Prioritized queue of requested

files

Local Replica Catalog

Replica Location

Index

GridFTP Server

Metadata Catalog

MySQL Database

Scheduler Daemon

Transfer Daemon

Prioritized List of

Requested Files

Site Storage System

Page 11: Wide Area Data Replication for Scientific Collaborations Ann Chervenak, Robert Schuler, Carl Kesselman USC Information Sciences Institute Scott Koranda.

LDR Data Publishing Scheduling daemon runs at each LDR site

Queries site’s metadata catalog to identify logical files with specified metadata attributes

Checks RLS Local Replica Catalog to determine whether copies of those files already exist locally

If not, puts logical file names on priority-based scheduling queue

Transfer daemon also runs at each site Checks queue and initiates data transfers in priority order Queries RLS Replica Location Index to find sites where

desired files exists Randomly selects source file from among available replicas Use GridFTP transport protocol to transfer file to local site Registers newly-copied file in RLS Local Replica Catalog

Page 12: Wide Area Data Replication for Scientific Collaborations Ann Chervenak, Robert Schuler, Carl Kesselman USC Information Sciences Institute Scott Koranda.

Generalizing the LDR Publication Scheme

Want to provide a similar capability that is Independent of LIGO infrastructure Useful for a variety of application domains

Capabilities include: Interface to specify which files are required at local site Use of Globus RLS to discover whether replicas exist locally

and where they exist in the Grid Use of a selection algorithm to choose among available

replicas Use of Globus Reliable File Transfer service and GridFTP

data transport protocol to copy data to local site Use of Globus RLS to register new replicas

Page 13: Wide Area Data Replication for Scientific Collaborations Ann Chervenak, Robert Schuler, Carl Kesselman USC Information Sciences Institute Scott Koranda.

Relationship to Other Globus Services

At requesting site, deploy:

WS-RF Services Data Replication Service Delegation Service Reliable File Transfer

Service

Pre WS-RF Components

Replica Location Service (Local Replica Catalog, Replica Location Index)

GridFTP Server

Web Service Container

Data Replication

Service

Replicator Resource

Reliable File

Transfer Service

RFT Resource

Local Replica Catalog

Replica Location

Index

GridFTP Server

Delegation Service

Delegated Credential

Local Site

Page 14: Wide Area Data Replication for Scientific Collaborations Ann Chervenak, Robert Schuler, Carl Kesselman USC Information Sciences Institute Scott Koranda.

DRS Functionality Initiate a DRS Request Create a delegated credential Create a Replicator resource Monitor Replicator resource Discover replicas of desired files in RLS, select among replicas Transfer data to local site with Reliable File Transfer Service Register new replicas in RLS catalogs Allow client inspection of DRS results Destroy Replicator resource

DRS implemented in Globus Toolkit Version 4, complies with Web Services Resource Framework (WS-RF)

Page 15: Wide Area Data Replication for Scientific Collaborations Ann Chervenak, Robert Schuler, Carl Kesselman USC Information Sciences Institute Scott Koranda.

WSRF in a Nutshell Service State Management:

Resource Resource Property

State Identification: Endpoint Reference

State Interfaces: GetRP, QueryRPs,

GetMultipleRPs, SetRP Lifetime Interfaces:

SetTerminationTime ImmediateDestruction

Notification Interfaces Subscribe Notify

ServiceGroups

RPs

Resource

ServiceGetRP

GetMultRPs

SetRP

QueryRPs

Subscribe

SetTermTime

Destroy

EPREPR

EPR

Page 16: Wide Area Data Replication for Scientific Collaborations Ann Chervenak, Robert Schuler, Carl Kesselman USC Information Sciences Institute Scott Koranda.

Service Container

Create Delegated Credential

Client

Delegation

Data Rep.

RFTReplicaIndex

ReplicaCatalog

GridFTPServer

GridFTPServer

ReplicaCatalog

ReplicaCatalog Replica

Catalog

MDS

Credential

RP

proxy

•Initialize user proxy cert.

•Create delegated credential resource•Set termination time

•Credential EPR returnedEPR

Page 17: Wide Area Data Replication for Scientific Collaborations Ann Chervenak, Robert Schuler, Carl Kesselman USC Information Sciences Institute Scott Koranda.

Service Container

Create Replicator Resource

Client

Delegation

Data Rep.

RFTReplicaIndex

ReplicaCatalog

GridFTPServer

GridFTPServer

ReplicaCatalog

ReplicaCatalog Replica

Catalog

MDS

Credential

RP

•Create Replicator resource•Pass delegated credential EPR•Set termination time

•Replicator EPR returned

EPRReplicator

RP

•Access delegated credential resource

Page 18: Wide Area Data Replication for Scientific Collaborations Ann Chervenak, Robert Schuler, Carl Kesselman USC Information Sciences Institute Scott Koranda.

Service Container

Monitor Replicator Resource

Client

Delegation

Data Rep.

RFTReplicaIndex

ReplicaCatalog

GridFTPServer

GridFTPServer

ReplicaCatalog

ReplicaCatalog Replica

Catalog

MDS

Credential

RP

Replicator

RP

•Periodically polls Replicator RP via GetRP or GetMultRP

•Add Replicator resource to MDS Information service Index

Index

RP

•Subscribe to ResourceProperty changes for “Status” RP and “Stage” RP

•Conditions may trigger alerts or other actions (Trigger service not pictured)

EPR

Page 19: Wide Area Data Replication for Scientific Collaborations Ann Chervenak, Robert Schuler, Carl Kesselman USC Information Sciences Institute Scott Koranda.

Service Container

Query Replica Information

Client

Delegation

Data Rep.

RFTReplicaIndex

ReplicaCatalog

GridFTPServer

GridFTPServer

ReplicaCatalog

ReplicaCatalog Replica

Catalog

MDS

Credential

RP

Replicator

RP

Index

RP

•Notification of “Stage” RP value changed to “discover”

•Replicator queries RLS Replica Index to find catalogs that contain desired replica information

•Replicator queries RLS Replica Catalog(s) to retrieve mappings from logical name to target name (URL)

Page 20: Wide Area Data Replication for Scientific Collaborations Ann Chervenak, Robert Schuler, Carl Kesselman USC Information Sciences Institute Scott Koranda.

Service Container

Transfer Data

Client

Delegation

Data Rep.

RFTReplicaIndex

ReplicaCatalog

GridFTPServer

GridFTPServer

ReplicaCatalog

ReplicaCatalog Replica

Catalog

MDS

Credential

RP

Replicator

RP

Index

RP

•Notification of “Stage” RP value changed to “transfer”

•Create Transfer resource•Pass credential EPR•Set Termination Time•Transfer resource EPR returned

Transfer

RP

EPREPR

•Access delegated credential resource

•Setup GridFTP Server transfer of file(s)

•Data transfer between GridFTP Server sites

•Periodically poll “ResultStatus” RP via GetRP•When “Done”, get state information for each file transfer

Page 21: Wide Area Data Replication for Scientific Collaborations Ann Chervenak, Robert Schuler, Carl Kesselman USC Information Sciences Institute Scott Koranda.

Service Container

Register Replica Information

Client

Delegation

Data Rep.

RFTReplicaIndex

ReplicaCatalog

GridFTPServer

GridFTPServer

ReplicaCatalog

ReplicaCatalog Replica

Catalog

MDS

Credential

RP

Replicator

RP

Index

RP

•Notification of “Stage” RP value changed to “register”

•RLS Replica Catalog sends update of new replica mappings to the Replica Index

Transfer

RP•Replicator registers new file mappings in RLS Replica Catalog

Page 22: Wide Area Data Replication for Scientific Collaborations Ann Chervenak, Robert Schuler, Carl Kesselman USC Information Sciences Institute Scott Koranda.

Service Container

Client Inspection of State

Client

Delegation

Data Rep.

RFTReplicaIndex

ReplicaCatalog

GridFTPServer

GridFTPServer

ReplicaCatalog

ReplicaCatalog Replica

Catalog

MDS

Credential

RP

Replicator

RP

Index

RP

•Notification of “Status” RP value changed to “Finished” Transfer

RP

•Client inspects Replicator state information for each replication in the request

Page 23: Wide Area Data Replication for Scientific Collaborations Ann Chervenak, Robert Schuler, Carl Kesselman USC Information Sciences Institute Scott Koranda.

Service Container

Resource Termination

Client

Delegation

Data Rep.

RFTReplicaIndex

ReplicaCatalog

GridFTPServer

GridFTPServer

ReplicaCatalog

ReplicaCatalog Replica

Catalog

MDS

Credential

RP

Replicator

RP

Index

RP

•Termination time (set by client) expires eventually

Transfer

RP•Resources destroyed (Credential, Transfer, Replicator)

TIME

Page 24: Wide Area Data Replication for Scientific Collaborations Ann Chervenak, Robert Schuler, Carl Kesselman USC Information Sciences Institute Scott Koranda.

Performance Measurements: Wide Area Testing

The destination for the pull-based transfers is located in Los Angeles

Dual-processor, 1.1 GHz Pentium III workstation with 1.5 GBytes of memory and a 1 Gbit Ethernet

Runs a GT4 container and deploys services including RFT and DRS as well as GridFTP and RLS

The remote site where desired data files are stored is located at Argonne National Laboratory in Illinois

Dual-processor, 3 GHz Intel Xeon workstation with 2 gigabytes of memory with 1.1 terabytes of disk

Runs a GT4 container as well as GridFTP and RLS services

Page 25: Wide Area Data Replication for Scientific Collaborations Ann Chervenak, Robert Schuler, Carl Kesselman USC Information Sciences Institute Scott Koranda.

DRS Operations Measured

Create the DRS Replicator resource Discover source files for replication using local RLS

Replica Location Index and remote RLS Local Replica Catalogs

Initiate an Reliable File Transfer operation by creating an RFT resource

Perform RFT data transfer(s) Register the new replicas in the RLS Local Replica

Catalog

Page 26: Wide Area Data Replication for Scientific Collaborations Ann Chervenak, Robert Schuler, Carl Kesselman USC Information Sciences Institute Scott Koranda.

Experiment 1: Replicate 10 Files of Size 1 Gigabyte

Component of Operation Time (milliseconds)

Create Replicator Resource 317.0

Discover Files in RLS 449.0

Create RFT Resource 808.6

Transfer Using RFT 1186796.0

Register Replicas in RLS 3720.8

Data transfer time dominates Wide area data transfer rate of 67.4 Mbits/sec

Page 27: Wide Area Data Replication for Scientific Collaborations Ann Chervenak, Robert Schuler, Carl Kesselman USC Information Sciences Institute Scott Koranda.

Experiment 2: Replicate 1000 Files of Size 10 Megabytes

Component of Operation Time (milliseconds)

Create Replicator Resource 1561.0

Discover Files in RLS 9.8

Create RFT Resource 1286.6

Transfer Using RFT 963456.0

Register Replicas in RLS 11278.2

Time to create Replicator and RFT resources is larger Need to store state for 1000 outstanding transfers

Data transfer time still dominates Wide area data transfer rate of 85 Mbits/sec

Page 28: Wide Area Data Replication for Scientific Collaborations Ann Chervenak, Robert Schuler, Carl Kesselman USC Information Sciences Institute Scott Koranda.

Future Work

We will continue performance testing of DRS: Increasing the size of the files being transferred Increasing the number of files per DRS request

Add and refine DRS functionality as it is used by applications

E.g., add a push-based replication capability

We plan to develop a suite of general, configurable, composable, high-level data management services