1 Peer-to-Peer Approaches to Grid Resource Discovery Ann
Chervenak University of Southern California Information Sciences
Institute Joint work with Shishir Bharathi, Min Cai
Slide 2
2 Resource Discovery In Grids Applications running in wide area
distributed environments or Grids need information about resources
and services Resource information has an impact on Scheduling
decisions Replica selection Planning for workflow execution (e.g.,
Pegasus) Resource discovery services need to be Highly available,
fault tolerant, reliable Highly scalable Provide flexible query
interfaces Typical Grid resource discovery services are organized
as hierarchies of index services Applying P2P techniques offers
promise of self- organization, self-healing, improved
scalability
Slide 3
3 Outline Resource Discovery in Grids: current approaches
Peer-to-Peer Resource Discovery Applying P2P techniques to Grid
Resource Discovery Unstructured P2P Information Service Joint work
with Shishir Bharathi Structured P2P Replica Location Service Joint
work with Min Cai Summary and Future Work
Slide 4
4 Currently, most services for resource discovery in Grids are
query-based index services One or more indexes aggregate
information about resources Often distributed using a hierarchical
structure Service provides a front end query interface Database
back end stores resource information Service responds to queries by
identifying resources that match desired properties Scalable Hold
large amounts of resource information Support high query rates
Often specialized for particular types of resource information
and/or queries Specialized query APIs, resource schemas, etc.
Typical Grid Resource Discovery
Slide 5
5 Index-based Discovery Services Globus Monitoring and
Discovery System (MDS) Hierarchical, distributed Index Service
Aggregates information about resources (CPUs, storage systems,
etc.) Answers queries for resources with specified properties
Typically close to resources, whose information may change
frequently (e.g., index co-located with a cluster)
Slide 6
6 Globus Replica Location Service (RLS) Hierarchical,
distributed index Provides mappings from logical names for data to
physical locations of replicas Metadata Catalog Service (MCS)
Centralized database of metadata attributes associated with data
items Answers queries for resources with specified metadata
characteristics Storage Resource Broker MCAT Centralized (or
partitioned) catalog with metadata, replica location and resource
information Index-based Discovery Services (Cont.)
Slide 7
7 Grids are growing larger Increasing number of resources and
resource discovery service instances Organization of resource
discovery services is challenging Creation/maintenance of efficient
hierarchies Avoiding hotspots, eliminating update cycles Much of
the configuration and maintenance of these services is done
manually Few capabilities for self-configuration or self-healing
Limits scalability Makes services complex to deploy and maintain
Goal: Use peer-to-peer techniques make services more
self-configuring, reliable and scalable Challenges for Resource
Discovery Services
Slide 8
8 Service instances create an overlay network Queries and
responses forwarded/routed in the overlay Structured overlay Chord,
Pastry, etc. Distributed Hash Table (DHT) based Effective when
storing/retrieving pairs Strong bounds on performance Unstructured
overlay Gnutella, KaZaA Effective when querying on attributes Not
DHT based Flooding algorithms Hybrid approaches also possible Good
scalability, self-organization, reliability, self-healing
Peer-to-Peer Systems
Slide 9
9 Structured P2P Networks Maintain a structured overlay network
among peers and use message routing Basic functionality: lookup
(key), which returns the identity of the node storing the object
with that key Often based on Distributed Hash Table (DHT) Objects
are associated with a key that can be produced by hashing the
object name Nodes have identifiers that share the same space as
keys Each node is responsible for storing a range of keys and
corresponding objects Nodes maintain an overlay network, with each
node having several other nodes as neighbors When a lookup (key)
request is issued from one node, the lookup message is routed
through the overlay network to the node responsible for the
key
Slide 10
10 Structured P2P Networks (cont.) Different DHT systems
construct a variety of overlay networks and employ different
routing algorithms They can guarantee to finish a lookup operation
in O(log N) or O(dN 1/d ) hops Each node only maintains the
information of O(log N) or d neighbors for an N node network (where
d is the dimension of the hypercube organization of the network) So
DHT systems provide good scalability as well as fault tolerance DHT
systems include Pastry, Chord, CAN and Koorde
Slide 11
11 Example: Chord Structured P2P System Chord algorithm
proposed by Stoica, et al. Chord uses a one-dimensional circular
identifier space with modulo 2 m for both node identifiers and
object keys Every node in Chord is assigned a unique m-bit
identifier by hashing their IP address and port number All nodes
self-organize into a ring topology based on their node identifiers
in the circular space Each object is also assigned a unique m-bit
identifier called its object key Object keys are assigned to nodes
by using consistent hashing Key k is assigned to the first node
whose identifier is equal to or follows the identifier of k in the
circular space This node stores the object with key k and is called
its successor node
Slide 12
12 An Example of Chord Network N4 N8 N20 N24 N40 N48 N54 N60
Key18 Key31 Key52
Slide 13
13 Each Chord node maintains two sets of neighbors: its
successors and its fingers Successor nodes immediately follow the
node in the identifier space Finger nodes are spaced exponentially
around the identifier space Each node has a constant number of
successors and at most m fingers The i-th finger for the node with
identity n is the first node that succeeds n by at least 2 i-1 on
the identifier circle, where 1
15 Chord (cont.) When node n wants to lookup the object with
key k, it will route a lookup request to the successor node of key
k If the successor node is far away from n, node n forwards the
request to the finger node whose identifier most immediately
precedes the successor node of key k By repeating this process, the
request gets closer and closer to the successor node Eventually,
the successor node receives the lookup request for the object with
key k, finds the object locally and sends the result back to node n
Each hop from one node to the next node covers at least half the
identifier space (clockwise) between that node and the successor
node of key k Number of routing hops for a lookup is O(log N) for a
Chord network with N nodes Insertion time also O(log N) Each node
maintains pointers to O(log N) neighbors
17 Unstructured P2P Systems An unstructured P2P network usually
does not impose any constraints on links between nodes in the
system Choice of neighbors to peer with is less restrictive and is
often probabilistic or randomized Unstructured overlays do not
create associations between nodes or links in the system and the
information stored in those nodes Do not require that information
adhere to a particular format or be tied to the structure of the
overlay Unlike DHT systems that store pairs and hash on key value
Information is usually stored only at the node where it was
generated or replicated in a probabilistic manner Query-response
pathways are also not well defined Queries are propagated in the
system using flooding based algorithms Responses are routed back on
the same path as the queries
Slide 18
18 Unstructured P2P Networks (cont.) Cannot provide guarantees
on query performance Dont know the number of hops taken by a query
message to reach a node that can answer the query Unlike structured
overlays Cannot guarantee that results will be returned if they
exist in the network The time-to-live field in the message dictates
how far the message travels in the network Message may not reach
all nodes Applications must be capable of dealing with these issues
as they do with other failure modes
Slide 19
19 Unstructured P2P Networks (cont.) Examples of unstructured
P2P systems: Napster, Gnutella and Kazaa Successful in internet
file sharing applications Allow peers to host content, discover
content on other peers, and download that content Popular in the
Internet community despite known disadvantages: Vulnerability of
central indexes in Napster High network loads imposed by Gnutellas
flooding algorithms Optimizations of unstructured systems have been
developed based on file and query distributions and on the use of
replication and caching
Slide 20
20 Outline Resource Discovery in Grids: current approaches
Peer-to-Peer Resource Discovery Applying P2P techniques to Grid
Resource Discovery Unstructured P2P Information Service Joint work
with Shishir Bharathi Structured P2P Replica Location Service Joint
work with Min Cai Summary and Future Work
Slide 21
21 P2P technologies successful in internet file sharing
applications Gnutella, Kazaa, Napster, etc. Allow peers to host
content, discover content and download Grid resource discovery
services have similar requirements Would like to use peer-to-peer
technologies for resource discovery in Grids Improved
configuration, scalability, etc. Convergence of P2P and Grid has
been predicted But not yet a reality Applying Peer-to-Peer
Techniques to Grid Resource Discovery
Slide 22
22 Performance May require multiple network hops to resolve a
query Some P2P overlays distribute resource information widely
(e.g., structured overlays) Still want to make use of specialized
Grid indexes Security issues Access to resources and information
about resources may need to be controlled Need a security model
that allows us to use P2P safely Practical Issues Has taken several
years to make Grid discovery services scalable, stable To support
greater scalability, need further improvements in simple and
dynamic deployment Challenges in Applying P2P Technologies to
Grids
Slide 23
23 Explore organizing an existing grid information service (GT4
Index Service ) as a peer-to-peer system Background Grid
Information Services GT4 Index Service P2P Index Service Design
Issues and design choices Optimizations Implementation Experimental
results Design of a Scalable Peer-to-Peer Information System Using
the GT4 Index Service, Shishir Bharathi and Ann Chervenak, CCGrid
2007 Conference, May 2007 P2P Grid Information Service
Slide 24
24 Collect information about resources skynet.isi.edu has 96
nodes GridFTP runs on port 2811 Avg. load on skynet-login.isi.edu
is 2.55 Aggregate Information from multiple types of resources and
services Queried by schedulers, workflow planners, clients that
need information about current resource state Process different
types of queries What port does GridFTP run on ? One response
expected What servers have load < 3.0 ? Expect multiple
response, information gathering step Grid Information Services
Slide 25
25 Globus Toolkit Version 4 Index Service Part of the
Monitoring and Discovery System (MDS) Aggregates information about
resources, responds to queries Issues: Designing efficient
hierarchies, avoiding hot spots, avoiding update cycles, scaling
with amount of information Dealing with multiple administrative
domains Organization of GT4 Indexes
Slide 26
26 WS-RF service, part of the Monitoring and Discovery System
(MDS) Information Providers generate resource information in XML
format E.g. Hawkeye, Ganglia, GridFTP Aggregator sources aggregate
information Index Service publishes aggregated information in a
single Resource Property document Processes XPath queries and
returns matching XML content GT4 Index Service
Slide 27
27 Modified GT4 Index Services to create P2P Index Service P2P
Indexes organize themselves into an unstructured overlay Each P2P
Index can be accessed both via the overlay and as a standalone GT4
Index Service Design of P2P GT4 Index Service
Slide 28
28 Overlay used only for self-organization and for forwarding
and routing of queries and responses Continue to use specialized
MDS4 Index Services that aggregate resource information Resource
information is not stored or updated via the overlay Each P2P Index
acquires resource information via an out- of-band mechanism
Policies may dictate when and how indexes are updated Resource
information is not replicated via the overlay May change quickly,
so replication is not effective Separate storing of information
from querying for information Design of GT4 Index Service
(cont.)
Slide 29
29 Grid services typically impose a strict security model
Client and server go through mutual authentication and
authorization P2P systems impose a less strict security model
Access to information and access to resource via the same overlay
Separation of access to information from access to resources User
authenticates at any node and queries for information Trusted at VO
level to access information. e.g. Find compute resource that can
execute job User accesses resource directly and not through the
overlay Involves mutual authentication and authorization Trusted at
individual site level to access resources. e.g. Submit job to
resource directly Design Issue: Security Model
Slide 30
30 Our choice: Unstructured Overlays Easy overlay management
Suitable for storing arbitrary content Support VO defined
topologies Previous work mostly in file-sharing applications Why
not Structured Overlays? Well researched in context of information
services. However Not ideal for storing XML resource information
Policy restrictions may prevent nodes from storing/indexing
information generated at other nodes Design Issue: Choice of
Overlay
Slide 31
31 Unstructured overlay + no replication + flooding algorithm
means Cannot guarantee answer will be found Depends on max-hops
field No guarantees on the number of hops taken to reach answer
Exponential growth in the number of messages sent Need to optimize
message forwarding to counter this explosion Typical message
handling Process query locally. If result not found, forward to
peers Reduces number of messages sent BUT slow, if client expects
multiple responses Issues for Unstructured Overlays
Slide 32
32 Goal: Reduce number of messages sent Replication/Caching
most popular technique Cannot cache query responses Information may
change quickly, policy restrictions, etc. Can cache queries Query
Caching with Probabilistic Forwarding Cache information about which
nodes responded to query If a node responded earlier to same query,
deterministically forward query to that node Forward query to other
nodes with low probability Identify nodes that have been updated
with new information Set up caches along duplicate paths correctly
Similar to other learning-based & probabilistic approaches
Effective for applications that may issue queries repeatedly (e.g.,
Pegasus workflow planner) Optimization: Query Caching with
Probabilistic Forwarding
Slide 33
33 Goal: Improve performance of attribute based queries Process
and forward model may be slow Distinguish between Return one vs.
Return all semantics Return one - explicit queries What is the load
on skynet.isi.edu? Requires a single response Process query locally
before forwarding to reduce messages Return all attribute based
queries What sites have load < 3.0? Likely to be multiple
responses Forward query before processing locally to reduce
response time (early forwarding) Tag QUERY messages with hints
(Return one or Return all) that indicate whether to do early
forwarding Optimization: Early Forwarding
Slide 34
34 Layered implementation P2P Resource component maintains
overlay and processes messages IndexServiceResource component
processes queries Almost plug-and-play Gnutella-like message
forwarding Updated using standard MDS aggregator framework and not
through the overlay Query Caching and Early Forwarding
optimizations Implementation of P2P Index Service
Slide 35
35 Experimental Set-up Evaluate overhead of applying a P2P
overlay Evaluate wide-area performance Test beds - LAN at ISI,
PlanetLab Mostly comparison tests small networks Applications
Pegasus (A workflow planning application) Site and transformation
catalogs used by Pegasus Simple random query client Artificial
records Metrics Time taken by Pegasus to generate a plan Query
rates Experiments
Slide 36
36 Pegasus planning a 100 job workflow Query Overhead
reasonably constant as the number of datasets in the index increase
Experiments: Overhead of P2P Layer
Slide 37
37 WAN measurements on the PlanetLab test bed 8 nodes, 2 peers
per node, up to 256 concurrent client threads Query rates in World
WAN slightly higher than in US WAN Higher load on the US PlanetLab
nodes Query processing is compute intensive Experiments: Wide Area
Performance
Slide 38
38 P2P Organization of GT4 Information Service Low overhead
from adding a P2P layer Key design features: Separation of storage
of information from querying for information (Overlay used only to
forward queries) Separation of access to information from access to
resources (Security model: Choose what is exposed at VO level and
apply additional restrictions at resource level) Simple
optimizations help address issues with flooding (results not shown
here) Query caching with probabilistic forwarding Early Forwarding
Future Work Scale to larger sizes P2P version of Replica Location
Service using overlay Experiment with replicating relatively static
information P2P Index Service: Conclusions
Slide 39
39 Outline Resource Discovery in Grids: current approaches
Peer-to-Peer Resource Discovery Applying P2P techniques to Grid
Resource Discovery P2P Information Service Joint work with Shishir
Bharathi P2P Replica Location Service Joint work with Min Cai
Summary and Future Work
Slide 40
40 Implemented a P2P Replica Location Service based on: Globus
Toolkit Version 3.0 RLS Chord structured Peer-to-Peer overlay
network Peer-to-Peer Replica Location Service A Peer-to-Peer
Replica Location Service Based on A Distributed Hash Table, Min
Cai, Ann Chervenak, Martin Frank, Proceedings of SC2004 Conference,
November 2004. Applying Peer-to-Peer Techniques to Grid Replica
Location Services, Ann L. Chervenak, Min Cai, Journal of Grid
Computing, 2006.
Slide 41
41 The existing Globus Replica Location Service (RLS) is a
distributed registry service that records the locations of data
copies and allows discovery of replicas Maintains mappings between
logical identifiers and target names Local Replica Catalogs (LRCs)
contain logical-to- target mappings Replica Location Index Nodes
(RLIs) aggregate information about LRCs Soft state updates sent
from LRCs to RLIs The Globus Replica Location Service
Slide 42
42 Each RLS deployment is statically configured If upper level
RLI fails, the lower level LRCs need to be manually redirected More
automated and flexible membership management is desirable for:
larger deployments dynamic environments where servers frequently
join and leave We use a peer-to-peer approach to provide
distributed RLI index for {logical-name, LRC} mappings Consistent
with our security model: resource discovery at the RLI level,
stricter security at LRC level In P2P RLS, replicate mappings,
unlike in P2P MDS Easier to hash on logical name than on arbitrary
XML content Mappings are much less dynamic than resource
information Motivation for a Peer-to-Peer RLS
Slide 43
43 A P-RLS server consists of: An unchanged Local Replica
Catalog (LRC) to maintain consistent {logical-name, target-name}
mappings A Peer-to-Peer Replica Location Index node (P-RLI) The
P-RLS design uses a Chord overlay network to self-organize P-RLI
servers Chord is a distributed hash table that supports scalable
key insertion and lookup Each node has log (N) neighbors in a
network of N nodes A key is stored on its successor node (first
node with ID equal to or greater than key) Key insertion and lookup
in log (N) hops Stabilization algorithm for overlay construction
and topology repair P2P Replica Location Service (P-RLS)
Design
Slide 44
44 Uses Chord algorithm to store mappings of logical names to
LRC sites Generates Chord key for a logical name by applying SHA1
hash function Stores {logical-name, LRC} mappings on the P-RLI
successor nodeof the mapping When P-RLI node receives a query for
LRC(s) that store mappings for a logical name: Answers the query if
it contains the logical-to-LRC mapping(s) If not, routes query to
the successor node that contains the mappings Then query LRCs
directly for mappings from logical names to replica locations P-RLS
Design (cont.)
46 Implemented a prototype of P-RLS Extends RLS implementation
in Globus Toolkit 3.0 Each P-RLS node consists of an unchanged LRC
server and a peer-to-peer P-RLI server The P-RLI server implements
the Chord protocol operations, including join, update, query,
successor, probing & stabilization LRC, RLI & Chord
protocols implemented on top of RLS RPC layer Successor, Join,
Update, Query, Probing Stabilizatio n Chord Network LRC Protocol
P-RLS RLI Protocol Chord Protocol RLS RPC Layer LRC Server P-RLI
Server RLS Client API LRC Protocol P-RLS RLI Protocol Chord
Protocol RLS RPC Layer LRC Server P-RLI Server RLS Client API P-RLS
Implementation
Slide 47
47 P-RLS network runs on a 16-node cluster 1000 updates (add
operations) on each node, updates overwrite existing mappings, and
maximum 1000 mappings in the network Update latencies increase on
log scale with number of nodes P-RLS Performance
Slide 48
48 Query latencies with 100,000 and 1 million mappings Total
number of mappings has little effect on query times Uses hash table
to index mappings on each P-RLI node Query times increase on log
scale with number of nodes P-RLS Measurements (cont.)
Slide 49
49 Need better reliability when P-RLI nodes fail or leave
Replicate mappings so they are not lost Min Cai proposes Adaptive
Successor Replication: Extends previously described scheme
Replicate each mapping on k successor nodes of the root node
Provides reliability despite P-RLI failures No mappings lost unless
all k successors fail simultaneously Distributes mappings more
evenly among P-RLI nodes as replication factor increases Improves
load balancing for popular mappings Successor Replication in
P-RLS
Slide 50
50 Implemented a P2P Replica Location Service based on: Globus
Toolkit Version 3.0 RLS Chord structured Peer-to-Peer overlay
network Measured the performance of our P-RLS system with up to 15
nodes Query and update latencies increase at rate of O(logN) with
size of P-RLS network Simulated the performance of larger P-RLS
networks Replication of mappings results in more even distribution
of mappings among nodes Successor replication scheme provides query
load balancing for popular mappings Summary: P-RLS Work
Slide 51
51 Other approaches to applying P2P to Grid services GLARE
Mumtaz Siddiqui et al., University of Innsbruck Structured P2P
approach to Grid information services P2P Replica Location Service
Matei Ripeanu et al., University of British Columbia Use bloom
filters and an unstructured P2P network for replica location
service Structured Peer-to-Peer Networks Examples: Chord, CAN,
Tapestry, Pastry Unstructured Peer-to-Peer Networks Examples:
Gnutella, KaZaA, Gia Related Work: P2P and Grids
Slide 52
52 Summary Resource discovery services in Grids are well-suited
to P2P approaches Similar in function to internet file sharing
applications P2P approaches are attractive because Scale of Grids
growing larger Organization of hierarchical resource discovery
services is challenging, especially at large scale Need
self-configuration, self-healing Security, performance &
practical issues to be overcome Two systems implemented and
evaluated P2P Information Service: Uses unstructured overlay to
support resource information specified in XML P2P Replica Location
Service: Uses structured overlay to distribute mappings among
indexes, bounds query response
Slide 53
53 Continued research Larger scale Different overlay schemes
Additional optimizations to improve query performance, reduce
message counts Incorporate P2P techniques into real-world Grid
resource discovery services Make GT4-based peer-to-peer overlay
available as open source contribution to Globus Toolkit Release P2P
Index Service component for MDS4 Tested a version of Replica
Location Service using the unstructured GT4 overlay Additional
improvements to services for easy, dynamic deployment to make these
approaches practical Future Work