Download - 1 Peer-to-Peer Approaches to Grid Resource Discovery Ann Chervenak University of Southern California Information Sciences Institute Joint work with Shishir.

1 Peer-to-Peer Approaches to Grid Resource Discovery Ann Chervenak University of Southern California Information Sciences Institute Joint work with Shishir Bharathi, Min Cai

2 Resource Discovery In Grids Applications running in wide area distributed environments or Grids need information about resources and services Resource information has an impact on Scheduling decisions Replica selection Planning for workflow execution (e.g., Pegasus) Resource discovery services need to be Highly available, fault tolerant, reliable Highly scalable Provide flexible query interfaces Typical Grid resource discovery services are organized as hierarchies of index services Applying P2P techniques offers promise of self- organization, self-healing, improved scalability

3 Outline Resource Discovery in Grids: current approaches Peer-to-Peer Resource Discovery Applying P2P techniques to Grid Resource Discovery Unstructured P2P Information Service Joint work with Shishir Bharathi Structured P2P Replica Location Service Joint work with Min Cai Summary and Future Work

4 Currently, most services for resource discovery in Grids are query-based index services One or more indexes aggregate information about resources Often distributed using a hierarchical structure Service provides a front end query interface Database back end stores resource information Service responds to queries by identifying resources that match desired properties Scalable Hold large amounts of resource information Support high query rates Often specialized for particular types of resource information and/or queries Specialized query APIs, resource schemas, etc. Typical Grid Resource Discovery

5 Index-based Discovery Services Globus Monitoring and Discovery System (MDS) Hierarchical, distributed Index Service Aggregates information about resources (CPUs, storage systems, etc.) Answers queries for resources with specified properties Typically close to resources, whose information may change frequently (e.g., index co-located with a cluster)

6 Globus Replica Location Service (RLS) Hierarchical, distributed index Provides mappings from logical names for data to physical locations of replicas Metadata Catalog Service (MCS) Centralized database of metadata attributes associated with data items Answers queries for resources with specified metadata characteristics Storage Resource Broker MCAT Centralized (or partitioned) catalog with metadata, replica location and resource information Index-based Discovery Services (Cont.)

7 Grids are growing larger Increasing number of resources and resource discovery service instances Organization of resource discovery services is challenging Creation/maintenance of efficient hierarchies Avoiding hotspots, eliminating update cycles Much of the configuration and maintenance of these services is done manually Few capabilities for self-configuration or self-healing Limits scalability Makes services complex to deploy and maintain Goal: Use peer-to-peer techniques make services more self-configuring, reliable and scalable Challenges for Resource Discovery Services

8 Service instances create an overlay network Queries and responses forwarded/routed in the overlay Structured overlay Chord, Pastry, etc. Distributed Hash Table (DHT) based Effective when storing/retrieving pairs Strong bounds on performance Unstructured overlay Gnutella, KaZaA Effective when querying on attributes Not DHT based Flooding algorithms Hybrid approaches also possible Good scalability, self-organization, reliability, self-healing Peer-to-Peer Systems

9 Structured P2P Networks Maintain a structured overlay network among peers and use message routing Basic functionality: lookup (key), which returns the identity of the node storing the object with that key Often based on Distributed Hash Table (DHT) Objects are associated with a key that can be produced by hashing the object name Nodes have identifiers that share the same space as keys Each node is responsible for storing a range of keys and corresponding objects Nodes maintain an overlay network, with each node having several other nodes as neighbors When a lookup (key) request is issued from one node, the lookup message is routed through the overlay network to the node responsible for the key

10 Structured P2P Networks (cont.) Different DHT systems construct a variety of overlay networks and employ different routing algorithms They can guarantee to finish a lookup operation in O(log N) or O(dN 1/d ) hops Each node only maintains the information of O(log N) or d neighbors for an N node network (where d is the dimension of the hypercube organization of the network) So DHT systems provide good scalability as well as fault tolerance DHT systems include Pastry, Chord, CAN and Koorde

11 Example: Chord Structured P2P System Chord algorithm proposed by Stoica, et al. Chord uses a one-dimensional circular identifier space with modulo 2 m for both node identifiers and object keys Every node in Chord is assigned a unique m-bit identifier by hashing their IP address and port number All nodes self-organize into a ring topology based on their node identifiers in the circular space Each object is also assigned a unique m-bit identifier called its object key Object keys are assigned to nodes by using consistent hashing Key k is assigned to the first node whose identifier is equal to or follows the identifier of k in the circular space This node stores the object with key k and is called its successor node

12 An Example of Chord Network N4 N8 N20 N24 N40 N48 N54 N60 Key18 Key31 Key52

13 Each Chord node maintains two sets of neighbors: its successors and its fingers Successor nodes immediately follow the node in the identifier space Finger nodes are spaced exponentially around the identifier space Each node has a constant number of successors and at most m fingers The i-th finger for the node with identity n is the first node that succeeds n by at least 2 i-1 on the identifier circle, where 1

14 An Example of Chord Network N4 N8 N20 N24 N40 N48 N54 N60 Finger Table N4+1 => N8 N4+2 => N8 N4+4 => N8 N4+8 => N20 N4+16 => N20 N4+32 => N40 N4+1, N4+2, N4+4 N4+8, N4+16 N4+32

15 Chord (cont.) When node n wants to lookup the object with key k, it will route a lookup request to the successor node of key k If the successor node is far away from n, node n forwards the request to the finger node whose identifier most immediately precedes the successor node of key k By repeating this process, the request gets closer and closer to the successor node Eventually, the successor node receives the lookup request for the object with key k, finds the object locally and sends the result back to node n Each hop from one node to the next node covers at least half the identifier space (clockwise) between that node and the successor node of key k Number of routing hops for a lookup is O(log N) for a Chord network with N nodes Insertion time also O(log N) Each node maintains pointers to O(log N) neighbors

16 An Example of Chord Network N4 N8 N20 N24 N40 N48 N54 N60 Finger Table N4+1 => N8 N4+2 => N8 N4+4 => N8 N4+8 => N20 N4+16 => N20 N4+32 => N40 N4+1, N4+2, N4+4 N4+8, N4+16 N4+32 Key18 Key31 Key52 Lookup(key52) lookup(key52) Key52

17 Unstructured P2P Systems An unstructured P2P network usually does not impose any constraints on links between nodes in the system Choice of neighbors to peer with is less restrictive and is often probabilistic or randomized Unstructured overlays do not create associations between nodes or links in the system and the information stored in those nodes Do not require that information adhere to a particular format or be tied to the structure of the overlay Unlike DHT systems that store pairs and hash on key value Information is usually stored only at the node where it was generated or replicated in a probabilistic manner Query-response pathways are also not well defined Queries are propagated in the system using flooding based algorithms Responses are routed back on the same path as the queries

18 Unstructured P2P Networks (cont.) Cannot provide guarantees on query performance Dont know the number of hops taken by a query message to reach a node that can answer the query Unlike structured overlays Cannot guarantee that results will be returned if they exist in the network The time-to-live field in the message dictates how far the message travels in the network Message may not reach all nodes Applications must be capable of dealing with these issues as they do with other failure modes

19 Unstructured P2P Networks (cont.) Examples of unstructured P2P systems: Napster, Gnutella and Kazaa Successful in internet file sharing applications Allow peers to host content, discover content on other peers, and download that content Popular in the Internet community despite known disadvantages: Vulnerability of central indexes in Napster High network loads imposed by Gnutellas flooding algorithms Optimizations of unstructured systems have been developed based on file and query distributions and on the use of replication and caching

20 Outline Resource Discovery in Grids: current approaches Peer-to-Peer Resource Discovery Applying P2P techniques to Grid Resource Discovery Unstructured P2P Information Service Joint work with Shishir Bharathi Structured P2P Replica Location Service Joint work with Min Cai Summary and Future Work

21 P2P technologies successful in internet file sharing applications Gnutella, Kazaa, Napster, etc. Allow peers to host content, discover content and download Grid resource discovery services have similar requirements Would like to use peer-to-peer technologies for resource discovery in Grids Improved configuration, scalability, etc. Convergence of P2P and Grid has been predicted But not yet a reality Applying Peer-to-Peer Techniques to Grid Resource Discovery

22 Performance May require multiple network hops to resolve a query Some P2P overlays distribute resource information widely (e.g., structured overlays) Still want to make use of specialized Grid indexes Security issues Access to resources and information about resources may need to be controlled Need a security model that allows us to use P2P safely Practical Issues Has taken several years to make Grid discovery services scalable, stable To support greater scalability, need further improvements in simple and dynamic deployment Challenges in Applying P2P Technologies to Grids

23 Explore organizing an existing grid information service (GT4 Index Service ) as a peer-to-peer system Background Grid Information Services GT4 Index Service P2P Index Service Design Issues and design choices Optimizations Implementation Experimental results Design of a Scalable Peer-to-Peer Information System Using the GT4 Index Service, Shishir Bharathi and Ann Chervenak, CCGrid 2007 Conference, May 2007 P2P Grid Information Service

24 Collect information about resources skynet.isi.edu has 96 nodes GridFTP runs on port 2811 Avg. load on skynet-login.isi.edu is 2.55 Aggregate Information from multiple types of resources and services Queried by schedulers, workflow planners, clients that need information about current resource state Process different types of queries What port does GridFTP run on ? One response expected What servers have load < 3.0 ? Expect multiple response, information gathering step Grid Information Services

25 Globus Toolkit Version 4 Index Service Part of the Monitoring and Discovery System (MDS) Aggregates information about resources, responds to queries Issues: Designing efficient hierarchies, avoiding hot spots, avoiding update cycles, scaling with amount of information Dealing with multiple administrative domains Organization of GT4 Indexes

26 WS-RF service, part of the Monitoring and Discovery System (MDS) Information Providers generate resource information in XML format E.g. Hawkeye, Ganglia, GridFTP Aggregator sources aggregate information Index Service publishes aggregated information in a single Resource Property document Processes XPath queries and returns matching XML content GT4 Index Service

27 Modified GT4 Index Services to create P2P Index Service P2P Indexes organize themselves into an unstructured overlay Each P2P Index can be accessed both via the overlay and as a standalone GT4 Index Service Design of P2P GT4 Index Service

28 Overlay used only for self-organization and for forwarding and routing of queries and responses Continue to use specialized MDS4 Index Services that aggregate resource information Resource information is not stored or updated via the overlay Each P2P Index acquires resource information via an out- of-band mechanism Policies may dictate when and how indexes are updated Resource information is not replicated via the overlay May change quickly, so replication is not effective Separate storing of information from querying for information Design of GT4 Index Service (cont.)

29 Grid services typically impose a strict security model Client and server go through mutual authentication and authorization P2P systems impose a less strict security model Access to information and access to resource via the same overlay Separation of access to information from access to resources User authenticates at any node and queries for information Trusted at VO level to access information. e.g. Find compute resource that can execute job User accesses resource directly and not through the overlay Involves mutual authentication and authorization Trusted at individual site level to access resources. e.g. Submit job to resource directly Design Issue: Security Model

30 Our choice: Unstructured Overlays Easy overlay management Suitable for storing arbitrary content Support VO defined topologies Previous work mostly in file-sharing applications Why not Structured Overlays? Well researched in context of information services. However Not ideal for storing XML resource information Policy restrictions may prevent nodes from storing/indexing information generated at other nodes Design Issue: Choice of Overlay

31 Unstructured overlay + no replication + flooding algorithm means Cannot guarantee answer will be found Depends on max-hops field No guarantees on the number of hops taken to reach answer Exponential growth in the number of messages sent Need to optimize message forwarding to counter this explosion Typical message handling Process query locally. If result not found, forward to peers Reduces number of messages sent BUT slow, if client expects multiple responses Issues for Unstructured Overlays

32 Goal: Reduce number of messages sent Replication/Caching most popular technique Cannot cache query responses Information may change quickly, policy restrictions, etc. Can cache queries Query Caching with Probabilistic Forwarding Cache information about which nodes responded to query If a node responded earlier to same query, deterministically forward query to that node Forward query to other nodes with low probability Identify nodes that have been updated with new information Set up caches along duplicate paths correctly Similar to other learning-based & probabilistic approaches Effective for applications that may issue queries repeatedly (e.g., Pegasus workflow planner) Optimization: Query Caching with Probabilistic Forwarding

33 Goal: Improve performance of attribute based queries Process and forward model may be slow Distinguish between Return one vs. Return all semantics Return one - explicit queries What is the load on skynet.isi.edu? Requires a single response Process query locally before forwarding to reduce messages Return all attribute based queries What sites have load < 3.0? Likely to be multiple responses Forward query before processing locally to reduce response time (early forwarding) Tag QUERY messages with hints (Return one or Return all) that indicate whether to do early forwarding Optimization: Early Forwarding

34 Layered implementation P2P Resource component maintains overlay and processes messages IndexServiceResource component processes queries Almost plug-and-play Gnutella-like message forwarding Updated using standard MDS aggregator framework and not through the overlay Query Caching and Early Forwarding optimizations Implementation of P2P Index Service

35 Experimental Set-up Evaluate overhead of applying a P2P overlay Evaluate wide-area performance Test beds - LAN at ISI, PlanetLab Mostly comparison tests small networks Applications Pegasus (A workflow planning application) Site and transformation catalogs used by Pegasus Simple random query client Artificial records Metrics Time taken by Pegasus to generate a plan Query rates Experiments

36 Pegasus planning a 100 job workflow Query Overhead reasonably constant as the number of datasets in the index increase Experiments: Overhead of P2P Layer

37 WAN measurements on the PlanetLab test bed 8 nodes, 2 peers per node, up to 256 concurrent client threads Query rates in World WAN slightly higher than in US WAN Higher load on the US PlanetLab nodes Query processing is compute intensive Experiments: Wide Area Performance

38 P2P Organization of GT4 Information Service Low overhead from adding a P2P layer Key design features: Separation of storage of information from querying for information (Overlay used only to forward queries) Separation of access to information from access to resources (Security model: Choose what is exposed at VO level and apply additional restrictions at resource level) Simple optimizations help address issues with flooding (results not shown here) Query caching with probabilistic forwarding Early Forwarding Future Work Scale to larger sizes P2P version of Replica Location Service using overlay Experiment with replicating relatively static information P2P Index Service: Conclusions

39 Outline Resource Discovery in Grids: current approaches Peer-to-Peer Resource Discovery Applying P2P techniques to Grid Resource Discovery P2P Information Service Joint work with Shishir Bharathi P2P Replica Location Service Joint work with Min Cai Summary and Future Work

40 Implemented a P2P Replica Location Service based on: Globus Toolkit Version 3.0 RLS Chord structured Peer-to-Peer overlay network Peer-to-Peer Replica Location Service A Peer-to-Peer Replica Location Service Based on A Distributed Hash Table, Min Cai, Ann Chervenak, Martin Frank, Proceedings of SC2004 Conference, November 2004. Applying Peer-to-Peer Techniques to Grid Replica Location Services, Ann L. Chervenak, Min Cai, Journal of Grid Computing, 2006.

41 The existing Globus Replica Location Service (RLS) is a distributed registry service that records the locations of data copies and allows discovery of replicas Maintains mappings between logical identifiers and target names Local Replica Catalogs (LRCs) contain logical-to- target mappings Replica Location Index Nodes (RLIs) aggregate information about LRCs Soft state updates sent from LRCs to RLIs The Globus Replica Location Service

42 Each RLS deployment is statically configured If upper level RLI fails, the lower level LRCs need to be manually redirected More automated and flexible membership management is desirable for: larger deployments dynamic environments where servers frequently join and leave We use a peer-to-peer approach to provide distributed RLI index for {logical-name, LRC} mappings Consistent with our security model: resource discovery at the RLI level, stricter security at LRC level In P2P RLS, replicate mappings, unlike in P2P MDS Easier to hash on logical name than on arbitrary XML content Mappings are much less dynamic than resource information Motivation for a Peer-to-Peer RLS

43 A P-RLS server consists of: An unchanged Local Replica Catalog (LRC) to maintain consistent {logical-name, target-name} mappings A Peer-to-Peer Replica Location Index node (P-RLI) The P-RLS design uses a Chord overlay network to self-organize P-RLI servers Chord is a distributed hash table that supports scalable key insertion and lookup Each node has log (N) neighbors in a network of N nodes A key is stored on its successor node (first node with ID equal to or greater than key) Key insertion and lookup in log (N) hops Stabilization algorithm for overlay construction and topology repair P2P Replica Location Service (P-RLS) Design

44 Uses Chord algorithm to store mappings of logical names to LRC sites Generates Chord key for a logical name by applying SHA1 hash function Stores {logical-name, LRC} mappings on the P-RLI successor nodeof the mapping When P-RLI node receives a query for LRC(s) that store mappings for a logical name: Answers the query if it contains the logical-to-LRC mapping(s) If not, routes query to the successor node that contains the mappings Then query LRCs directly for mappings from logical names to replica locations P-RLS Design (cont.)

45 N4 N8 N20 N24 N40 N48 N54 N60 N4+1, N4+2, N4+4 N4+8, N4+16 N4+32 rli_get_lrc (lfn1001) lookup(key52) SHA1(lfn1000) = 18 SHA1(lfn1001) = 52 SHA1(lfn1002) = 31 Finger Table N4+1 => N8 N4+2 => N8 N4+4 => N8 N4+8 => N20 N4+16 => N20 N4+32 => N40 An Example of P-RLS Network

46 Implemented a prototype of P-RLS Extends RLS implementation in Globus Toolkit 3.0 Each P-RLS node consists of an unchanged LRC server and a peer-to-peer P-RLI server The P-RLI server implements the Chord protocol operations, including join, update, query, successor, probing & stabilization LRC, RLI & Chord protocols implemented on top of RLS RPC layer Successor, Join, Update, Query, Probing Stabilizatio n Chord Network LRC Protocol P-RLS RLI Protocol Chord Protocol RLS RPC Layer LRC Server P-RLI Server RLS Client API LRC Protocol P-RLS RLI Protocol Chord Protocol RLS RPC Layer LRC Server P-RLI Server RLS Client API P-RLS Implementation

47 P-RLS network runs on a 16-node cluster 1000 updates (add operations) on each node, updates overwrite existing mappings, and maximum 1000 mappings in the network Update latencies increase on log scale with number of nodes P-RLS Performance

48 Query latencies with 100,000 and 1 million mappings Total number of mappings has little effect on query times Uses hash table to index mappings on each P-RLI node Query times increase on log scale with number of nodes P-RLS Measurements (cont.)

49 Need better reliability when P-RLI nodes fail or leave Replicate mappings so they are not lost Min Cai proposes Adaptive Successor Replication: Extends previously described scheme Replicate each mapping on k successor nodes of the root node Provides reliability despite P-RLI failures No mappings lost unless all k successors fail simultaneously Distributes mappings more evenly among P-RLI nodes as replication factor increases Improves load balancing for popular mappings Successor Replication in P-RLS

50 Implemented a P2P Replica Location Service based on: Globus Toolkit Version 3.0 RLS Chord structured Peer-to-Peer overlay network Measured the performance of our P-RLS system with up to 15 nodes Query and update latencies increase at rate of O(logN) with size of P-RLS network Simulated the performance of larger P-RLS networks Replication of mappings results in more even distribution of mappings among nodes Successor replication scheme provides query load balancing for popular mappings Summary: P-RLS Work

51 Other approaches to applying P2P to Grid services GLARE Mumtaz Siddiqui et al., University of Innsbruck Structured P2P approach to Grid information services P2P Replica Location Service Matei Ripeanu et al., University of British Columbia Use bloom filters and an unstructured P2P network for replica location service Structured Peer-to-Peer Networks Examples: Chord, CAN, Tapestry, Pastry Unstructured Peer-to-Peer Networks Examples: Gnutella, KaZaA, Gia Related Work: P2P and Grids

52 Summary Resource discovery services in Grids are well-suited to P2P approaches Similar in function to internet file sharing applications P2P approaches are attractive because Scale of Grids growing larger Organization of hierarchical resource discovery services is challenging, especially at large scale Need self-configuration, self-healing Security, performance & practical issues to be overcome Two systems implemented and evaluated P2P Information Service: Uses unstructured overlay to support resource information specified in XML P2P Replica Location Service: Uses structured overlay to distribute mappings among indexes, bounds query response

53 Continued research Larger scale Different overlay schemes Additional optimizations to improve query performance, reduce message counts Incorporate P2P techniques into real-world Grid resource discovery services Make GT4-based peer-to-peer overlay available as open source contribution to Globus Toolkit Release P2P Index Service component for MDS4 Tested a version of Replica Location Service using the unstructured GT4 overlay Additional improvements to services for easy, dynamic deployment to make these approaches practical Future Work