C2: a new overlay network based on CAN and Chord · Abstract: In this paper, we present C2, a new...

14
Int. J. High Performance Computing and Networking, Vol. x, No. x, 200x 1 Copyright © 200x Inderscience Enterprises Ltd. C 2 : a new overlay network based on CAN and Chord Wenyuan Cai, Shuigeng Zhou*, Weining Qian and Linhao Xu Department of Computer Science and Engineering, Fudan University, 220 Handan Road, 200433 Shanghai, China E-mail: [email protected] E-mail: [email protected] E-mail: [email protected] E-mail: [email protected] *Corresponding author Kian-Lee Tan Department of Computer Science, School of Computing, National University of Singapore, Singapore E-mail: [email protected] Aoying Zhou Department of Computer Science and Engineering, Fudan University, 220 Handan Road, 200433 Shanghai, China E-mail: [email protected] Abstract: In this paper, we present C 2 , a new overlay network based on CAN and Chord. It is primarily designed for a dynamic environment in which peers join and depart the network frequently. For an n-peers C 2 system, each peer maintains only about O(log n) of other peers’ information, and achieves routing within O(log n) hops. For each peer’s joining or departure, C can, in high probability, update the routing tables with no more than O(log n) messages. What distinguishes C 2 from many other peer to peer data sharing systems is its low computation cost and its high routing efficiency in a dynamic network. Even in the case that a considerable number of peers fail simultaneously, i.e., several other peers’ routing tables are out of date, the average number of hops for successful routing remains acceptable. Keywords: Distributed Computing; Peer-to-Peer Computing; Overlay Network; Chord; CAN. Reference to this paper should be made as follows: Cai, W., Zhou, S., Qian, W., Xu, L., Tan, K-L. and Zhou, A. (xxxx) ‘C 2 : a new overlay network based on CAN and Chord’, Int. J. High Performance Computing and Networking, Vol. x, No. x, pp.xxx–xxx. Biographical notes: Wenyuan Cai is a Master student in the Department of Computer Science and Engineering, Fudan University, China. He received his BS Degree in Computer Science and Technology from Shanghai Jiaotong University, China, in 2002. His current research interests include P2P computing, Data Mining in P2P environments. Shuigeng Zhou is a Professor in the Department of Computer Science and Engineering, Fudan University, China. He received Bachelor and Master Degrees in Electromagnetic Field Theory and Microwave Technology from Huazhong University of Science and Technology, University of Electronic Science and Technology of China, in 1988 and 1991 respectively, and PhD Degree in Computer Science from Fudan University in 2000. Before he joined Fudan University, he was a Post-doctoral Researcher at the State Key Lab of Software Engineering, Wuhan University, China from 2000 to 2002. His current research interests are in the areas of P2P computing, Data Mining and Information Retrieval.

Transcript of C2: a new overlay network based on CAN and Chord · Abstract: In this paper, we present C2, a new...

Int. J. High Performance Computing and Networking, Vol. x, No. x, 200x 1

Copyright © 200x Inderscience Enterprises Ltd.

C2: a new overlay network based on CAN and Chord

Wenyuan Cai, Shuigeng Zhou*, Weining Qian and Linhao Xu Department of Computer Science and Engineering, Fudan University, 220 Handan Road, 200433 Shanghai, China E-mail: [email protected] E-mail: [email protected] E-mail: [email protected] E-mail: [email protected] *Corresponding author

Kian-Lee Tan Department of Computer Science, School of Computing, National University of Singapore, Singapore E-mail: [email protected]

Aoying Zhou Department of Computer Science and Engineering, Fudan University, 220 Handan Road, 200433 Shanghai, China E-mail: [email protected]

Abstract: In this paper, we present C2, a new overlay network based on CAN and Chord. It is primarily designed for a dynamic environment in which peers join and depart the network frequently. For an n-peers C2 system, each peer maintains only about O(log n) of other peers’ information, and achieves routing within O(log n) hops. For each peer’s joining or departure, C can, in high probability, update the routing tables with no more than O(log n) messages. What distinguishes C2 from many other peer to peer data sharing systems is its low computation cost and its high routing efficiency in a dynamic network. Even in the case that a considerable number of peers fail simultaneously, i.e., several other peers’ routing tables are out of date, the average number of hops for successful routing remains acceptable.

Keywords: Distributed Computing; Peer-to-Peer Computing; Overlay Network; Chord; CAN.

Reference to this paper should be made as follows: Cai, W., Zhou, S., Qian, W., Xu, L., Tan, K-L. and Zhou, A. (xxxx) ‘C2: a new overlay network based on CAN and Chord’, Int. J. High Performance Computing and Networking, Vol. x, No. x, pp.xxx–xxx.

Biographical notes: Wenyuan Cai is a Master student in the Department of Computer Science and Engineering, Fudan University, China. He received his BS Degree in Computer Science and Technology from Shanghai Jiaotong University, China, in 2002. His current research interests include P2P computing, Data Mining in P2P environments.

Shuigeng Zhou is a Professor in the Department of Computer Science and Engineering, Fudan University, China. He received Bachelor and Master Degrees in Electromagnetic Field Theory and Microwave Technology from Huazhong University of Science and Technology, University of Electronic Science and Technology of China, in 1988 and 1991 respectively, and PhD Degree in Computer Science from Fudan University in 2000. Before he joined Fudan University, he was a Post-doctoral Researcher at the State Key Lab of Software Engineering, Wuhan University, China from 2000 to 2002. His current research interests are in the areas of P2P computing, Data Mining and Information Retrieval.

2 W. CAI, S. ZHOU, W. QIAN, L. XU, K-L. TAN AND A. ZHOU

Weining Qian is an Assistant Professor in the Department of Computer Science and Engineering, Fudan University, China. He received his BS, MS and PhD Degrees in Computer Science all from Fudan University, in 2000 and 2003 respectively.

Linhao Xu is a PhD candidate in the Department of Computer Science and Engineering, Fudan University, China. His current research interests are Query Processing in P2P Networks, Information Retrieval in P2P Environments.

Kian-Lee Tan received his PhD in computer science in 1994. He is currently an Associate Professor in the Department of Computer Science, School of Computing, National University of Singapore. His current research interests include multimedia information retrieval, query processing and optimization in multiprocessor and distributed systems, and database performance, database security and genome databases. He has published numerous papers in conferences such as SIGMOD, VLDB, ICDE and EDBT, and journals such as TODS, TKDE, and VLDBJ. Kian-Lee is a member of ACM and an affiliate member of IEEE.

Aoying Zhou is a Professor in the Department of Computer Science and Engineering, Fudan University, China. He received Bachelor and Master degrees in Computer Science from Sichuan University, in 1985 and 1988 respectively, and Ph.D. degree in Computer Science from Fudan University in 1993. His current research interests are in the areas of Database, P2P computing, Data Mining, Streaming Data Management. Aoying Zhou is a member of ACM and a member of IEEE Computer Science Society.

1 INTRODUCTION

Although they came into being only a few years ago, peer to peer (P2P) data sharing systems have become one of the most prevalent internet applications. In P2P systems, resources are fully independent of central control and peers can join or leave the systems frequently and unpredictably. Therefore, it is crucial for P2P systems to design a scalable overlay structure and a deterministic routing mechanism to locate the desired resources. Roughly speaking, there are two types of data sharing P2P systems – unstructured and structured P2P systems. As far as routing efficiency is concerned, the structured P2P systems are generally superior to the unstructured P2P systems. Current structured P2P systems, such as Chord (Stoica et al., 2001), CAN (Ratnasamy et al., 2001), Tapestry (Zhao et al., 2001), Pastry (Rowstron and Druschel, 2001a), Viceroy (Malkhi et al., 2002), Yappers (Ganesan et al., 2003), Sympony (Manku et al., 2003), Kademlia (Maymounkov and Mazieres, 2002), Skipnet (Harvey et al., 2003), P-Grid (Aberer, 2001) have been proved to possess some common features such as deterministic location, load balance and dynamic membership etc., each of which boasts of its own unique merits. Thus, it is reasonable to design new overlay networks that can combine merits of existing structured systems into one framework to acquire further improvement on routing efficiency and fault tolerance. The work of this paper is our effort in this direction. In this paper, we present a new overlay network C2, which integrates merits of resource locating of both Chord (Stoica et al., 2001) and CAN (Ratnasamy et al., 2001). It is primarily designed for a dynamic environment in which peers join and depart the network frequently. It functions as a distributed hash table (DHT) to manage resources in a dynamic environment and allow peers to contact each other to locate the desired resources.

Compared with CAN and Chord, C2 boasts of lower computation cost, higher routing efficiency and better fault tolerance. In the stable state (no peer joins or leaves the network and no peer fails), for an n-peers C2 system, each peer maintains about O(log n) other peers’ information, and achieves routing within O(log n) hops. For each peer’s join or departure, C2 can, in high probability, update the routing tables with no more than O(log n) messages. To locate a key, C2 needs O(log n) hops at most, and the computation cost for each hop is quite lower than other systems. In a dynamic environment where peers join and depart the system unpredictably, C2 can still run smoothly. Even in the case that considerable peers fail simultaneously (i.e., quite a lot of other peers’ routing tables are out of date), the average number of hops for successful routing does not show an obvious increase.

The major contributions of this paper are as follows:

• a new overlay network, i.e., C2, is proposed, which combines the advantages of Chord and CAN

• some properties of C2 are introduced and proved • extensive experiments are conducted to evaluate the

efficiency and effectiveness of C2.

The rest of this paper is structured as follows. We discuss related work in Section 2. For easy understanding of this paper by the readers who do not have much knowledge of P2P systems, in Section 3 we give a brief introduction to Chord and Can, on which C2 is based. We then introduce the design of the C2 system and explore its properties in Section 4. We present extensive experimental results in Section 5. Finally, we conclude our work and highlight some future research directions in Section 6.

C2: A NEW OVERLAY NETWORK BASED ON CAN AND CHORD 3

2 RELATED WORK

We can categorise the P2P Systems into two main groups: unstructured P2P system and structured P2P system. In this section, we will briefly survey the major existing P2P systems of these two types.

2.1 Unstructured P2P systems

The earliest P2P file sharing systems, such as Napster (http://www.napster.com/) and Gnutella (http://gnutella. wego.com/) are unstructured P2P systems. Such P2P systems appeared later include Freenet (http://freenet. sourceforge.com/), Bestpeer (Ng et al., 2002) etc. These are the most popular P2P application systems because they are simple and easy to deploy.

Napster. It is a centralised system. Each peer can locate files by searching the central directory. There are problems inherent with such centralised architecture.

• It suffers from ‘single points of failure’. The crash of the central server will lead to the breakdown of the whole system.

• It is unscalable; the workload of the central server is the bottleneck of the system performance.

To overcome the limitations of centralisation, some purely decentralised P2P file sharing systems were developed.

• Gnutella. It is a completely decentralised system. Data placement is independent of the overlay topology. When searching, a peer broadcasts its query to its neighbour peers and the neighbours propagate the query to their neighbours and so on, till the query’s TTL(Time To Live) equals to zero. The peers that have matching files will pass back the results set to the original querying peer. Gnutella is fully decentralised, so it does not suffer from ‘single points of failure’ and is more scalable than Napster. However it adopts the broadcast based routing mechanism, its searching cost is very high.

• Freenet. It is a purely decentralised and loosely structured P2P system with a self organising property. By pooling unused disk spaces at peers, it creates a collaborative virtual file system that provides both security and publisher anonymity. From this point, a main difference from other systems such as Gnutella is that Freenet provides file storage service, rather than file sharing service. Whereas in Gnutella files are only copied to other nodes when these nodes request them, in Freenet, files are pushed to other nodes for storage, replication and persistence. Furthermore, Freenet makes it infeasible to discover the true origin or destination of a file passing through the network, and difficult for a node operator to determine (or be responsible for) the actual physical contents.

• Bestpeer. It is a prototype P2P system implemented by the database group at National University of Singapore. Judging by network structure, it can be seen as a kind of hybrid unstructured P2P system, which lies between

Napster and Gnutella. Comparing with the unstructured P2P systems mentioned above, BestPeer has some unique features.

• It combines the power of mobile agents into P2P systems to perform operations at peers’ sites. This facilitates content based searching.

• It is selfconfigurable, i.e., a node can dynamically optimise the set of peers that it can communicate directly with based on some optimisation criterion. By keeping peers that provide most information or services in close proximity (i.e., direct communication), the network bandwidth can be better utilised and system performance can be optimised.

• BestPeer provides a location independent global names lookup server to identify peers with dynamic (or unpredictable) IP addresses.

In such a way, several peers can always collaborate (or share resources) even if their IP addresses may be different at different occasions.

2.2 Structured P2P systems

Compared with unstructured P2P systems, structured P2P systems employ distributed hash table (DHT) based mechanisms to locate resources deterministically. Typical structured P2P systems include Chord (Stoica et al., 2001), CAN (Ratnasamy et al., 2001), Pastry (Rowstron and Druschel, 2001a), Tapestry (Zhao et al., 2001), Viceroy (Malkhi et al., 2002), Yappers (Ganesan et al., 2003) and Sympony (Manku et al., 2003) etc. In these systems, the overlay topology is tightly controlled and resources are placed on specified peers, so that messages and/or data can be routed to their destinations efficiently. However, when considerable peers join and leave the system simultaneously, routing performance will be impacted severely. Since this paper’s focus is on structured P2P systems, we give a more detailed survey of the major existing structured P2P systems as follows:

• CAN. It has a hypercube alike topology structure. Each peer is mapped onto a d-dimensional coordinate space and attached to a hypercube region in this space. Each peer corresponds to one zone and stores the data that are mapped to this zone by hash function. In d-dimensional space, two peers are neighbours if their corresponding zones overlap along d – 1 dimensions, but are neighbour to each other in the remaining dimension. Each peer in CAN maintains only information about its immediate neighbours. To locate a key, a query peer simply routes the message to the neighbour which makes the most progress to the destination until the destination peer is reached. Each peer maintains O(d)(2d on average) neighbours and the routing path is O(dn1/d)((d/4)(n1/d) on average) hops.

4 W. CAI, S. ZHOU, W. QIAN, L. XU, K-L. TAN AND A. ZHOU

• Chord. It is based on a kind of ring alike topology structure. Each peer is mapped onto a 1-dimensional circular coordinate space from 0 to 2m – 1. The peer which is responsible for a certain key is the first peer to succeed the key. Each peer maintains three sets of neighbours information: • a predecessor • a successors list • the finger table of O(log n) peers spaced

exponentially around the key space.

To locate a key, every peer simply routes the message to the peer which is the biggest, but does not overshoot the key in the its routing table until the key lies between the peer and its successor, which means that the successor is the destination peer. Each peer has O(log n) neighbours and the routing path is O(log n) hops.

• Pastry. Its underlying overlay network is of tree alike topology structure. Each peer maintains three sets of neighbours’ information: • leaf set • neighbourhood set • routing table of O(log n) peers.

Given a routing message, the peer first checks to see if the key falls in the range of its Leaf Set. If so, the message is forwarded directly to the destination peer. If the key is not covered by the Leaf Set, the peer routes the message to the peer whose ID shares the longest prefix with the key in its routing table. In Pastry, each peer has O(log n) neighbours and routing path is O(log n) hops.

• Tapestry. Analogous to Pastry, it also has tree alike topology structure. Each peer maintains a neighbour map which is composed of (log N)/b levels, and a back pointer list that points to peers where it is referred to as a neighbour. Each level in the neighbour map represents a matching suffix up to a digit position in the ID. Every level of the neighbour map contains b entries. To locate a key, each peer routes the message to the neighbour whose ID shares the longest suffix with the key in its neighbour map. Each peer has O(log n) neighbours and the routing path is O(logb n) hops.

• Viceroy. It is the first randomised protocol for DHT routing based on the butterfly network. Each node in the system is randomly assigned two identifiers: an ID and an integer identifying its level. The Viceroy network is composed of an approximate butterfly network, a ring connecting nodes in the order of their IDs and level rings, where all nodes of the same level are connected in a ring. Each peer maintains a constant number of neighbours. To locate a key, the routing process can be divided into three steps:

• climbing step • downward step • vicinity search step.

The routing path is O(log n) hops with high probability in a random network construction.

• Yappers. It is a kind of hybrid system combining the advantages of the Gnutella style and DHT based systems. As its name, YAPPERS (Yet Another Peer to Peer System) implies, it operates on top of an arbitrary overlay network, just as Gnutella does, while providing DHT like search efficiency. Specifically, YAPPERS builds many small DHTs, instead of one overarching DHT, on top of an arbitrary overlay and relies on an intelligent forwarding mechanism, similar to Gnutella style flooding, to traverse all the small DHTs if necessary.

• Symphony. As Yappers, it is also a kind of hybrid system that extends Chord network structure by additional direct links. The core idea of Symphony is to place all hosts along a ring and attach each node with a few long distance links. Symphony is inspired by Kleinberg’s Small World construction. It is shown that with k = O(1) links per node, it is possible to route hash lookups with an average latency of O((1/k)log2n) hops.

The proposed C2 network is also a kind of structured P2P system. It combines the merits of CAN and Chord to form a new structured overlay network. Compared with CAN and Chord, C2 boasts of lower computational cost, higher routing efficiency and better fault tolerance. Each C2 peer maintains about O(log n) other peers’ information, and achieves routing within O(log n) hops. Even in a dynamic environment where peers join and depart the system frequently, C2 can still run smoothly.

3 ABOUT CHORD AND CAN

Our new P2VP protocol C2 is based on Chord and CAN. For the readers, who are not familiar with Chord and CAN, to easily understand the new protocol, in this section we give a brief introduction to Chord and CAN as follows.

3.1 Chord

Chord provides only one operation: given a key, it maps the key onto a node. Data location is carried out on top of Chord by associating a key with each data item and storing the key/data item pair at the node to which the key maps.

In Chord, both data items and nodes are associated with unique IDs by means of a variant of consistent hashing. Consistent hashing makes each node receive roughly the same number of keys and involve relatively little movement of keys when nodes join or leave the network, thus balancing the load of the network. As nodes enter the network, they are assigned unique IDs by hashing their IP address. Keys (file IDs) are assigned to nodes as follows. Identifiers are ordered in an identifier circle modulo 2m (m is a predefined parameter). Key k is assigned to the first node whose identifier is equal to or follows (the identifier of) k in the identifier space. This node is called the successor node of key k.

C2: A NEW OVERLAY NETWORK BASED ON CAN AND CHORD 5

When a node n joins the network, certain keys previously assigned to n’s successor will be assigned to n. When node n leaves the network, all keys assigned to it will be reassigned to its successor. These are the only changes in key assignments that need to take place in order to maintain load balance. The only routing information required is for each node to be aware of its successor node on the circle. Queries for a given identifier are passed around the circle via these successor pointers until they first encounter a node that succeeds the identifier. This is the node the query maps to.

In order to speedup the routing process, Chord maintains additional routing information. This additional information

is called a finger table, in which each entry i points to the successor of node (n + 2i) modulo 2m. In order for a node n to perform a lookup for key k, the finger table is consulted to identify the highest node n′ whose ID is between n and k. If such a node exists, the lookup is repeated starting from n′. Otherwise, the successor of n is returned. By using the finger table, lookups can be completed in time log N, where N is the number of nodes. Figure 1 shows an example of a 10-peer Chord Network. The left subfigure shows the finger table of a peer, and the right subfigure shows how peers route a message.

Figure 1 Chord example

3.2 CAN

CAN is a distributed, internetscale hash table that maps file names to their location in the network.

The basic operations performed by CAN include the insertion, lookup and deletion of (key, value) pairs in the distributed hash table. Each CAN node stores a part (called a zone) of the hash table, as well as information about a small number of adjacent zones in the table. Requests to insert, lookup or delete for a particular key are routed via intermediate nodes to the node that maintains the zone containing the key.

CAN uses a virtual d-dimensional Cartesian coordinate space to store (key K, value V) pairs as follows: First, K is deterministically mapped onto a point P in the coordinate space. The (K, V) pair is then stored at the node that owns the zone within which point P lies. To retrieve the entry corresponding to K, any node can apply the same deterministic function to map K to P and then retrieve the corresponding value of V from P. If P is not owned by the requesting node, the request must be routed from one node to another one, until the node that contains zone P is reached. CAN nodes learn and maintain the IP addresses of nodes that hold coordinate nodes adjoining their own in a routing table that enables routing between arbitrary points in space. Intuitively routing in CAN works by following the straight line path through the Cartesian space from source to destination coordinates. Figure 2 shows the routing strategy of CAN. New nodes that join the CAN system are allocated

their own portion of the coordinate space by splitting the allocated zone of an existing node in half, as follows:

• The new node identifies a node already existing in CAN, using some bootstrap mechanism.

• Using the CAN routing mechanism, it randomly chooses a point P in the space and sends a JOIN request to the node whose zone contains P. The zone will be split, and half will be assigned to the new node.

• The new node learns the IP addresses of its neighbours, and the neighbours of the split zone are notified so that routing can include the new node.

Figure 2 CAN example

When nodes leave CAN, the zones they occupy and the associated hash table entries are explicitly handed over to one of their neighbours.

6 W. CAI, S. ZHOU, W. QIAN, L. XU, K-L. TAN AND A. ZHOU

4 THE C2 NETWORK

In this section, we will describe the C2 protocol about how to locate a key, how to join the system, and how to recover from the departure of existing peers. In addition, we will briefly discuss the fault tolerant characteristic of C2.

4.1 Overview

C2 network is a kind of combination of CAN (Ratnasamy et al., 2001) and Chord (Ratnasamy et al., 2001). Peers’ identifiers and resources are mapped onto a discrete d-dimensional Cartesian coordinate space by uniform hash function as in CAN and Chord. Each dimension is a circle from 0 to 2m – 1 (m is a predefined system parameter), i.e., there are totally 2m valid coordinate points on each dimension. Thus the entire space can be (not necessarily) partitioned into at most 2m × d unit hypercubes. We denote these smallest hypercubes (whose range on each dimension is 1) as unithypercubes (or simply UHs). Consider that each of these 2m × d unithypercubes can represent one peer; the entire space can hold at most 2m × d peers. However, one peer may cover more than one unithypercube just as in CAN, a peer may cover a large zone. In practice, given the dimension number d of the mapping space, the value of m must be large enough to make the space be able to contain all peers that possibly join the network. In what follows, we denote the number of peers in the C2 network as n = α × 2m × d (α is a constant, satisfying α ∈ (0,l)).

All resources are deterministically mapped onto a d-dimensional key in the discrete coordinate space by a uniform hash function. The entire discrete d-dimensional coordinate space is partitioned amongst all peers in the network, and each peer manages the resources mapped into its hypercube zone. Thus, to locate a certain resource is to find the peer whose hypercube zone covers the key of the resource. Each hypercube zone can be identified by its bottom left point and top right point. We define these two points as the MinP and MaxP of the hypercube zone respectively.

Definition 1: Minimal point (MinP): the minimal point of a hypercube zone is the point whose coordinate value is greater than the minimal coordinate value, by one, in every dimension.

Since the coordinate space is discrete and each point in the coordinate space can only be assigned to one peer, the coordinate value of MinP in every dimension is defined by the value one larger than the minimal one.

Definition 2: Maximal point (MaxP): the maximal point of a hypercube zone is the point whose coordinate value is the maximal one in every dimension.

For the case of 2-dimensional space, given a zone Z whose down left and up right coordinates are (x1, y1) and (x2, y2) respectively, then its MinP is (x1 + 1, y1 + 1), and its MaxP is (x2, y2). Let peer P be the owner of Z. Then any data

object on P must have a key whose coordinate value (x, y) satisfies x1 < x ≤ x2 and y1 < y ≤ y2.

Example 1: Figure 3 shows a 2-dimensional [0, 23 – 1] × [0, 23 – 1] coordinate space partitioned among ten C2 peers. Each zone is managed by a peer presented by a natural number. The MaxP of a zone is presented by a circular dot and the MinP by a square dot. The keys mapped into the zone of Peer 10 are (5, 1), (5, 2), (6, 1), and (6, 2).

Figure 3 A 2-dimensional C2 with 10 peers

All peers in C2 system are selforganised into an overlay network. Each peer maintains information about a set of peers in its routing table, which can facilitate routing in the coordinate space. Peers get the information by communicating with some other peers in the network. For an n peers C2 network, each peer maintains in its routing table, the information of about O(log n) other peers, and a lookup operation requires only O(log n) hops. In a dynamic environment, when a peer joins or leaves the network, C2 needs only O(log n) messages to update the routing information with high probability.

4.2 Routing table

The information in the routing table is used for peers to facilitate message routing in the coordinate space. In C2 each peer maintains a routing table which is organised into d rows with m entries in each row, that is, the size of the routing table is d × m.

Definition 3: Neighbour point (NbP): For each peer, P, there is a point Q whose coordinate value succeeds P’s MaxP by 2j (0 ≤ j ≤ m – 1) in the rth dimension (1 ≤ r ≤ d) and shares the same value with P’s MinP in all the other dimensions. Point Q is the jth neighbour point of P in the rth dimension.

The jth entry at row r of P’s routing table corresponds to the peer whose zone covers P’s jth NbP in the rth dimension. Each entry in the routing table consists of the IP address, the MaxP and the MinP of the corresponding peer.

C2: A NEW OVERLAY NETWORK BASED ON CAN AND CHORD 7

Example 2: Consider the C2 network in Figure 3, Table 1 is the routing table of Peer 1, 6, and 10.

Table 1 Routing table of Peer 1, 6, and 10

1 6 10 Interval Dimension

20 21 22 20 21 22 20 21 22

1 4 4 5 7 7 8 8 8 9 2 9 9 6 1 1 1 7 7 5

Theorem 1: For an n-peers C2 system, the number of entries in the routing table of any peer is O(log n).

Proof: As we have seen, the number of entries in every peer’s routing table is d × m. The number of peers in C2

network n = α × 2m × d (α ∈ (0,l)). Thus, the number of entries in the routing table of any peer is O(log n).

4.3 Routing in C2

In this subsection, we will first introduce the routing scheme of C2; that is, how a query message is routed to the destination peer. Then, we will analyse the performance of our routing scheme, and prove its efficiency.

After receiving a query message, the peer takes two steps to decide to which peer it should forward the message:

• compute the differences between its MaxP and the key in all dimensions

• forward the message to the peer which is the closest to the destination in the routing table.

To achieve the second step of routing, a naive method is to compute the distance to the destination for every peer in the routing table, and then choose the least one. However, this method needs a lot of computation and is quite time consuming. In C2 we design a method which is much faster than the naive one.

Definition 4: Maximal difference dimension (MaxDD): the maximal difference dimension of a peer P with regard to a key K, is the dimension in which the difference between the peer’s MaxP and the key is the largest in all dimensions, that is,

1arg max ( , ).

d

iiMaxDD Dif P K

== (1)

( , )2 .

i i i ii m

i i i i

K MaxP K MaxPDif P K

K MaxP K MaxP

− ≥= − + <

(2)

Definition 5: Minimal difference dimension (MinDD): the minimal difference dimension of a peer P with regard to a key K, is the dimension in which the difference between the peer’s MaxP and the key is the least in all dimensions, that is,

arg min ( , ).iMinDD Dif P K= (3)

In the case that more than one dimension is MaxDD or MinDD, any one of them can be taken as the final MaxDD or MinDD.

To route a message to a key K in the coordinate space, the peer first calculates its MaxDD with regard to K. Then it chooses the farthest legal1 entry in the row of MaxDD. Finally, it forwards the message to the peer corresponding to that entry.

The computation cost of such a routing scheme is much lower than that of the naive method which directly computes the distances between the key and all peers in the routing table. Lemma 1 and Theorem 2 will prove that in the coordinate space, the distance between the key and the farthest legal NbP in the MaxDD is not more than the distance between the key and any other legal NbP in the routing table.

Lemma 1: In logical overlay, given a peer P and a key K, P’s farthest legal NbP in its MaxDD has the maximal hop distance towards K in the routing table.

Proof: Given a key K(K1, K2, …, Kd) and a peer P whose MaxP is (P1, P2, …, Pd) in d-dimensional coordinate space, and 2j ≤ DifMaxDD(P, K) < 2j+1(0 ≤ j < m).

Then, the farthest legal NbP in MaxDD is Q(P1, P2, …, PMaxDD + 2j, …, Pd), the hop distance in logical overlay is 2j.

If ∃h and s(h ≠ MaxDD, 1 ≤ s < m) (without loss of generality, we assume h > MaxDD), the hop distance of legal NbP Q′(Pl, P2, …, PMaxDD, …, Ph + 2s, …, Pd) is larger than that of Q, i.e., 2j < 2s. Then 2j < DifMaxDD (P, K) < 2j+1 ≤ 2s < Difh(P, K) < 2s+1(0 ≤ j < s < m), which contradicts with the fact that

1arg max ( , ).di iMaxDD Dif P K==

Thus, 2j is the maximal distance of legal hop.

Theorem 2: In logical overlay, given a peer P and a key K, P’s farthest legal NbP in its MaxDD has the minimal hop distance towards K in the routing table.

Proof: Given a key K(K1, K2, …, Kd) and a peer P(P1, P2, …, Pd) in d-dimensional coordinate space, and 2j ≤ DifMaxDD(P,K) < 2j+1 (0 ≤ j < m). Q(P1, P2, …, PMaxDD + 2j, …, Pd) is the farthest legal NbP in P’s MaxDD.

For any legal NbP Q′(Q1,Q2, …, QMaxDD, …, Qh + 2s, …, Qd), in the hth dimension (h ≠ MaxDD, without loss of generality, we assume h > MaxDD), according to Lemma 1, s ≤ j. Then,

2 2 2 2, , 1 1

2 2

2 2

( ', ) ( , )

[ ( , ) ( ( , ) 2 ) ]

[ ( , ) 2 ) ( , ) ]

2 [2 ( , ) 2 ]

2 (2 ( , ) 2 ).

d dQ K Q K i ii i

sMaxDD h

jMaxDD h

j jMaxDD

s sh

D D Dif Q K Dif Q K

Dif P K Dif P K

Dif P K Dif P K

Dif P K

Dif P K

′ = =− = −

= + −

− − +

= × × −

− × × −

∑ ∑

8 W. CAI, S. ZHOU, W. QIAN, L. XU, K-L. TAN AND A. ZHOU

In the case s = j,

2 2 1, , 2 [ ( , ) ( , )] 0.s

Q K Q K MaxDD hD D Dif P K Dif P K+′ − = × − ≥

In the case s < j, that is s + 1 ≤ j,

2 2, ,

1

2 2

2 2

2

2 [2 (2 ( , ) 2 )

(2 ( , ) 2 ]

( ( , ) 2 and

( , ) 2 )

2 (2 2 2 )2 (2 2 2 )2 0.

s j s jQ K Q K MaxDD

sh

jMaxDD

sh

s j s s s

s s s s

s

D D Dif P K

Dif P K

Dif P K

Dif P K

−′

+

− +

+ +

− = × −

− +

− ≥ −

≥ − +

≥ − +

= >

Thus, DQ′,key > DQ,key. Therefore, P’s farthest legal NbP in its MaxDD has the

minimal hop distance towards K in the routing Table 1.

Algorithm 1 is the pseudocode of the routing strategy in C2. The input of the algorithm is the key, the output is the identifier of the peer to which the message will be forwarded. When a peer P receives a lookup message, it first checks whether the key locates in its own zone. If its own zone covers the key, then the routing succeeds; Otherwise, P computes its MaxDD with regard to the key, then tries to forward the message to the peer Q covering its farthest legal NbP in the MaxDD. If Q is available, the algorithm will return Q; otherwise, P will try in the second MaxDD (the dimension which has the second maximal difference between P’s MaxP and the key in all dimensions), and so on. Example 3 provides an example about how a peer can route a message to the destination. And Theorem 4 proves the efficiency of the routing scheme when C2 is logarithmic.

Algorithm 1: P.Route(key)

Example 3: Figure 4 is the example of Peer 10 to locate the key (4,7). Let us consider the C2 network in Figure 4, which is similar to Figure 3. Suppose Peer 10 wants to locate a file whose key is (4,7). Since the difference between Peer 10’s MaxP and (4,7) in the first dimension is 6 and in the second

dimension is 5, the MaxDD of Peer 10 with regard to key (4,7) is the first dimension. Peer 10 searches the first row in its routing table. Because the farthest legal NbP of Peer 10 with regard to (4, 7) is (2, 1) which is in Peer 9’s zone, Peer 10 forwards the message to Peer 9 (if Peer 9 is not available, then Peer 10 will forward the message to Peer 5). The difference between Peer 9’s MaxP and (4, 7) on the two dimensions are 0 and 5 respectively, so MaxDD of Peer 9 with regard to (4, 7) is the second dimension. Then Peer 9 searches the second row in its routing table. Because the entry corresponding to the farthest legal NbP of Peer 9 with regard to (4, 7) is Peer 1, Peer 9 forwards the message to Peer 1. Similarly, Peer 1 sends the message to Peer 4, and Peer 4 forwards the message to Peer 2 whose zone covers the key (4, 7).

Figure 4 Example of routing

Theorem 3: Suppose the number of peers in the C2 network is n, for any given peer P, the number of hops to locate a key K is O(log n).

Proof: Let the key K be managed by peer Q. Recall that if P ≠ Q, then P forwards the message to a peer T, whose zone covers the farthest legal NbP in the MaxDD of P. Suppose T is in the jth entry of the rth row. The distance between P’s MaxP and its NbP, which is covered by T’s zone, is 2j–1. Since P chooses the jth entry but not the ( j + 1)th entry, the distance between P’s MaxP and K is less than 2j, so the distance between T’s MaxP and K on dimension r is not greater than the distance between T and P. Therefore, T halves the distance between P and key on the rth dimension. Considering that the range of every dimension is 2m, then within m steps, the distance will reduce to 0. Since there are d-dimensions, the total number of steps will be d × m which is O(log n).

4.4 Join C2

In a dynamic network, peers can join and leave the system at any time. In this subsection, we will introduce how the system handles the joining of a new peer. The method to manipulate the departure of an existing peer will be introduced in next subsection.

C2: A NEW OVERLAY NETWORK BASED ON CAN AND CHORD 9

When a new peer arrives, it needs to initialise its routing table, and then inform some other peers of its presence. In order to balance the workload, a new peer must be allocated its own zone, which is generated by splitting the zone of an existing peer in half or taking over a zone from an existing peer. The joining process takes six steps:

• find a peer already in the C2 network by bootstrap mechanism

• generate a point in the coordinate space by using hashing method

• route to the existing peer whose zone covers the point • redirect to the peer which is responsible for the biggest

zone in the routing table • split the zone, the new incoming peer manages the

smaller half2 • update routing table of both peers and inform other

peers of its presence.

In order to simplify the join and departure scheme, each peer P maintains d predecessor pointers and a routing table which is a list of peers satisfying the condition that P is in these peers’ routing tables. When peers join and leave, maintaining predecessor pointers and routing tables can facilitate the update of routing tables.

Theorem 4: With high probability, any peer joining an n peers C2 network will use O(log n) messages to update the routing tables.

Proof: We will classify the update of the routing table into two categories:

• Update the rth row which corresponds to the dimension in which the existing peer splits its zone. This problem is similar to updating the finger table in Chord, where it is proved that the message to update the finger table is O(log n) in its technical report. In C2, since the diameter of the rth dimension is 2m, the cost of update the rth row of routing table is O(log 2m) = O(m). In most cases, the splitting peer need not update the rth row of its routing table. However, in the case that one or more entries in the rth row are the splitting peers themselves, it should modify such entries by pointing to the new peer directly, the cost is O(1).

• Update the other (d – 1) rows of the routing table. The new peer can download the routing table of the splitting peer, all the entries need not modify except for the entries in the rth row. However, the splitting peer should update its routing table, since its minimal point is changed. The cost of such updating is O(log2 2m) = O(m2) in each dimension. As a practical optimisation, the splitting peer can use its old routing table to facilitate the finding of correct values for its new routing tables, since its new routing tables will be similar to the old one. This can be shown to reduce the time to update the routing table to O(log2m) = O(m).

Thus the total cost of updating the routing table is O(d × m) = O(log n).

In the fourth step of the join method, we redirect the new peer to the peer whose zone is the largest in the routing table. This redirection method is very useful for the workload balancing characteristic of C2, which will be shown and proved in the experiment section.

Moreover, all the peers which should be informed in terms of the new peer’s presence, are in the routing table of the splitting peer. The new peer downloads the routing table and then sends messages to inform all the peers in the routing table directly. Having been informed, these peers update their routing table by deciding whether to substitute the new peer for the splitting peer, and send messages to the splitting peer and the new peer to update their routing table. The cost of informing every peer in the routing table is O(1). Since the average number of entries in the routed tables is O(log n), the cost of this process is also O(log n).

Algorithm 2 is the pseudocode of the join strategy of C2. Example 4 provides an example of the procedure of a new peer’s join.

Algorithm 2: P.Join(Q)

Example 4: Figure 5 shows how Peer 11 joins the C2

network, which randomly generates the point (1, 5), via Peer 10. It first routes to Peer 1, whose zone covers (1, 5). Then, it splits the zone of Peer 1 and takes over the smaller half.

Figure 5 Example of join

10 W. CAI, S. ZHOU, W. QIAN, L. XU, K-L. TAN AND A. ZHOU

Finally, it builds up its routing table and notifies the peers in Peer 1’s routing table. Moreover, Peer 1 updates its own routing table. Table 2 shows Peer 11’s routing table and the change of Peer 1’s routing table.

Table 2 Change of routing table in join

1 BeforeJoin 1 AfterJoin 11

Interval

Dimension 20 21 22 20 21 22 20 21 22

1 4 4 5 2 2 3 4 4 5

2 9 9 6 9 9 6 1 1 9

4.5 Leave C

When peers leave C2, their zones should be taken over by some existing peers. The routing table of existing peers should also be updated.

In the case of node departure, the peer P that will depart the network should find an existing peer to take over its zone. In order to balance the workload, P chooses the most underloaded peer Q in its routing table, and notifies Q to take over the zone. Q then downloads the routing table from P without any modification. Finally, Q downloads the routing table from P and notifies the peers whose routing tables contain P, by replacing Q with P.

In the case of node failure, the least underloaded neighbour peer will take over the zone. Similar to the node join method, the routing table can be reestablished within O(log n) messages. However, it is very difficult to inform the peer whose routing table contains the failure peer, instantaneously. Thus, there is a high probability that not all the information in the routing table is correct. One of the recuperative methods is that every peer contacts the peers in its routing table periodically (This method is the same as the stabilise method in Chord, see Stoica et al., 2001). When some of the peers in the routing table are unavailable, they will update these entries by relocating the NbPs by using the routing scheme mentioned in Section 2.2.

A better strategy to handle the departure of peers is introduced in our technical report. Example 5 shows an example of Peer 8’s departure. In addition, we will discuss the fault tolerance characteristic of C2 in the next subsection.

Example 5: Figure 6 shows an example of Peer 8’s departure. Peer 8 first chooses peer 10 to take over its zone. Then, Peer 10 downloads the routing table from Peer 8 and notifies the peer whose routing table contains Peer 8. Finally, Peer 10 manages two zones.

Figure 6 Example of departure

4.6 Fault tolerance

In a dynamic environment, both peer departure and failure are unpredictable. Hence, some routing tables are out of date. Even with some outdated routing tables in the network, the routing scheme of C2 can still reach the destination with little additional overhead.

Thanks to its unique routing scheme, C2 is quite fault tolerant. Suppose peer P wants to route a message to the peer whose zone covers key K. It computes the differences between its MaxP and K in all dimensions, and sorts the dimensions by their differences. Then, it chooses the farthest legal entry in MaxDD. If the peer corresponding to the entry is available, P forwards the message to it; otherwise, the second MaxDD will be used, and so on. Therefore, in most cases, it can choose a way out to reach the destination peer, unless all the d-dimensions have no such a way out. Obviously, the worst case rarely happens in reality.

Experimental results show that the performance of query routing degrades modestly, with regard to the number of failures. When considerable peers fail, only a small fraction of additional routings cost arises. The performance (number of hops) of successful query routing remains satisfactory.

5 PERFORMANCE EVALUATION

In this section, we will present simulation results to demonstrate the performance of C2. We implement an experimental C2 simulator platform in Java, which can deploy up to 218 peers. All the following experiments are running in IBM’s JDK 1.3 and on a PC with Pentium IV 2.4 GHz and 512M DDR RAM. On this simulator, a number of C2 properties are evaluated experimentally.

• we will examine the relationship between routing efficiency and network scale, and compare C2 with CAN

• we will take a close look at the distribution of the routing cost

C2: A NEW OVERLAY NETWORK BASED ON CAN AND CHORD 11

• we will study the workload distribution of C2 network • we will investigate how routing performance changes

along with the proportion of failure peers in the C2 network.

5.1 Efficiency and scalability

In the first simulation experiment, we vary the number of peers from 26 to 218 in networks with d = 4, m = 6 and d = 2, m = 12 respectively. In each configuration of the C2 overlay network, we do 104 trials in which one peer is chosen randomly and one destination point is generated in the coordinate space. A message routes the chosen peer to the point, and the average number of hops is recorded.

The experimental results of efficiency and scalability are showed in Figure 7, where the ‘LogN’ and ‘CAN’ curves show the values of log n and the average number of hops in CAN (Ratnasamy et al., 2001), respectively. The experimental result shows that the average number of hops in an n peers C2 network is about (1/2)log n. Obviously, as the network scale enlarges, the number of hops for routing in CAN grows faster than in C2, which means that C2 is more efficient and scalable than CAN.

Figure 7 Scalability

Actually, the efficiency and scalability properties of C2 are the same as Chord (Stoica et al., 2001).

Figure 8 shows the probability density function of routing efficiency in a 218 peers 4-dimensional C2 Network. In each of the 106 trials, the number of hops is recorded. We can see that the lengths of most routing paths are about (1/2) log n. In the worst case, the length of the routing path is log n.

Figure 8 Hop distribution

5.2 Workload balance

In our second simulation experiment, the workload balancing characteristic of C2 is evaluated. We achieve this experiment on a 4-dimensional C2 with m = 6. Initially, only one peer is in the system. Then, peers join the system one by one until there are 218 peers in the system. We approximately consider the volume of the area as the workload of the peer.

Figure 9 shows the workload distribution of peers in C2.

Figure 9 Workload balancing

Here, V represents the average workload. We can see that, without checking the routing tables to choose the biggest zone, only about 40% peers’ workload is V, the heaviest workload is 8 V, while the least workload is 161 V. However, by checking the routing table locally, the workload balancing has been ameliorated significantly; more than 90% peers’ workload is equal to the average workload and the heaviest workload is only 2 V. This experimental results show that the join strategy of C2 is favourable for workload balance.

5.3 Fault tolerant

In our third simulation experiment, the fault tolerant characteristic of C2 is examined. We implement this experiment by varying the proportion of failed peers in the network from (1/128) to (1/4) in a C2 network with 216 peers. In each of the total 104 trials, two online peers are chosen randomly and a message is routed between them. The proportion of failed routing and the average number of hops of successful routing are recorded.

Figure 10 illustrates the fault tolerance feature of C2. Simulation results show that only a small fraction of routing failed even in the case that a considerable number of peers are offline simultaneously, and subsequently some peers’ routing tables are outdated. The reason, which is discussed in Section 4.6, is that our routing method can bypass the failure peer.

12 W. CAI, S. ZHOU, W. QIAN, L. XU, K-L. TAN AND A. ZHOU

Figure 10 Failure routing vs. failure peer

Figure 11 shows that the average number of hops of successful routings, when varying the proportion of failure peers, does not increase significantly; that is, when a considerable number of peers fail, the efficiency of our C2 will not degrade drastically.

Figure 11 Performance in failure

Figure 12 shows that the distribution of routing efficiency is very similar to that without peer failures, that is, the expected number of hops is (1/2) log n, and the worst case hop number is log n.

Figure 12 Performance distribution in failure

6 CONCLUSION AND FUTURE WORK

In this paper, we present and evaluate C2, a new overlay network based on CAN and Chord. C2 is decentralised, scalable, and fault tolerant. We show how a C2 overlay network can be efficiently and reliably constructed to support resource sharing in dynamic environment.

Each C2 peer maintains a routing table with only O(log n) entries, and a message can be routed to its destination within O(log n) hops. In a dynamic environment, C2 achieves high fault tolerance due to its two merits.

• the system is resilient to failures because it can update the routing information with no more than O(log n) messages

• even when the routing tables are outdated, the routing scheme can still bypass the failed peers to the target, with moderate overheads.

Hence, even with simultaneous failures of a considerable number of peers in the network, the system is still reliable and the system performance can be guaranteed. Our theoretical analysis and simulation results validate the scalability and fault tolerance of C2.

In the future, we plan to further the current work to develop an intelligent P2P overlay network of system robustness, topology awareness, and heterogeneity sensitiveness.

• Although the current routing scheme can bypass the failed peer, it is vulnerable to sybil attack (Douceur, 2002). Thus, it is important to enhance C2’s robustness.

• Since the routing scheme presented in this paper is based on logical overlay and is irrelevant to physical topology, which may lead to high latency, it is necessary to make the routing algorithm adaptive to the underlying network topology.

• Since capacities of peers throughout the internet differ drastically, this may result in low efficiency if powerful peers and incompetent peers have the same workload. Therefore, it is important to design the peer join algorithm adaptive to heterogeneity of peers.

ACKNOWLEDGEMENTS

This paper was partially supported by the Doctoral Subject Program of Ministry of Education under grant 20030246023, Natural Science Foundation of China under 60373019, and Infocomm Development Authority of Singapore under the agreement on Fudan-NUS P2P Computing Competence Center. The authors would like to thank Professor B.C. Ooi of National University of Singapore, and the anonymous reviewers for their helpful and insightful comments on the submission version of this paper.

C2: A NEW OVERLAY NETWORK BASED ON CAN AND CHORD 13

REFERENCES

Aberer, K. (2001) ‘P-grid: a self-organizing access structure for P2P information systems’, Proceedings of the 9th International Conference on Cooperative Information Systems, Springer-Verlag, pp.179–194.

Douceur, J. (2002) ‘The sybil attack’, Proceedings of the 1st International Workshop on Peer-to-Peer Systems (IPTPS’2002), Springer-Verlag, LNCS, 7–8 March, Cambridge, MA, USA, Vol. 2429, pp.251–260.

Ganesan, P., Sun, Q. and Garcia-Molina, H. (2003) ‘Yappers: a peer-to-peer lookup service over arbitrary topology’, [Online] Available: citeseer.ist.psu.edu/ganesan03yappers. html.

Harvey, N., Jones, M., Saroiu, V, Theimer, M. and Wolman, A. (2003) ‘Skipnet: a scalable overlay network with practical locality properties’, [Online] Available: citeseer.ist.psu.edu/ harvey03skipnet.html.

Malkhi, D., Naor, M. and Ratajczak, D. (2002) ‘Viceroy: a scalable and dynamic emulation of the butterfly’, Proceedings of the 21st ACM Symposium on Principles of Distributed Computing, ACM, Monterey, California, USA, 21–24 July, pp.183–192.

Manku, G., Bawa, M. and Raghavan, P. (2003) ‘Symphony: distributed hashing in a small world’, [Online] Available: citeseer.ist.psu.edu/manku03symphony.html.

Maymounkov, P. and Mazieres, D. (2002) ‘Kademlia: a peer-to-peer information system based on the xor metric’, [Online] Available: citeseer.ist.psu.edu/article/maymounkov02 kademlia.html.

Ng, W.S., Ooi, B.C. and Tan, K-L. (2002) ‘Bestpeer: a self-configurable peer-to-peer system’, Proceedings of IEEE Conference on Data Engineering (ICDE'2001), IEEE Computer Society, 26 February – 1 March 2002, San Jose, CA, p.272.

Ratnasamy, S., Francis, P., Handley, K., Karp, R. and Shenker, S. (2001) ‘A scalable content-addressable network’, Proceedings of ACM SIGCOMM, ACM Press, San Diego, CA, USA, 27–31 August, pp.161–172.

Rowstron, A. and Druschel, P. (2001a) ‘Pastry: scalable, distributed object location and routing for large-scale peer-to-peer systems’, Proceedings of IFIP/ACM International Conference on Distributed Systems Platforms (Middleware), Springer, LNCS, 12–16 November, Heidelberg, Germany, Vol. 2218, pp.329–350.

Stoica, R., Morris, D., Karger, M., Kaashoek, F. and Balakrishnan, H. (2001) ‘Chord: a scalable peer-to-peer lookup service for internet applications’, Proceedings of ACM SIGCOMM 2001’, ACM Press, 27–31 August, San Diego, CA, USA, pp.149–160.

Zhao, B., Kubiatowicz, J. and Joseph, A. (2001) Tapestry: An Infrastructure for Fault-Tolerant Wide-area Location and Routing, Tech. Rep. USB/CSD-01-1141, University of California at Berkeley, Computer Science Department, Tech. Rep., USA.

NOTES

1legal means that the NbP of the hop should not exceed the key in any dimension.

2For simplicity, we only consider the case when every peer only manages one zone. In the following subsection, we will see that it is quite possible for a peer to manage more than one peer. In that case, the new coming peer will take over one zone, but not split the existent zone.

WEBSITES

Freenet homepage, http://freenet.sourceforge.com/. Gnutella developement homepage, http://gnutella.wego.com/. Napster homepage, http://www.napster.com/. O’reilly p2p directory, http://www.openp2p.com/pub/q/

p2pcategory/. Peer-to-peer working group, http://www.peer-to-peerwg.org/.

BIBLIOGRAPHY

Azar, Y., Broder, A.Z., Karlin, A.R. and Upfal, E. (2000) ‘Balanced allocations’, SIAM Journal on Computing, Vol. 29, No. 1, pp.180–200, [Online] Available: citeseer.ist.psu.edu/ azar94balanced.html.

Datar, M. (2002) ‘Butterflies and peer-to-peer networks’, Proceedings of the 10th Annual European Symposium on Algorithms, Springer-Verlag, LNCS, 17–21 September, Vol. 2461, pp.310–322, Rome, Italy.

Harren, M., Hellerstein, J.M., Huebsch, R., Loo, B.T., Shenker, S. and Stoica, I. (2001) ‘Complex queries in dht-based peer-to-peer networks matthew harren’, Proceedings for the 1st International Workshop on Peer-to-Peer Systems (IPTPS’2002) Springer-Verlag, LNCS, 7–8 March, Vol. 2429, Cambridge, MA, USA, pp.242–250.

Hildrum, K., Kubiatowicz, J.D., Rao, S. and Zhao, B.Y. (2002) ‘Distributed object location in a dynamic network’, Proceedings of the Fourteenth ACM Symposium on Parallel Algorithms and Architectures (SPAA), ACM, 11–13 August, Winnipeg, Manitoba, Canada, pp.41–52.

Kaashoek, F. and Powstron, A. (Eds.) (2002) Electronic Proceedings for the 1st International Workshop on Peer-to-Peer Systems (IPTPS’2002), Available at: http://www.cs.rice.edu/Conferences/IPTPS02/.

Kubiatowicz, J., Bindel, D., Chen, Y., Czerwinski, S., Eaton, P., Geels, D., Gummadi, R., Rhea, S., Weatherspoon, H., Weimer, W., Wells, C. and Zhao, B. (2000) ‘Oceanstore: an architecture for global-scale persistent storage’, Proceedings of the 9th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2000), 12–15 November, Cambridge, MA, USA, pp.190–201.

Liben-Nowell, D., Balakrishnan, H. and Karger, D. (2002) ‘Analysis of the evolution of peer-topeer systems’, [Online] Available: citeseer.ist.psu.edu/liben-nowell02 analysis.html.

Oram, A. et al. (Eds.) (2001) Peer-to-Peer: Harnessing the Power of Disruptive Technologies, O’Reilly & Associates, Inc., March, Sebastopol, CA, USA.

Pandurangan, G. (2001) ‘Building low-diameter p2p networks’, Proceedings of the 42nd IEEE Symposium on Foundations of Computer Science, IEEE Computer Society, p.492.

Plaxton, C., Rajaram, R. and Richa, A.W. (1997) ‘Accessing nearby copies of replicated objects in a distributed environment’, Proceedings of the 9th Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA’97), 23–25 June, ACM Press, Newport, RI, USA, pp.311–320.

Ratnasamy, S., Shenker, S. and Stoica, I. (2002) ‘Routing algorithms for dhts: some open questions’, Proceedings of the 1st International Workshop on Peer-to-Peer Systems (IPTPS’2002), 7–8 March, Springer-Verlag, LNCS, Cambridge, MA, USA, Vol. 2429, pp.45–52.

14 W. CAI, S. ZHOU, W. QIAN, L. XU, K-L. TAN AND A. ZHOU

Ripeanu, M. (2001) ‘Peer-to-peer architecture case study: Gnutella network’, IEEE Computer Society, 27–29 August, Linköping, Sweden, pp.99–100, [Online] Available: citeseer.ist.psu.edu/ ripeanu01peertopeer.html.

Rowstron, A. and Druschel, P. (2001b) ‘Past: persistent and anonymous storage in a peer-to-peer networking environment’, Proceedings of the 8th IEEE Workshop on Hot Topics in Operating Systems (HotOS 2001), 20–23 May, Elmau, Oberbayern, Germany, pp.65–70.

Saia, J., Fiat, A., Gribble, S., Karlin, A. and Saroiu, S. (2002) ‘Dynamically fault-tolerant content addressable networks’, [Online] Available: citeseer.ist.psu.edu/ saia02dynamically.html.

Saroiu, S., Gummadi, K. and Gribble, S. (2002) ‘A measurement study of peer-to-peer file sharing systems’, Proceedings of Multimedia Conferencing and Networking (MMCN'02), January, San Jose, CA.