Peer to peer network schemes and finding algorithms
-
Upload
mohamed-el-sharnoby -
Category
Internet
-
view
169 -
download
0
Transcript of Peer to peer network schemes and finding algorithms
Searching in P2P networksMohamed Elsharnouby - Istanbul Sehir University
P2P networks
Structured:
- CAN- Chord- Tapestry- Pastry- Viceroy
Unstructured:
- Freenet- Gnutella- BitTorrent
Structured
Pros:
- Can search any resource even if rare- Search is more efficient as it exploits the
structure
Cons:
- Not very robust and resilient as unstructured
- Overhead of maintaining the structure with joining and leaving peers
Pros:
- More resilient to failures- Better handling of joining/leaving peers- Allow better optimization of routing by
changing the overlay structure
Cons:
- Rare resources are harder to find if found at all
- Searching can flood and overload the whole network
Unstructured
Search in Structured Networks
Content Addressable Network (CAN)
CAN
Multidimensional Cartesian coordinate space on a multi-torus
Each peer has a neighbour list
Routing performance is O( × N1/ )
CAN
Joining: by splitting an existing peer’s zone into half
Neighbour list: transferred from the old peer - updated for all neighbouring peers
Leaving: a neighboring peer takes over its space and the neighbour lists are updated
CAN improvement
Multiple coordinate spaces (realities) with different place for each peer, same place for data
Increasing dimensions: gives better routing. But both are needed
Overloading zones: more data availability - fault tolerance - shorter routing
Topological awareness of IP network
Using multiple hash functions: increases data availability
Chord
Chord
Peers are organized around a circle according to their ID which is an m-bit ID assigned by a uniform hashing function
Each data item is assigned an ID on the same circle and assigned to its successor peer
Routing takes O(log N) if peer information is up to date
Chord
Each peer carries a finger table for info of peers which are successors of IDs that increase by a power [ hence the O(log N) routing ]
Resilience is increased by maintaining another list of length r of the peer’s direct successors
Joining and leaving: needs successor keys to be updated which is done by a stabilization protocol that runs periodically in the background
Chord
It needs O(log N) for routing, much better than CAN
Needs O(log2 N) which is worse than CAN which requires O(2 x d)
Could make some use of CAN improvements ideas as multiple realities
Cannot take into account IP topology
Tapestry
Tapestry
The nth peer that the message reaches shares a suffix of at least length n with destination ID
Routing takes O(logb N) where b is the base of IDs
Uses multiple roots for each data object to avoid single points of failure
Robustness is increased by making the neighbour map maintain two backup peers in addition to the primary ones
Pastry
Pastry
Same as Tapestry
Doesn’t have optimization for locality of peers
Less efficient replication algorithm
Viceroy
Viceroy
- General Ring: every node is connected to its successor and predecessor
- Level Ring: every node is connected to others on ring
- Butterfly: every level L:- Down right edge that is
added to a long range- Down left edge to close
range- Up edge to close range
Routing performance is O(log N)
Search in Unstructured Networks
Freenet
Freenet
It uses Steepest Ascent Hill Climbing with backtracking algorithm
It caches the found file in the path peers => improvement of routing
Anonymity is one of the main properties of the network
Least Recently Used (LRU) is the basic cache replacement algorithm
An enhanced algorithm for cache replacement could be used for cache replacement
Freenet
Enhanced-clustering with Random Shortcut
It uses the concept of small world by choosing the farthest node in the cache
If the new added node is closer it replaces in the cache
If it’s farther with a certain probability it replaces
The choice of optimum is still an open question
Gnutella
Gnutella
Routing through the network is mainly done by flooding (BFS) with certain TTL and limit of hops
This causes high overload of the network when too many nodes join
To join a client connects to one of the peers and broadcasts its content by flooding as well
A concept of ultra peers with higher bandwidth is introduced to carry the network routing and search operations for its leaves
BitTorrent
BitTorrent
A centralized P2P system
It cuts files into pieces of fixed size (256 Kbytes each) and hashes them with SHA1 to confirm integrity of data
A client needs to connect to Tracker that gives the client a set of random peers having the file needed
A downloaded piece could be seeded
DHT introduced trackerless BitTorrent
Questions?
Thank you