“Umbrella”: A novel fixed-size DHT protocol
A.D. Sotiriou
Overview
Novel distributed hash table architecture
Supports key publishing and retrieval on top of an overlay network for content distribution
Efficient algorithms based on a distributed routing table of constant size for each node
Minimize traffic load
Related Work Plaxton, Rajaraman and Richa
Algorithm wasn’t developed for P2P systems Based on the ground rule of comparing one byte at a time Required knowledge of latencies between all nodes
Tapestry Variation of the Plaxton Adjusted for P2P systems Routing table of β*logβN neighbors , search of logβN maximum steps
Chord applied a different approach Placed nodes in a circular space Maintained information only for a number of successor and predecessor
nodes through a finger table Finger table of O(logN) size
CAN furthered on Pastry’s alternation and Implied DHT in a d-dimensional Cartesian space based on a d-tore Constantly divided space and distributed it amongst nodes Maintained information about their neighbors Constant O(d) table but required O(dN1/d) steps for lookups
Structure Overview
Creation of an overlay network
All inserting nodes are identified by a unique code SHA-1 on combination of IP and computer name
Main objective of the architecture Insert and retain nodes in a simple and well structured manner Allow querying and fetching of content
Efficient Fault-tolerant Retain up-to-date information of a limited, constant number of
neighboring nodes
Structure Overview Form of a 16-ary tree
Each node is placed in a hierarchy 1 parent node 16 child nodes Each node operates autonomously Further links for fault-tolerance
Each level n withholding max 16n+1nodes
The relation between a parent node at level n and a child node (level n+1) :
The n+1 first digits of the parent’s identifier are equal with the corresponding of the child’s
The n+2 digit of the child’s identifier determines the child’s position in the parent’s child list
Routing Table Three sets of neighborhood nodes
BasicMain table for routing
UpperAllows routing to nodes of higher level (when the parent node is unreachable)
LowerAllows routing to nodes of lower level (when child nodes fail)
Each node is responsible to modify or fix its routing table when nodes Enter Leave Fail to communicate
Maximum steps required O(logbN)
Field Set Description Level Basic The level it resides
Right Basic The non-empty node to the right
Left Basic The non-empty node to the left Up Basic The parent node
Right2 Upper The node residing to the right of the parent node
Left2 Upper The node residing to the left of the parent node
Up2 Upper The parent’s parent node
Right3 Lower One (random) child of the node to the right
Left3 Lower One (random) child of the node to the left
Umbrella Basic All children nodes
Umbrella2 Lower A (random) child node from each child
Main Algorithms – Insert Contacting an already connected node and issuing a request for insertion The established node checks if the n+1 first digits of the identifier match its
own, where n is the level the node resides If not then the insertion message is forwarded to the node’s parent If yes then the message is forwarded to the child with the n+2 digit common with
that of the new node If such a child does not exist then the new node is placed as a child to the current node The new node is informed of his new neighbors and via versa
Main Algorithms – Publish If the content’s identifier doesn’t have the first n+1 digits same as
the node then the publish message is forwarded to the parent node If they are matching, then it is forwarded to the child with the
corresponding matching n+2 digit If no such child exists then the node publishes the content itself
Main Algorithms – Search The node first checks for the keyword in its list of published
keywords If it exists then the search terminates If not, then it checks whether the first n+1 digits are identical to its own
identifier If not then the message is forwarded to its parent If yes, then it’s forwarded to the child with corresponding n+2 digit matching If no such exists, then the search fails
Main Algorithms – Departure If the node has no children then all of its keywords are forwarded to
its parent and it informs all its neighbors of its departure If it has any child, then it randomly picks one and copies all of its
neighborhood and keyword information to it before departing The chosen child moves up a level and substitutes the departing node
If the child has any child, then the previous step is repeated recursively until a node with no children is reached and the first step is then executed ending the algorithm
Enhanced Algorithms System liable to sudden node departures
Voluntary departure without calling appropriate mechanism Sudden departures due to client errors Network disconnections
Treat all of the above cases in the same manner Changes in the algorithms already presented Allow the system to bypass node failures
Most changes are based on using the upper and lower set The upper set is utilized to forward messages to nodes of a
higher level The lower set for nodes on a lower level
Enhanced Algorithms– Parent Failure Forward requests
consequently to: The parent’s parent node
( field Up2 on the upper set)
The node to the right of the parent node (field Right2 on the upper set)
The node to the left of the parent node (field Left2 on the upper set)
Whichever of the above succeeds first terminates the mechanism
Enhanced Algorithms– Child Failure Forward requests consequently to:
One of the child’s child (field Umbrella2 on the lower set) The node on the right of the child (field Umbrella on the basic set) The node on the left of the child (field Umbrella on the basic set) A child of the node right of the issuing node (field Right3) A child of the node left of the issuing node (field Left3)
Whichever of the above succeeds first terminates the mechanism
Repair Mechanism We have designed a repair mechanism
Invoked whenever such a failure is detected
Algorithm utilizes the delete algorithm in order to repair a failure to a child All failures can be transformed into a child failure through contacting
nodes in the neighboring table and forwarding a repair message Once the appropriate node is reached and informed of the child failure,
a variation of the delete algorithm is evoked in order to repair the failure Substituting the failed node with one of its children Deleting it if none is available
Each node is responsible for checking its neighborhood table periodically Issuing ping messages to all node entries Invoking the repair mechanism whenever a failure is detected
This mechanism increases the system’s stability and fault tolerance tremendously
Repair Mechanism Check if the node had children
If it didn’t have any then just contact all of its neighbors by utilizing the neighborhood table and inform them of the new structure
If it did then one of them must be in the Umbrella2 entry Pick a random entry in the Umbrella2 field and inform all neighbors of the
change The new child is informed and gathers the appropriate new neighborhood
settings from nearby nodes
(1) repair ( ) (2) if ( has_Umbrella2( ) ) (3) kid = get_appropriate_Umbrella2( ) (4) contact_neigboors_of_change(kid) (5) kid.gather_new_neighboors() (6) else (7) contact_neighboors_of_change()
Simulation Results
Extended neurogrid simulator
Implemented umbrella algorithms
Two sets of results Without repair mechanism With repair mechanism
Variable network size
Random node failures
No Failures - Hops Prove the integrity of our design under normal conditions Conducted simulations with node populations varying from 10 nodes up to 6000
nodes Investigated the number of hops required for a successful insertion and lookup with a
varying population of nodes Number of hops grows logarithmically with the node population in all mechanisms
No Failures - Messages Investigated the overall traffic generated by our architecture Total messages per request
Low number of messages exchanged Due to the small number of hops required for each successful request Also due to the limited (constant) number of neighbors maintained by each node
Total number increases linearly with the node population
Failures With No Repair Conducted a second set of simulations to test the system’s tolerability against node
failures Progressively caused node failures from 0 up to 80% of the total node population, in steps
of 5% For a rate of up to 22% of failing nodes, the success rate is kept high (over 80%)
Slowly degrades up to a mid-point of 50% Onwards our system becomes unstable and success rates drop dramatically
Failures With Repair – Success Rate Conducted a third set or simulations with repair
Progressively caused node failures from 0 up to 80% of the total node population, in steps of 5%
3T, 6T and 20T repair periods, where T is a constant representing communication activity This ensures that an inactive node will not suffocate the network with repair messages
Repair mechanism dramatically increases the success rate Regardless of the node population
0
20
40
60
80
100
120
0 10 20 30 40 50 60 70 80
Fail ratio
Look
ups
No Repair
Repair 3
Repair 6
Repair 20
0
20
40
60
80
100
120
0 10 20 30 40 50 60 70 80
Fail ratio
Look
ups
No Repair
Repair 3
Repair 6
Repair 20
Failures With Repair – Messages Total amount was expected to increase
Remains almost constant for rate failures of up to 50% Increases linearly from then on
In all cases, the total per node average is kept reasonably low
0
200000
400000
600000
800000
1000000
1200000
1400000
0 10 20 30 40 50 60 70 80
Fail ratio
Total
Mes
sage
s
No Repair
Repair 3
Repair 6
Repair 20
0
2000000
4000000
6000000
8000000
10000000
12000000
14000000
0 10 20 30 40 50 60 70 80
Fail Ratio
Total
Mes
sage
s
No Repair
Repair 3
Repair 6
Repair 20
Conclusions Novel protocol
Based on a distributed hash table Supports key publishing Retrieval on top of an overlay network
For content distribution
Analysed our system Proved its correctiveness and efficacy
Its main strengths are Fixed-size routing table Provides efficient routing in O(logbN) steps
Even when more than half of the system’s population suddenly fails
Questions?
Top Related