Download - “Umbrella”: A novel fixed-size DHT protocol A.D. Sotiriou.

“Umbrella”: A novel fixed-size DHT protocol

A.D. Sotiriou

Overview

Novel distributed hash table architecture

Supports key publishing and retrieval on top of an overlay network for content distribution

Efficient algorithms based on a distributed routing table of constant size for each node

Minimize traffic load

Related Work Plaxton, Rajaraman and Richa

Algorithm wasn’t developed for P2P systems Based on the ground rule of comparing one byte at a time Required knowledge of latencies between all nodes

Tapestry Variation of the Plaxton Adjusted for P2P systems Routing table of β*logβN neighbors , search of logβN maximum steps

Chord applied a different approach Placed nodes in a circular space Maintained information only for a number of successor and predecessor

nodes through a finger table Finger table of O(logN) size

CAN furthered on Pastry’s alternation and Implied DHT in a d-dimensional Cartesian space based on a d-tore Constantly divided space and distributed it amongst nodes Maintained information about their neighbors Constant O(d) table but required O(dN1/d) steps for lookups

Structure Overview

Creation of an overlay network

All inserting nodes are identified by a unique code SHA-1 on combination of IP and computer name

Main objective of the architecture Insert and retain nodes in a simple and well structured manner Allow querying and fetching of content

Efficient Fault-tolerant Retain up-to-date information of a limited, constant number of

neighboring nodes

Structure Overview Form of a 16-ary tree

Each node is placed in a hierarchy 1 parent node 16 child nodes Each node operates autonomously Further links for fault-tolerance

Each level n withholding max 16n+1nodes

The relation between a parent node at level n and a child node (level n+1) :

The n+1 first digits of the parent’s identifier are equal with the corresponding of the child’s

The n+2 digit of the child’s identifier determines the child’s position in the parent’s child list

Routing Table Three sets of neighborhood nodes

BasicMain table for routing

UpperAllows routing to nodes of higher level (when the parent node is unreachable)

LowerAllows routing to nodes of lower level (when child nodes fail)

Each node is responsible to modify or fix its routing table when nodes Enter Leave Fail to communicate

Maximum steps required O(logbN)

Field Set Description Level Basic The level it resides

Right Basic The non-empty node to the right

Left Basic The non-empty node to the left Up Basic The parent node

Right2 Upper The node residing to the right of the parent node

Left2 Upper The node residing to the left of the parent node

Up2 Upper The parent’s parent node

Right3 Lower One (random) child of the node to the right

Left3 Lower One (random) child of the node to the left

Umbrella Basic All children nodes

Umbrella2 Lower A (random) child node from each child

Main Algorithms – Insert Contacting an already connected node and issuing a request for insertion The established node checks if the n+1 first digits of the identifier match its

own, where n is the level the node resides If not then the insertion message is forwarded to the node’s parent If yes then the message is forwarded to the child with the n+2 digit common with

that of the new node If such a child does not exist then the new node is placed as a child to the current node The new node is informed of his new neighbors and via versa

Main Algorithms – Publish If the content’s identifier doesn’t have the first n+1 digits same as

the node then the publish message is forwarded to the parent node If they are matching, then it is forwarded to the child with the

corresponding matching n+2 digit If no such child exists then the node publishes the content itself

Main Algorithms – Search The node first checks for the keyword in its list of published

keywords If it exists then the search terminates If not, then it checks whether the first n+1 digits are identical to its own

identifier If not then the message is forwarded to its parent If yes, then it’s forwarded to the child with corresponding n+2 digit matching If no such exists, then the search fails

Main Algorithms – Departure If the node has no children then all of its keywords are forwarded to

its parent and it informs all its neighbors of its departure If it has any child, then it randomly picks one and copies all of its

neighborhood and keyword information to it before departing The chosen child moves up a level and substitutes the departing node

If the child has any child, then the previous step is repeated recursively until a node with no children is reached and the first step is then executed ending the algorithm

Enhanced Algorithms System liable to sudden node departures

Voluntary departure without calling appropriate mechanism Sudden departures due to client errors Network disconnections

Treat all of the above cases in the same manner Changes in the algorithms already presented Allow the system to bypass node failures

Most changes are based on using the upper and lower set The upper set is utilized to forward messages to nodes of a

higher level The lower set for nodes on a lower level

Enhanced Algorithms– Parent Failure Forward requests

consequently to: The parent’s parent node

( field Up2 on the upper set)

The node to the right of the parent node (field Right2 on the upper set)

The node to the left of the parent node (field Left2 on the upper set)

Whichever of the above succeeds first terminates the mechanism

Enhanced Algorithms– Child Failure Forward requests consequently to:

One of the child’s child (field Umbrella2 on the lower set) The node on the right of the child (field Umbrella on the basic set) The node on the left of the child (field Umbrella on the basic set) A child of the node right of the issuing node (field Right3) A child of the node left of the issuing node (field Left3)

Whichever of the above succeeds first terminates the mechanism

Repair Mechanism We have designed a repair mechanism

Invoked whenever such a failure is detected

Algorithm utilizes the delete algorithm in order to repair a failure to a child All failures can be transformed into a child failure through contacting

nodes in the neighboring table and forwarding a repair message Once the appropriate node is reached and informed of the child failure,

a variation of the delete algorithm is evoked in order to repair the failure Substituting the failed node with one of its children Deleting it if none is available

Each node is responsible for checking its neighborhood table periodically Issuing ping messages to all node entries Invoking the repair mechanism whenever a failure is detected

This mechanism increases the system’s stability and fault tolerance tremendously

Repair Mechanism Check if the node had children

If it didn’t have any then just contact all of its neighbors by utilizing the neighborhood table and inform them of the new structure

If it did then one of them must be in the Umbrella2 entry Pick a random entry in the Umbrella2 field and inform all neighbors of the

change The new child is informed and gathers the appropriate new neighborhood

settings from nearby nodes

(1) repair ( ) (2) if ( has_Umbrella2( ) ) (3) kid = get_appropriate_Umbrella2( ) (4) contact_neigboors_of_change(kid) (5) kid.gather_new_neighboors() (6) else (7) contact_neighboors_of_change()

Simulation Results

Extended neurogrid simulator

Implemented umbrella algorithms

Two sets of results Without repair mechanism With repair mechanism

Variable network size

Random node failures

No Failures - Hops Prove the integrity of our design under normal conditions Conducted simulations with node populations varying from 10 nodes up to 6000

nodes Investigated the number of hops required for a successful insertion and lookup with a

varying population of nodes Number of hops grows logarithmically with the node population in all mechanisms

No Failures - Messages Investigated the overall traffic generated by our architecture Total messages per request

Low number of messages exchanged Due to the small number of hops required for each successful request Also due to the limited (constant) number of neighbors maintained by each node

Total number increases linearly with the node population

Failures With No Repair Conducted a second set of simulations to test the system’s tolerability against node

failures Progressively caused node failures from 0 up to 80% of the total node population, in steps

of 5% For a rate of up to 22% of failing nodes, the success rate is kept high (over 80%)

Slowly degrades up to a mid-point of 50% Onwards our system becomes unstable and success rates drop dramatically

Failures With Repair – Success Rate Conducted a third set or simulations with repair

Progressively caused node failures from 0 up to 80% of the total node population, in steps of 5%

3T, 6T and 20T repair periods, where T is a constant representing communication activity This ensures that an inactive node will not suffocate the network with repair messages

Repair mechanism dramatically increases the success rate Regardless of the node population

0

20

40

60

80

100

120

0 10 20 30 40 50 60 70 80

Fail ratio

Look

ups

No Repair

Repair 3

Repair 6

Repair 20

0

20

40

60

80

100

120

0 10 20 30 40 50 60 70 80

Fail ratio

Look

ups

No Repair

Repair 3

Repair 6

Repair 20

Failures With Repair – Messages Total amount was expected to increase

Remains almost constant for rate failures of up to 50% Increases linearly from then on

In all cases, the total per node average is kept reasonably low

0

200000

400000

600000

800000

1000000

1200000

1400000

0 10 20 30 40 50 60 70 80

Fail ratio

Total

Mes

sage

s

No Repair

Repair 3

Repair 6

Repair 20

0

2000000

4000000

6000000

8000000

10000000

12000000

14000000

0 10 20 30 40 50 60 70 80

Fail Ratio

Total

Mes

sage

s

No Repair

Repair 3

Repair 6

Repair 20

Conclusions Novel protocol

Based on a distributed hash table Supports key publishing Retrieval on top of an overlay network

For content distribution

Analysed our system Proved its correctiveness and efficacy

Its main strengths are Fixed-size routing table Provides efficient routing in O(logbN) steps

Even when more than half of the system’s population suddenly fails

Questions?