Overlapping Hash Trie : A Longest Prefix First Search Scheme for IPv4/IPv6 Lookup
Resource Lookup DHT Distributed Hash Tables - …sferrett.web.cs.unibo.it/CAS/SLIDES/03_DHT.pdf ·...
Transcript of Resource Lookup DHT Distributed Hash Tables - …sferrett.web.cs.unibo.it/CAS/SLIDES/03_DHT.pdf ·...
Resource Lookup–
DHTDistributed Hash Tables
Stefano FerrettiDISI
University of Bologna
Email: [email protected]
Slide credits: Stefano Ferretti, Laura Ricci (UniPi)Part of these slides comes from (in Italian)http://didawiki.cli.di.unipi.it/doku.php/informatica/p2p/start
Copyright © 2015, Stefano Ferretti, Università di Bologna, Italy(http://www.cs.unibo.it/sferrett/)
This work is licensed under the Creative Commons Attribution-ShareAlike 3.0 License (CC-BY-SA). To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.
Licence
2
Distributed Hash Tables
Distributed Hash Tables (DHT): IntroMain aspectsApproachesInterfacesConclusions
Lookup
● Given a content c, stored at some set of nodes, find it● The main challenge
– Do it in a reliable, scalable and efficient way– Despite the dynamicity and frequent changes
I
?
Content “c”
distributed system
7.31.10.25
peer-to-peer.info
12.5.7.31
95.7.6.10
86.8.10.18
planet-lab.orgberkeley.edu 89.11.20.15
?
I have content “c“.Where can I put it?
?
I need content “c”,Where can I find it?
Node A Node B
Internet node
Lookup
● Unstructured systems– Content placement does not depend on the net structure – Each peer maintains its contents– Resource lookup can be difficult
● Structured Systems– A content can be stored in any node in the net– The (node, content) association is based on specific
criteria
I
?
Content “c”
distributed system
7.31.10.25
peer-to-peer.info
12.5.7.31
95.7.6.10
86.8.10.18
planet-lab.orgberkeley.edu 89.11.20.15
?
I have content “c“.Where can I put it?
?
I need content “c”,Where can I find it?
Node A Node B
Internet node
● Let's have a brief look on how lookup was performed in P2P systems ...
Take 1 – Napster (Jan ’99) Client – Server architecture (not P2P) Publish – send the key (file name) to the
server Lookup – ask the server who has the requested
file. The response contains the address of a node/nodes that hold the file
Retrieval – get the file directly from the holder Join – send your file list to the server Leave – cancel your file list at the server
Take 1 – Napster (continued) Advantages
Low message overhead Search scope is O(1)
Minimal overhead on the clients 100% success rate (if the file is there, it will be
found) Disadvantages
Single point of failure Not scalable (server is too busy)
Server maintains O(N) State
Napster: Publish
I have X, Y, and Z!
Publish
insert(X, 123.2.21.23)...
123.2.21.23
Napster: Search
Where is file A?
Query Reply
search(A)-->123.2.0.18
Fetch
123.2.0.18
Take 2 - Gnutella (March ’00) Build a decentralized unstructured overlay
each node has several neighbors Publish – store the file locally Lookup – check local database. If X is known
return, if not, ask your neighbors. TTL limited. Retrieval – direct communication between 2 nodes Join – contact a list of nodes that are likely to be
up, or collect such a list from a website. Random, unstructured overlay
What is the communication pattern?flooding
Take 2 – Gnutella (continued) Advantages
Fast lookup Low join and leave overhead Popular files are replicated many times, so lookup with
small TLL will usually find the file Can choose to retrieve from a number of sources
Disadvantages Not 100% success rate, since TTL is limited Very high communication overhead
Search scope is O(N) Search time is O(???) Limits scalability But people do not care so much about wasting bandwidth
Uneven load distribution Nodes leave often, network unstable
I have file A.
I have file A.
Search in Query Flooding
Where is file A?
Query
Reply
I have file A.
Assume TTL=3
Where is file A?
Let's focus on this queryTTL=3
TTL=2
TTL=1
TTL=0
Improve scalability by introducing a hierarchy 2 tier system
super-peers: have more resources, more neighbors, know more keys
clients: regular/slow nodes Client can decide if it is a super-peer when
connecting Super-peers accept connection from clients and
establish connections to other super-peers Search goes through super-peers
Take 3 - FastTrack, KaZaA, eDonkey
“Smart” Query Flooding: Join: on startup, client contacts a
“supernode” ... may at some point become one itself
Publish: send list of files to supernode Search: send query to supernode, supernodes
flood query amongst themselves. Fetch: get the file directly from peer(s); can
fetch simultaneously from multiple peers
Take 3 - FastTrack, KaZaA, eDonkey
Take 3 - FastTrack, KaZaA, eDonkey (continued) Advantages
More stable than Gnutella. Higher success rate More scalable
Disadvantages Not “pure” P2P Still no real guarantees on search scope or
search time Still high communication overhead
Similar behavior to plain flooding, but better.
Supernodes Network Design“Super Nodes”
Supernodes: File Insert
I have X!
Publish
insert(X, 123.2.21.23)...
123.2.21.23
Supernodes: File Search
Where is file A?
Query
search(A)-->123.2.0.18
search(A)-->123.2.22.50
Replies
123.2.0.18
123.2.22.50
DISTRIBUTED HASH TABLES: Motivations
Co
mm
un
icat
ion
Ove
rhe a
d
Memory
Flooding
Centralized
O(N)
O(N)O(1)
O(1)
O(log N)
O(log N)
Drawbacks•Communication overhead•False negatives
Drawbacks•Memory, CPU, Bandwidth usage •Fault Tolerance?
A tradeoff between these two options
DISTRIBUTED HASH TABLES: Motivations
Co
mm
un
icat
ion
Ove
rhe a
d
Memory
Flooding
Centralized
O(N)
O(N)O(1)
O(1)
O(log N)
O(log N)
Drawbacks•Communication overhead•False negatives
Drawbacks•Memory, CPU, Bandwidth usage •Fault ToleranceDistributed
Hash Table
Scalability: O(log N)
No false negative
Self-organized management
Joining nodes
Leaving nodes (voluntary/failures)
DHT: Main Aspects● Goals
– O(log(N)) hops to lookup– O(log(N)) entries in routing table
Routing needs O(log(N)) hops to
reach a node with the searched data
Nodes' routing table size: O(log(N))
Comparison
False Negative
O(1)O(N)Central Server
O(log N)O(log N)DHT
O(N²)O(1)Pure P2P(flooding)
RobustnessComplex Queries
Communiction OverheadMemory Approach
DHT: Overview
• Abstraction: a distributed “hash-table” (DHT) data structure:
– put(id, item);
– item = get(id);
• Scalable, decentralized, fault tolerant
• Implementation: nodes in system form a distributed data structure
– Can be Ring, Tree, Hypercube, Skip List, Butterfly Network, ...
DHT: Overview (2)
Many DHTs:
26
Hashing (Recap) Data structure supporting the operations:
void insert( key, item ) item search( key )
Implementation uses a hash function for mapping keys to array cells
Expected search time O(1) provided that there are few collisions
Nodes IDs and content IDs mapped in the same keyspace Use of a hash function
Nodes store table entries lookup( key )
returns the IP of the node responsible for the key key usually numeric (in some range)
Requirements for an application being able to use DHTs: data identified with unique keys nodes can (agree to) store keys for each other
location of object or actual object
DHTs
Using the DHT Interface How do you publish a file? How do you find a file?
What does a DHT need to do
Map keys to nodes needs to be dynamic as nodes join and leave
Route a request to the appropriate node Key-based routing on the overlay
Lookup Example
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
insert(K1,V1)
K V(K1,V1)
lookup(K1)
DHT StructureStep 1:
● Definition of the keyspace. Ex: Logic keyspace as a ring – Assumption: keyspace 0, …, 2m-1 larger than the amount of
data items (es m=160),– Need for a total ordering and distance notion (in modulus)
● Each node is assigned a single key (its ID)– Mapping node - key through a hash function
● The overlay network is not correlated to the real (Internet) topology
Distributed Hash Table:overlay
Real topology
2207
29063485
201116221008709
611
DHT StructureStep 2:
● Each node is responsible for a set of data items stored in the DHT– This data set is composed of items with keys in a contiguous
interval in the keyspace– Typical DHT implementation: map key to node whose id is
“close” to the key (need distance function). ● Data items mapped in the same keyspace of the nodes, through the
hash function– NodeID=hash(node’s IP address)– KeyID=hash(key). Key is a file name
● E.g., Hash(String): H(“LucidiLezione29-02-08“) → 2313● Hash can be performed on the filename or on its entire
content– m-bit identifier– Cryptographic hash function (e.g., SHA-1)
● Sometimes a certain redundancy is introduced (overlapping)
DHT Structure
H(Node Y)=3485
3486 -611
1623 -2011
612-709
2012 -2207
2208-2906
(3486 -611)
2907 -3485
1009 -1622
2m-1 0
Data item “D”:H(“D”)=3107 H(Node X)=2906
710 -1008
Distributed Hash Table:overlay
● Real topology
2207
29063485
201116221008709
611
y
x
Load Balancing
● Problem: uniform distribution among DHT nodes● Possible causes of unbalanced load:
– A node has a bigger portion of the keyspace– Uniform partitioning of the keyspace among nodes,
but an interval contains more data items than other intervals
– Uniform partitioning and uniform distribution of the data items, but there are certain items that are more requested than others
● Unbalanced load → lower robustness, lower scalability● Need for algorithms for balancing the load
● Problem: given a data item D, find the node that holds key=H(D)
(3107, (ip, port))
Value = it might be a pointer to the real location of the data
Key = H(“D”)
Node 3485 holds keys 2907-3485,
Starting node(arbitrary)
H(“D“)= 3107
2207
29063485
20111622
1008
709
611
ROUTING
D134.2.11.68
ROUTING● Each node must be able to forward each lookup query to a
node closer to the destination● Maintain routing tables adaptively
– each node knows some other nodes– must adapt to changes (joins, leaves, failures)
(3107, (ip, port))
Value = it might be a pointer to the real location of the data
Key = H(“D”)
Node 3485 holds keys 2907-3485,
Starting node(arbitrary)
H(“D“)= 3107
2207
29063485
20111622
1008
709
611
D134.2.11.68
Nodes joining the DHT ID identification through the hash function
The node contacts an arbitrary node (bootstrap)
A portion of the keyspace is assigned to this node, based on its
ID
Copy of assigned pairs K/V (plus some redundancy)
Insertion in the DHT (i.e. link creation)
2207
29063485
201116221008709
611
ID: 3485
134.2.11.68
Nodes leaving the DHT (departure, failure)
● Voluntary departure The keys are distributed to neighbour nodes Copy of K/V pairs on assigned neighbour nodes Removal of the node from the routing tables
● Node failure All keys are lost, unless some redundancy is introduced
Most DHTs use replication Periodic refresh
Use of multiple routing paths Periodic node probing to check that neighbours are
online When faults are detected, routing tables are updated
DHT: Applications● DHTs offer a generic distributed service to store and
index information● The data item can be
– File– IP address– Any kind of information ……
● Examples of applications – DNS implementation
● key: hostname, value: IP address– P2P storage systems: ex. Freenet– Distributed search engine– Distributed caching– Distributed file system
DHTs● Chord
UC Berkeley, MIT
● Pastry
Microsoft Research, Rice University
● Tapestry
UC Berkeley
● CAN
UC Berkeley, ICSI
● P-Grid
EPFL Lausanne● Kademlia , rete KAD di EMule... ● Symphony, Viceroy, …
If you want to start with a survey
● Eng Keong Lua; Crowcroft, J.; Pias, M.; Sharma, R.; Lim, S., "A survey and comparison of peer-to-peer overlay network schemes," Communications Surveys & Tutorials, IEEE , vol.7, no.2, pp.72,93, Second Quarter 2005doi: 10.1109/COMST.2005.161054