Resource Lookup DHT Distributed Hash Tables - …sferrett.web.cs.unibo.it/CAS/SLIDES/03_DHT.pdf ·...

Resource Lookup–

DHTDistributed Hash Tables

Stefano FerrettiDISI

University of Bologna

Email: [email protected]

Slide credits: Stefano Ferretti, Laura Ricci (UniPi)Part of these slides comes from (in Italian)http://didawiki.cli.di.unipi.it/doku.php/informatica/p2p/start

Copyright © 2015, Stefano Ferretti, Università di Bologna, Italy(http://www.cs.unibo.it/sferrett/)

This work is licensed under the Creative Commons Attribution-ShareAlike 3.0 License (CC-BY-SA). To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.

Licence

2

Distributed Hash Tables

Distributed Hash Tables (DHT): IntroMain aspectsApproachesInterfacesConclusions

Lookup

● Given a content c, stored at some set of nodes, find it● The main challenge

– Do it in a reliable, scalable and efficient way– Despite the dynamicity and frequent changes

I

?

Content “c”

distributed system

7.31.10.25

peer-to-peer.info

12.5.7.31

95.7.6.10

86.8.10.18

planet-lab.orgberkeley.edu 89.11.20.15

?

I have content “c“.Where can I put it?

?

I need content “c”,Where can I find it?

Node A Node B

Internet node

Lookup

● Unstructured systems– Content placement does not depend on the net structure – Each peer maintains its contents– Resource lookup can be difficult

● Structured Systems– A content can be stored in any node in the net– The (node, content) association is based on specific

criteria

I

?

Content “c”

distributed system

7.31.10.25

peer-to-peer.info

12.5.7.31

95.7.6.10

86.8.10.18

planet-lab.orgberkeley.edu 89.11.20.15

?

I have content “c“.Where can I put it?

?

I need content “c”,Where can I find it?

Node A Node B

Internet node

● Let's have a brief look on how lookup was performed in P2P systems ...

Take 1 – Napster (Jan ’99) Client – Server architecture (not P2P) Publish – send the key (file name) to the

server Lookup – ask the server who has the requested

file. The response contains the address of a node/nodes that hold the file

Retrieval – get the file directly from the holder Join – send your file list to the server Leave – cancel your file list at the server

Take 1 – Napster (continued) Advantages

Low message overhead Search scope is O(1)

Minimal overhead on the clients 100% success rate (if the file is there, it will be

found) Disadvantages

Single point of failure Not scalable (server is too busy)

Server maintains O(N) State

Napster: Publish

I have X, Y, and Z!

Publish

insert(X, 123.2.21.23)...

123.2.21.23

Napster: Search

Where is file A?

Query Reply

search(A)-->123.2.0.18

Fetch

123.2.0.18

Take 2 - Gnutella (March ’00) Build a decentralized unstructured overlay

each node has several neighbors Publish – store the file locally Lookup – check local database. If X is known

return, if not, ask your neighbors. TTL limited. Retrieval – direct communication between 2 nodes Join – contact a list of nodes that are likely to be

up, or collect such a list from a website. Random, unstructured overlay

What is the communication pattern?flooding

Take 2 – Gnutella (continued) Advantages

Fast lookup Low join and leave overhead Popular files are replicated many times, so lookup with

small TLL will usually find the file Can choose to retrieve from a number of sources

Disadvantages Not 100% success rate, since TTL is limited Very high communication overhead

Search scope is O(N) Search time is O(???) Limits scalability But people do not care so much about wasting bandwidth

Uneven load distribution Nodes leave often, network unstable

I have file A.

I have file A.

Search in Query Flooding

Where is file A?

Query

Reply

I have file A.

Assume TTL=3

Where is file A?

Let's focus on this queryTTL=3

TTL=2

TTL=1

TTL=0

Improve scalability by introducing a hierarchy 2 tier system

super-peers: have more resources, more neighbors, know more keys

clients: regular/slow nodes Client can decide if it is a super-peer when

connecting Super-peers accept connection from clients and

establish connections to other super-peers Search goes through super-peers

Take 3 - FastTrack, KaZaA, eDonkey

“Smart” Query Flooding: Join: on startup, client contacts a

“supernode” ... may at some point become one itself

Publish: send list of files to supernode Search: send query to supernode, supernodes

flood query amongst themselves. Fetch: get the file directly from peer(s); can

fetch simultaneously from multiple peers

Take 3 - FastTrack, KaZaA, eDonkey

Take 3 - FastTrack, KaZaA, eDonkey (continued) Advantages

More stable than Gnutella. Higher success rate More scalable

Disadvantages Not “pure” P2P Still no real guarantees on search scope or

search time Still high communication overhead

Similar behavior to plain flooding, but better.

Supernodes Network Design“Super Nodes”

Supernodes: File Insert

I have X!

Publish

insert(X, 123.2.21.23)...

123.2.21.23

Supernodes: File Search

Where is file A?

Query

search(A)-->123.2.0.18

search(A)-->123.2.22.50

Replies

123.2.0.18

123.2.22.50

DISTRIBUTED HASH TABLES: Motivations

Co

mm

un

icat

ion

Ove

rhe a

d

Memory

Flooding

Centralized

O(N)

O(N)O(1)

O(1)

O(log N)

O(log N)

Drawbacks•Communication overhead•False negatives

Drawbacks•Memory, CPU, Bandwidth usage •Fault Tolerance?

A tradeoff between these two options

DISTRIBUTED HASH TABLES: Motivations

Co

mm

un

icat

ion

Ove

rhe a

d

Memory

Flooding

Centralized

O(N)

O(N)O(1)

O(1)

O(log N)

O(log N)

Drawbacks•Communication overhead•False negatives

Drawbacks•Memory, CPU, Bandwidth usage •Fault ToleranceDistributed

Hash Table

Scalability: O(log N)

No false negative

Self-organized management

Joining nodes

Leaving nodes (voluntary/failures)

DHT: Main Aspects● Goals

– O(log(N)) hops to lookup– O(log(N)) entries in routing table

Routing needs O(log(N)) hops to

reach a node with the searched data

Nodes' routing table size: O(log(N))

Comparison

False Negative

O(1)O(N)Central Server

O(log N)O(log N)DHT

O(N²)O(1)Pure P2P(flooding)

RobustnessComplex Queries

Communiction OverheadMemory Approach

DHT: Overview

• Abstraction: a distributed “hash-table” (DHT) data structure:

– put(id, item);

– item = get(id);

• Scalable, decentralized, fault tolerant

• Implementation: nodes in system form a distributed data structure

– Can be Ring, Tree, Hypercube, Skip List, Butterfly Network, ...

DHT: Overview (2)

Many DHTs:

26

Hashing (Recap) Data structure supporting the operations:

void insert( key, item ) item search( key )

Implementation uses a hash function for mapping keys to array cells

Expected search time O(1) provided that there are few collisions

Nodes IDs and content IDs mapped in the same keyspace Use of a hash function

Nodes store table entries lookup( key )

returns the IP of the node responsible for the key key usually numeric (in some range)

Requirements for an application being able to use DHTs: data identified with unique keys nodes can (agree to) store keys for each other

location of object or actual object

DHTs

Using the DHT Interface How do you publish a file? How do you find a file?

What does a DHT need to do

Map keys to nodes needs to be dynamic as nodes join and leave

Route a request to the appropriate node Key-based routing on the overlay

Lookup Example

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

insert(K1,V1)

K V(K1,V1)

lookup(K1)

DHT StructureStep 1:

● Definition of the keyspace. Ex: Logic keyspace as a ring – Assumption: keyspace 0, …, 2m-1 larger than the amount of

data items (es m=160),– Need for a total ordering and distance notion (in modulus)

● Each node is assigned a single key (its ID)– Mapping node - key through a hash function

● The overlay network is not correlated to the real (Internet) topology

Distributed Hash Table:overlay

Real topology

2207

29063485

201116221008709

611

DHT StructureStep 2:

● Each node is responsible for a set of data items stored in the DHT– This data set is composed of items with keys in a contiguous

interval in the keyspace– Typical DHT implementation: map key to node whose id is

“close” to the key (need distance function). ● Data items mapped in the same keyspace of the nodes, through the

hash function– NodeID=hash(node’s IP address)– KeyID=hash(key). Key is a file name

● E.g., Hash(String): H(“LucidiLezione29-02-08“) → 2313● Hash can be performed on the filename or on its entire

content– m-bit identifier– Cryptographic hash function (e.g., SHA-1)

● Sometimes a certain redundancy is introduced (overlapping)

DHT Structure

H(Node Y)=3485

3486 -611

1623 -2011

612-709

2012 -2207

2208-2906

(3486 -611)

2907 -3485

1009 -1622

2m-1 0

Data item “D”:H(“D”)=3107 H(Node X)=2906

710 -1008

Distributed Hash Table:overlay

● Real topology

2207

29063485

201116221008709

611

y

x

Load Balancing

● Problem: uniform distribution among DHT nodes● Possible causes of unbalanced load:

– A node has a bigger portion of the keyspace– Uniform partitioning of the keyspace among nodes,

but an interval contains more data items than other intervals

– Uniform partitioning and uniform distribution of the data items, but there are certain items that are more requested than others

● Unbalanced load → lower robustness, lower scalability● Need for algorithms for balancing the load

● Problem: given a data item D, find the node that holds key=H(D)

(3107, (ip, port))

Value = it might be a pointer to the real location of the data

Key = H(“D”)

Node 3485 holds keys 2907-3485,

Starting node(arbitrary)

H(“D“)= 3107

2207

29063485

20111622

1008

709

611

ROUTING

D134.2.11.68

ROUTING● Each node must be able to forward each lookup query to a

node closer to the destination● Maintain routing tables adaptively

– each node knows some other nodes– must adapt to changes (joins, leaves, failures)

(3107, (ip, port))

Value = it might be a pointer to the real location of the data

Key = H(“D”)

Node 3485 holds keys 2907-3485,

Starting node(arbitrary)

H(“D“)= 3107

2207

29063485

20111622

1008

709

611

D134.2.11.68

Nodes joining the DHT ID identification through the hash function

The node contacts an arbitrary node (bootstrap)

A portion of the keyspace is assigned to this node, based on its

ID

Copy of assigned pairs K/V (plus some redundancy)

Insertion in the DHT (i.e. link creation)

2207

29063485

201116221008709

611

ID: 3485

134.2.11.68

Nodes leaving the DHT (departure, failure)

● Voluntary departure The keys are distributed to neighbour nodes Copy of K/V pairs on assigned neighbour nodes Removal of the node from the routing tables

● Node failure All keys are lost, unless some redundancy is introduced

Most DHTs use replication Periodic refresh

Use of multiple routing paths Periodic node probing to check that neighbours are

online When faults are detected, routing tables are updated

DHT: Applications● DHTs offer a generic distributed service to store and

index information● The data item can be

– File– IP address– Any kind of information ……

● Examples of applications – DNS implementation

● key: hostname, value: IP address– P2P storage systems: ex. Freenet– Distributed search engine– Distributed caching– Distributed file system

DHTs● Chord

UC Berkeley, MIT

● Pastry

Microsoft Research, Rice University

● Tapestry

UC Berkeley

● CAN

UC Berkeley, ICSI

● P-Grid

EPFL Lausanne● Kademlia , rete KAD di EMule... ● Symphony, Viceroy, …

If you want to start with a survey

● Eng Keong Lua; Crowcroft, J.; Pias, M.; Sharma, R.; Lim, S., "A survey and comparison of peer-to-peer overlay network schemes," Communications Surveys & Tutorials, IEEE , vol.7, no.2, pp.72,93, Second Quarter 2005doi: 10.1109/COMST.2005.161054

Resource Lookup DHT Distributed Hash Tables - …sferrett.web.cs.unibo.it/CAS/SLIDES/03_DHT.pdf ·...

Documents

Transcript of Resource Lookup DHT Distributed Hash Tables - …sferrett.web.cs.unibo.it/CAS/SLIDES/03_DHT.pdf ·...