1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables...

29
1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables (DHTs)

Transcript of 1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables...

1

Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang

Distributed Hash Tables (DHTs)

2

Background– Distributed Data Structures

– Chord, CAN, Pastery, Tapestry

– File Sharing Networks

– 1st , 2nd , 3rd Generation

Distributed Hash Tables (DHTs)

3

Overlay networks– Network on top of another network

– Provide extra functionality

– E.g. Dial-up Internet, VPN, TOR

– Bootstrapping node

Distributed Hash Tables (DHTs)

4

Hash table– Key-value pair

Distributed Hash Tables (DHTs)

Key Value

Key1 Value1

Key2 Value2

Key3 Value3

5

Node 2Node 1

Hash table– Hash table divided on several nodes

Distributed Hash Tables (DHTs)

Key Value

Key 1-1 Value 1-1

Key 1-2 Value 1-2

Key 1-3 Value 1-3

Key Value

Key 2-1 Value 1-1

Key 2-2 Value 1-2

Key 2-3 Value 1-3

Node 3Key Value

Key 3-1 Value 3-1

Key 3-2 Value 3-2

Key 3-3 Value 3-3

6

Structure– Key space partitioning

– Consistent hashing

– Overlay network structure

Distributed Hash Tables (DHTs)

7

Chord– The chord ring: modulo 2^m

Distributed Hash Tables (DHTs)

8

Chord– Each node has a determined place within the

ring

– A node has a successor and a predecessor

– A node stores the keys between its predecessor and itself

– A routing/finger table is used to keep track of the other nodes

Distributed Hash Tables (DHTs)

9

ChordExample:

Node 8

Key space

Distributed Hash Tables (DHTs)

10

Chord– Each node contains a routing table with at

most m entries

– For j = 1,m -> entry: p+2^(j-1) to closest successor node

– Checks if k is found between n and successor of n. If not forward the request to the closest node preceding n

– Each node knows a lot of nearby nodes, and a little about far away nodes

– The target node will eventually be found after at most (log n) steps

Distributed Hash Tables (DHTs)

11

Node 8Chord

Distributed Hash Tables (DHTs)

j Maps to Real Node

1 x+1 N14

2 x+2 N14

3 x+4 N14

4 x+8 N21

5 x+16 N32

6 x+32 N42

12

Chord

Distributed Hash Tables (DHTs)

13

Functionality– DHT Optimalization

– Degrees

– Hops

– Churn

– Replication

Distributed Hash Tables (DHTs)

14

Overlay networks in practice– Overlay network definition: Network built on top of

another network.

– Overlay network examples:– IntServ (Integrated services)– DiffServ (Differentiated services)– IP Multicast

– + Improved quality of service

– - Require modification

Distributed Hash Tables (DHTs)

15

Overlay networks in practice– DHT

– Peer-to-peer protocols:– Freenet– I2P– Gnutella – A file sharing network

• An open protocol used by several clients• The most popular file sharing network on the

Internet (as of 2007, 40% share)• Example client: LimeWire

Distributed Hash Tables (DHTs)

16

Overlay networks in practiceGnutella

– “Pong” packets– Responses to Ping

packets– Pong-caching

– Floodsearching– Older Gnutella

versions only– Bad scalability– Newer versions: No

query flooding

Distributed Hash Tables (DHTs)

17

Overlay networks in practiceLimeWire

– A Gnutella client (became free and opensource in 2001)

– LimeWire uses the BitTorrent protocol and the Gnutella network

Distributed Hash Tables (DHTs)

18

Example application: DHT in LimeWire

– ...

Distributed Hash Tables (DHTs)

19

Protocols and their implementations– DHTs facilitate complex services and systems:

– Distributed file systems– P2P file sharing– Content distribution services– Cooperative web caching– Multicast and anycast– Domain name services– Instant messaging

Distributed Hash Tables (DHTs)

20

Protocols and their implementationsSome overlay network implementations:

– CAN (Content Addressable Network)

– Chord

– Kademlia (most used DHT, used by BitTorrent and Gnutella)

– Pastry

– P-Grid

– Tapestry

• The figure illustrates Chord network topology:

Bold - The first four implementations (2001)

Distributed Hash Tables (DHTs)

21

DHT example applicationBitTorrent– BitTorrent is the most videly used P2P

protocol that supports DHT

– One user needs to act as the file-provider– Providing a “seed”– .torrent file with metadata

– Peers receives different piece of the file– Initial user is relieved– Peer-to-seed shifts– “Swarm”– BitTorrent tracker assists

– Hash has to be known

Distributed Hash Tables (DHTs)

22

– ...

Distributed Hash Tables (DHTs)

DHT example applicationBitTorrent

23

DHT Benefits and LimitationsSome wanted properties:

• Fault-tolerance

• Locating objects (and quick retrieval)

• Scalability

• Incremental deployment

• Availability and performance (efficiency)

• Load balancing

• Data integrity (and quick data storage)

• Redundancy (replication)

Distributed Hash Tables (DHTs)

24

DHT Benefits and Limitations– DHTs work well when executed as intended

– Security limitations

Distributed Hash Tables (DHTs)

25

DHT Security Considerations– Assumption of trusted nodes!

– Behaviour inconsistencies

– Peer-to-peer systems must operate even with malicious participants

– Anonymity less important in DHTs

– Malicious participants– Nodes with malicious intends

– Several known DHT attacks– Missing DHT security considerations

Distributed Hash Tables (DHTs)

26

DHT Security ConsiderationsThe «Sybil» attack

...

Distributed Hash Tables (DHTs)

27

DHT Security Considerations– «Eclipse» attack

– Isolation from benign nodes– “Eclipsed” node

– Sybil and Eclipse attacks facilitate other attacks:– Routing attacks

• Incorrect lookup routing• Incorrect routing updates• Partitioning - “bootstrapping node”

– Storage and retrieval attacks• Usage of replicas

Distributed Hash Tables (DHTs)

28

DHT Security ConsiderationsOther types of attacks

– Inconsistent behaviour– Malicious nodes behaving like benign ones

– Overload of targeted nodes (DoS)

– Rapid joins and leaves

– Rebalancing requires traffic

– Joins and leaves in file sharing systems

– Unsolicited messages– Use digital signatures or MACs– Use random nonce (preferably)

Distributed Hash Tables (DHTs)

29

Questions?– Distributed

– Hash

– Tables

Distributed Hash Tables (DHTs)