Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P...

69
Peer to Peer Technologies

Transcript of Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P...

Page 1: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Peer to Peer Technologies

Page 2: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Outline

What is P2P?P2P architecturesExamples of P2P system (P2P applications)P2P data management techniquesConclusions

Page 3: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

What is P2P?

Page 4: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

P2P introduction:

Peer-to-Peer computing put in a simple way is described is the sharing of computer resources and services by direct exchange between systems. Peer (Servent) - this is defined as a computer that has both Client and Server roles. It is also called a Servent with the same meaning as above.

Page 5: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

P2P network diagram

Page 6: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

A simple picture of P2P App

Page 7: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

P2P features(1)

All peers in P2P network are the same. Data and computation is decentralized.Search for information in P2P networks is more relevant compared to static searches (such as Google or Yahoo).Peers and their connections are volatile.

Page 8: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

P2P features(2)

Properties: – no central coordination – no central database – no peer has a global view of the system – global behavior emerges from local interactions – all existing data and services are accessible from any peer – peers are autonomous – peers and connections are unreliable

Page 9: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Types of P2P (layer view)

Page 10: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Types of P2P System (Apps)

E-commerce systems – eBay, B2B market places…

File sharing systems – Napster, Gnutella, Freenet, …

Distributed Databases – Mariposa [Stonebraker96], …

Networks – Arpanet – Mobile ad-hoc networks

Page 11: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

P2P vs. C/S and Web system

Page 12: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

P2P architectures

Page 13: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

P2P qualities

Easy to modify or upgrade the system with minimum effortA high need for performance qualityA high ask on the Usability qualityFlexible enough to handle infinite requests form peers - scalabilityThe principle of remote access

Page 14: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Peer structure

Each peer provides a basic set of core services.Using the some protocols(http, ftp…) peers link together in networks to share information and servicesexample below is that of a Peer that uses the HTTP protocol.

Page 15: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.
Page 16: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Architectural styles

Call and Return Style- Object Oriented system (wait until the other component replies)- Layered Architecture(when the task can be divided )

Page 17: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Architectural patterns

Broker PatternPipes and FiltersLayers

Page 18: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Examples of P2P Systems

Page 19: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Existing P2P systems

NapsterGnutellaFreenetOceanStore Farsite FastTrack Tornado

Chord CAN Gridella

Page 20: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

P2P System models (1)

Centralized model – global index held by a central

authority (single point of failure) – direct contact between

requestors and providers – Example: Napster

Page 21: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

P2P System models (2)

Decentralized model – Examples: Freenet, Gnutella – no global index, no central

coordination, global behavior emerges from local interactions, etc.

– direct contact between requestors and providers (Gnutella) or mediated by a chain of intermediaries (Freenet)

Page 22: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

P2P System models (3)

Hierarchical model – introduction of “super-peers” – mix of centralized and

decentralized model – Example: FastTrack

Page 23: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Napster: OverviewCentral (virtual) database which holds an index of offered MP3/WMA filesClients(!) connect to this server, identify themselves (account) and send a list of MP3/WMA files they are sharing (C/S)Other clients can search the index and learn from which clients they can retrieve the file (P2P)Combination of client/server and P2P approachesFirst time users must register an account

Page 24: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Communication Model

Page 25: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Gnutella: OverviewNo central server

– cannot be sued (Napster)Constrained broadcast

– Every peer sends packets it receives to all of its peers (typically 4)

– Life-time of packets limited by time -to-live (typically set to 7)

– Packets have unique ids to detect loopsHooking up to the Gnutella systems requires that a new peer knows at least one Gnutella host

– gnutellahosts.com:6346 – Outside the Gnutella protocol specification

Page 26: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Protocol Message Types

Page 27: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Communication model

Page 28: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Topology of Gnutella

Small-world properties verified (“find everything close by”)Backbone + outskirts

Page 29: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.
Page 30: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Summary(1):

Completely decentralizedHit rates are highHigh fault toleranceAdopts well and dynamically to changing peer populationsNo estimates on the duration of queries can be givenNo probability for successful queries can be givenFree riding is a problem

Page 31: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Summary(2):

Reputation of peers is not addressedSimple, robust, and scalable (at the moment)Protocol causes high network traffic (e.g., 3.5Mbps). For example:

– 4 connections C / peer, TTL = 7

– 1 ping packet can cause packets

Page 32: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Freenet: Overview

Adaptive P2P system which supports publication,replication, and retrieval of dataAnonymityRequests are routed to the most likely physical location

– no central server as in Napster – no constrained broadcast as in Gnutella

Files are referred to in a location independent wayDynamic replication of data

Page 33: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Freenet: Key types

Keys are represented as Uniform Resource Identifiers (URIs): freenet:keytype@dataKeyword Signed Keys (KSK)Signature Verification Keys (SVK)SVK Subspace Keys (SSK)Content Hash Keys (CHK)Keys can be used for indirections, e.g., KSK ->CHK

Page 34: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Keyword Signed Keys (KSK)

User chooses a short descriptive text sdtext for a file,e.g., text/computer-science/esec2001/p2p-tutorialsdtext is used to deterministically generate a public/private key pairThe public key part is hashed and used as the file keyThe private key part is used to sign the fileThe file itself is encrypted using sdtext as keyFor finding the file represented by a KSK a user must know sdtext which is published by the provider of the FileExample: freenet:KSK@text/books/1984.html

Page 35: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

SVKs and SSKsAllows people to make a subspace, i.e., controlling a set of keysBased on the same public key system as KSKs but purely binary and the key pair is generated randomlyPeople who trust the owner of a subspace will also trust documents in the subspace because inserting documents requires knowing the subspace’s private keyFor retrieval: sdtext and public key of subspace are publishedSSKs are the client-side representation of SVKs with a document nameExamples:

–freenet:SVK@HDOKWIUn10291jqd097euojhd01 –freenet:SSK@1093808jQWIOEh8923kIah10/text/books/1984.html

Page 36: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Content Hash Keys (CHK)

Derived from hashing the contents of the file Þ pseudo-unique file key to verify file integrityFile is encrypted with a randomly-generated encryption keyFor retrieval: CHK and decryption key are published (decryption key is never stored with the file)Useful to implement updating and splitting, e.g., in conjunction with SVK/SSK:

– to store an updateable file, it is first inserted under its CHK – then an indirect file that holds the CHK is inserted under a

SSK – others can retrieve the file in two steps given the SSK – only the owner of the subspace can update the file

Example: freenet:CHK@UHE92hd92hseh912hJHEUh1928he902

Page 37: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

SummaryCompletely decentralizedHigh fault toleranceRobust and scalableAutomatic replication of contentAdopts well and dynamically to changing peer populationsSpam content less of a problem (subspaces)Adaptive routing preserves network bandwidthNo estimates on the duration of queries can be givenNo probability for successful queries can be givenTopology is unknown -> algorithms cannot exploit itRouting “circumvents” free-ridersReputation of peers is not addressedSupports anonymity of publishers and readers

Page 38: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

P2P data management techniques

Page 39: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Assumptions

Peers have a physical address (called reference in the following)Data objects are identified by keys k

Page 40: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Searching problem

Peers with address Pd store data items d that are identified by a key kIn order to locate a peer that stores d we have to search for key k in the lookup table consisting of tuples of form (k, Pd)Thus, the database we have to manage consists of the key-value pairs (k, Pd)We do not further consider the storage of data items d

Page 41: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Data access structures

Every peer maintains a small fragment of the database and a routing tableThe peers implement a routing strategyReplication can be used to increase robustness

Page 42: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Approaches

Existing P2P Systems – Gnutella – Freenet

Research – CHORD – Content-Addressable Networks – Tapestry – P-Grid

Page 43: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

GnutellaEach peer knows a fixed number of other peers, e.g. 4Other peers are found randomly, e.g. through ping messagesSearch requests are forwarded to those peers, with a limited time-to-live, e.g. 7Peers can answer the request if they store the corresponding file

Page 44: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.
Page 45: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

GnutellaSearch types – Any possible string comparisonScalability

– Search very poor with respect to number of messages – Probably search time O(Log n) due to small world property – Updates excellent: nothing to do – Routing information: low cost

Robustness – High, since many paths are explored

Autonomy – Storage: no restriction, peers store the keys of their files – Routing: peers are target of all kinds of requests

Global Knowledge – None required

Page 46: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Freenet

Each peer knows a fixed number of other peers and a key, that the peers storeSearch requests are routed to the peer with the most similar key

– If not successful the next similar key is used etc. – Similarity based on lexicographic distance (any other

measure would be possible as well)Search requests have limited life time, e.g. 500Peers can answer requests if they store the requested itemsWhen the answer is passed back, the intermediate peers can use it to update their routing information

Page 47: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Freenet

Page 48: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Freenet: Searching

Peers store keys, data and addressesAs with Gnutella search requests have

– limited life time, but typical higher, e.g., 500

– message identifiers to avoid cycles

Page 49: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Freenet: SearchingIf a search request arrives

– Either the data is in the table – Or the request is forwarded to the

addresses with the most similar keys (lexicographic similarity, edit distance) till a answer is foundIf an answer arrives

– The key and address of the answer are inserted into the table

– The least recently used key is evicted

Page 50: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Freenet: DiscussionSearch types

– Only equality, exact keys need to be known, e.g., published in a directory

– However, if keys were not hashed, semantic similarity might be used for routingScalability

– Search good, seems to be O(Log n) in number of nodes n – Update good, like search – Routing information: a bootstrapping phase is required

Robustness – Good, since alternative paths are explored

Autonomy – Storage no restriction – Routing: dependency between stored keys and received requests

Global Knowledge – Key hashing

Page 51: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

CHORD

Based on a hashing of search keys and peer addresses on binary keys of length mEach peer with hashed identifier p is responsible (=stores values associated with the key) for all hashed keys k such that

Page 52: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

CHORD

Each peer p stores a “finger table” consisting of the first peer with hashed identifierA search algorithm ensures the reliable location of the data – Complexity O(log n), n nodes in the network

Page 53: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

CHORD

Page 54: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

CHORD: Searching

Page 55: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

CHORD: DiscussionSearch types

– Only equalityScalability

– Search O(Log n). – Update requires search, thus O(Log n). – Construction: O(Log^2 n) if a new node joins

Robustness – Replication might be used by storing replicas at successor nodes

Autonomy – Storage and routing: none – Nodes have by virtue of their IP address a specific role

Global knowledge – Mapping of IP addresses and data keys to key common key space – Single Origin

Page 56: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

CAN

Based on hashing of keys into a d-dimensional space (a torus)Each peer is responsible for keys of a subvolume of the space (a zone)Each peer stores the peers responsible for the neighboring zones for routingSearch requests are greedily forwarded to the peers in the closest zonesAssignment of peers to zones depends on a random selection made by the peer

Page 57: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

CAN

Page 58: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

CAN: DiscussionSearch types

– equality only – however, could be extended using spatial proximity

Scalability – Search and update: good O(d n^(1/d)), depends on

configuration of d – Construction: good

Robustness – Good with replication

Autonomy – Free choice of coordinate zone

Global Knowledge – Hashing of keys to coordinates, realities, overloading – Single origin

Page 59: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Tapestry

Based on building distributed, n-ary search treesEach peer is assigned to a leaf of the search treeEach peer stores references for the other branches in the tree for routingSearch requests are either processed locally or forwarded to the peers on the alternative branchesEach peer obtains an ID in the node ID spaceEach data object obtains a home peer based on a distributed algorithm applied to its ID

Page 60: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Tapestry

Page 61: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Tapestry: DiscussionSearch types

– Equality searchesScalability

– Search and update O(Log n) – Node join operation is scalable

Robustness – High when using replication

Autonomy– Assignment of node IDs not clearGlobal Knowledge– Hashing of object Ids, replication scheme– Single origin

Page 62: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

P-GridSimilar data organization as Tapestry, however node IDs of variable lengthData objects stored at peer if node ID is prefix of data keyAssignment of peers is performed by repeated mutual splitting of the search space among the peers

– Tapestry-like data organization combined with CAN-like constructionSplitting stops when abortion criteria is fulfilled

– Maximal key length – Minimal number of known data items

Different P-Grids can merge during splitting (multiple origin possible, unlike CAN)Replication is obtained when multiple peers reside in same fragment of ID space

Page 63: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

P-Grid

Page 64: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Comparisons

Page 65: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Research issues

Page 66: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

P2P Research

P2P for reliable E-CommerceQuality of service(fault tolerance)

- Multiple sources downloadingRicher data modelMultimediaMessage-based applicationsMobility

Page 67: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Appendix

Page 68: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Small World Networks

Page 69: Peer to Peer Technologies. Outline What is P2P? P2P architectures Examples of P2P system (P2P applications) P2P data management techniques Conclusions.

Downloading big files

Multiple sourcesFault toleranceErasure coding

- Tornado coding