Large Scale Sharing

Large Scale SharingMarco F. Duarte

COMP 520: Distributed SystemsSeptember 19, 2004

Introduction P2P sharing systems are very popular In P2P, all nodes have identical capabilities

and responsibilities Popular approaches are partially centralized,

do not scale well, or do not provide desired anonymity

Scalability of systems critical Need for decentralized, load-balancing

architectures

Features desired in a P2P sharing system Decentralized architecture – no single point

of failure Scalability – bandwidth and load balancing Fault tolerance – content replication Anonymity for users – posters, readers,

storers Resilient against DoS attacks

Freenet provides anonymity No requester, provider information implicit in

communication Presence of a file in a node does not imply

authorship Popular files are replicated to improve locality Does not intend to provide

permanent storage

Freenet Queries Files receive FileIDs (160-

bit SHA-1 hash of “file identifier”)

Queries have pseudo-unique random identifiers (QueryIDs) and hops-to-live count.

Routing tables contain table of previously retrieved FileIDs and their locations

Queries are routed to location with closest FileID at each stage; loops are detected with QueryID

FileID Node Address

00231311 192.168.3.24

11310231 192.168.52.111

20130102 192.168.122.38

23102312 192.168.213.231

30002312 192.168.58.47

32302132 192.168.33.241

32320303 192.168.194.28

33103123 192.168.12.242

31302313?

Freenet Queries: Lookups and Stores

•Copies of the file are stored at all nodes•File record for a is added to routing tables•Writes perform lookup, insert file along path if no match found

Freenet Properties FileID-based clustering allows for improved routing

as usage increases LRU-like capacity management: rarely used files are

purged from the system Random nature of FileIDs allow for diversity of

information at nodes Attempts to supplant existing files will lead to real file

propagation Anonymity features:

File ownership assumed randomly by other nodes Minimal routing information necessary at each hop Hops-to-live count of 1 updated randomly

Freenet Problems Files that are stored in the network may not

be found. Freenet does not provide reliable storage No notion of locality in routing Simulations do not involve file insertion or

node discovery

PAST: Reliable Distributed Storage Customizable file persistence High availability and load balancing Efficient Routing and Storage Allocation Uses FileIDs generated from hashes like in

Freenet Uses owner credentials to verify identity of

authors Interface: Insert, Lookup, Reclaim

PAST Architecture FileID computed from hash of filename,

owner’s public key and a random salt. Each node receives a pseudorandom

NodeID, independent of the node properties. Owner specifies number k of replicas of a file

to store in the system on insert. File is stored in the k nodes with NodeIDs

closest to the FileID. Routing provided by Pastry.

Pastry: Routing for P2P Networks Paths with less than hops Delivery guaranteed under at most node

failures Flexible proximity metric. Each node contains:

Leaf set – l nodes with closest NodeIDs Routing table – set of neighbors organized by NodeIDs Neighborhood set – l closest nodes Each NodeID is paired with its network

address Direct routes to neighbors and l closest

NodeIDs

Nb2log

Pastry: Example Routing table

organized by similarity to NodeID.

Neighborhood set used for node addition/recovery.

Queries are forwarded to a numerically closer node (by shared NodeID header, and NodeID proximity).

Pastry Routing Table0=2M

Leaf Set

Neighborhood Set

1311 2031

Pastry Routing Example0=2M

1311 2031

0231 3321

30133133

Other nodes exist but are not shown

Pastry Node Insertion Example0=2M

1311 2031

0231 3321

NeighborhoodSet

Leaf Set

Pastry Node Removal Example0=2M

PAST Insertions0=2M

1311 2031

0231 3321

Insert File, FileID 3130

3130: File,Certificate

fileID = Insert(name, owner-credentials, k, file)

Insert File K times

PAST Insertions0=2M

1311 2031

0231 3321

k Store Receipts

k StoreReceipts

fileID = Insert(name, owner-credentials, k, file)

PAST Semantics fileID = lookup(fileID)

Routed to NodeID = FileID First of k closest nodes found returns file, credentials

Reclaim(fileID, owner-credentials) Same semantics as Insert Owner issues Reclaim Certificate Storing nodes issue Reclaim Receipt

Changes in leaf sets will trigger changes in replica locations A new node creates “pointers” to files it should contain;

migration is gradual

Load Balancing in PAST: Replica Diversion

3130 Leaf Set

3201Leaf Set

Load Balancing in PAST: File Diversion

3130 Leaf Set

3201Leaf Set

Change ID by changing salt

Policies for acceptance of replicas and diverted replicas, and selection of diverted replica node.Maximum ratio of file size to free space for insertion tpri, tdiv

Caching in PAST Highly popular files might demand more

replicas than specified. Files located “far away” only need to be

fetched once locally Unused disk space is allocated as cache. Caching performance degrades gradually

with increased utilization Cache insertion policy similar to diversion

policies.

PAST Performance: tpri comparison, tdiv =0.05

82.00%84.00%86.00%88.00%90.00%92.00%94.00%96.00%98.00%

100.00%

0.05 0.1 0.2 0.5

SucceedUtilization

PAST Performance: tpri comparison, tdiv =0.05

PAST Performance:Ratio of File Diversions

PAST Performance: Ratio of Replica Diversions

PAST Performance: Failed Insertions

PAST Performance: Cache Hits

Conclusions Content based routing improves scalability of

distributed storage systems. Need for user authentication in distributed

systems. Caching is crucial for system performance. Diversion allows for graceful performance

degradation. Need file mutability, file search or indexing

services

Large Scale Sharing

Documents

Transcript of Large Scale Sharing

A Large-Scale Neural Network Which Recognizes Handwritten ...papers.nips.cc/paper/211-a-large-scale-neural-network-which... · A Large-Scale Neural Network 415 A LARGE-SCALE NEURAL

FINANCING LARGE SCALE SOLAR · FINANCING LARGE SCALE SOLAR Large Scale Solar Conference - Sydney Gloria Chan. ... LARGE SCALE SOLAR HAS COME A LONG WAY MW OF INSTALLED CAPACITY CEFC

Health Data & Blockchain: The New Sharing Frontier Data & Blockchain: The New Sharing Frontier Michael Dillhyon, ... Scaling: large-scale ... • P2P technology scales system resources

Large Scale Organisations in Context Types of Large Scale Organisations Classification of Large Scale Organisations Contributions to the economy Management.

Large Scale Sharing Marco F. Duarte COMP 520: Distributed Systems September 19, 2004.

Large-scale Disasters LESSONS LEARNED Large ... - … · Large-scale Disasters LESSONS LEARNED ... LARGE-SCALE DISASTERS ... 4.An insurance perspective on disaster management

The Climate-G testbed towards a large scale data sharing environment for climate change

Online Contextual Face Recognition: Towards Large Scale Photo Tagging for Sharing

Large Scale

Large-scale data sharing by exploiting gossiping Esther Pacitti SOPHIA ANTIPOLIS - MÉDITERRANÉE 1st Gossple Workshop on Social Networking (december 2010)

Human Genome Sciences’ Large Scale Manufacturing · PDF fileHuman Genome Sciences’ Large Scale Manufacturing Facility ... HUMAN GENOME SCIENCES LARGE SCALE MANUFACTURING ... Above

Understanding large-scale instabilities of …gershwin.ens.fr/.../beamer-lectures_cambridge_2012.pdflayered models V. Zeitlin Unstable large-scale ﬂows Modeling large-scale processes

Non-coherent Large Scale MIMO: massive but feasible · PDF fileNon-coherent Large Scale MIMO: massive ... •The “magic” of massive MIMO ... •CSI estimation and sharing is vey

Moving from small science to big science: Social and organizational impediments to large scale data sharing

Cosmology : Cosmic Microwave Background & Large scale structure & Large scale structure Cosmology : Cosmic Microwave Background & Large scale structure.

Sharing and Reuse of e-Services at a large scale in Spain, challenges ahead · 2017-10-03 · Sharing Services at a Large Scale, How? Legal Main services identified and supported

Large-Scale Battery Storage Knowledge Sharing ReportLarge-scale Battery Storage Knowledge Sharing Report CONTENTS 1. Executive Summary 1 2. Introduction 2 2.1 Background 2 2.2 Scope

Large Scale Sharing The Google File System PAST: Storage Management & Caching – Presented by Chi H. Ho.

Large Scale Sharing GFS and PAST Mahesh Balakrishnan.

IP Address Sharing in Large Scale Networks: DNS64/NAT64 · 2020-07-06 · Configuring IP Address Sharing in a Large Scale Network. F5 ® Deployment Guide 2. Configuring the BIG-IP