Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows,...
-
Upload
joanna-kelly -
Category
Documents
-
view
226 -
download
0
Transcript of Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows,...
![Page 1: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/1.jpg)
Chord
Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes,
Robert E. GruberGoogle, Inc.OSDI 2006
![Page 2: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/2.jpg)
Introduction
Dynamo stores objects associated with a key through a simple interface: get(),put()
It should be possible to scale Dynamo incrementally
This requires the ability to partition data over the set of nodes (storage hosts)
Dynamo relies on a concept called consistent hashing The approach they used is similar to that
found in Chord.
![Page 3: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/3.jpg)
Distributed Hash Tables (DHT)
Operationally like standard hash tables Stores (key, value) pairs
The key is like a filename The value can be file contents or pointer to
location Goal: Efficiently insert/lookup/delete
(key,value) pairs Each peer stores a subset of (key,
value) pairs in the system
![Page 4: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/4.jpg)
DHT
Core operation: Find node responsible for a key Map key to node Efficiently route insert/lookup/delete request
to this node Allow for frequent node arrivals and
departures
![Page 5: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/5.jpg)
DHT Introduce a hash function to map the object being
searched for to a unique global identifier: e.g., h(“NGC’02 Tutorial Notes”) → 8045
Distribute the range of the hash function among all nodes in the network
Each node must “know about” at least one copy of each object that hashes within its range (when one exists)
0-9999500-9999
1000-19991500-4999
9000-9500
4500-6999
8000-8999 7000-8500
8045
![Page 6: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/6.jpg)
DHT:Desirable Properties
Key ID space (search space) is uniformly populated Mapping of keys to IDs using (consistent) hashing
A node is responsible for indexing all the keys in a certain subspace of the ID space
Nodes have only partial knowledge of other node’s responsibilities
Messages should be routed to a node efficiently (small number of hops)
Node arrival/departure should only affect a few nodes.
![Page 7: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/7.jpg)
Consistent Hashing
The main idea: map both keys and nodes (node IPs) to the same (metric) ID space
![Page 8: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/8.jpg)
Consistent Hashing
The main idea: map both keys and nodes (node IPs) to the same (metric) ID space
The ring is just a possibility.Any metric space will do
![Page 9: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/9.jpg)
Consistent Hashing
With high probability, the hash function balances load (all nodes receive roughly the same number of keys).
With high probability, when a node joins (or leaves) the network, only an fraction of the keys are moved to a different location. This is clearly the minimum necessary to
maintain a balanced load.
![Page 10: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/10.jpg)
Consistent Hashing
The consistent hash function assigns each node and key an m-bit identifier using SHA-1 as a base hash function.
A node’s identifier is chosen by hashing the node’s IP address.
A key identifier is produced by hashing the key. For more info see:
D. R. Karger, E. Lehman, F. Leighton, M. Levine, D. Lewin, and R.Panigrahy, “Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on theWorldWideWeb,” in Proc. 29th ACM Symp. Theory of Computing, El Paso, TX, May 1997, pp. 654–663.
![Page 11: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/11.jpg)
P2P Middleware: Differences
Different P2P middlewares differ in: The choice of the ID space The structure of their network of nodes (i.e.
how each node chooses its neighbors) For each object, node(s) whose range(s)
cover that object must be reachable via a “short” path
This is a major research topic
![Page 12: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/12.jpg)
Chord
m bit identifier space for both keys and nodes
Key identifier = SHA-1(key) Key = “LetItBe” ID=50 Key = “129.100.16.93” ID=70
How do we assign keys to nodes?
SHA-1
SHA-1
![Page 13: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/13.jpg)
Chord
Nodes organized in an identifier circle based on node identifiers
Keys assigned to their successor node in the identifier circle e.g., node with next higher ID.
![Page 14: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/14.jpg)
Chord Hash function
ensures even distribution of nodes and keys on the circle
Range covered by node is from previous ID up to its own ID
Assume an N node network
![Page 15: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/15.jpg)
Chord: Search Possibilities
Routing table size vs search cost Every peer knows every other peer:
O(N) routing table size Every peer knows its successor: O(N)
search time. The “compromise” is to have each peer
know the next m successors.
![Page 16: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/16.jpg)
Finger Table
Let m be the number of bits in the key/node identifiers
Each node, n, maintains a routing table with at most m entries called the finger table.
The ith entry in the table at node n contains the identity of the first node, s, that succeeds n by at least 2i-1. s = successor(n+2i-1) s is called the ith finger of node n
![Page 17: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/17.jpg)
Chord:Finger Table
Finger table:finger[i] = successor (n + 2i-1)
where 1 ≤ i ≤ m
O(log N) table size
![Page 18: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/18.jpg)
Chord: Finger Table
Finger table:finger[i] = successor (n + 2i-1)
![Page 19: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/19.jpg)
Chord: Finger Table
Finger table:finger[i] = successor (n + 2i-1)
![Page 20: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/20.jpg)
Chord: Finger Table
Finger table:finger[i] = successor (n + 2i-1)
![Page 21: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/21.jpg)
Chord: Finger Table
Finger table:finger[i] = successor (n + 2i-1)
![Page 22: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/22.jpg)
Chord: Finger Table
Finger table:finger[i] = successor (n + 2i-1)
![Page 23: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/23.jpg)
Chord: Finger Table
Finger table:finger[i] = successor (n + 2i-1)
![Page 24: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/24.jpg)
Chord: Finger Table
Finger table:finger[i] = successor (n + 2i-1)
![Page 25: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/25.jpg)
Chord: Finger Table
Finger table:finger[i] = successor (n + 2i-1)
![Page 26: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/26.jpg)
The Chord algorithm –Scalable node localization
![Page 27: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/27.jpg)
Chord: Search
Assume node n is searching for key k. Node n does the following:
Find ith table entry of node n such that k[finger[i].start, finger[i+1].start])
If no such entry exists then return the node in the last entry of the finger table
The above two steps are repeated until the condition in the first step is satisfied.
![Page 28: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/28.jpg)
Chord: Join
Nodes can join (and leave) at any time. Challenge: Preserving the ability to
locate every key in the network Chord must preserve the following:
Each node’s successor correctly maintained For every key k, node successor(k) is
responsible for k. For lookups to be fast, it is desirable for
the finger tables to be correct.
![Page 29: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/29.jpg)
Chord: Join Implementation
Each node in Chord maintains a predecessor pointer. This consists of the Chord ID and IP address
of the immediate predecessor of that node. It can be used to walk counterclockwise
around the identifier circle. The new node to be added learns the
identify of an existing Chord node by some external mechanism
![Page 30: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/30.jpg)
Chord: Join Initialization Steps
Assume n is the node to join. Find any existing node, n’. Find successor of n from n’. Label this
successor(n). Ask successor(n) for its predecessor.
This is labelled as predecessor(successor(n)).
![Page 31: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/31.jpg)
Chord: Join Example
•Assume N26 wants tojoin; If finds N8
•N8’s finger table suggests that N26 will be “between” N21 and N32.
![Page 32: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/32.jpg)
Chord: Join (Initialize finger table)
Node n needs to have its finger table initialized
Node n can ask one its predecessor to be for its finger table as a starting point
![Page 33: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/33.jpg)
Chord: Join (Changing Existing Finger Tables)
Node n needs to entered into the finger tables of some existing nodes.
Node n becomes the ith finger of node p, iff p precedes n by at least 2i-1 ; and The ith finger of node p succeeds n.
The first node, p, that satisfies these conditions is the immediate predecessor of n-2i-1
For a given n, the algorithm starts with the ith
finger of node n and then continues to walk in the counter-clock-wise direction on the identifier circle until it encounters a node whose ith finger precedes n.
![Page 34: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/34.jpg)
Chord: Join Example (add N26)
N21+1 N32
N21+2 N32
N21+4 N32
N21+8 N32
N21+16
N38
N21+32
N56
N21 (old finger table)
N21+1 N26
N21+2 N26
N21+4 N26
N21+8 N32
N21+16 N38
N21+32 N56
N21 (new finger table)
i=1: Does N21 precede N26 by at least 1 (2i-1); yes: N21+1 becomes N26;
i=2: Does N21 precede N26 by at least 2; yes: N21+2 becomes N26;
i=3: Does N21 precede N26 by at least 4; yes: N21+4 becomes N26;
i=4: Does N21 precede N26 by 8; no; evaluate N14;
![Page 35: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/35.jpg)
Chord: Join Example (add N26)
N14+1 N21
N14+2 N21
N14+4 N21
N14+8 N32
N14+16
N32
N14+32
N48
N14 (new finger table)
N14+1 N21
N14+2 N21
N14+4 N21
N14+8 N26
N14+16 N32
N14+32 N48
N14 (new finger table)
i=4: Does N14 precede N26 by at least 8; yes; N14+8 becomes N26
i=5; Does N15 precede N26 by at least 16; no; evaluate N8
Etc
![Page 36: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/36.jpg)
Chord: Join (Transferring Keys)
Move responsibility for all the keys for which node n is the successor.
Typically this involves moving data associated with each key to the new node.
Node n can become the successor for keys that were previously the responsibility of the node immediately following n.
Node n only needs to contact one node to transfer responsibility for all relevant keys.
![Page 37: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/37.jpg)
Chord: Join
The previous discussion on join focuses on a single node join.
What if there are multiple node joins? Join requires that each node’s
successor is correctly maintained
![Page 38: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/38.jpg)
Chord: Stabilization Protocol The successor/predecessor links are
rebuilt by periodic stabilize notification messages Sent by each node to its successor to inform
it of the (possibly new) identity of the predecessor
The successor pointers are used to verify and correct finger table entries.
![Page 39: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/39.jpg)
Chord: Join/Stabilize Example
![Page 40: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/40.jpg)
Chord: Join/Stabilize Example
• N26 joins the system
• N26 acquires N32 as its successor
• N26 notifies N32
• N32 acquires N26 as its predecessor
![Page 41: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/41.jpg)
Chord: Join/Stabilize Example
• N26 copies keys
• N21 runs stabilize() and asks its successor N32 for its predecessor which is N26.
![Page 42: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/42.jpg)
Chord: Join/Stabilize Example
• N21 aquires N26 as its successor
![Page 43: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/43.jpg)
Chord Stabilization
Pointers and finger tables may be in a state of flux
Is it possible that data will not be found? Yes
Recovery: try again
![Page 44: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/44.jpg)
Chord: Node Failure
N120
N113
N102
N80
N85
N80 doesn’t know correct successor, so incorrect lookup
N10
Lookup(90)
![Page 45: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/45.jpg)
Chord: Node Failure
Solution: Use successor lists Each node knows r immediate successors After failure, will know first live successor Stabilize messages correct finger tables Replicas of the data associated with a key at
the r successor nodes might be used Application dependent
![Page 46: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/46.jpg)
Chord Properties
In a system with N nodes and K keys, with high probability… each node receives at most K/N keys each node maintains info. about O(log N) other nodes lookups resolved with O(log N) hops Insertions O(log2N)
The developers of Chord validated this through simulation studies.
No consistency among replicas Hops have poor network locality
![Page 47: Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bf7c1a28abf838c84178/html5/thumbnails/47.jpg)
Chord: Network Locality
Nodes close on ring can be far in the network.
CA-T1CCIArosUtah
CMU
To vu.nlLulea.se
MITMA-CableCisco
Cornell
NYU
OR-DSLN20
N41N80N40
* Figure from http://project-iris.net/talks/dht-toronto-03.ppt