Network
description
Transcript of Network
![Page 1: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/1.jpg)
1
Network
Principles of Computer System (2012 Fall)
End-to-end Layer
![Page 2: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/2.jpg)
2
Review
• System Complexity• Modularity & Naming• Enforced Modularity– C/S– Virtualization: C/S on one Host
• Network– Layers & Protocols– P2P & Congestion control
![Page 3: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/3.jpg)
3
Network is a system too
• Network As a System– Network consists of many networks many links
many switches– Internet is a case study of successful system
![Page 4: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/4.jpg)
Delay (transit time)• Propagation delay
– Depends on the speed of light in the transmission medium• Transmission delay
– Depends on the data rate of the link and length of the frame– Each time the packet is transmitted over a link
• Processing delay– E.g. examine the guidance info, checksum, etc.– And copying to/from memory
• Queuing delay– Waiting in buffer– Depends on the amount of other traffic
4
![Page 5: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/5.jpg)
Recursive network composition
5
• Gnutella is a large decentralized P2P network• The link layer itself is a network
![Page 6: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/6.jpg)
The Internet “Hour Glass”
6
![Page 7: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/7.jpg)
Packet routing/forwarding
7
• Packet switching– Routing: choosing a particular path (control plane)– Forwarding: choosing an outgoing link (data plane)
• Usually by table lookup
![Page 8: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/8.jpg)
Best-effort network• Best-effort network– If it cannot dispatch, may discard a packet
• Guaranteed-delivery network– Also called store-and-forward network, no discarding data– Work with complete messages rather than packets– Uses disk for buffering to handle peaks– Tracks individual message to make sure none are lost
• In real world– No absolute guarantee– Guaranteed-delivery: higher layer; best-effort: lower layer
8
![Page 9: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/9.jpg)
NAT (Network Address Translation)• Private network– Public routers don’t accept routes to network 10
• NAT router: bridge the private networks– Router between private & public network– Send: modify source address to temp public address– Receive: modify back by looking mapping table
• Limitations– Some end-to-end protocol place address in payloads– The translator may become the bottleneck– What if two private network merge?
9
![Page 10: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/10.jpg)
10
CASE STUDY: MAPPING INTERNET TO ETHERNET
![Page 11: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/11.jpg)
Case study: mapping Internet to Ethernet
• Listen-before-sending rule, collision• Ethernet: CSMA/CD– Carrier Sense Multiple Access with Collision Detection
• Ethernet type– Experimental Ethernet, 3mpbs– Standard Ethernet, 10 mbps– Fast Ethernet, 100 mbps– Gigabit Ethernet, 1000 mbps
11
![Page 12: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/12.jpg)
Overview of Ethernet• A half duplex Ethernet– The max propagation time is less than the 576 bit times,
the shortest allowable packet– So that two parties can detect a collision together
• Collision: wait random first time, exponential backoff if repeat
• A full duplex & point-to-point Ethernet– No collisions & the max length of the link is determined by
the physical medium
12
![Page 13: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/13.jpg)
Broadcast aspects of Ethernet
• Broadcast network– Every frame is delivered to every station– (Compare with forwarding network)
• ETHERNET_SEND– Pass the call along to the link layer
• ETHERNET_HANDLE– Simple, can even be implemented in hardware
13
![Page 14: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/14.jpg)
Layer mapping• The internet network layer– NETWORK_SEND (data, length, RPC, INTERNET, N)– NETWORK_SEND (data, length, RPC, ENET, 18)
• L must maintain a table
14
![Page 15: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/15.jpg)
ARP (Address Resolution Protocol)• NETWORK_SEND (“where is M?”, 11, ARP, ENET,
BROADCAST)• NETWORK_SEND (“M is at station 15”, 18, ARP, ENET,
BROADCAST)• L ask E’s Ethernet address, E does not hear the Ethernet
broadcast, but the router at station 19 does, and it sends a suitable ARP response instead
• Manage forwarding table as a cache
15
![Page 16: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/16.jpg)
ARP & RARP protocol
16
• Name mapping: IP address <-> MAC address
![Page 17: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/17.jpg)
ARP spoofing
17
![Page 18: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/18.jpg)
18
END-TO-END LAYER
![Page 19: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/19.jpg)
19
E2E Transport
• Reliability: “At Least Once Delivery”– Lock-step– Sliding Window
• Congestion Control– Flow Control– Additive Increase Multiplicative Decrease
![Page 20: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/20.jpg)
The end-to-end layer
• Network layer is not enough– No guarantees about delay– Order of arrival– Certainty of arrival– Accuracy of content– Right place to deliver
• End-to-end layer– No single design is likely to suffice– Transport protocol for each class of application
20
![Page 21: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/21.jpg)
Famous transport protocols• UDP (User Datagram Protocol)– Be used directly for some simple applications– Also be used as a component for other protocols
• TCP (Transmission Control Protocol)– Keep order, no missing, no duplication– Provision for flow control
• RTP (Real-time Transport Protocol)– Built on UDP– Be used for streaming video or voice, etc.
• Other protocols, as presentation protocols
21
![Page 22: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/22.jpg)
Assurance of end-to-end protocol
• Seven Assurances1. Assurance of at-least-once delivery2. Assurance of at-most-once delivery3. Assurance of data integrity4. Assurance of end-to-end performance5. Assurance of stream order & closing of
connections6. Assurance of jitter control7. Assurance of authenticity and privacy
22
![Page 23: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/23.jpg)
Assurance of at-least-once delivery
• RTT (Round-trip time)– to_time + process_time + back_time (ack)
• At least once on best effort network– Send packet with nonce– Sender keeps a copy of the packet– Resend if timeout before receiving acknowledge– Receiver acknowledges a packet with its nonce
• Try limit times before return error to app
23
![Page 24: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/24.jpg)
Assurance of at-least-once delivery• Dilemma– 1. The data was not delivered– 2. The data was delivered, but no ACK received– No way to know which situation
• At-least-once delivery– No absolute assurance for at-least-once– Ensure if it is possible to get through, the message will get
through eventually– Ensure if it’s not possible to confirm delivery, app will
know– No assurance for no-duplication
24
![Page 25: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/25.jpg)
Timeout
• Fixed timer: dilemma of fixed timer– Too short: unnecessary resend– Too long: take long time to discover lost packets
• Adaptive timer– E.g. Adjust by currently observed RTT, set timer to 150%– Exponential backoff: wait 1, 2, 4, 8, 16, ... seconds
• NAK (Negative Acknowledgment)– Receiver sends a message that lists missing items– Receiver can count arriving segments rather than timer– Receiver can have no timer (only once per stream)
25
![Page 26: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/26.jpg)
Congestion collapse in NFS
• Congestion Collapse in NFS– Using at-least-once with stateless interface– Persistent client: repeat resending forever– Server: FIFO– Timeout when queuing and resend– Re-execute the resend request and waste time• When the queue becomes longer
– Lesson: Fixed timers are always a source of trouble, sometimes catastrophic trouble
26
![Page 27: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/27.jpg)
Emergent phase synchronization of periodic protocols
• Periodic polling– E.g. picking up mail, sending “are-you-there?”– A workstation sends a broadcast packet every 5 minutes– All workstations try to broadcast at the same time
• Each workstation– Send a broadcast– Set a fixed timer
• Lesson: Fixed timers have many evils. Don’t assume that unsynchronized periodic activities will stay that way
27
![Page 28: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/28.jpg)
Wisconsin time server meltdown• NETGEAR added a feature to wireless router– Logging packets -> timestamp -> time server (SNTP) ->
name discovery -> 128.105.39.11– Once per second until receive a response– Once per minute or per day after that
• Wisconsin Univ.– On May 14, 2003, at about 8:00 a.m– From 20,000 to 60,000 requests per second, filtering
23457– After one week, 270,000 requests per second, 150Mbps
28
![Page 29: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/29.jpg)
Wisconsin time server meltdown
• Lesson(s)– Fixed timers again– Fixed Internet address– The client implements only part of a protocol• There is a reason for features such as the “go away”
response in SNTP
29
![Page 30: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/30.jpg)
30
Timeout
• RTT– Timeout should depend on
RTT– Sender measures the time
between transmitting a packet and receiving its ack, which gives one sample of the RTT
![Page 31: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/31.jpg)
31
RTT Could be Highly Variable
![Page 32: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/32.jpg)
32
Calculating RTT and Timeout (in TCP)
• Exponentially Weighted Moving Average– Estimate both the average rtt_avg and the
deviation rtt_dev
– Procedure calc_rtt(rtt_sample)• rtt_avg = a*rtt_sample + (1-a)*rtt_avg; /* a = 1/8 */ • dev = absolute(rtt_sample – rtt_avg); • rtt_dev = b*dev + (1-b)*rtt_dev; /* b = 1/4 */
– Procedure calc_timeout(rtt_avg, rtt_dev)• Timeout = rtt_avg + 4*rtt_dev
![Page 33: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/33.jpg)
End-to-end performance• Multi-segment message questions– Trade-off between complexity and performance– Lock-step protocol
33
![Page 34: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/34.jpg)
Overlapping transmissions
• Pipelining technique
34
![Page 35: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/35.jpg)
35
Fixed Window• Receiver tells the sender a
window size• Sender sends window• Receiver acks each packet as
before• Window advances when all
packets in previous window are acked– E.g., packets 4-6 sent, after 1-3 ack’d
• If a packet times out -> rxmit packets
• Still much idle time
![Page 36: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/36.jpg)
36
Sliding Window• Sender advances the
window by 1 for each in-sequence ack it receives– Reduces idle periods– Pipelining idea
• But what’s the correct value for the window?– We’ll revisit this question– First, we need to understand
windows
![Page 37: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/37.jpg)
Overlapping transmissions• Problems– Packets or ACK may be lost
• Sender holds a list of segments sent, check it off when receives ACK
• Set a timer (a little more than RTT) for last segment– If list of missing ACK is empty, OK– If timer expires, resend packets and another timer
37
![Page 38: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/38.jpg)
38
Handling Packet Loss
![Page 39: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/39.jpg)
39
Chose the right window size
• Window is too small– Long idle time– Underutilized network
• Window too large– Congestion
![Page 40: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/40.jpg)
Sliding window sizewindow size ≥ round-trip time × bottleneck data rate
•Sliding window with one segment in size– Data rate is window size / RTT
•Enlarge window size to bottleneck data rate– Data rate is window size / RTT
•Enlarge window size further– Data rate is still bottleneck– Larger window makes no sense
40
![Page 41: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/41.jpg)
Self-pacing• Sliding window size– Although the sender doesn’t know the bottleneck, it is
sending at exactly that rate– Once sender fills a sliding window, cannot send next data
until receive ACK of the oldest data in the window– The receiver cannot generate ACK faster than the network
can deliver data elements– e.g. receive 500 KBps, sender 1 MBps, RTT 70ms, segment
carries 512 Bytes, sliding window size = 70 (35KB)– RTT estimation still needed
• Needs to err on the side of being too small
41
![Page 42: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/42.jpg)
Congestion control• Requires cooperation of more than one layer
• A shared resource, and demands from several statistically independent sources, there will be fluctuations in the arrival of load, and thus in the length of queue and time spent waiting
42
![Page 43: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/43.jpg)
Managing shared resources• Overload is inevitable, but how long does it last?
– A queue handles short bursts by time-averaging with adjacent periods when there is excess capacity
– If overload persists longer than service time, then may cause max delay, which is called congestion
• Congestion– Maybe temporary or chronic– Stability of offered load
• Large number of small source vs. small number of large source– Congestion collapse
• Competition for a resource sometimes leads to wasting of that resource (Sales & checkout clerks)
43
![Page 44: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/44.jpg)
Congestion collapse
44
![Page 45: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/45.jpg)
45
Setting Window Size: Congestion
![Page 46: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/46.jpg)
46
Setting Window Size: Congestion
![Page 47: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/47.jpg)
47
Congestion Control• Basic Idea:– Increase cwnd slowly– If no drops -> no congestion yet– If a drop occurs -> decrease cwnd quickly
• Use the idea in a distributed protocol that achieves– Efficiency: i.e., uses the bottleneck capacity efficiently– Fairness, i.e., senders sharing a bottleneck get equal
throughput (if they have demands)• Every RTT:– No drop: cwnd = cwnd + 1– A drop: cwnd = cwnd / 2
![Page 48: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/48.jpg)
48
Additive Increase
![Page 49: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/49.jpg)
49
AIMD Leads to Efficiency and Fairness
![Page 50: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/50.jpg)
Retrofitting TCP• 1. Slow start: one packet at first, then double until– Sender reaches the window size suggested by the receiver – All the available data has been dispatched– Sender detects that a packet it sent has been discarded
• 2. Duplicate ACK– When receiver gets an out-of-order packet, it sends back a
duplicate of latest ACK• 3. Equilibrium– Additive increase & multiplicative decrease
• 4. Restart, after waiting a short time
50
![Page 51: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/51.jpg)
Retrofitting TCP
51
![Page 52: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/52.jpg)
52
Summary of E2E Transport
• Reliability Using Sliding Window– Tx Rate = W / RTT
• Congestion Control– W = min(Receiver_buffer, cwnd)– cwnd is adapted by the congestion control
protocol to ensure efficiency and fairness– TCP congestion control uses AIMD which provides
fairness and efficiency in a distributed way
![Page 53: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/53.jpg)
53
P2P NETWORK
![Page 54: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/54.jpg)
54
Downsides of C/S
• Centralized Infrastructure– Centralized point of failure– High management costs• If one org has to host millions of files, etc.
– Not suitable for many scenarios• E.g., cooperation between you and me
– Lack ability to aggregate clients
![Page 55: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/55.jpg)
55
P2P: Peer-to-peer
• No central servers!• Questions– How to track nodes and objects in the system?– How do you find other nodes in the system?– How should data be split up between nodes?– How to prevent data from being lost? • How to keep it available?
– How to provide consistency?– How to provide security? anonymity?
![Page 56: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/56.jpg)
56
BitTorrent• Usage Model: Cooperative– User downloads file from someone using simple user
interface– While downloading, BitTorrent serves file also to others– BitTorrent keeps running for a little while after
download completes• 3 Roles– Tracker: What peer serves which parts of a file– Seeder: Own the whole file– Peer: Turn a seeder once has 100% of a file
![Page 57: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/57.jpg)
57
BitTorrent• Publisher a .torrent file on a Web server (e.g., suprnova.org)
– URL of tracker– file name, length– SHA1s of data blocks (64-512Kbyte)
• Tracker– Organizes a swarm of peers (who has what block?)
• Seed posts the URL for .torrent with tracker– Seed must have complete copy of file– Every peer that is online and has copy of a file becomes a seed
• Peer asks tracker for list of peers to download from– Tracker returns list with random selection of peers
• Peers contact peers to learn what parts of the file they have etc.– Download from other peers
![Page 58: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/58.jpg)
58
A torrent file { 'announce': 'http://bttracker.debian.org:6969/announce', 'info': { 'name': 'debian-503-amd64-CD-1.iso', 'piece length': 262144, 'length': 678301696, 'pieces': '841ae846bc5b6d7bd6e9aa3dd9e551559c82abc1...d14f1631d 776008f83772ee170c42411618190a4' }}
![Page 59: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/59.jpg)
59
Which piece to download?
• Order of parts downloading– Strict?– Rarest first?– Random?– Parallel?
• BitTorrent– Random for the first one– Rarest first for the rest– Parallel for the last one
![Page 60: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/60.jpg)
60
Drawback of BitTorrent
• Rely on Tracker– Tracker is central component– Cannot scale to large number of torrents
![Page 61: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/61.jpg)
61
Scalable Lookup• Interface– Provide an abstract interface to store and find data
• Typical DHT interface:– put(key, value)– get(key) -> value– loose guarantees about keeping data alive
• For BitTorrent trackers:– announce tracker: put(SHA(URL), my-ip-address)– find tracker: get(SHA(url)) -> IP address of tracker
• Some DHT-based trackers exist.– Many other usages of DHTs
![Page 62: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/62.jpg)
62
P2P Implementation of DHT
• Overlay Network– partition hash table over n nodes– not every node knows about all other n nodes– rout to find right hash table
• Goals– log(n) hops– Guarantees about load balance
![Page 63: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/63.jpg)
63
A DHT in Operation: put()
![Page 64: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/64.jpg)
64
A DHT in Operation: get()
![Page 65: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/65.jpg)
65
Chord Properties
• Efficient: O(log(N)) messages per lookup– N is the total number of servers
• Scalable: O(log(N)) state per node• Robust: survives massive failures
![Page 66: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/66.jpg)
66
Chord IDs
• Key identifier = SHA-1(key)• Node identifier = SHA-1(IP address)• Both are uniformly distributed• Both exist in the same ID space
• How to map key IDs to node IDs?
![Page 67: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/67.jpg)
67
Consistent Hashing
![Page 68: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/68.jpg)
68
Basic lookup
![Page 69: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/69.jpg)
69
Simple lookup algorithm
Lookup(my-id, key-id) n = my successor if my-id < n < key-id call Lookup(id) on node n // next hop else return my successor // done
• Correctness depends only on successors
![Page 70: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/70.jpg)
70
“Finger Table” allows log(N) lookups
![Page 71: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/71.jpg)
71
Finger i points to successor of n+2i
![Page 72: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/72.jpg)
72
Lookup with fingers
Lookup(my-id, key-id) look in local finger table for highest node n s.t. my-id < n < key-id if n exists call Lookup(id) on node n // next hop else return my successor // done
![Page 73: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/73.jpg)
73
Lookups take O(log(N)) hops
![Page 74: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/74.jpg)
74
Join (1)
N36
1. Lookup(36)
N25
N40 K30K38
![Page 75: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/75.jpg)
75
Join (2)
N361. Lookup(36)
N25
N40 K30K38
![Page 76: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/76.jpg)
76
Join (3)
N361. Lookup(36)
N25
N40 K30K38
K30
![Page 77: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/77.jpg)
77
Join (4)
N364. Set N25’s
successor pointer
N25
N40 K30K38
K30
Update finger pointers in the background Correct
successors produce correct lookups
![Page 78: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/78.jpg)
78
Failures might cause incorrect lookup
N80
N85
N102
N113
N120 N10
N80 doesn’t know correct successor, so incorrect lookup
Lookup(90)
![Page 79: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/79.jpg)
79
Solution: successor lists
• Successor Lists– Each node knows r immediate successors– After failure, will know first live successor– Correct successors guarantee correct lookups
• Guarantee is with some probability
![Page 80: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/80.jpg)
80
Solution: successor lists
• Successor List Length– Assume 1/2 of nodes fail – P(successor list all dead) = (1/2)r
• I.e. P(this node breaks the Chord ring) • Depends on independent failure
– P(no broken nodes) = (1 – (1/2)r)N
• r = 2log(N) makes prob. = 1 – 1/N
![Page 81: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/81.jpg)
81
Lookup with fault tolerance
Lookup(my-id, key-id) look in local finger table and successor-list for highest node n s.t. my-id < n < key-id if n exists call Lookup(id) on node n // next hop if call failed, remove n from finger table return Lookup(my-id, key-id) else return my successor // done
![Page 82: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/82.jpg)
82
Other design issues
• Concurrent joins• Locality• Heterogeneous node• Dishonest nodes• ...
![Page 83: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/83.jpg)
83
WAR STORIES
![Page 84: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/84.jpg)
War stories: surprises in protocol design
• Fixed Timers Lead to Congestion Collapse in NFS• Autonet Broadcast Storms• Emergent Phase Synchronization of Periodic
Protocols• Wisconsin Time Server Meltdown
84
![Page 85: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/85.jpg)
Congestion collapse in NFS
• Congestion Collapse in NFS– Using at-least-once with stateless interface– Persistent client: repeat resending forever– Server: FIFO– Timeout when queuing and resend– Re-execute the resend request and waste time• When the queue becomes longer
– Lesson: Fixed timers are always a source of trouble, sometimes catastrophic trouble
85
![Page 86: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/86.jpg)
Autonet broadcast storms• Designed by DEC to handle broadcast elegantly– Physical layer: point-to-point coaxial cables
• Network as a tree– Broadcast packet first to root then down– Nodes accepted only packets going downward
• No duplicate broadcast– Problem: every once in a while, the network collapsed with a
storm of repeated broadcast packets• Lesson: Emergent properties often arise from the interaction of
apparently unrelated system features operating at different system layers, in this case, link-layer reflections and network-layer broadcasts.
86
![Page 87: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/87.jpg)
Emergent phase synchronization of periodic protocols
• Periodic polling– E.g. picking up mail, sending “are-you-there?”– A workstation sends a broadcast packet every 5 minutes– All workstations try to broadcast at the same time
• Each workstation– Send a broadcast– Set a fixed timer
• Lesson: Fixed timers have many evils. Don’t assume that unsynchronized periodic activities will stay that way
87
![Page 88: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/88.jpg)
Wisconsin time server meltdown• NETGEAR added a feature to wireless router– Logging packets -> timestamp -> time server (SNTP) ->
name discovery -> 128.105.39.11– Once per second until receive a response– Once per minute or per day after that
• Wisconsin Univ.– On May 14, 2003, at about 8:00 a.m– From 20,000 to 60,000 requests per second, filtering
23457– After one week, 270,000 requests per second, 150Mbps
88
![Page 89: Network](https://reader035.fdocuments.in/reader035/viewer/2022062302/56816384550346895dd467a5/html5/thumbnails/89.jpg)
Wisconsin time server meltdown
• Lesson(s)– Fixed timers again– Fixed Internet address– The client implements only part of a protocol• There is a reason for features such as the “go away”
response in SNTP
89