CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident...
Transcript of CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident...
![Page 1: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/1.jpg)
1
CompSci 514: Computer NetworksLecture 4: End-to-end Congestion
Control
Xiaowei Yang
![Page 2: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/2.jpg)
2
Outline• TCP congestion collapse
• TCP congestion control algorithm
• Analysis
• Throughput and loss rate modeling
![Page 3: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/3.jpg)
3
A fundamental networking problem
• Packet switching– Statistical multiplexing
• Q: N users, and one network– How fast should each user send?– Or when should a packet be sent?– Why is it a difficult problem?
...
![Page 4: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/4.jpg)
• A à C, B à D• How fast should A and B send?• Don’t know the number of users• Don’t know the bottleneck capacity
4
A
B
C
D1Gbps
1Gbps
10Mbps
1Gbps
1Mbps
![Page 5: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/5.jpg)
The TCP Congestion Collapse Incident
• In October of 1986, the Internet had the first of what became a series of �congestion collapse�: increased load leads to decreased throughput
• The NSFnet phase-I backbone dropped three orders of magnitude from its capacity of 32 kbit/s to 40 bit/s, and continued until end nodes started implementing Van Jacobson's congestion control between 1987 and 1988.
5
![Page 6: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/6.jpg)
What caused the TCP congestion collapse?
• Original TCP (Cerf74)– Static window W– W = # of unacknowledged bytes
6
Send 1, 2, 3, …, W bytes
ACK 1, 2, 3, …, WSend W+1, W+2, … 2W bytes
ACK W+1, W+2, … 2W bytes
![Page 7: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/7.jpg)
Review of TCP’s sliding window algorithm
• A well-known algorithm in networking• Used for
– Reliable transmission– Flow control– Congestion control
7
![Page 8: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/8.jpg)
Definitions
• Round Trip Time (RTT)– The time it takes from sending a packet to
receiving an ACK for that packet– RTTmin = minimum RTT observed during a
connection
• Bandwidth Delay Product (BDP)– Bottleneck Capacity * RTTmin
8
![Page 9: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/9.jpg)
Drawbacks of static window sizes
• One TCP flow– MSS = 512 bytes, Window size = 10 MSS,
RTTmin = 100ms• 10 TCP flows, 100 TCP flows, …
9
![Page 10: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/10.jpg)
What caused the TCP congestion collapse?
• Original TCP (Cerf74)– Static window W– W = # of unacknowledged bytes
• When user increases, queueing delay increases• TCP times out, retransmitting the whole window• TCP retransmits too early, wasting the
network’s bandwidth to retransmit the same packets already in transit and reducing useful throughput (goodput)
10
![Page 11: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/11.jpg)
Retransmitting packets that are not lost à congestion collapse
11
![Page 12: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/12.jpg)
How can we prevent this problem?
12
![Page 13: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/13.jpg)
Design Goals• Congestion avoidance: making the
system operate around the knee to obtain low latency and high throughput
• Congestion control: making the system operate left to the cliff to avoid congestion collapse
• Congestion control: making the system operate left to the cliff to avoid congestion collapse
• Congestion avoidance: making the system operate around the knee to obtain low latency and high throughput
![Page 14: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/14.jpg)
Design goals
• More concretely– Efficiency
• High throughput/utilization– Low latency– Fairness– Fast convergence
14
![Page 15: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/15.jpg)
Key Improvements from the TCP88 paper
• Revised RTO computation– Prevent spurious retransmissions
• ACK self-clocking– Determines when to send a packet
• Dynamic window sizing– Determines how fast a sender can send a
packet
![Page 16: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/16.jpg)
Revised RTO estimates
• Old design: RTTn+1 = a RTT + (1- a ) RTTnRTO = β RTTn+1
• Improved designsrttn+1 = a RTT + (1- a ) srttnrttvarn+1 = b ( | RTT – srttn | ) + (1- b )rttvarn
RTOn+1 = srttn+1 + 4 rttvarn+1
16
![Page 17: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/17.jpg)
Exponential backoff• When a timeout occurs , the RTO value is
doubledRTOn+1 = max (2*RTOn , 64) seconds
• Can save the day in the worst case
17
![Page 18: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/18.jpg)
ACK self-clocking
• When pipe is full, the speed of ACK returns equals to the speed new packets should be injected into the network
![Page 19: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/19.jpg)
Why is self-clocking important
• Determines when a packet should be sent– A flow should send no faster than what its
bottleneck allows– ACK spacing correlates with bottleneck
capacity
27
![Page 20: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/20.jpg)
Does ACK clocking solve it all?• No
• A sender will not send faster, but may under-utilize bandwidth or cause packets being buffered at the bottleneck
28
![Page 21: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/21.jpg)
Dynamic window sizing• Sending speed: W / RTT
• Ideally, W / RTT = available bandwidth at the bottleneck
• à Adjusting W based on available bandwidth
1. Increases W when there is no congestion2. Decreases W when there is congestion
![Page 22: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/22.jpg)
How to detect congestion?
• Packet loss– Time out– Duplicate ACKs
• Latency– Current RTT – RTTmin
• Sending rate / ACKing rate
• ML? 31
![Page 23: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/23.jpg)
TCP Reno: Two Modes of Congestion Control
1. Probing for the available bandwidth– slow start (W < ssthresh)
2. Avoid overloading the network– congestion avoidance (W >= ssthresh)
![Page 24: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/24.jpg)
Slow Start• Initial value: Set W = 1 MSS
• Modern TCP implementation may set initial window size to a much larger value
• When receiving an ACK, W+= 1 MSS
• W doubles every RTT– Exponential increase
![Page 25: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/25.jpg)
Congestion Avoidance
• If W >= ssthresh then each time an ACK is received, increases W as follows:
• W += 1 / W
• W increases by one MSS per RTT
![Page 26: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/26.jpg)
Reaction to Congestion
• Reduce W
• Timeout: severe congestion– W is reset to one MSS:– ssthresh is set to half of the current size of the
congestion window:ssthressh = W / 2
– entering slow-start
![Page 27: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/27.jpg)
Reaction to Congestion• Duplicate ACKs: not so congested (why?)• Fast retransmit
– Three duplicate ACKs indicate a packet loss
– Retransmit without timeout
![Page 28: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/28.jpg)
Reaction to congestion: Fast
Recovery
• Avoiding slow start (changed from TCP88)
– ssthresh = W / 2
– W = W +3MSS
– Increase W by one MSS for each additional duplicate ACK
• When ACK arrives that acknowledges “new data,” set:
W = ssthresh = W/2
enter congestion avoidance, i.e., increase W by 1 MSS per RTT
![Page 29: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/29.jpg)
42
The Sawtooth behavior of TCP
• For every ACK received– W += 1/W
• For every packet lost– W /= 2
• Only true in unit of RTT
RTT
W
![Page 30: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/30.jpg)
43
Why does it work? [Chiu-Jain]
• A feedback control system• The network uses feedback y to adjust users�
load åx_i
![Page 31: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/31.jpg)
44
Goals of Congestion Avoidance
– Efficiency: the closeness of the total load on the resource of its knee: åx_i ~= capacity
– Fairness:
• When all x_is are equal, F(x) = 1• When all x_i�s are zero but x_j = 1, F(x) = 1/n
– Distributedness• A centralized scheme requires complete knowledge of the state of
the system
– Convergence• The system approach the goal state from any starting state
![Page 32: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/32.jpg)
45
Metrics to measure convergence
• Responsiveness• Smoothness
![Page 33: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/33.jpg)
46
Model the system as a linear control system
• Four sample types of controls• AIAD, AIMD, MIAD, MIMD
![Page 34: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/34.jpg)
47
Phase space
X1=W1/RTT
X2=W2/RTT
![Page 35: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/35.jpg)
48
TCP congestion control is AIMD
• Problems:– Each source has to probe for its bandwidth– Congestion occurs first before TCP backs off– Unfair: long RTT flows obtain smaller bandwidth
shares
RTT
W
![Page 36: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/36.jpg)
49
Macroscopic behavior of TCP
pRTTMSS••5.1
• Throughput is inversely proportional to RTT:
• In a steady state, total packets sent in one sawtooth cycle:– S = w + (w+1) + … (w+w) = 3/2 w2
• the maximum window size is determined by the loss rate– 1/S = p– w =
• The length of one cycle: w * RTT• Average throughput: 3/2 w * MSS / RTT
11.5p
![Page 37: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/37.jpg)
Why is congestion control still relevant
• The Mice & Elephant phenomenon of Internet traffic– Many small flows– But a few large flows sent most of the byte
50
![Page 38: CompSci514: Computer Networks Lecture 4: End-to-end ... · The TCP Congestion Collapse Incident •In October of 1986, the Internet had the first of what became a series of congestion](https://reader030.fdocuments.in/reader030/viewer/2022040911/5e859e488321fa63b720c2ae/html5/thumbnails/38.jpg)
52
Conclusion
• Congestion control is one of the fundamental issues in networking
• TCP congestion control algorithm• The AIMD algorithm• TCP macroscopic behavior model