Sizing Router Buffers Nick McKeown Guido Appenzeller & Isaac Keslassy SNRC Review May 27 th, 2004.
1 EE384Y: Packet Switch Architectures Part II Sizing Router Buffers (Recent work by Guido...
-
Upload
lauren-coleman -
Category
Documents
-
view
219 -
download
1
Transcript of 1 EE384Y: Packet Switch Architectures Part II Sizing Router Buffers (Recent work by Guido...
1
High PerformanceSwitching and RoutingTelecom Center Workshop: Sept 4, 1997.
EE384Y: Packet Switch ArchitecturesPart II
Sizing Router Buffers
(Recent work by Guido Appenzeller)
Nick McKeownProfessor of Electrical Engineering and Computer Science, Stanford University
[email protected]://www.stanford.edu/~nickm
2
How much Buffer does a Router need?
Universally applied rule-of-thumb: A router needs a buffer size:
• 2T is the round-trip propagation time (or just 250ms)• C is the capacity of the outgoing link
Background Mandated in backbone and edge routers. Appears in RFPs and IETF architectural guidelines. Has major consequences for router design. Comes from dynamics of TCP congestion control. Villamizar and Song: “High Performance TCP in ANSNET”,
CCR, 1994. Based on 2 to 16 TCP flows at speeds of up to 40 Mb/s.
CTB 2
3
Example
10Gb/s linecard or router Requires 300Mbytes of buffering. Read and write new packet every 32ns.
Memory technologies SRAM: require 80 devices, 1kW, $2000. DRAM: require 4 devices, but too slow.
Problem gets harder at 40Gb/s Hence RLDRAM, FCRAM, etc.
4
TCP
TCP adapts to congestion Sender sends packets, receiver sends ACKs Sending rate is controlled by Window W At any time, only W unacknowledged packets may be
outstanding
W is adjusted for each packet (in CA mode): If ACK received: W = W+1/W (W=W+1 for each W
packets) If packet is lost: W = W/2 (W halved in case of loss)
The sending rate of TCP is: RTT
WR
5
Single TCP FlowRouter with large enough buffers for full link utilization
B
DestCC’ > C
Source
maxW
2maxW
t
Window size Buffer size and RTT
For every W ACKs received, send W+1 packets
RTT
WR
6
Over-buffered Link
7
Under-buffered Link
8
Buffer = Rule-of-thumb
Interval magnifiedon next slide
9
Microscopic TCP BehaviorWhen sender pauses, buffer drains
one RTTDrop
10
Origin of rule-of-thumb Before and after reducing window size, the sending rate of the
TCP sender is the same
Inserting the rate equation we get
The RTT is part transmission delay T and part queuing delay B/C . We know that after reducing the window, the queueing delay is zero.
newold RR
new
new
old
old
RTT
W
RTT
W
T
W
CBT
W oldold
2
2/
/2
BCT 2
11
Rule-of-thumb
Rule-of-thumb makes sense for one flow Typical backbone link has > 20,000 flows Does the rule-of-thumb still hold?
Answer: If flows are perfectly synchronized, then Yes. If flows are desynchronized then No.
12
Buffer size is height of sawtooth
t
B
0
13
If flows are synchronized
maxW
Aggregate window has same dynamics Therefore buffer occupancy has same dynamics Rule-of-thumb still holds.
2maxW
t
max
2
W
maxW
14
Two TCP FlowsTwo TCP flows can synchronize
15
If flows are not synchronized
maxW
Aggregate window has less variation Therefore buffer occupancy has less variation The more flows, the smaller the variation Rule-of-thumb does not hold.
2maxW
t
)( WMin
)( WMax
16
If flows are not synchronized
maxW
2maxW
ProbabilityDistributionBuffer Size
B
0
17
Quantitative Model Model congestion window of a flow as random variable
)(tWi model as )(][ xfxWP i iW where
For many de-synchronized flows We assume congestions windows are independent All congestion windows have the same probability distribution
2]var[][ WiWi WWE
Now central limit theorem gives us queue length distribution
)1,0()( NnntW WWn
i
18
Required buffer size
2T C
n
Simulation
19
Required buffer size
2T C
n
99.9%
98.0%
99.5%2T C
n
2×
20
Small buffers help short flowsAverage flow completion times of 14 packet flows that share a congested bottleneck link with long-lived flows.
2T C
n
CT 2
21
Experiments with backbone routerGSR 12000, OC3 Line Card
TCP
Flows
Router Buffer Link Utilization
Pkts RAM Model Sim Exp
100 0.5 x
1 x
2 x
3 x
64
129
258
387
1Mb
2Mb
4Mb
8Mb
96.9%
99.9%
100%
100%
94.7%
99.3%
99.9%
99.8%
94.9%
98.1%
99.8%
99.7%
400 0.5 x
1 x
2 x
3 x
32
64
128
192
512kb
1Mb
2Mb
4Mb
99.7%
100%
100%
100%
99.2%
99.8%
100%
100%
99.5%
100%
100%
99.9%
2T C
n
Thanks: Experiments conducted by Paul Barford and Joel Sommers, U of Wisconsin
22
What about Short Flows?
So far we assumed long flows in congestion avoidance mode. What if traffic is mainly short flows in slow-start?
Answer: Behavior is different, but In mixes of flows, long flows drive buffer requirements Required buffer for short flows is independent of line
speed and RTT (same for 1Mbit/s or 40 Gbit/s)
23
A single, short-lived TCP flowFlow length 62 packets, RTT ~140 ms
2
4
8
16
32
RTTsynfin ack
received
Flow Completion Time (FCT)
24
Modelling TCPFlows vs. independent bursts
Inter-Burst Arrival Time is greater than buffer sizeTherefore, we assume bursts are independent.
Poisson arrivals of flows
Arrivals of length Lflow (the
flow length in packets)
Poisson arrivals of bursts
Four different poisson arrival processes of lengths 2,4,...
S i Lflow
flow
CLflow
S i 2,4, 8, 16
burst
CLflow E S i
flow
E S i
25
The M/G/1 ModelTCP traffic is modelled as an M/G/1 arrival
process: poisson arrivals of jobs
with an arrival rate of
S i 2,4, 8, 16...
burst E Si
is the load
Average queue length in jobs is:
E NQ
2E S 2
2 1
2E S 2
2 1 E S 2
This gives us an average queue length in packets of
E Q E NQ E S2E S 2
2 1 E S
Let's see if this works in practice...
26
Average Queue length
capacity :C 40Mbit sload : 0.8
for length 50packets :Lflow 400MbitAverage100flows secondCompletion time 400ms
27
Queue Distribution To determine the required buffer, we need the queue
distribution.
Or at least the tail endof the queue distribution
PDrop
x B
P Q x
P(Q = x)
Q
Buffer B
Packet Loss
● For M/G/1 queues there is no general solution for the queue distribution.
● We did two things (details are in the paper):
– Use M/G/1 processor sharing model (bad)– Use Frank Kelly's effective bandwidth (good)
28
In Summary
Buffer size is dictated by long TCP flows. 10Gb/s linecard with 200,000 x 56kb/s flows
Rule-of-thumb: Buffer = 2.5Gbits• Requires external, slow DRAM
Becomes: Buffer = 6Mbits• Can use on-chip, fast SRAM• Completion time halved for short-flows
40Gb/s linecard with 40,000 x 1Mb/s flows Rule-of-thumb: Buffer = 10Gbits Becomes: Buffer = 50Mbits