Reducing the Buffer Size in Backbone Routers Yashar Ganjali High Performance Networking Group...

Reducing the Buffer Size in Backbone Routers

Yashar Ganjali

High Performance Networking Group

Stanford University

February 23, 2005

yganjali@stanford.edu

http://www.stanford.edu/~yganjali

23 February 2005 High Performance Networking Group 2

Motivation

Problem– Internet traffic is doubled every year– Disparity between traffic and router growth

(space, power, cost)

Possible solution– All-optical networking

Consequences– Large capacity large traffic– Very small buffers

Outline of the Talk

Buffer sizes in today’s Internet From huge to small (Guido’s results)

– 2-3 orders of magnitude reduction

From small to tiny– Constant buffer sizes?

Backbone Router Buffers

Universally applied rule-of-thumb– A router needs a buffer size: B=2TxC

2T is the two-way propagation delay C is the capacity of the bottleneck link

Known to the inventors of TCP Mandated in backbone routers Appears in RFPs and IETF architectural guidelines

CRouterSource Destination

Review: TCP Congestion Control

Only W packets may be outstanding

Rule for adjusting W– If an ACK is received: W ← W+1/W– If a packet is lost: W ← W/2

Source Dest

Window size

Multiplexing Effect in the Core

ProbabilityDistribution

Buffer Size

Backbone router buffers

It turns out that– The rule of thumb is wrong for a core routers

– Required buffer is instead of CT 2n

Required Buffer Size

Simulation

Impact on Router Design

10Gb/s linecard with 200,000 x 56kb/s flows– Rule-of-thumb: Buffer = 2.5Gbits

Requires external, slow DRAM

– Becomes: Buffer = 6Mbits Can use on-chip, fast SRAM Completion time halved for short-flows

40Gb/s linecard with 40,000 x 1Mb/s flows– Rule-of-thumb: Buffer = 10Gbits– Becomes: Buffer = 50Mbits

How small can buffers be?

Imagine you want to build an all-optical router for a backbone network…

…and you can build a few dozen packets in delay lines.

Conventional wisdom: It’s a routing problem (hence deflection routing, burst-switching, etc.)

Our belief: First, think about congestion control.

TCP with ALMOST No Buffers

Utilization of bottleneck link = 75%

Problem Solved?

75% utilization with only one unit of buffering More flows Less buffer Therefore, one unit of buffering is enough

TCP Throughput withSmall Buffers

TCP Throughput vs. Number of Flows

0 200 400 600 800 1000Number of Flows

TCP Reno Performance

Buffer Size = 10; Load = 80%

0 1000 2000 3000 4000 5000 6000

Bottleneck Capacity Mbps

Two Concurrent TCP Flows

Simplified Model

Flow 1 sends W packets during each RTT Bottleneck capacity = C packets per RTT Example: C = 15, W = 5

Flow 2 sends two consecutive packets during each RTT

Drop probability is increased with W

Simplified Model (Cont’d)

W(t+1) = p(t)x[W(t)/2] + [1-p(t)]x[W(t)+1] But, p grows linearly with W E[W] = O(C½) Link utilization = W/C As C increases, link utilization goes to zero.

Snow model!!!

Q. What happens if flow 2 never sends any consecutive packets?

A. No packet drops unless utilization = 100%.

Q. How much space we need between the two packets?

A. At least the size of a packet.

Q. What if we have more than two flows?

Per-flow Queueing

Let us assume we have a queue for each flow; and

Server those queues in a round robin manner.

Does this solve the problem?

Per-flow Buffering

Per-Flow Buffering

Flow 3, does not have a packet at time t; Flows 1 and 2 do.

At time t+RTT we will see a drop.

Temporarily Idle

Time t Time t + RTT

Ideal Solution

If packets are spaced out perfectly; and The starting times of flows are chosen randomly; We only need a small buffer for contention

resolution.

Randomization

Mimic an M/M/1 queue

Under low load, queue size is small with high Probability

Loss can be bounded

P(Q > b)Buffer B

Packet Loss

TCP Pacing

Current TCP: – Send packets when ACK received.

Paced TCP: – Send one packet every W/RTT time units.– Update W, and RTT similar to TCP

CWND: Reno vs. Paced TCP

TCP Reno: Throughput vs. Buffer Size

Paced TCP: Throughput vs. Buffer Size

Early Results

Congested core router with 10 packet buffers.Average offered load = 80%RTT = 100ms; each flow limited to 2.5Mb/s

router

source

server

source

10Gb/s

>10Gb/s

What We Know

ArbitraryInjectionProcess

If Poisson Processwith load < 1

CompleteCentralized

Control

Any rate > 0need unbounded

buffers

Theory Experiment

Need buffersize of approx:

O(logD + logW)i.e. 20-30 pkts

D=#of hopsW=window size

[Goel 2004]

TCP Pacing:Results as goodor better than forPoisson

Constant fractionthroughput withconstant buffers

[Leighton]

Limited Congestion WindowR

Limited Window Unlimited Window

Slow Access Links

router

source

server

source

10Gb/s

5Mb/s 5Mb/s

Congested core router with 10 packet buffers.RTT = 100ms; each flow limited to 2.5Mb/s

Conclusion

We can reduce 1,000,000 packet buffers to 10,000 today.

We can “probably” reduce to 10-20 packet buffers:– With many small flows, no change needed.– With some large flows, need pacing in the access

routers or at the edge devices.

Need more work!

Extra Slides

Pathological Example

Flow 1: S1 D; Load = 50% Flow 2: S2 D; Load = 50%

If S1 sends a packet at time t, S2 cannot send any packets at time t, and t+1.

To achieve 100% throughput we need at least one unit of buffering.

Reducing the Buffer Size in Backbone Routers Yashar Ganjali High Performance Networking Group...

Documents

Transcript of Reducing the Buffer Size in Backbone Routers Yashar Ganjali High Performance Networking Group...

Crowdsourcing for NLP Using Amazon Mechanical Turk and CrowdFlower Matteo Negri and Yashar Mehdad

Molecular Basis of Vancomycin Resistance-Basic Science Paper Yashar Kalani Biochemistry 230.

Professor Yashar Ganjali Department of Computer Science ...

Milad Eftekhar, Yashar Ganjali, Nick Koudasmilad/paper/KDD-presentation.pdf · Conclusion and Future Works •Focus on groups rather than individuals • Wider diffusion • Improved

1 Input Queued Switches: Cell Switching vs. Packet Switching Abtin Keshavarzian Joint work with Yashar Ganjali, Devavrat Shah Stanford University.

Ninov Yashar

Yashar Akrami

Deborah Yashar Indigenous Politics and Democracy in Contesting Citizenship

Abtin Keshavarzian Yashar Ganjali Department of Electrical Engineering Stanford University June 5, 2002 Cell Switching vs. Packet Switching EE384Y: Packet.

Windows Azure Building web sites and services in the cloud Manu Cohen-Yashar Sela Group .

High Performance All-Optical Networks with Small Buffers Yashar Ganjali High Performance Networking Group Stanford University yganjali@stanford.edu yganjali.

POVERTY MEASUREMENT AND METODOLOGY OF HOUSEHOLD BUDGET SURVEY IN AZERBAIJAN Yashar PASHA.

Professor Ayse Karaman karaman@cdf.toronto.edu. Announcements Programming assignment 1 is out – visit yganjali/courses/csc458/assi.

PG Intraventricular Meningioma Alexander Taghva, MD Parham Yashar, MD Steven Giannotta, MD.

Routers with Small Buffers Yashar Ganjali High Performance Networking Group Stanford University yganjali@stanford.edu yganjali

Yashar Ganjali Joint work with: Abtin Keshavarzian June 4, 2003 Single-Path vs. Multi-Path Routing in Ad Hoc Networks.

Aristotelis S. Filippidis M. Yashar S. Kalani Harold L. Rekate

Yada Yah - Volume 3 Going Astray - 05 - Yashar - Stand Up - Walk with Me...

Sargalim Sin-El Libro de Jaser Yashar

Presentator: Yashar Ghouchibeik Functie: Consultant