An Integrated Congestion Management Architecture for Internet Hosts

An Integrated Congestion Management Architecture

for Internet Hosts

Hari BalakrishnanMIT Lab for Computer Science

http://inat.lcs.mit.edu/~hari

Two Questions

• Q: Why has the Internet not crumbled under its own load in recent years?

• A: Soundness of TCP congestion control + friendly TCP workload

• Q: Can we remain sanguine about the future outlook?

Internet

Shared infrastructure and overload causes congestion

Router

Problem #1: The Web!

r1r1

r-nr-n

r3r3

r2r2

ServerClient

• Multiple reliable streams• Individual objects are small• So what?

Far too inefficient!Far too aggressive!

Internet

Problem #2: Application Heterogeneity

u1u1

u-mu-m

u3u3

u2u2r1r1

r2r2

r3r3

r-nr-n

Server

• New applications (e.g., real-time streams)– The world isn’t only about HTTP or even TCP!

• So what?Applications do not adapt to congestionLong-term Internet stability is threatened

Internet

Client

Congestion Management Challenges

• Heterogeneous traffic mix– Variety of applications and transports

• Multiple concurrent streams• Separation from other tasks like

loss recovery• Stable control algorithms• Enable adaptive applications

“Solution” #1: Persistent Connections

r1r1

r-nr-n

r3r3

r2r2Put everyone on same

ordered byte stream

While this fixes some of the problems of independentconnections, it really is a step in the wrong direction!

1. Far too much coupling between objects2. Far too application-specific3. Does not enable application adaptation

ServerClient

“Solution” #2: Web Accelerators

• Is your Web experience too slow?• Chances are, it’s because of pesky TCP

congestion control and those annoying timeouts

• Web accelerators will greatly speed up your transfers…

• By just “adjusting” TCP’s congestion control!

• Who cares if the Internet is stable or not?

“Solution” #3: Integrated TCP Sessions

r1r1

r-nr-n

r3r3

r2r2

• Independent TCP connections, but shared control parameters [BPS+98, Touch98]• Shared congestion windows, round-trip estimates• But, this approach doesn’t accommodate non-TCP traffic

ServerClient

What is the World Heading Toward?

• The world won’t be just HTTP• The world won’t be just TCP

Logically different streams (objects) should be kept separate, yet efficient congestion

management must be performed.

u1u1

u-mu-m

u3u3

u2u2r1r1

r2r2

r3r3

r-nr-n

Server

Internet

Client

What We Really Need…

An integrated approach to end-to-end congestion management for the Internet using the CM

IP

HTTP Video1

TCP1 TCP2 UDP

Audio Video2

CongestionManager

Congestion Manager Properties

• Ensures proper and stable congestion behavior

• Enables application and transport adaptation to congestion via API

• Trusted intermediary between flows for bandwidth management

• Per-host & per-domain statistics (bandwidth, loss rates, RTT)

• Design motivated by end-to-end argument and Application Level Framing (ALF)

CM Features

• Algorithms and protocols• Adaptation API• Applications & performance

CM Algorithms

• Rate-based AIMD control• Loss-resilient feedback protocol• Exponential aging when feedback is

infrequent• Flow segregation to handle non-

best-effort networks• Scheduler to apportion bandwidth

between flows

CM’s Rate-based Control is TCP-friendly

• Throughput goes as 1/sqrt(p) for AIMD scheme

• nr2 = K { (nr/nd) + 1 }, where nr = # recd and nd = # dropped

is necessary and sufficient for verifying TCP “friendliness”

Receiver Feedback

• Implicit: hints from application• Explicit: CM probes• Probe every half RTT using loss-

resilient protocol– Low overhead

• Tracks loss rate, RTT estimate• But why do we need feedback?

Why Feedback?• Loss rate increases with feedback infrequency

0

0.1

0.2

0.3

0.4

0.5

0.6

0 1 2 3 4 5 6 7

x times per RTT

Loss

pro

bab

ility

Handling Infrequent Feedback

• Exponential aging: reduce rate by half, every silent RTT– Continues transmissions at safe rate without clamping apps

• Which RTT to use? Average? Minimum?

0

10

20

30

40

50

60

70

80

90

0 2 4 6 8 10 12 14 16

Time (s)

Seq

uen

ce n

um

ber

Better than Best-effort Networks

• Future networks will not treat all flows equally

• Per-host aggregation incorrect• Solution: flow segregation based

on loss rates• [Evaluation in-progress]

CM Scheduler

• Currently HRR scheduler for rate allocation

• Uses receiver hints to apportion bandwidth between flows

• Experiments show it working well• Exploring other scheduling algorithms

for delay management as well– Useful for real-time streams

The CM API

• Goal: To enable easy application adaptation

• Four guiding principles:– Put the application in control– Accommodate traffic heterogeneity– Accommodate application

heterogeneity– Learn from the application

Put the application in control

• Application decides what to transmit as late as possible

• CM does not buffer any data– cm_send() style API not useful

• Request/callback/notify API– cm_request(nsend);– app_notify(can_send);– cm_notify(nsent);

• Query function: cm_query(&rate, &srtt);

Accommodate traffic heterogeneity

• TCP bulk transfers• Short transactions• Real-time flows at continuum of

rates• Layered streams• [And other unanticipated apps]• cm_open(minrate);

Accommodate application

heterogeneity• API should not force particular application

style• Asynchronous transmitters (e.g., TCP)

– Triggered by events other than periodic clocks– Request/callback/notify works well

• Synchronous transmitters– Maintain internal timer for transmissions– Need rate change triggers from CM

change_rate(newrate);

Learn from the application

• cm_notify(nsent) upon transmission

• cm_update(nrecd, duration, loss_occurred, rtt)– Hint to CM (implicit feedback)– Called by TCP on ACK reception, RTP

applications on RTCP feedback, etc.

• cm_close() to clean up flow state

Application 1: Web/TCP

Web server uses change_rate() to pick convenient source encoding

CM Web Performance

0

50

100

150

200

250

0 2 4 6 8 10 12 14 16

0

50

100

150

200

250

300

350

400

450

0 2 4 6 8 10 12 14 16

With CM

CM greatly improvesperformance predictabilityand consistency

TCP Newreno

Time (s)

Seq

uen

ce n

um

ber

0

50

100

150

200

250

300

350

400

450

0 5 10 15 20 25

Application 2: Layered Streaming Audio

Time (s)

Seq

uen

ce n

um

ber

Competing TCP

TCP/CM

Audio/CM

Conclusions

• CM ensures proper and stable congestion behavior– CM tells flows their rates

• Simple, yet powerful API to enable application adaptation– Application is in control of what to send

• Improves performance consistency and predictability for individual applications and makes them better network citizens

An Integrated Congestion Management Architecture for Internet Hosts

Documents

Transcript of An Integrated Congestion Management Architecture for Internet Hosts