Microscopic Behavior of Internet Control Xiaoliang (David) Wei NetLab, CS&EE California Institute of...

91
Microscopic Microscopic Behavior Behavior of of Internet Control Internet Control Xiaoliang (David) Wei Xiaoliang (David) Wei NetLab, CS&EE NetLab, CS&EE California Institute of California Institute of Technology Technology

Transcript of Microscopic Behavior of Internet Control Xiaoliang (David) Wei NetLab, CS&EE California Institute of...

Microscopic Microscopic BehaviorBehavior

of of Internet ControlInternet Control

Xiaoliang (David) WeiXiaoliang (David) WeiNetLab, CS&EENetLab, CS&EE

California Institute of California Institute of TechnologyTechnology

Internet ControlInternet Control Problem -> solution -> understanding -> Problem -> solution -> understanding ->

1986

1989

1995

2003

1999

1986: First Internet Congestion Collapse

Internet ControlInternet Control Problem -> solution -> understanding -> Problem -> solution -> understanding ->

1986

1989

1995

2003

1999

First Internet Congestion Collapse

1988~1990: TCP-Tahoe DEC-bit

Internet ControlInternet Control Problem -> solution -> understanding -> Problem -> solution -> understanding ->

1986

1989

1995

2003

1999

First Internet Congestion Collapse

1993~1995: Tri-S, DUAL, TCP-Vegas

TCP Tahoe; DEC-bit

OutlineOutline

MotivationMotivation Overview of Microscopic behaviorOverview of Microscopic behavior Stability of Delay-based Congestion Stability of Delay-based Congestion

Control AlgorithmsControl Algorithms Fairness of Loss-based Congestion Fairness of Loss-based Congestion

control algorithmscontrol algorithms Future worksFuture works SummarySummary

OutlineOutline

MotivationMotivation Overview of Microscopic behaviorOverview of Microscopic behavior Stability of Delay-based Congestion Stability of Delay-based Congestion

Control AlgorithmsControl Algorithms Fairness of Loss-based Congestion Fairness of Loss-based Congestion

control algorithmscontrol algorithms Future worksFuture works

Macroscopic View of TCP Macroscopic View of TCP ControlControl

TCP/AQM: A feedback control systemTCP/AQM: A feedback control system

TCP Sender 1TCP Sender 1 C

xi(t)

TCP: Reno Vegas FAST

AQM: DropTail / RED Delay ECN

TCP Sender 2TCP Sender 2

q(t)

TCP Receiver 1TCP Receiver 1

TCP Receiver 2TCP Receiver 2

Bii tqtxFtx

,

ctxtqGtq Fi

i ,

τF

τB

Fluid ModelsFluid Models

Assumptions:Assumptions: TCP algorithms directly control the transmission TCP algorithms directly control the transmission

rates;rates; The transmission rates are differentiable The transmission rates are differentiable

(smooth);(smooth); Each TCP packet Each TCP packet observesobserves the same congestionthe same congestion

priceprice (loss, delay or ECN) (loss, delay or ECN)

Bii tqtxFtx

,

ctxtqGtq Fi

i ,

Methodology based on Methodology based on Fluid ModelsFluid Models

Equilibrium:Equilibrium: Efficiency? Efficiency? Fairness?Fairness?

Dynamics:Dynamics: Stability?Stability? Responsiveness?Responsiveness?

Bii tqtxFtx

,

ctxtqGtq Fi

i ,

Gap 1: Stability of TCP Gap 1: Stability of TCP VegasVegas

AnalysisAnalysis: “TCP Vegas is stable : “TCP Vegas is stable if (and if (and only if)only if) the number of flows is large, and the number of flows is large, and capacity is small, and delay is small.”capacity is small, and delay is small.”

ExperimentExperiment: a single TCP Vegas flow is : a single TCP Vegas flow is stable with stable with arbitraryarbitrary delay and capacity.delay and capacity.

Gap 2: Fairness of Scalable Gap 2: Fairness of Scalable TCPTCP

AnalysisAnalysis: “Scalable : “Scalable TCP is TCP is fairfair in in homogeneous homogeneous network” [Kelly’03] network” [Kelly’03]

ExperimentExperiment: in most cases, Scalable TCP : in most cases, Scalable TCP is is unfairunfair in homogeneous network. in homogeneous network.

AnalysisAnalysis: : [Chiu&Jain’90] → [Chiu&Jain’90] → Scalable TCP is Scalable TCP is unfairunfair. .

Gap 3: TCP vs TFRCGap 3: TCP vs TFRC AnalysisAnalysis: “We designed TCP Friendly Rate : “We designed TCP Friendly Rate

Control (TFRC) algorithm to have the Control (TFRC) algorithm to have the same same equilibriumequilibrium as TCP when they co-exist.” as TCP when they co-exist.”

ExperimentExperiment: TCP flows : TCP flows do not fairly do not fairly coexist coexist with TFRC flows.with TFRC flows.

GapsGaps

StabilityStability: TCP-: TCP-VegasVegas

FairnessFairness: Scalable : Scalable TCPTCP

FriendlinessFriendliness: TCP : TCP vs TFRCvs TFRC

Current analytical models ignore microscopic behavior in TCP congestion control

OutlineOutline

MotivationMotivation Overview of Microscopic behaviorOverview of Microscopic behavior Stability of Delay-based Congestion Stability of Delay-based Congestion

Control AlgorithmsControl Algorithms Fairness of Loss-based Congestion Fairness of Loss-based Congestion

control algorithmscontrol algorithms Future worksFuture works

Microscopic View Microscopic View (Packet level)(Packet level)

Two level timescalesTwo level timescales On each RTT -- TCP congestion control On each RTT -- TCP congestion control

algorithm;algorithm;

On each packet arrival -- On each packet arrival -- Ack-Ack-clockingclocking:: p--;p--; while (p < w(t) ) do while (p < w(t) ) do

Send a packetSend a packetp++;p++;

((pp: number of packets in flight): number of packets in flight)

W: 0 -> 5W: 0 -> 5

SenderSender ReceiverReceiver

1

2

3

4

5

0

x(t)

t (time)

c

C

Packets queued in Packets queued in bottleneckbottleneck

0

x(t)

t (time)

c

SenderSender ReceiverReceiver

5

4

3

2

1

C

Packets leaves bottleneck Packets leaves bottleneck at rate at rate cc

SenderSender ReceiverReceiver3

4

5

0

x(t)

t (time)

c

C

12

Acknowledgment returns Acknowledgment returns at rate at rate cc

SenderSender ReceiverReceiver

A3

A1

A2

0

x(t)

t (time)

c

C

45

New Packets sent at rate New Packets sent at rate cc

SenderSender ReceiverReceiver

A5

A4

0

x(t)

t (time)

c

RTT

C

13 2

C

No queue in 2No queue in 2ndnd Round Round TripTrip

SenderSender ReceiverReceiver13 2

0

x(t)

t (time)

c

RTT RTT

5 4

No need to control rate

x(t) !

Two FlowsTwo Flows

TCP1TCP1 Rcv1Rcv1

TCP2TCP2 Rcv2Rcv2

1

2

3

4

1

2

3

4

C

0

x(t)

t (time)

c

Two FlowsTwo Flows

TCP1TCP1 Rcv1Rcv13

4

1

TCP2TCP2 Rcv2Rcv2

2

3

4

C 1

2

t (time)

0

x(t)

c

C TCP1TCP1 Rcv1Rcv1

A1

A2

A3

4

TCP2TCP2 Rcv2Rcv2

2

3

4

51

t (time)

0

x(t)

c

C TCP1TCP1 Rcv1Rcv1

TCP2TCP2 Rcv2Rcv22

3

4

A1

1

2

A3

A4

t (time)

RTT0

x(t)

c

C TCP1TCP1 Rcv1Rcv1

12

3

4

TCP2TCP2 Rcv2Rcv24

A2

A3

t (time)

RTT

A1

0

x(t)

c

C TCP1TCP1 Rcv1Rcv1

A1

234

TCP2TCP2 Rcv2Rcv2

A4

2

1

A3

t (time)

RTT0

x(t)

c

C TCP1TCP1 Rcv1Rcv1

TCP2TCP2 Rcv2Rcv24

2

13

A1

A2

A3

4

t (time)

RTT RTT0

x(t)

c

On-off pattern for each flow

Sub-RTT Burstiness: NS-2 Sub-RTT Burstiness: NS-2 MeasurementMeasurement

Two levels of BurstinessTwo levels of Burstiness

Micro BurstMicro Burst Pulse functionPulse function Input rate>>cInput rate>>c Extra queue & lossExtra queue & loss TransientTransient

t (time)

RTT RTT

Sub-RTT burstinessSub-RTT burstiness On-off functionOn-off function Input rate <=cInput rate <=c No extra queue & No extra queue &

lossloss PersistentPersistent

0

x(t)

c

Microscopic Effects: Microscopic Effects: knownknown

Loss-based TCPLoss-based TCP Delay-based Delay-based TCPTCP

Micro Micro BurstBurst

Low throughput Low throughput with small buffer with small buffer – pacing – pacing improves improves throughputthroughput

(Clearly (Clearly understood)understood)

Noise to delay Noise to delay signal, should signal, should be eliminated be eliminated

(Partially…)(Partially…)

Sub-Sub-RTT RTT BurstinBurstinessess

Observed in Internet TrafficObserved in Internet Traffic

(“Why do we care?”)(“Why do we care?”)

Microscopic Effects: newMicroscopic Effects: new

Loss-based TCPLoss-based TCP Delay-based Delay-based TCPTCP

Micro Micro BurstBurst

Low throughput Low throughput with small buffer with small buffer – pacing – pacing improves improves throughputthroughput

(Clearly (Clearly Understood)Understood)

Fast Fast convergence in convergence in queuing delay queuing delay and better and better stabilitystability

Sub-Sub-RTT RTT BurstinBurstinessess

Low loss Low loss synchronization synchronization rate with rate with DropTail routersDropTail routers

No effectNo effect

New UnderstandingsNew Understandings

Micro BurstMicro Burst with with Delay-Delay-based TCPbased TCP::

fast queue fast queue convergenceconvergence

1.1. A single TCP-Vegas flow A single TCP-Vegas flow is always stable, is always stable, regardlessregardless of delay and of delay and capacity.capacity.

Sub-RTT Sub-RTT BurstinessBurstiness and and Loss-based Loss-based TCPTCP::

low loss sync ratelow loss sync rate

1.1. Scalable TCP is (usually) Scalable TCP is (usually) unfair; unfair;

2.2. TCP is unfriendly to TCP is unfriendly to TFRC;TFRC;

OutlineOutline

MotivationMotivation Overview of Microscopic behaviorOverview of Microscopic behavior Stability of Delay-based Congestion Stability of Delay-based Congestion

Control AlgorithmsControl Algorithms Fairness of Loss-based Congestion Fairness of Loss-based Congestion

control algorithmscontrol algorithms Future worksFuture works

New UnderstandingsNew Understandings

Micro BurstMicro Burst with with Delay-Delay-based TCPbased TCP::

fast queue fast queue convergenceconvergence

1.1. A single TCP-Vegas flow A single TCP-Vegas flow is always stable, is always stable, regardlessregardless of delay and of delay and capacity.capacity.

Sub-RTT Sub-RTT BurstinessBurstiness and and Loss-based Loss-based TCPTCP::

low loss sync ratelow loss sync rate

1.1. Scalable TCP is (usually) Scalable TCP is (usually) unfair; unfair;

2.2. TCP is unfriendly to TCP is unfriendly to TFRC;TFRC;

A packet level model: A packet level model: basisbasis

Packets can only be sent upon arrival of an Packets can only be sent upon arrival of an acknowledgment;acknowledgment;

A micro burst of packets can be sent at a moment;A micro burst of packets can be sent at a moment; Window size Window size w(t) w(t) can be an arbitrary given can be an arbitrary given

process.process.

Ack-clockingAck-clocking: on each ack : on each ack arrivalarrival p--;p--; while (p < w(t) ) do while (p < w(t) ) do

Send a packetSend a packetp++; p++; ((pp: number of packets in flight): number of packets in flight)

A packet level model: A packet level model: variablesvariables

ppj j : Number of packets in flight when : Number of packets in flight when jj is sent; is sent;

ssjj : sending time of packet : sending time of packet jj

bbjj : backlog experienced by packet : backlog experienced by packet jj

aajj : ack arrival time of packet : ack arrival time of packet jj

Ack-clockingAck-clocking: on each ack : on each ack arrivalarrival p--;p--; while (p < w(t) ) do while (p < w(t) ) do

Send a packetSend a packetp++; p++; ((pp: number of packets in flight): number of packets in flight)

A packet level model: A packet level model: variablesvariables

SenderSender ReceiverReceiver

A5

A4

C

12

ppjj : Number of packets in flight when : Number of packets in flight when jj is sent; is sent;

ssjj : sending time of packet : sending time of packet jj

3

A packet level model: A packet level model: variablesvariables

SenderSender ReceiverReceiver

A5

A4

C

12

ppjj : Number of packets in flight when : Number of packets in flight when jj is sent; is sent;

ssjj : sending time of packet : sending time of packet jj

bbjj : backlog experienced by packet : backlog experienced by packet jj

3

A packet level model: A packet level model: variablesvariables

SenderSender ReceiverReceiver

A4

C 12

ppjj : Number of packets in flight when : Number of packets in flight when jj is sent; is sent;

ssjj : sending time of packet : sending time of packet jj

bbjj : backlog experienced by packet : backlog experienced by packet jj

aajj : ack arrival time of packet : ack arrival time of packet jj

A3

6 5

A packet level model: A packet level model: variablesvariables

Ack-clockingAck-clocking: on each ack : on each ack arrivalarrival p--;p--; while (p < w(t) ) do while (p < w(t) ) do

Send a packetSend a packetp++; p++; ((pp: number of packets in flight): number of packets in flight)

k k : number of acks between : number of acks between ssjj and and ssj-1j-1 ; ; ppjj : number of packets in flight when : number of packets in flight when ii is sent is sent ssjj : sending time of packet : sending time of packet jj aaj-p(j) j-p(j) : ack arrival time of the packet one RTT ago: ack arrival time of the packet one RTT ago

kpjjjpk

j j

j

awkpkpp

1

1

1110

1:1max

jpjj as

A packet level model: A packet level model: variablesvariables

Ack-clockingAck-clocking: on each ack : on each ack arrivalarrival p--;p--; while (p < w(t) ) do while (p < w(t) ) do

Send a packetSend a packetp++; p++; ((pp: number of packets in flight): number of packets in flight)

k k : number of acks between : number of acks between ssjj and and ssj-1j-1 ; For example: ; For example: k =0k =0

1

...11

1

110max

1

j

jjpk

j

p

wkpkppj

111 11 jpjpjpjj saaasjjj

A packet level model: A packet level model: variablesvariables

0,1max 11 jjjj sscbb

c

bdsa j

jj

C

j-1

j

p3 p2 p1

bbj j : experienced backlog: experienced backlog cc : bottleneck capacity : bottleneck capacity aajj :ack arrival time :ack arrival time dd : propagation delay : propagation delay

1 jj ssc

A packet level modelA packet level model

11 1max jjjj sscbb

c

bdsa j

jj

kpjjjpk

j j

j

awkpkpp

1

1

1110

1:1max

jpjj as

ppjj : Number of packets in flight : Number of packets in flight when when jj is sent; is sent;

ssjj : sending time of packet : sending time of packet jj

bbjj : backlog experienced by packet : backlog experienced by packet jj

aajj : ack arrival time of packet : ack arrival time of packet jj

Ack-clocking: quick Ack-clocking: quick sending processsending process

TheoremTheorem: For anytime that a packet : For anytime that a packet jj is is sent (sent (ssj j ), there is always a packet ), there is always a packet j*:=j*(j) j*:=j*(j) s.t.s.t. ssjj = = ssj*j*

ppj*j* = = ww ( (ssj j ))

The number of packets in flight at any The number of packets in flight at any packet sending time is sync-up with the packet sending time is sync-up with the congestion window.congestion window.

t i me (t)

w(t)p(t)

s

Ack-clocking: fast queue Ack-clocking: fast queue convergenceconvergence

TheoremTheorem: If: If

forfor Then:Then:

jj bcdp

cdpk jkpjk j :

t i me (t)

w(t)q(t)

s

The queue The queue converges instantly converges instantly if window size is if window size is larger than BDP in larger than BDP in the entire previous the entire previous RTT.RTT.

Window Control and Ack-Window Control and Ack-clockingclocking

Per RTT Window Control: Per RTT Window Control: makes decision once every RTTmakes decision once every RTT with the measurement from the latest acknowledgement (a subsequence with the measurement from the latest acknowledgement (a subsequence

of sequence number kof sequence number k11, k, k22, k, k33, …), …)

t i me (t)

w(t)p(t)

1ks

3ks

1ka

2ka

2ks

Stability of TCP Stability of TCP VegasVegas

TheoremTheorem: Given the packet level : Given the packet level model, if model, if ααd>1, a single TCP d>1, a single TCP Vegas Vegas flow converges to equilibrium with flow converges to equilibrium with arbitrary capacity c, propagation arbitrary capacity c, propagation delay d. That is: there exists a delay d. That is: there exists a sequence number sequence number J J such that such that

11

11

:

dbd

dcdswdcd

Jj

j

j

Stability of Stability of VegasVegas : 100-flow : 100-flow simulationsimulation

Stability of Stability of VegasVegas : Avg : Avg Window SizeWindow Size

Window Oscillation: 1 packet

Stability of Stability of VegasVegas : Queue : Queue SizeSize

Queue Oscillation:

100 packets

( because 100 flows synchronized )

Gap 1: Stability of TCP Gap 1: Stability of TCP VegasVegas

AnalysisAnalysis: “TCP Vegas is stable : “TCP Vegas is stable if (and if (and only if)only if) the number of flows is large, and the number of flows is large, and capacity is small, and delay is small.”capacity is small, and delay is small.”

ExperimentExperiment: a single TCP Vegas flow is : a single TCP Vegas flow is stable with stable with arbitraryarbitrary delay and capacity.delay and capacity.

Reason: micro burst leads to fast queue convergence

Designed based on the intuition that Designed based on the intuition that queue is directly a function of congestion queue is directly a function of congestion window size.window size.

A FAST flow does the following every A FAST flow does the following every other RTT:other RTT:

twd

c

bd

ptw

j

j 2

1

FAST FAST : stable and : stable and responsiveresponsive

FAST FAST : stability: stability Theorem: Given the packet level model, Theorem: Given the packet level model,

homogeneous homogeneous FASTFAST flows converge to flows converge to equilibrium regardless of capacity equilibrium regardless of capacity c c and and propagation delay propagation delay d d and number of and number of flows flows NN..

[Tang, Jacobsson, Andrew, Low’07]: [Tang, Jacobsson, Andrew, Low’07]: FASTFAST is stable with single bottleneck is stable with single bottleneck link regardless of capacity link regardless of capacity c c and and propagation delay propagation delay d d and number of and number of flows flows NN. (With an extended fluid model . (With an extended fluid model capturing microburst effects)capturing microburst effects)

Micro-burst: SummaryMicro-burst: Summary

t (time)

RTT RTT0

x(t)

c

Effects:Effects: Fast queue convergenceFast queue convergence

Stability of homogeneous Vegas for arbitrary Stability of homogeneous Vegas for arbitrary delaydelay

Possibility of very responsive & stable TCP controlPossibility of very responsive & stable TCP control Stability of FAST for arbitrary delayStability of FAST for arbitrary delay

OutlineOutline

MotivationMotivation Overview of Microscopic behaviorOverview of Microscopic behavior Stability of Delay-based Congestion Stability of Delay-based Congestion

Control AlgorithmsControl Algorithms Fairness of Loss-based Congestion Fairness of Loss-based Congestion

control algorithmscontrol algorithms Future worksFuture works

New UnderstandingsNew Understandings

Micro BurstMicro Burst with with Delay-Delay-based TCPbased TCP::

fast queue fast queue convergenceconvergence

1.1. A single (homogeneous) A single (homogeneous) TCP-Vegas flow is always TCP-Vegas flow is always stable, stable, regardlessregardless of of delay and capacity.delay and capacity.

Sub-RTT Sub-RTT BurstinessBurstiness and and Loss-based Loss-based TCPTCP::

low loss sync ratelow loss sync rate

1.1. Scalable TCP is (usually) Scalable TCP is (usually) unfair; unfair;

2.2. TCP is unfriendly to TCP is unfriendly to TFRC;TFRC;

Loss Synchronization Rate: Loss Synchronization Rate: DefinitionDefinition

Loss Synchronization Rate Loss Synchronization Rate [Baccelli,Hong’02]:[Baccelli,Hong’02]:

The probability that a flow observes a The probability that a flow observes a packet loss during a congestion event.packet loss during a congestion event.Congestion event (loss event):Congestion event (loss event):

A round-trip time interval in which at A round-trip time interval in which at least one packet is dropped by the least one packet is dropped by the bottleneck router due to congestion bottleneck router due to congestion (buffer overflow at router)(buffer overflow at router)

Loss Synchronization Rate: Loss Synchronization Rate: EffectsEffects

Intuitions:Intuitions: Individual flow: the smaller the better Individual flow: the smaller the better

(selfishness)(selfishness) System design: the higher the better (for System design: the higher the better (for

fairness and convergence)fairness and convergence) Theoretic Results:Theoretic Results:

Aggregate throughput [Baccelli,Hong’02]Aggregate throughput [Baccelli,Hong’02] Instantaneous fairness [Baccelli,Hong’02]Instantaneous fairness [Baccelli,Hong’02] Fairness convergence [Shorten, Wirth, Fairness convergence [Shorten, Wirth,

Leith’06]Leith’06]

Loss Sync. Rate: Existing Loss Sync. Rate: Existing ModelModel

[Shorten, Wirth, Leith’06] No Model. [Shorten, Wirth, Leith’06] No Model. Measure from NS-2 and feed into a Measure from NS-2 and feed into a model for computational resultsmodel for computational results

[Baccelli,Hong’02] Assume each [Baccelli,Hong’02] Assume each packet has the same probability of packet has the same probability of being dropped in the loss event.being dropped in the loss event.

Packet loss is bursty: Packet loss is bursty: InternetInternet

~50% losses happen in bursts

burst peri od of l oss si gnalL i ncomi ng packets dropped

Legend:

a droppedpacket

i ncoming packets during the RTT of l oss event from al l fl ows

a packet(f rom any fl ow)

Loss process is bursty: on-Loss process is bursty: on-offoff

In each loss event (one RTT), packet loss In each loss event (one RTT), packet loss process is an on-off process.process is an on-off process.

Data packet process is Data packet process is bursty: on-offbursty: on-off

burst peri od of one fl ow: w packets

Legend:

a packetf rom fl ow i

ii iiiii ii

i ncoming packets during the RTT of l oss event from al l fl ows

i

a packet(f rom any fl ow)

t (time)

RTT RTT0

x(t)

c

In each loss event (one RTT), TCP data In each loss event (one RTT), TCP data packet process is an on-off process.packet process is an on-off process.

Loss Sync. Rate: A Sampling Loss Sync. Rate: A Sampling PerspectivePerspective

burst peri od of one fl ow: w packets

burst peri od of l oss si gnal

L i ncomi ng packets dropped

Legend:

a droppedpacket

a packetf rom fl ow i

ii iiiii ii

i ncoming packets during the RTT of l oss event from al l fl ows

i

a packet(f rom any fl ow)

Loss Sync. RateLoss Sync. Rate: The efficiency of a (bursty) TCP : The efficiency of a (bursty) TCP data process to sample the loss signal in a (bursty) data process to sample the loss signal in a (bursty) loss processloss process Assumption 1: Within the RTT of loss event, the position of Assumption 1: Within the RTT of loss event, the position of

an individual flow’s burst is uniformly distributed.an individual flow’s burst is uniformly distributed. Assumption 2: Loss process does not depend on data Assumption 2: Loss process does not depend on data

packet process of individual flows.packet process of individual flows.

Loss Sync. Rate Case 1: Loss Sync. Rate Case 1: TCP+DropTailTCP+DropTail

burst peri od of one fl ow: w packets

burst peri od of l oss si gnal

L i ncomi ng packets dropped

Legend:

a droppedpacket

a packetf rom fl ow i

ii iiiii ii

i ncoming packets during the RTT of l oss event from al l fl ows

i

a packet(f rom any fl ow)

LBcd

wL ii

1

wwii : window of a TCP flow: window of a TCP flow L L : number of dropped packets: number of dropped packets cd+B+L cd+B+L : number of packets : number of packets

going through the bottleneck in going through the bottleneck in the loss event ( the loss event ( cc : capacity, : capacity, dd : : propagation delay; propagation delay; B B : buffer size): buffer size)

Loss Sync. Rate: Loss Sync. Rate: TCP+DropTailTCP+DropTail

Loss Sync. Rate Case 2: Loss Sync. Rate Case 2: Pacing+DropTailPacing+DropTail

i

w packets di stri buted i n the enti re RTT of l oss event

burst peri od of l oss si gnal

L i ncomi ng packets

Legend:

a droppedpacket

a packetf rom fl ow i

ii i iiiii

i ncoming packets during the RTT of l oss event from al l fl ows

i

a packet(f rom any fl ow)

iw

i LBcd

L

11

wwii : window of a TCP flow: window of a TCP flow L L : number of dropped packets: number of dropped packets cd+B+L cd+B+L : number of packets : number of packets

going through the bottleneck in going through the bottleneck in the loss eventthe loss event

Loss Sync. Rate: Pacing + Loss Sync. Rate: Pacing + DropTailDropTail

burst peri od of one fl ow: w packets

packet l oss di stri buted over the enti re RTT of l oss event

ii iiiii ii

i ncoming packets during the RTT of l oss event from al l fl ows

Loss Sync. Rate Case 3: Loss Sync. Rate Case 3: TCP+REDTCP+RED

L

ii LBcd

w

11

wwii : window of a TCP flow: window of a TCP flow L L : number of dropped packets: number of dropped packets cd+B+L cd+B+L : number of packets : number of packets

going through the bottleneck in going through the bottleneck in the loss eventthe loss event

Model for Loss Sync. Rate: Model for Loss Sync. Rate: General formGeneral form

cd+B cd+B : number of packets going through the : number of packets going through the bottleneck in the loss event ( bottleneck in the loss event ( cc : capacity, : capacity, dd : : propagation delay; propagation delay; B B : buffer size): buffer size)

wwi i : window of a TCP flow in the loss event: window of a TCP flow in the loss event L L : number of dropped packets in the loss event: number of dropped packets in the loss event KKi i : length of burst period of flow : length of burst period of flow i i (in pkt) (in pkt) MM : length of burst period of loss process (in pkt) : length of burst period of loss process (in pkt)

burst peri od of Fl ow i

burst peri od of l oss si gnal

randoml y drop f rom Mi ncomi ng packets

Legend:

a droppedpacket

a packetf rom fl ow i

iii iiii ii ii

cd+B incoming packets during the RTT of l oss event

i

a packet(f rom any fl ow)

spanni ng over K i ncomi ng packets

?i

Loss Sync. Rate: MatLab Loss Sync. Rate: MatLab ComputationComputation

cd+B = 1080; wi = 60; L = 16; K , M vary

Measurement: TCP + Measurement: TCP + DropTailDropTail

Averaged sync. Averaged sync. RateRate

cd+B = 3340cd+B = 3340 M =L = N/2M =L = N/2 K = w = (cd+B)/NK = w = (cd+B)/N

Measurement: Pacing + Measurement: Pacing + DropTailDropTail

Averaged sync. Averaged sync. RateRate

cd+B = 3340cd+B = 3340 M =L = N/2M =L = N/2 K = w = (cd+B)/NK = w = (cd+B)/N

Measurement: TCP + Measurement: TCP + REDRED

Averaged sync. Averaged sync. RateRate

cd+B = 3340cd+B = 3340 M =L = N/2M =L = N/2 K = w = (cd+B)/NK = w = (cd+B)/N

Loss Sync. Rate: Qualitative Loss Sync. Rate: Qualitative ResultsResults

With DropTail and bursty TCP (most With DropTail and bursty TCP (most widely deployed combination), loss widely deployed combination), loss synchronization rate is very low;synchronization rate is very low;

TCP Pacing increases loss TCP Pacing increases loss synchronization rate;synchronization rate;

RED increases loss synchronization RED increases loss synchronization rate.rate.

Loss Sync. Rate: Asymptotic Loss Sync. Rate: Asymptotic ResultResult

If number of flows If number of flows NN is large: is large: LL >> >> wwii

TCP:TCP:

Very weak dependency of Loss Sync Rate Very weak dependency of Loss Sync Rate to window size: All flows see the same lossto window size: All flows see the same loss

TCP Pacing:TCP Pacing:

Loss Sync Rate is proportional to window Loss Sync Rate is proportional to window size:size:Rich guys see more loss.Rich guys see more loss.

LBcd

L

LBcd

wL ii

1

LBcd

Lw

LBcd

L i

w

i

i

11

Asymptotic Result: MatLab Asymptotic Result: MatLab ComputationComputation

cd+B = 1080; L = N/2; N varies

Fair share window size: cd+B/N

ImplicationsImplications

1.1. Scalable TCP is (usually) unfair with Scalable TCP is (usually) unfair with bursty TCPbursty TCP

2.2. TCP is unfriendly to TFRC;TCP is unfriendly to TFRC;

3.3. ……

Fairness of Scalable TCPFairness of Scalable TCP For each RTT without a loss:For each RTT without a loss:

wwii ( (t+1t+1) = ) = ααwwi i ((tt); ); αα=1.01=1.01 For each RTT with a loss (loss event): For each RTT with a loss (loss event):

wwii ( (t+1t+1) = ) = ββwwii ( (tt); ); ββ= 0.875= 0.875 [Chiu,Jain’90]: MIMD algorithms [Chiu,Jain’90]: MIMD algorithms cannot cannot

convergesconverges to fairness with synchronization to fairness with synchronization modelmodel

[Kelly’03]: Scalable TCP (MIMD) [Kelly’03]: Scalable TCP (MIMD) convergesconverges to fairness in theory with fluid modelto fairness in theory with fluid model

[Wei, Jin, Low’06][Li,Leith,Shorten’07]: [Wei, Jin, Low’06][Li,Leith,Shorten’07]: Scalable TCP is Scalable TCP is unfairunfair in experiments in experiments

Fairness of Scalable TCP: Chiu Fairness of Scalable TCP: Chiu vs Kellyvs Kelly

[Kelly’03]: Scalable TCP (MIMD) is [Kelly’03]: Scalable TCP (MIMD) is fairfair Assumption: loss event rate is Assumption: loss event rate is

proportional to window size (proportional to window size (fluid fluid modelmodel))

[Chiu,Jain’90]: MIMD is not fair[Chiu,Jain’90]: MIMD is not fair Assumption: loss event rate is Assumption: loss event rate is

independent of window size independent of window size ((simplified synchronization modelsimplified synchronization model))

Fairness of Scalable TCP: Chiu Fairness of Scalable TCP: Chiu vs Kellyvs Kelly

[Kelly’03]: Scalable TCP is fair[Kelly’03]: Scalable TCP is fair Assumption: loss event rate is proportional Assumption: loss event rate is proportional

to window size (fluid model)to window size (fluid model)Sync. Rate ModelSync. Rate Model: true with very few : true with very few

bursty TCP flows or with paced TCP flowsbursty TCP flows or with paced TCP flows

[Chiu,Jain’90]: MIMD is not fair[Chiu,Jain’90]: MIMD is not fair Assumption: loss event rate is independent Assumption: loss event rate is independent

of window size (simplified synchronization of window size (simplified synchronization model)model)

Sync. Rate ModelSync. Rate Model: many bursty TCP flows: many bursty TCP flows

Scalable TCP: Scalable TCP: simulationssimulations

Capacity=100Mbps; delay=200ms; buffer size: BDP; MTU=1500; N varies; averaged rate over

600 second runtime

Gap 2: Fairness of Gap 2: Fairness of Scalable TCPScalable TCP

AnalysisAnalysis: “Scalable : “Scalable TCP is TCP is fairfair in in homogeneous homogeneous network” [Kelly’03]network” [Kelly’03]

ExperimentExperiment: in most cases, Scalable TCP : in most cases, Scalable TCP is is unfairunfair in homogeneous network. in homogeneous network.

AnalysisAnalysis: “MIMD in : “MIMD in general is unfair.” general is unfair.” [Chiu&Jain’90]. [Chiu&Jain’90].

→ → Scalable TCP is Scalable TCP is unfairunfair..

Reason: sub-RTT burstiness leads to similar similar loss sync. rate for different flows

burst peri od of TCP: w packets

burst peri od of l oss si gnalL i ncomi ng packets

Legend:

a droppedpacket

a packetf rom TCP

22 22222 22

i ncoming packets during the RTT of l oss event from al l fl ows

2

a packet(f rom any fl ow)

1 11 1 11111

a packetf rom TFRC

1

TFRC vs TCPTFRC vs TCP

iw

i LBcd

L

11

TCP:TCP:

LBcd

wL ii

1

TFRC (same as Pacing):TFRC (same as Pacing):

TFRC vs TCP: simulationTFRC vs TCP: simulation

Gap 3: TCP vs TFRCGap 3: TCP vs TFRC AnalysisAnalysis: “We designed TCP Friendly Rate : “We designed TCP Friendly Rate

Control (TFRC) algorithm to have the Control (TFRC) algorithm to have the same same equilibriumequilibrium as TCP when they co-exist.” as TCP when they co-exist.”

ExperimentExperiment: TCP flows do not fairly : TCP flows do not fairly coexist with TFRC flows.coexist with TFRC flows.

Reason: sub-RTT burstiness leads to different different loss sync. rate for TFRC and TCP

Sub-RTT Burstiness: Sub-RTT Burstiness: SummarySummary

t (time)

RTT RTT

Possible solutions Possible solutions Eliminate sub-RTT Eliminate sub-RTT

burstiness: Pacingburstiness: Pacing Randomize loss signal: REDRandomize loss signal: RED Persistent loss signal: Persistent loss signal:

ECNECN

0

x(t)

c

Effects:Effects: Low Loss Sync. Rate Low Loss Sync. Rate

with DropTail routerwith DropTail router Poor convergencePoor convergence MIMD unfairnessMIMD unfairness TFRC unfriendlyTFRC unfriendly

OutlineOutline

MotivationMotivation Overview of Microscopic behaviorOverview of Microscopic behavior Stability of Delay-based Congestion Stability of Delay-based Congestion

Control AlgorithmsControl Algorithms Fairness of Loss-based Congestion Fairness of Loss-based Congestion

control algorithmscontrol algorithms Future worksFuture works

Future:Future: a research a research framework on microscopic framework on microscopic

Internet behaviorInternet behavior Experiment tools: help to observe, Experiment tools: help to observe,

analyze and validate microscopic analyze and validate microscopic behavior in Internet: WAN-in-Lab, NS-2 behavior in Internet: WAN-in-Lab, NS-2 TCP-Linux, …TCP-Linux, …

Theoretic model: more accurate models Theoretic model: more accurate models to capture the dynamic of Internet in to capture the dynamic of Internet in microscopic timescale.microscopic timescale.

New algorithms: new algorithms that New algorithms: new algorithms that utilize and control the microscopic utilize and control the microscopic Internet behaviorInternet behavior

NS-2 TCP-LinuxNS-2 TCP-Linux

The first tool that can run a congestion The first tool that can run a congestion algorithm directly from Linux source code algorithm directly from Linux source code with the same simulation speed (sometimes with the same simulation speed (sometimes even faster)even faster)

700+ local downloads (2400+ tutorial visits 700+ local downloads (2400+ tutorial visits worldwide)worldwide)

5+ Linux kernel fixes5+ Linux kernel fixes 2+ papers2+ papers Outreach: Outreach:

BIC/Cubic-TCP (NCSU), BIC/Cubic-TCP (NCSU), H-TCP (Hamilton), H-TCP (Hamilton), TCP Westwood (UCLA/Politecnico di Bari), TCP Westwood (UCLA/Politecnico di Bari), A-Reno (NEC), …A-Reno (NEC), …

NS-2 Simulator

Linux Implementation

Thank you!Thank you!

Q&AQ&A