February 2005Proprietary Content1 The Role of PCE in the Evolution of Transport Protocols Pfldnet...

44
February 2005 Proprietary Cont ent 1 The Role of PCE in the Evolution of Transport Protocols Pfldnet 2005, Lyon, France M. Y. “Medy” Sanadidi http://www.cs.ucla.edu/~medy http://www.cs.ucla.edu/NRL/hpi/tcpw/

Transcript of February 2005Proprietary Content1 The Role of PCE in the Evolution of Transport Protocols Pfldnet...

  • The Role of PCE in the Evolution of Transport Protocols Pfldnet 2005, Lyon, France

    M. Y. Medy Sanadidi

    http://www.cs.ucla.edu/~medyhttp://www.cs.ucla.edu/NRL/hpi/tcpw/

    Proprietary Content

  • Recent Issues in Transport ProtocolsLarge Pipes UtilizationSteady stateStart-upImpact of Wireless Links:Last-hop wirelessMultihop contention networksFairness for asymmetric flows Protocols Co-ExistenceNew Paradigms:Voice/VideoStore-and-forward at Transport layer (e.g. PEPs, P2P/Overlays)

    Proprietary Content

  • Example: Satellite/802.11 Networks

    Proprietary Content

  • Outline Path Characteristics Estimation (PCE)Prospects for Higher EfficiencyFuture of Friendly Co-Existence Addressing the New ParadigmsSummary

    Proprietary Content

  • Path Characteristics Estimation (PCE)Characteristics of Interest:Links capacityPath dynamic range, i.e. buffering capacityCross traffic level, path-persistence, responsivenessRandom lossMultihop wireless connectivity, contention, route diversityParticipating Nodes:Sources onlySources and DestinationsForwarding nodes (routers, base stations, multihop wireless nodes)

    Proprietary Content

  • Sharing a LinkFlow2Flow12 flows, red one is non-responsivefair share ?bandwidthresidual bandwidthbottleneckinterface queuebacklogBuffer spacePropagation Time

    Proprietary Content

  • A Hierarchy of CharacteristicsAchieved rateDelay/Dynamic RangePacket lossIntensityPath persistenceElasticityLinks capacitiesPropagation timesBuffer spaceErrorsCross Traffic LoadArchitectureFlow Behavior+

    Proprietary Content

  • Path Capacity EstimationPath Capacity: capacity of narrow linkPathrate: rely on packet pair dispersion measurements followed by statistical processing of resultsCapProbe: use dispersion measurements; perform on line filtering of results based on end-to-end delayTcpProbe: an adaptation of CapProbe into TCP with minimal sender side only changes

    Proprietary Content

  • CapProbe and TcpProbe

    Proprietary Content

  • Prospects for Higher EfficiencySteady State:Congestion avoidance (FAST): stable at high throughput, co-existence ??, and random loss impact ??Scaling up congestion recovery (HSTCP, STCP): higher throughput, but fairness and stability ??Scaling up congestion recovery (BIC): improves on the above in fairnessForwarder Based (XCP): superb, when we are done with implementation issuesPCE reliance (TCP Westwood, TCP Peach): Peach requires forwarder priority support, TCPW requires good estimation at high speeds

    Proprietary Content

  • Using PCETahoe/Reno/NewReno estimate:Packet loss via Dup AcksRTT average and varianceMaintain a pipe size (or bandwidth-delay product) estimate: ssthreshVegas/FAST:Achieved Rate and its relation to the Expected Rate, or equivalently RTT and RTTmin, or Queuing delayHSTCP/STCP/BIC:Use current window size (Expected Rate) in addition to all items above in Reno

    Proprietary Content

  • Using PCE (2)TCPW estimatesPacket loss and type of lossNarrow link capacity, or Path capacityAchieved RateDynamic Range resulting from buffering space:(RTTmax-RTTmin)XCP measures at forwarders the actual:Links capacitiesLoad intensityRTT (obtained from sources)

    Proprietary Content

  • Large Pipes Measurements Results

    Proprietary Content

  • Experiments Environment

    (Powerful Machines)CPU: Xeon 3.06GHzCache: 512 L2/ 1MB L3Intel 1000PROPCI-X BUS 133MHz

    NewReno Sender

    Advanced TCPSender

    Gigabit link

    UCLAGigabit Switch

    Gigabit link

    NewReno Receiver(Alabama)

    Internet2

    NewReno Receiver(Caltech)

    PATHNETS 2004 - San Jose CA

  • Acceptable Long Term Efficiency

    Proprietary Content

  • UCLA-Alabama

    PATHNETS 2004 - San Jose CA

  • Some Difference in Completion Times

    Proprietary Content

  • Transfer Completion Times

    On average:

    TCPW and FAST: 0 to 100 MB in 5.8 Sec! HSTCP: 0 to 100 MB in 7.5 Sec!NewReno: 0 to 100 MB in 11 Sec!

    UCLA-Alabama

    PATHNETS 2004 - San Jose CA

  • Co-Existence at Gbps Speed

    Proprietary Content

  • Friendliness

    UCLA-CalTech

    PATHNETS 2004 - San Jose CA

  • Random Loss Impact

    Proprietary Content

  • Random Loss Emulation

    Induced non-congestion packet loss in emulator (PER 0.1% up to 0.5%)TCPW throughput much higher than all other schemes

    AdvancedTCPSender

    NewReno Receiver(Alabama)

    UCLA Alabama

    UCLA-Alabama

    NistnetNetwork Emulator

    PATHNETS 2004 - San Jose CA

  • Effect of Random Loss

    Proprietary Content

  • Random Loss Emulation (Results)

    UCLA-Alabama

    PATHNETS 2004 - San Jose CA

    Chart1

    49.4330.254.71

    22.8312.143.03

    12.928.62.26

    TCPW

    FAST

    HSTCP

    Average Throughput (Mbps)

    Sheet1

    FASTWWHSTCPNewReno

    (0-1)1.471.216.044.63

    (1-2)13.712.910.57.58FASTTCPWHSTCPNewReno

    15.1714.1116.5412.21015.1714.1116.5412.21

    (2-3)14.814.810.67.74129.9728.9127.1419.95

    29.9728.9127.1419.95244.7743.6137.8427.93

    (3-4)14.814.710.77.98359.5758.1148.7435.81

    44.7743.6137.8427.93474.3772.2159.9444.02

    (4-5)14.814.510.97.88589.1787.0171.1452.22

    59.5758.1148.7435.816103.97101.7182.5460.53

    (5-6)14.814.111.28.217118.77116.1193.9469.09

    74.3772.2159.9444.028133.07130.61105.5477.76

    (6-7)14.814.811.28.2986.63

    89.1787.0171.1452.221095.55

    (8-9)14.814.711.48.3111104.54

    103.97101.7182.5460.53

    (9-10)14.814.411.48.56

    118.77116.1193.9469.09

    (10-11)14.314.511.68.67

    133.07130.61105.5477.76

    8.87

    86.63

    8.92

    95.55

    8.99

    104.54

    Sheet1

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    0000

    FAST

    TCPW

    HSTCP

    NewReno

    Sheet2

    49.4310.1522.832.6812.921.09

    30.256.6712.145.858.60.45

    4.710.673.030.332.260.2

    0.10%0.25%0.50%

    TCPW49.4322.8312.92

    FAST30.2512.148.6

    HSTCP4.713.032.26

    Sheet2

    000

    000

    000

    TCPW

    FAST

    HSTCP

    Average Throughput (Mbps)

    Sheet3

  • TCPW: Mining ACK Streams for PCERely on PCE ( e.g. capacity, achieved rate, dynamic range) to determine an Eligible Rate Estimate (ERE)ERE is used to size the congestion window after a packet lossReceiverSenderInternetBottleneckpacketsACKsmeasure

    Proprietary Content

  • TCPW BE (2001)BE Sampling:With Saverio Mascolo (P. Bari) and Claudio Casetti (P. Torino)~ Packet pair a noisy estimate of achieved rate/capacity Provides throughput boost under random loss, overestimates under congestionEfficient but not friendly

    Proprietary Content

  • TCPW RE (2002)

    RE Sampling:~ Packet trainFair estimate under congestion, underestimates under random lossUsed in TCPW RE and inTCP Westwood+ (S. Mascolo) Friendly

    Proprietary Content

  • Adaptive Estimation in TCPW

    TCPW CRB: ERE BE if random loss, else ERE RE

    TCPW ABSE: ERE RE

  • TCPW CRB (2002)Combined Rate and BandwidthBinary adaptiveCongestion measure: Expected Rate/Achieved RateClarified Efficiency/Friendliness tradeoffCongestion measurePacket Loss Detectedssthresh, cwnd = BE x RTTminover a threshold under a threshold Ssthresh, cwnd = RE x RTTmin

    Proprietary Content

  • TCPW ABSE (2002)Under CongestionUnder No Congestion Adaptive Bandwidth Share EstimationAdapt the sample interval Tk according to congestion level Congestion measure, similar to VegasTk ranges from one interACK interval to current RTTBetter Efficiency/Friendliness profile than CRB

    Proprietary Content

  • Helping Short Lived ConnectionsApproaches:Cached ssthreshLarger initial windowPCE based: Hoes; TCPW AstartNegotiation: Quick-StartNo problems here for XCP!

    Proprietary Content

  • TCPW Astart (2003)Take advantage of ERE :Adaptively and repeatedly reset ssthresh ERE until sender window reaches estimated pipe size, or encounters packet lossIncludes multiple mini exponential increase, and mini linear increase phasescwnd grows slower as it approaches BDPConnection converges faster to its pipe size with less buffer overflow, since it adapts to pipe size and transient loading

    Proprietary Content

  • Astart: First 20 Seconds ThroughputRTT =100ms, Buffer =BDPRTT =100ms, Bottleneck =40 MbpsBottleneck capacity = 40 Mbps, Buffer =BDP

    Good scaling with capacity and propagation timeRobust to buffer size variation

    Proprietary Content

  • TCPW BBE (Work in Progress)With H. Shimonishi (NEC, Tokyo)Buffer and Bandwidth EstimationEstimates Capacity using TcpProbe (much more accurate than BE!!)Higher efficiency at higher random loss rates (e.g. 5-10%)Estimates Dynamic Range (related to buffer size)Improves TCPW control as a function of congestion The result is higher efficiency and robust friendliness even at small buffers!

    Proprietary Content

  • TCPW BBE Algorithms (ICC 2005)Dynamic Range estimateDmax = RTTcong loss - RTTmin

    Current Delay DistanceD = RTT RTTminEligible Rate estimateERE = u * C + (1-u) * RE

    Note: u=0 if D and Dmax are small

    Proprietary Content

  • Opportunistic Friendliness of TCPW-BBEIf Reno under-perform: use all the opportunity provided without hurting co-existing Reno flowsTCP-RenoSenderReceiver10M-1GbpsTCPW-BBESender0.001% lossReceiverRTT 40msecIf Reno performs: achieve similar to Reno

    Proprietary Content

  • The Future of Friendly Co-ExistenceDefining Friendliness:TCP Friendliness:Achieve throughput equal to that of TCP Reno under some conditions (RTT, packet loss rate)Problematic if Reno under-perform; e.g. under random lossesOpportunistic Friendliness:If Reno performs, achieve similar to RenoIf Reno under-perform: use all the opportunity provided without hurting co-existing Reno flows

    Proprietary Content

  • Evaluating a New Proposed Protocol:The Efficiency/Friendliness ProfileEach point in the graph is obtained as follows:N legacy flows => legacy throughput tR1 total utilization U1 N/2 legacy, N/2 proposed flows => legacy throughput tR2 Total utilization U2Efficiency Improvement E = U2 / U1FriendlinessF = tR2 / tR1

    Proprietary Content

    Friendliness (F)

    1.0

    1.0

    0.0

    Efficiency (E)

    Target points

  • E/F Profiles of TCPW BE, CRB and ABSE

    Proprietary Content

  • E/F Profile of Vegas11.11.21.31.41.50.40.60.811.21.4Utilization Ratio G (Efficiency)Throughput Ratio L (Friendliness)N=2N=4N=8N=16N=24Vegas vs. NewReno (RED)Vegas uses fixed targeted queue length => varying friendliness depending on number of connections!

    Proprietary Content

  • Addressing New ParadigmsAudio/Video Streaming: Increasing portion of the total traffic with distinct requirementsMultihop Wireless: Difficult fundamental issuesStore-and-forward at the Transport Layer: Revisit early problems and new opportunities

    Proprietary Content

  • Continuous Media TransportRequirements:Minimum bandwidthUpper bound on delayLower reliability requirements than in FTPAdaptive streaming objectives:Delivered qualityCongestion controlSupport for adaptive coding

    Proprietary Content

  • Addressing Continuous Media Issues Issues with the standard protocols:UDP: no congestion or error control TCP: AIMD behavior undesirable due to fluctuation in rate, and consequently delay, and intolerance to random lossDCCP provides an excellent framework, recommends TFRC as one possible protocol, but allows for alternativesTFRC is equation based, rate-equivalent to Reno, with smoother delivery suitable for streamingSCTP enables multiple streams with different congestion control mechanisms, among other features

    Proprietary Content

  • Streaming Over WirelessUnder random loss, Reno and its rate-equivalent TFRC, will both under-performApproaches, some with loss discrimination, have been proposed:TFRC Wireless:Combination of loss discrimination schemes, Multi-TFRCMultiple TFRC connections until link is congestedVTPRate estimation and loss discrimination

    Proprietary Content

  • Performance ComparisonEfficiency in presence of errors 5% error rate, single connectionRate adaptation 5% error rate, single connection with on/off CBR cross traffic

    Proprietary Content

  • TCP over Multihop WirelessPacket losses due to:Contention due to hidden terminals Varying channel qualityRoute collapseBuffer overflow ??Solution approaches:Neighborhood REDDelayed ACK extensionSizing the TCP window for contention reduction

    Proprietary Content

  • Store & Forward at the Transport LayerOverlays/P2P tunneling through TCP connectionsPEPs breaking ETE path into concatenated TCP connections, e.g. satellitesNew(?) Requirements:Buffer management and priority schemes for better ETE application protocol performanceTCP Receiver advertised window role Related item: Prioritized TCP for QOS at the Transport layer (TCP-LP, TCPW-LP)

    Proprietary Content

  • SummaryExcellent progress by many approaches for scaling efficiency with pipe sizeFocus on PCE techniques is promising, e.g. TCPW provides:Scalable efficiencyRobustness to random lossTunable opportunistic friendlinessStreaming, multihop wireless, and forwarding at the Transport layer to receive attention and make good progress

    Proprietary Content

  • Steady State Characteristics (TCPW RE)For small loss rate, TCPW has much larger windowthan NewReno. More scalable!

    Proprietary Content

    Equilibriums of congestion window and loss

    probability are

    :

    _1131109209.unknown

    _1131305670.unknown

  • Fairness (TCPW RE)For small loss rate, TCPW is more fair than NewReno

    Proprietary Content

    Two TCP connections with different round trip delay share the same bottleneck, they have same queuing delay

    and

    is usually small

    TCPW:

    NewReno:

    _1168685170.unknown

    _1168685526.unknown

    _1168684860.unknown

    Multfrc: opens multiple tfrc connection until link is congestedTFRC WRLS: uses loss discrimination, combination methods, used based on conditions, lack of smoothness possibly from TFRC equation rate control

    P is loss probability, d is round trip delay, tau is round trip time,