UDP High Speed
-
Upload
divyapreethiak -
Category
Documents
-
view
216 -
download
0
Transcript of UDP High Speed
-
8/8/2019 UDP High Speed
1/45
UDP-based schemes for High Speed Networks
Presented By : Sumitha Bhandarkar
Presented On : 03.24.04
-
8/8/2019 UDP High Speed
2/45
2
Agenda
RBUDP E. He, J. Leigh, O. Yu, T. A. DeFanti, Reliable Blast UDP : Predictable High Performance Bulk Data
Transfer , IEEE Cluster Computing 2002, Chicago, Illinois, Sept 2002.
Tsunami (No technical resources available) http://www.ncne.org/training/techs/2002/0728/presentations/200207-wallace1_files/v3_document.htm
SABUL/UDT
H. Sivakumar, R. L. Grossman, M. Mazzucco, Y. Pan, Q. Zhang, Simple Available Bandwidth
Utilization Library for High-Speed Wide Area Networks, to appear in Journal of Supercomputing, 2004.
Y. Gu and R. Grossman, UDT: An Application Level Transport Protocol for Grid Computing , Second
International Workshop on Protocols for Fast Long-Distance Networks, February 2004 (PFLDnet 2004).
Y. Gu and R. Grossman, UDT: A Transport Protocol for Data Intensive Applications, IETF DRAFT.http://bebas.vlsm.org/v08/org/rfc-editor/internet-drafts/draft-gg-udt-00.txt
GTP R.X. Wu and A.A. Chien, GTP: Group Transport Protocol for Lambda-Grids, 4th IEEE/ACM
International Symposium on Cluster Computing and the Grid, April 2004. (CCGrid 2004)
-
8/8/2019 UDP High Speed
3/45
3
TCP based SchemesThe problems
.
Slow Startup
Slow loss recovery
RTT bias
Burstiness caused by window control
Large amount of control traffic due to per-packet ack
-
8/8/2019 UDP High Speed
4/45
4
RBUDP
Intended to be aggressive.
Intended for high bandwidth dedicated or QOS enabled networks - not
for deployment on the broader internet.
Uses UDP for data traffic and TCP for signaling traffic.
Estimates available bandwidth on the network using Iperf/app_perf
(NOTE : this requires user interaction ie, NOT automated )
Tries to send just below this rate in blasts to avoid losses (payload =
RTT * Estimated BW)
If losses do occur within a blast, TCP is used to exchange loss reports
Lost packets are recovered by retransmitting the lost packets in smaller
blasts
-
8/8/2019 UDP High Speed
5/45
5
RBUDP
E. He, J. Leigh, O. Yu, T. A. DeFanti, Reliable Blast UDP : Predictable High Performance Bulk DataTransfer, IEEE Cluster Computing 2002, Chicago, Illinois, Sept 2002.
-
8/8/2019 UDP High Speed
6/45
6E. He, J. Leigh, O. Yu, T. A. DeFanti, Reliable Blast UDP : Predictable High Performance Bulk DataTransfer, IEEE Cluster Computing 2002, Chicago, Illinois, Sept 2002.
RBUDPSample Results (with network bottleneck)
-
8/8/2019 UDP High Speed
7/45
7E. He, J. Leigh, O. Yu, T. A. DeFanti, Reliable Blast UDP : Predictable High Performance Bulk DataTransfer, IEEE Cluster Computing 2002, Chicago, Illinois, Sept 2002.
RBUDPSample Results (with receiver bottleneck)
-
8/8/2019 UDP High Speed
8/45
8
Advantages
Keeps the pipe as full as possible
Avoid TCPs per-packet ack interaction
Paper provides analytical model- so performance is predictable
Disadvantages
Sending rate needs to be adjusted by the user (no means of automatically
adjusting sending rate in response to the dynamic network conditions) -
Thus the solution is good ONLY in dedicated/QOS supported networks.
No flow control - a fast sender can flood a slow receiver. Offered
solution is to use app_perf (modified Iperf developed by the authors to
take into account the receiver bottleneck) for bandwidth estimation.
RBUDPConclusions
-
8/8/2019 UDP High Speed
9/45
9
Tsunami
No tech papers. This info is from a presentation at July 2002
NLANR/Internet2 Techs Workshop. Available for download at
http://www.indiana.edu/~anml/anmlresearch.html. Latest version is dated
12/09/02
Very simple and primitive scheme - NOT TCP-FRIENDLY
Application level protocol - uses UDP for data and TCP for signaling
Receiver keeps track of lost packets and requests for retransmission
So how is this different fromRBUDP ?
-
8/8/2019 UDP High Speed
10/45
10
SABUL / UDT
SABUL (Simple Available Bandwidth Utilization Library) uses UDP to
transfer data and TCP to transfer control information.
UDT (UDP-based Data Transfer Protocol) uses UDP only for both data
and control information.
UDT is the successor to SABUL.
Both are application level protocols available as open source C++ library
on Linux/BSD/Solaris and NS-2 simulation modules.
-
8/8/2019 UDP High Speed
11/45
11
SABUL / UDT
Rate control : for handling dynamic congestion - uses constant ratecontrol interval (called SYN - set to 0.01 seconds) to avoid RTT bias.
Window based flow control : used in slow start, to ensure that fast sender
does not swamp a slow receiver, to limit unacknowledged pkts.
Selective positive acknowledgement (one per SYN) and immediate
negative acknowledgement.
Uses both packet loss and packet delay for inferring congestion
TCP Friendly - less aggressive than TCP in low BDP networks ; better
than TCP in higher BDP networks.
PFLDNet 2004 claim : Orthogonal Design - The UDP based framework
can be used with any congestion control algorithm and the UDT
congestion control algorithm can be ported to any TCP implementation.
-
8/8/2019 UDP High Speed
12/45
12
SABUL / UDT
Y. Gu and R. Grossman, UDT: An Application Level Transport Protocol for Grid Computing, PFLDnet2004.
-
8/8/2019 UDP High Speed
13/45
13
SABUL / UDTRate Control (AIMD)
Y. Gu, X. Hong, M. Mazzucco and R. Grossman, SABUL: A High Performance Data Transfer Protocol,
Submitted for publication.
Y. Gu and R. Grossman, UDT: An Application Level Transport Protocol for Grid Computing, PFLDnet2004.
Increase
If loss rate during the last SYN is less than a threshold (0.1%)
sending rate is increased.
Old version (SABUL) :
New version (UDT) :
Estimated BW calculated using packet-pair technique
Every 16th data packet and its successor are sent back to back
to form packet pair
Receiver uses median filter on interval between arrival times of
each packet pair to estimate link capacity
-
8/8/2019 UDP High Speed
14/45
14
SABUL / UDTRate Control (AIMD)
Y. Gu and R. Grossman, UDT: An Application Level Transport Protocol for Grid Computing, PFLDnet2004.
-
8/8/2019 UDP High Speed
15/45
15
SABUL / UDTRate Control (AIMD)
Decrease
increase inter-packet time by 1/8 (or equivalently, decrease sending
rate by 1/9) for one of these conditions -
if largest lost seq no. in NAK is greater than the largest sent
sequence number when last decrease occurred
if it is the 2dec_countth NAK since last time the above condition is
satisfied. dec_countis reset to 4 each time the first condition is
satisfied, and incremented by 1 each time the second condition is
satisfied.
delay warning is received
Loss information carried in NAK are also compressed, for loss of
consecutive packets.
No data is sent in the next SYN time after a decrease
Delay warning is generated by the rcvr based on observed RTT
trend
-
8/8/2019 UDP High Speed
16/45
16
SABUL / UDTRate Control (AIMD)
Y. Gu and R. Grossman, UDT: An Application Level Transport Protocol for Grid Computing, PFLDnet2004.
Flow Control
Receiver calculates the packet arrival rate (AS) using a median filter
and sends it back with the ACK
On sender side if the AS value in the ack is greater than 0, then
window is updated as
During congestion loss reports can be dropped or delayed. If sender
keeps sending new packets, it worsens congestion. Flow control helps
prevent this.
Flow control also used in the slow start phase starts with flow window of 2
similar to TCP
only beginning of a new session.
-
8/8/2019 UDP High Speed
17/45
17
SABUL / UDTTimers
SYN timer - trigger rate control event (fixed at 0.01s)
SND timer - schedule the data packet sending (updated by rate control
scheme)
ACK timer - trigger an ACK. (same as SYN interval)
NAK timer - Used to trigger a NAK. Its interval is updated to the
current RTT value each time the SYN timer is expired.
EXP timer - Used to trigger data packets retransmission and maintain
connection status. It is somewhat similar to the TCP RTO.
-
8/8/2019 UDP High Speed
18/45
18
SABUL / UDTSimulation Results
100Mbps/
1ms
1Gbps/
100ms
Y. Gu and R. Grossman, Using UDP for Reliable Data Transfer over High Bandwidth-Delay Product
Networks, submitted for publication.
-
8/8/2019 UDP High Speed
19/45
19
SABUL / UDTSimulation Results
Y. Gu and R. Grossman, Using UDP for Reliable Data Transfer over High Bandwidth-Delay Product
Networks, submitted for publication.
7 concurrent
flows
100Mbpsbottleneck link
-
8/8/2019 UDP High Speed
20/45
20
SABUL / UDTSimulation Results
Y. Gu and R. Grossman, Using UDP for Reliable Data Transfer over High Bandwidth-Delay Product
Networks, submitted for publication.
-
8/8/2019 UDP High Speed
21/45
21
SABUL / UDTSimulation Results
Y. Gu and R. Grossman, Using UDP for Reliable Data Transfer over High Bandwidth-Delay Product
Networks, submitted for publication.
-
8/8/2019 UDP High Speed
22/45
22
SABUL / UDTReal Implementation Results
Y. Gu and R. Grossman, Using UDP for Reliable Data Transfer over High Bandwidth-Delay Product
Networks, submitted for publication.
-
8/8/2019 UDP High Speed
23/45
23
SABUL / UDTReal Implementation Results
1Gbps/
40us
Y. Gu and R. Grossman, Using UDP for Reliable Data Transfer over High Bandwidth-Delay Product
Networks, submitted for publication.
-
8/8/2019 UDP High Speed
24/45
24
SABUL / UDTReal Implementation Results
1Gbps/ 110ms
I-TCP = TCP with
concurrent UDT flows
S-TCP = TCP without
concurrent UDT flows
Y. Gu and R. Grossman, Using UDP for Reliable Data Transfer over High Bandwidth-Delay Product
Networks, submitted for publication.
-
8/8/2019 UDP High Speed
25/45
25
SABUL / UDTReal Implementation Results
Y. Gu and R. Grossman, Using UDP for Reliable Data Transfer over High Bandwidth-Delay Product
Networks, submitted for publication.
-
8/8/2019 UDP High Speed
26/45
26
SABUL / UDTConclusions
1http://www.slac.stanford.edu/grp/scs/net/talk03/pfld-feb04.ppt
From one of the SLAC talks 1 - Looks good, BUT 4*CPU Utilization ofTCP
Reordering robustness worse than TCP - all out-of-order packets are treated
as losses. Suggested solution is to delay NAK reports by a short delay.
All losses are treated as congestion - bad performance at high link errorrates. (Better than TCP though, since it does not respond to each and every
loss event).
Router queue size is maintained smaller compared to TCP due to less
burstiness.
Increase algorithm relies on bandwidth estimation - may not be suitable forlinks with large number of concurrent flows.
-
8/8/2019 UDP High Speed
27/45
27
GTPGroup Transport Protocol
Motivated by the following observations about lambda grids
Very high speed (1Gig, 10Gig etc.) dedicated links connecting small
number of end points (eg 103, and not 108) and possibly long delays
(eg. 60ms between experimental sites)
Communication patterns not necessarily just point-to-point ;multipoint-to-point and multipoint-to-multipoint very likely.
Aggregate capacity of multiple connections could be far greater
than data handling speed of end system end point congestion far
more likely than network congestion
-
8/8/2019 UDP High Speed
28/45
28
GTPOverview
Receiver-driven (dumb sender, very smart receiver)
Request-response data transfer model
Rate-based explicit flow control
Receiver-centric max-min fair allocation across multiple flows(irrespective of individual RTTS)
UDP for data, TCP for control connection.
-
8/8/2019 UDP High Speed
29/45
-
8/8/2019 UDP High Speed
30/45
30
GTPFramework (cont.)
Single Flow Controller (SFC) : manages sending data packet requests,
chooses/requests sending rate, manages receiver buffer requirements
Single Flow Monitor (SFM) : Measures flow statistics such as allocated
rate, achieved rate, packet loss rate, rtt estimate etc, which will be used by
both SFC and CE
Capacity Estimator (CE) : Estimates flow capacity for each individual
flow based on statistics from SFM
Max-min Fairness Scheduler : Estimates max-min fair share for each
individual flow
-
8/8/2019 UDP High Speed
31/45
31
GTPFlow Control and Rate Allocation
Single Flow Controller (SFC) :
flow rate adjusted per RTT
loss proportional-decrease and proportional-increase for rate adaptation
Capacity Estimator (CE) : flow rate adjusted per centralized control interval (default 3*RTTmax)
Exponential Increase and loss proportional-decrease
R.X. Wu and A.A. Chien, GTP: Group Transport Protocol for Lambda-Grids, 4th IEEE/ACM International
Symposium on Cluster Computing and the Grid, April 2004. (CCGrid 2004)
-
8/8/2019 UDP High Speed
32/45
32
GTPFlow Control and Rate Allocation (cont.)
Target rate for each flow is
Max-min Fairness Scheduler adjusts the target flow rate to ensure max-min
fairness
R.X. Wu and A.A. Chien, GTP: Group Transport Protocol for Lambda-Grids, 4th IEEE/ACM International
Symposium on Cluster Computing and the Grid, April 2004. (CCGrid 2004)
-
8/8/2019 UDP High Speed
33/45
33
GTPOther Details
Current implementation expects in-order deliver. Can be augmented infuture for handling out-of-order packets.
TCP-Friendliness is tunable by allocating a fixed share of the total
bandwidth for TCP in the CE
Currently congestion detection is only loss based. Future work willaugment the algorithm to include delay-based congestion detection.
Transition management ensures max-min fairness is maintained even
when flows join/leave dynamically.
R.X. Wu and A.A. Chien, GTP: Group Transport Protocol for Lambda-Grids, 4th IEEE/ACM International
Symposium on Cluster Computing and the Grid, April 2004. (CCGrid 2004)
-
8/8/2019 UDP High Speed
34/45
34
GTPSimulation Results
R.X. Wu and A.A. Chien, GTP: Group Transport Protocol for Lambda-Grids, 4th IEEE/ACM International
Symposium on Cluster Computing and the Grid, April 2004. (CCGrid 2004)
-
8/8/2019 UDP High Speed
35/45
35
GTPSimulation Results (Cont.)
R.X. Wu and A.A. Chien, GTP: Group Transport Protocol for Lambda-Grids, 4th IEEE/ACM International
Symposium on Cluster Computing and the Grid, April 2004. (CCGrid 2004)
-
8/8/2019 UDP High Speed
36/45
36
GTPSimulation Results (Cont.)
R.X. Wu and A.A. Chien, GTP: Group Transport Protocol for Lambda-Grids, 4th IEEE/ACM International
Symposium on Cluster Computing and the Grid, April 2004. (CCGrid 2004)
-
8/8/2019 UDP High Speed
37/45
37
GTPEmulation Results
R.X. Wu and A.A. Chien, GTP: Group Transport Protocol for Lambda-Grids, 4th IEEE/ACM International
Symposium on Cluster Computing and the Grid, April 2004. (CCGrid 2004)
-
8/8/2019 UDP High Speed
38/45
38
GTPEmulation Results (Cont.)
R.X. Wu and A.A. Chien, GTP: Group Transport Protocol for Lambda-Grids, 4th IEEE/ACM International
Symposium on Cluster Computing and the Grid, April 2004. (CCGrid 2004)
-
8/8/2019 UDP High Speed
39/45
39
GTPReal Implementation Results
R.X. Wu and A.A. Chien, GTP: Group Transport Protocol for Lambda-Grids, 4th IEEE/ACM International
Symposium on Cluster Computing and the Grid, April 2004. (CCGrid 2004)
-
8/8/2019 UDP High Speed
40/45
40
GTPReal Implementation Results (Cont.)
R.X. Wu and A.A. Chien, GTP: Group Transport Protocol for Lambda-Grids, 4th IEEE/ACM International
Symposium on Cluster Computing and the Grid, April 2004. (CCGrid 2004)
-
8/8/2019 UDP High Speed
41/45
41
GTPReal Implementation Results (Cont.)
R.X. Wu and A.A. Chien, GTP: Group Transport Protocol for Lambda-Grids, 4th IEEE/ACM International
Symposium on Cluster Computing and the Grid, April 2004. (CCGrid 2004)
-
8/8/2019 UDP High Speed
42/45
42
Questions ???
-
8/8/2019 UDP High Speed
43/45
43
Extra Slides
-
8/8/2019 UDP High Speed
44/45
44
Scatter/Gather DMA
Optimization for improving network stack processing
Under normal circumstances, data is copied between kernel and app memory
This is required because the network device drivers read/write contiguous
memory locations, whereas applications use mapped virtual memory
When the NIC drivers are capable of scatter/gather DMA a scatter/gather listis maintained so that the NICS can do direct read/write to the final memory
location where the data is intended to go. The scatter/gather data structure
makes the memory look contiguous to the NID drivers
All protocol processing is done by reference. Eliminating the memory copy
has shown to improve performance dramatically
In practice, the process is a little more complicated. At the send side copy-on-
write should be enforced so that packets sent out but not acknowledged are not
overwritten. At the recv side, page borders should be enforced .
-
8/8/2019 UDP High Speed
45/45
45
Packet Pair BW Estimation
Two packets of same size (L) are transmitted back to back
Bottleneck link capacity (C) is smaller than the capacity of all other the
links (by definition)
Packets face transmission delay at the bottleneck link
As a result at the receiver they arrive with larger inter-packet delay than
when they were sent
This delay can be used to computer the bottleneck link capacity
(Makes lots of assumptions. Also will work only with FIFO queuing)