Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger...
-
Upload
chloe-wiggins -
Category
Documents
-
view
216 -
download
0
Transcript of Computer Networks Group Universität Paderborn Computer Networks Chapter 8: Transport layer Holger...
Computer Networks GroupUniversität Paderborn
Computer NetworksChapter 8: Transport layer
Holger Karl
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 2
Goal of this chapter
We will discuss the mechanisms in addition to congestion control that are required to build a reliable, connection-oriented, end-to-end transport protocol
With TCP as the evident case study, intermixed with the presentation
We will draw heavily on the material from the link layer, as there many similar problems are solved in similar fashion – e.g., sliding window for error control
A brief discussion of other transport protocols complements the chapter
Namely, UDP and RPC
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 3
Overview
Services & addressing Connection setup Connection release Flow control Timer and timeouts TCP summary UDP & RPC Performance issues
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 4
Transport protocols – Services to upper layer
Connection-less and connection-oriented transport service Connection management necessary as auxiliary service – setup
and teardown
Reliable or unreliable transport In-order, all packets
Be a good citizen in the network – perform congestion control
Allow several transport endpoints in a single host Demultiplexing
Support different interaction models Byte stream, messages, remote procedure invocation
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 5
Addressing
Provide multiple service access points to multiplex several applications
SAPs can identify connections or data flows
E.g., “port numbers” Dynamically allocated Predefined for “well-
known services” – port 80 for Web server
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 6
Overview
Services & addressing Connection setup Connection release Flow control Timer and timeouts TCP summary UDP & RPC Performance issues
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 7
Connection establishment
How to establish a joint context, a connection, between sender and receiver?
Note: only relevant in end-system, network layer assumed to be connection-less
Naïve solution: Sender sends a CONNECTION REQUEST Receiver answers by a CONNECTION
ACCEPTED Sender proceeds once that message is
received
CONNECTIONREQUEST
CONN
ACCEPTED
DATA
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 8
Failure of naïve solution
Naïve solution fails in realistic networks
Where packets can be lost, stored/reorder, and duplicated
Example failure scenario All packets are duplicated and
delayed Due to congestion, errors, ... and
re-routing, e.g. In a banking application, e.g. Result: two independent
transactions performed, where only one was intended
Problem are delayed duplicates
Transfer money CONNECTIONREQUEST
CONN
ACCEPTED
TRANSFER
OK
OK
CONN
ACCEPTED
??? DROP
OK
??? DROP
OK
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 9
Adding an additional handshake
First idea: the sender has to re-confirm to the receiver that it actually wants to set up this connection
! Add a third message to the connection setup phase
Three-way handshake This third message can
already carry data
CONNECTIONREQUEST
CONN
ACCEPTED
CONN CONFIRMED
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 10
Is three-way handshake sufficient?
No, it does not protect against delayed duplicates! Problem: If both the connection request and the connection
confirmation are duplicated and delayed, receiver again has no way to ascertain whether this is fresh or an old copy
Solution: Have the sender answer a question that the receiver asks!
Actually: Put sequence numbers into connection request connection acknowledgement, and connection confirmation
Have to be copied by the receiving party to the other side
Connection only established if the correct number is provided
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 11
Three-way handshake + sequence numbers
Two examples for critical cases (which are handled correctly):
Connection request appears as an old duplicate:
Connection request & confirmation appear as old duplicates:
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 12
Connection setup – further issues
Further problems due to crashing nodes, need some memory
Sequence numbers negotiated here are also used for the following data packets
Terminology for TCP Connection setup – SYN (synchronize) packet Connection accepted – SYN/ACK packet (because both the
previous sequence number is acknowledged and a new sequence number from the receiver is proposed)
Connection confirmation – ACK packet (combined with DATA, as a rule)
Problem: SYN attack – flooding with nonsense SYN packets
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 13
Connection identification in TCP
A TCP connection is setup Between a single sender and a single receiver More precisely, between application processes running on these
systems TCP can multiplex several such connections over the network
layer, using the port numbers as Transport SAP identifiers
A TCP connection is thus identified by a four tuple:
(Source Port, Source IP Address, Destination Port, Destination IP Address)
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 14
Overview
Services & addressing Connection setup Connection release Flow control Timer and timeouts TCP summary UDP & RPC Performance issues
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 15
Connection release
Once connection context between two peers is established, releasing a connection should be easy
Goal: Only release connection when both peers have agreed that they have received all data and have nothing more to say
I.e., both sides must have invoked a “Close”-like service primitive
It fact, it is impossible Problem: How to be sure that the other peer knows that you know
that it knows that you know … that all data have been transmitted and that the connection can now safely be terminated?
Analogy: Two army problem
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 16
Two army problem
Coordinated attack Two armies form up for an attack against each other One army is split into two parts that have to attack together – alone they
will lose Commanders of the parts communicate via messengers who can be
captured
Which rules shall the commanders use to agree on an attack date? Provably unsolvable if the network can loose messages
How to coordinate?
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 17
Connection release in practice
Two army problem equivalent to connection release But: when releasing a connection, bigger risks can be
taken
Usual approach: Three-wayhandshake again
Send disconnect request (DR), set timer, wait for DR from peer, acknowledge it
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 18
Problem cases for connection release with 3WHS
Lost ACK solved by (optimistic) timer in Host 2
Lost second DR solved by retransmission of first DR
Timer solves (optimistically) case when 2nd DR and ACK are lost
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 19
State diagram of a TCP connection
TCP uses three-way handshake for connection setup and release FIN, FIN/ACK for release Complicated by ability to have
asymmetric, have-open connections and data in transit while close is in progress
(Partial) state diagram Expresses both
“server” and “client”aspects
Each has a separate copy
Legend: event/action, segments all caps, local events normal capitalization
CLOSED
LISTEN
SYN_RCVD SYN_SENT
ESTABLISHED
CLOSE_WAIT
LAST_ACKCLOSING
TIME_WAIT
FIN_WAIT_2
FIN_WAIT_1
Passive open Close
Send/SYNSYN/SYN + ACK
SYN + ACK/ACK
SYN/SYN + ACK
ACK
Close/FIN
FIN/ACKClose/FIN
FIN/ACKACK + FIN/ACK Timeout after two segment lifetimes
FIN/ACK
ACK
ACK
ACK
Close/FIN
Close
CLOSED
Active open/SYN
CLOSED
LISTEN
SYN_RCVD SYN_SENT
ESTABLISHED
CLOSE_WAIT
LAST_ACKCLOSING
TIME_WAIT
FIN_WAIT_2
FIN_WAIT_1
Passive open Close
Send/SYNSYN/SYN + ACK
SYN + ACK/ACK
SYN/SYN + ACK
ACK
Close/FIN
FIN/ACKClose/FIN
FIN/ACKACK + FIN/ACK Timeout after two segment lifetimes
FIN/ACK
ACK
ACK
ACK
Close/FIN
Close
CLOSED
Active open/SYN
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 20
Overview
Services & addressing Connection setup Connection release Flow control Timer and timeouts TCP summary UDP & RPC Performance issues
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 21
Flow control
Recall: Flow control = prevent a fast sender from overrunning a slow receiver
Similar issue in link and transport layer
In transport layer more complicated Many connections, need to adapt the amount of buffer per
connection dynamically (instead of just simply allocating a fixed amount of buffer space per outgoing link)
Transport layer PDUs can differ widely in size, unlike link layer frames
Network’s packet buffering capability clouds the picture
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 22
Flow control – buffer allocation
In order to support outstanding packets, the sender either Has to rely on the receiver to process packets as they come in
(packets must not be reordered) – unrealistic, or Has to assume that the receiver has sufficient buffer space
available
The more buffer, the more outstanding packets Necessary to obtain highly efficient transmission, recall bandwidth-
delay product!
How does sender have buffer assurance? Sender can request buffer space Receiver can tell sender: “I have X buffers available at the
moment” For sliding window protocols: Influence size of sender’s send window
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 23
Example of flow control with ACK/permit separation
Arrows show direction of transmission, “…” indicates lost packet Potential deadlock in step 16 when control PDU is lost and not
retransmitted
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 24
Flow control – permits and acknowledgements
Distinguish Permits (“Receiver has buffer space, go ahead and send more
data”) Acknowledgements (“Receiver has received certain packets”)
Should be separated in real-world protocols!
Can be combined with dynamically changing buffer space at the received
Due to, e.g., different speed with which the application actually retrieves received data from the transport layer
Example: TCP
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 25
Send and receive buffers in TCP
TCP maintains buffer at Sender, to assist in error control Receiver, to store packets not yet retrieved by application or
received out of order Old TCP implementations used GoBack-N, discarded out-of-order
packets
Sending application
LastByteWrittenTCP
LastByteSentLastByteAcked
Receiving application
LastByteReadTCP
LastByteRcvdNextByteExpected
Sender Receiver
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 26
TCP flow control: Advertised window
In TCP, receiver can advertise size of its receiving buffer Buffer space occupied:
(NextByteExpected-1) – LastByteRead Maximum buffer space available: MaxRcvdBuffer Advertised buffer space (Advertised window):
MaxRcvdBuffer – ((NextByteExpected-1) – LastByteRead)
Recall: Advertised window limits the amount of data that a sender will inject into the network
TCP sender ensures that LastByteSent – LastByteAcked · AdvertisedWindow
Equvialently:
EffectiveWindow = AdvertisedWindow – (LastByteSent – LastByteAcked)
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 27
Nagle’s algorithm – self-clocking and windows
Recall TCP self-clocking: Arrival of an ACK is an indication that new data can be injected into the network
What happens when an ACK for only small amount of data (e.g., 1 byte arrives)?
Send immediately? Network will be burdened by small packets (“silly window syndrome”)
Nagle’s algorithm describes how much data TCP is allowed to send
When application produces data to sendif both available data and advertised window · MSS send a full segmentelse if there is unacked data in flight, buffer new data until
MSS is full else send all the new data now
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 28
Overview
Services & addressing Connection setup Connection release Flow control Timer and timeouts TCP summary UDP & RPC Performance issues
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 29
Timeout computations
Timeouts necessary to protect against lost packets Timeouts should reflect round-trip time between sender
and receiver Problem: RTTs can be highly variable, range over several orders
of magnitude Has to be adapted dynamically
Simple approach: Keep a running average of RTTs Computed by an autoregressive model:
EstimatedRTTn = EstimatedRTTn-1 + (1-) RTTSamplen
Timeout = 2 EstimatedRTTas a conservative choice
Parameter smoothes estimation ( = 0.8 … 0.9, typically)
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 30
Problem: ACKs do not refer to a transmission!
Simple algorithm described above cannot obtain correct RTT samples if packets have been retransmitted
ACKs refer to data/sequence numbers, not to individual packets!
Two examples: Sender Receiver
Original transmission
ACK
Sam
pleR
TT
Retransmission
Sender Receiver
Original transmission
ACK
Sam
pleR
TT
Retransmission
Solution: Do not take RTT samples for retransmitted packets (Karn/Partridge algorithm)
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 31
Problem: Variance not considered
Previous estimation does not consider variance of RTTs If low, estimate is good; no need to set timeout = 2 * RTT If high, be more careful when computing timeout from RTTs;
estimate is likely to be of low quality
Jacobsen/Karels algorithm for new estimation: Differencen = SampleRTTn – EstimatedRTTn-1
EstimatedRTTn = EstimatedRTTn-1 + ( Differencen)
Deviationn = Deviationn-1 + (|Difference|n – Deviationn-1)
Timeoutn = EstimatedRTTn + Deviationn
2 (0,1) a constant, typically = 1, = 4 (in TCP) Deviation is an upper bound of standard deviation, but easier to
compute
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 32
Timer and packet loss
What happens after a packet loss? Recall Ethernet behavior: Become more and more careful, back off Same idea here: After packet loss detected by timeout, use
successively larger timeout values Exponential backoff: Double timeout value for each additional
retransmission
The estimation of RTT and “basic” timeout value is performed as described on previous slide
The multiplicative factor for exponential backoff is reset upon ACK arrival
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 33
Timeouts & timer granularity
Timeout calculation can suffer from coarse-grained timer resolution in the operating system
Example: some Unixes only have 500 ms resolution
Also: firing a timer heavily depends on relatively precise timer interrupts to be available
Sometimes, timeouts can happen only seconds after a packet has been transmitted – far too sluggish
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 34
Overview
Services & addressing Connection setup Connection release Flow control Timer and timeouts TCP summary UDP & RPC Performance issues
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 35
TCP – Summary
TCP consists of Reliable byte stream – error control via GoBack-N
or Selective Repeat (depending on version) Congestion control – window-based, AIMD, slow start, congestion
threshold Flow control – advertised receiver window Connection management – three-way handshake for setup and
teardown
TCP is perhaps the single most complicated and subtle protocol in the internet
Many little details and extensions are not discussed here Interaction of TCP with other layers is more complicated than it
looks (e.g., wireless) because of hidden, implicit assumptions
pRTT
MSSBW
*
*93.0
• max. TCP BandWidth• Max. Segment Size• Round Trip Time• loss probability
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 36
TCP fairness & TCP friendliness
TCP attempts to Adjust dynamically to the available bandwidth Fairly share this bandwidth among all connections I.e.: If n connections share a given bottleneck link, each
connection obtains 1/n of its bandwidth
Interaction with other protocols Bottleneck bandwidth depends on load of other protocols as well,
e.g., UDP – which is NOT congestion-controlled I.e., UDP traffic can “push aside” TCP traffic
Consequence: Transport protocols should be TCP friendly They should not consume more bandwidth than a TCP connection
in a comparable situation
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 37
TCP header
Options (variable)
Data
Checksum
SrcPort DstPort
HdrLen 0 Flags
UrgPtr
AdvertisedWindow
SequenceNum
Acknowledgment
0 4 10 16 31
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 38
Overview
Services & addressing Connection setup Connection release Flow control Timer and timeouts TCP summary UDP & RPC Performance issues
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 39
A simple demultiplexer transport protocol – UDP
An example for an unreliable, datagram protocol: User Datagram Protocol (UDP)
Only real function: (de)multiplex several data flows onto the IP layer and back to the applications
In addition: ensures packet’s correctness Checksum out of UDP header + data + “pseudoheader” (pivotal IP
header fields)
SrcPort DstPort
Checksum Length
Data
0 16 31
UDP header
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 40
Remote Procedure Call (RPC)
UDP and TCP have essentially simple semantics: Transport data from A to B
A bit like “goto” in sequential languages: no structure in data flow
More complex interactions can be directly captured in communication primitives
Example: Invocation of a procedure as a primitive, as opposed to two separate gotos
! Remote procedure call Remote object invocation is really the same thing from a
communication perspective
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 41
Remote procedure call – Structure
Important goal: transparency to caller/callee Achieved (partially) by stubs
Client Server
Request
Reply
Computing
Blocked
Blocked
Blocked
Caller(client)
Clientstub
RPCprotocol
Returnvalue
Arguments
ReplyRequest
Callee(server)
Serverstub
RPCprotocol
Returnvalue
Arguments
ReplyRequest
Network
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 42
Overview
Services & addressing Connection setup Connection release Flow control Timer and timeouts TCP summary UDP & RPC Performance issues
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 43
A few words on performance
Performance is still important … Measuring performance
Measure relevant metrics Make sure that the sample size is big enough Make sure that samples are representative Be careful when using a coarse-grained clock Be sure that nothing unexpected is going on during your tests Caching can wreak havoc with measurements Understand what your are measuring and what is going on Be careful about extrapolating results Use a proper experimental design to finish in one human lifetime
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 44
System design for better performance
CPU speed is more important than network speed Or interfaces like NIC $ CPU
Reduce packet count to reduce software overhead Minimize context switches Minimize copying You can buy more bandwidth but not lower delay Avoiding congestion is better than recovering from it Avoid timeouts Optimize for the common case Depending on network: design for speed, not utilization
Think about the right performance metric, it can be, e.g., energy efficiency rather than throughput or delay
WS 05/06, v 1.1 Computer Networks - Ch. 8 - Transport layer 45
Conclusion
Transport protocols can be anything from trivial to highly complex, depending on the purpose they serve
They determine to a large degree the dynamics of a network and – in particular – its stability
It is trivial to build “faster” than TCP protocols, but they are unstable
The interdependencies of various mechanisms in a transport protocols can be very subtle with big consequences
E.g., fairness, coexistence of different TCP versions