The aim of this unit is to review the main concepts ...jamhour/Download/pub... · The first...
Transcript of The aim of this unit is to review the main concepts ...jamhour/Download/pub... · The first...
![Page 1: The aim of this unit is to review the main concepts ...jamhour/Download/pub... · The first difference between TCP and UDP refers to the presence or absence of connection. A connection](https://reader033.fdocuments.in/reader033/viewer/2022052017/60304665cc343633951db6e7/html5/thumbnails/1.jpg)
1
The aim of this unit is to review the main concepts related to TCP and UDP transport
protocols, as well as application protocols. These concepts are important requirements for
developing programs that communicates through an IP network. They are also import to
understand the operation of Proxy and NAT, as well as the operation of packet filters,
firewalls and other security mechanisms that will be covered later in this course.
![Page 2: The aim of this unit is to review the main concepts ...jamhour/Download/pub... · The first difference between TCP and UDP refers to the presence or absence of connection. A connection](https://reader033.fdocuments.in/reader033/viewer/2022052017/60304665cc343633951db6e7/html5/thumbnails/2.jpg)
2
The TCP/IP architecture consists of three layers: Application, Transport and Network. The
protocols used in the TCP/IP architecture are standardized and published by an entity
called IETF (Internet Engineering Task Force). Documents generated by the IETF are
called RFC (Request for Comments) and describe in detail the operation of the protocols.
All RFCs are accessible for free in the www.ietf.org site. Lower layers (Data Link and
Physical) are not considered part of the TCP/IP architecture as they are defined by another
entity (usually the IEEE - Institute of Electrical and Electronics Engineers). The TCP/IP
architecture defines two transport protocols: TCP (Transmission Control Protocol) and
UDP (User Datagram Protocol).
As belong to the same layer, TCP and UDP protocols cannot be used at the same time. TCP
and UDP protocols are implemented by the operating system. This greatly simplifies the
development of applications running on the network, because the details of each protocol
can be hidden from the application.
It is up to the application to decide which transport protocol will be used. This is done
through a standard interface with the operating system called "sockets". This interface
defines a set of APIs (standard function calls) for mapping applications on port numbers
and sending and receiving packets. The choice of the transport protocol depends heavily on
the goals of the application, as will be discussed in the sequence of this unit.
![Page 3: The aim of this unit is to review the main concepts ...jamhour/Download/pub... · The first difference between TCP and UDP refers to the presence or absence of connection. A connection](https://reader033.fdocuments.in/reader033/viewer/2022052017/60304665cc343633951db6e7/html5/thumbnails/3.jpg)
3
Before starting the discussion regarding the differences between TCP and UDP protocols,
lets discuss their similarities.The goal of both protocols is to provide a mechanism to
address processes in an operating system. As we have seen earlier in this course, this is
done by using 16-bit addresses, called number ports.
The way ports are mapped to the processes is defined by the socket interface. A process
(user level application) may or may not choose a port number when it starts using an API
called BIND. If the process is started without the BIND call, a random port (usually
between 1024 and 65535) will be assigned by the operating system. The operating system
ensures that a unique port is assigned to any process that communicates through the same
interface (IP address).
Usually, a client processes does not perform a bind operation. Server processes, by the
other hand, always do a bind because the port number cannot be random (since clients have
to address them). The port used by the server processes depends on the type of application
it represents. It belongs to the range of well-known ports (0-1023) for standard Internet
applications (such as web, email, and others) that requires “root” privileges to be executed.
Otherwise, they belong to the range of registered ports (1024-49151) for applications that
do not require root privileges, and are proprietary to specific vendors (such as Databases
Management Systems).
![Page 4: The aim of this unit is to review the main concepts ...jamhour/Download/pub... · The first difference between TCP and UDP refers to the presence or absence of connection. A connection](https://reader033.fdocuments.in/reader033/viewer/2022052017/60304665cc343633951db6e7/html5/thumbnails/4.jpg)
4
The TCP and UDP protocols are very different. UDP is very simple, and virtually provides
only the service of addressing processes through port numbers. By the other hand, TCP is a
very sophisticated protocol that performs various operations for an application, such as
automatic confirmation of reception and automatic retransmission of lost packets.
The first difference between TCP and UDP refers to the presence or absence of connection.
A connection is established by an exchange of control packets between a client and the
server, and occurs before the first data packet is transmitted. The TCP uses control packets
to create, monitor, and terminate connections. A connection is a fundamental requirement
to perform many of the services offered by
TCP. UDP transmits data packets only.
The second difference refers to the way data is fragmented into packets. In TCP, an
application does not need to control how much data can fit in one packet. It can simply
transmit the data in a continuous flow (stream) of bytes, because the operating system
(O.S). decides when there are enough bytes to create packets. The OS on the receiver
reassembles packets transparently to the application that receives the data as a stream of
bytes. In the case of UDP, it is up to the application to provide to the O.S. an amount of
data that fits in a packet.
The third difference refers to the control of reception and retransmission of lost packets. In
the case of TCP, packets are confirmed by the receiver. If they are not confirmed, they are
automatically retransmitted by the O.S., without intervention of the application. In the case
of UDP, the detection and retransmission of lost packets, when required, must be
performed by the application.
![Page 5: The aim of this unit is to review the main concepts ...jamhour/Download/pub... · The first difference between TCP and UDP refers to the presence or absence of connection. A connection](https://reader033.fdocuments.in/reader033/viewer/2022052017/60304665cc343633951db6e7/html5/thumbnails/5.jpg)
5
TCP implements two sophisticated mechanisms not implemented by UDP: flow control and congestion control.
Flow control is an automatic packet rate adjustment performed by the transmitter, which reduces its transmission rate to prevent packet loss on the receiver due to buffer overflow. This is necessary when receiver is not able to read the packets at the rate sent by the transmitter. The packets are first received by the O.S. (operating system) and stored in a buffer. If the application does not read the bytes from the buffer with enough speed, the result may be buffer overflow.
Congestion control is also a packet rate adjustment, but caused by packet loss in the network. When TCP detects packet loss, it assumes that routers had to drop packets because the network is congested. This mechanism was implemented in the early days of the Internet when it was realized that the automatic retransmission of packets without congestion control could lead to a fast collapse of the network.
The additional features offered by TCP over UDP have a cost: TCP only supports unicast transmissions. That is, you cannot use broadcast or multicast addresses on TCP connections. This happens, among other reasons, because the TCP packets need to be confirmed by the recipient. Confirmation is not possible when generic destination addresses are used because the sender does not know how many recipients it must wait for confirmation. All applications that need to transmit broadcast or multicast packets need to use UDP. TCP is also more costly in terms of use of C.P.U. for the O.S. and also in terms of the volume of data transmitted over the network. It is not indicated for applications that transmit only a few amount of packets or messages that cannot be delayed.
![Page 6: The aim of this unit is to review the main concepts ...jamhour/Download/pub... · The first difference between TCP and UDP refers to the presence or absence of connection. A connection](https://reader033.fdocuments.in/reader033/viewer/2022052017/60304665cc343633951db6e7/html5/thumbnails/6.jpg)
6
The TCP PDU (protocol data unit) is called segment. The TCP header is shown in the
figure. In addition to the ports numbers of origin and destination, the remaining fields of
the TCP header are related functions provided by the protocol.
The Sequence Number and Acknowledgement Number fields are related to the mechanism
of reliable transmission.
The Receive Window field is related to the mechanism of flow control. The Flags field
contains a set of control bits used to control TCP connection and also the reliable
transmission process.
The Urgent Pointer field is rarely used in practice. It allows you to tell the receiver that
some data should be processed with more priority, passing in front of other data that is
already buffered waiting for processing. The Options field is not required and is often
omitted from the TCP header. The TCP header has a variable size because the options field
is optional. Therefore, HLEN field defines the size of the header in 4-byte words.
![Page 7: The aim of this unit is to review the main concepts ...jamhour/Download/pub... · The first difference between TCP and UDP refers to the presence or absence of connection. A connection](https://reader033.fdocuments.in/reader033/viewer/2022052017/60304665cc343633951db6e7/html5/thumbnails/7.jpg)
7
When using TCP, the decision of when a segment is created and transmitted is done by the
protocol and not by the application. This strategy is called the flow (streaming)
transmission.Using the Sockets API, the application sends a continuous stream of bytes to
the operation system (O.S.). Each “send” API call does not necessarily generate a packet.
TCP may wait for a reasonable number of bytes in the transmission buffer, to avoid
generating too many packets of small size.
The package size is a compromise between minimizing the number of packets transmitted
and not causing excessive delay when the volume of data to be transmitted is small.
Theoretically, the maximum size of an IP packet is 64 Kbytes (less the size of the TCP
header). At first this would be about the amount of data that TCP should wait before
generating a packet.
In modern O.S., however, the amount of bytes that TCP accumulates is defined according
to the MTU of the network adaptor (1500 bytes for Ethernet). This is done to prevent
packets to be fragmented by the IP layer. The maximum size of a segment is called MSS
(Maximum Segment Size). It corresponds to 1460 bytes (1500 bytes - 20 bytes of IP header
- 20 bytes of the TCP header) in the Ethernet technology.
![Page 8: The aim of this unit is to review the main concepts ...jamhour/Download/pub... · The first difference between TCP and UDP refers to the presence or absence of connection. A connection](https://reader033.fdocuments.in/reader033/viewer/2022052017/60304665cc343633951db6e7/html5/thumbnails/8.jpg)
8
In order to make the process of segmentation and reassembly transparent to applications,
the TCP header includes information necessary for the O.S. to reassembly the data in the
receiver in the same order it was sent by the transmitter. A TCP connection is identified by
four addresses: Source IP, Source Port Number, Destination IP, and Destination Port
Number.
As illustrated in the figure, after establishing a TCP connection, all segments are numbered
using the "Sequence Number" field. The sequence number indentifies the first byte in a
segment, but the first segment in a connection does not start in ZERO. Instead, an initial
sequence number (ISN) is chosen randomly. A different ISN is chosen for each connection.
If the same pair of computers (A, B) terminates a connection and starts another
immediately, another ISN is used. The value of ISN is also unidirectional, i.e., a ISN is
used for the packet flow from A to B and another from B to A.
The justification to use a random ISN is to avoid an erroneous packet reassembly when a
connection terminates and is immediately re-established using the same port numbers.
Without a random ISN , due to network delay, packets of the previously connection could
be confused with the packages of the new connection.
![Page 9: The aim of this unit is to review the main concepts ...jamhour/Download/pub... · The first difference between TCP and UDP refers to the presence or absence of connection. A connection](https://reader033.fdocuments.in/reader033/viewer/2022052017/60304665cc343633951db6e7/html5/thumbnails/9.jpg)
9
TCP implements a reliable communication process called "retransmission in absence of
confirmation". As the figure shows, TCP the sequence number (SEQ) and confirmation
number (CONF) numbers to implement this strategy. The SEQ field always indicates the
first byte of the segment being transmitted. The CONF field indicates the next byte that the
sender expects to receive from its peer. The CONF field has the implied meaning of
confirming the receipt of all bytes preceding the CONF number. That is, if a peer sends a
segment with the CONF= 2000, it acknowledges the receipt of all bytes until 1999.
TCP does not necessary to send control packets only to confirm the receipt of data. It uses
a strategy in which the same segment used to transmit new data also confirms the bytes
already received. To illustrate this concept, assume that a client has to transmit two
segments: segment 1 (bytes 1000-1499) and segment 2 (bytes 1500-1799). The server also
has to transmit two segments: The segment A (bytes 2000-2099) and segment B (bytes
2100-2989). The client transmits the first segment (500 bytes) with SEQ=1000 and
CONF=2000. This means it is confirming to the server the receipt of all bytes until 2000.
The server responds with SEQ=2000 and CONF=1500. The SEQ field exactly matches the
next byte expected by the client and the CONF field confirms the receipt of the bytes
corresponding to segment 1. The process continues as indicate in the figure.
![Page 10: The aim of this unit is to review the main concepts ...jamhour/Download/pub... · The first difference between TCP and UDP refers to the presence or absence of connection. A connection](https://reader033.fdocuments.in/reader033/viewer/2022052017/60304665cc343633951db6e7/html5/thumbnails/10.jpg)
10
The retransmission technique used by TCP is based in a positive acknowledgement with
temporization. TCP has no error messages. If an acknowledgment does not arrive at the
transmitter in a given time, the segment is retransmitted. The receiver can send packets
without data, only with confirmation, when it has nothing to transmit. The maximum time
to wait for an acknowledgment is estimated based on the average Round-Trip Time (RTT)
to send and confirm a segment. The transmitter can adopt several techniques to estimate
the RTT. A common strategy is as follows:
EstimatedRTT = 0.875 EstimatedRTT + 0.125 SampleRTT
Timer = EstimatedRTT + 4 . Deviation
Deviation= 0.875 Desvio + 0.125 (SampleRTT – EstimatedRTT)
where:
SampleRTT: last measure of RTT
Temporizador: maximum time to wait a confirmation
Deviation: a measure of the fluctuation of the RTT
The receiver does not confirm segments received out of order. Instead, for each segment
received out of order, the receiver repeats the confirmation number of the last segment
received in the correct sequence. If the transmitter receives three segments with the same
acknowledgment number, it retransmits all segments not confirmed yet. This technique is
called fast retransmission (retransmission before timeout of the retransmission timer). The
figure presents a summary of key recommendations on the operation of TCP, described in
RFCs 1122 and 2581.
![Page 11: The aim of this unit is to review the main concepts ...jamhour/Download/pub... · The first difference between TCP and UDP refers to the presence or absence of connection. A connection](https://reader033.fdocuments.in/reader033/viewer/2022052017/60304665cc343633951db6e7/html5/thumbnails/11.jpg)
11
ACK, SYN and FIN bits (defined in the FLAGS field of the TCP header) are used to
control the opening and closing of TCP connections. The ACK flag is the confirmation of
receipt. It is always ZERO in the first segment sent by the client (because there is nothing
to confirm) and 1 everywhere else. The SYN flag controls the ISN synchronization. It is
ONE in the first two segments exchanged between the client and the server, and zero
everywhere else. The FIN flag is the termination flag. It is ONE to indicate that a
connection must be terminated.
The beginning of a TCP connection defines the ISN (Initial Sequence Numbers) used by
the client and the server. This involves the exchange of three segments:
1) The client sends a request to open a connection (SYN segment). This segment defines
the initial value of the sequence number of the client (C_ISN), and is identified by the
flags SYN=1 and ACK=0.
2) The server confirms the connection (SYNACK segment). This segment reports the ISN
of the server (S_ISN), and is identified by the flags SYN=1 and ACK=1.
3) The client sends the confirmation of receipt of SYNACK segment. After this stage,
data can be exchanged indefinitely between client and server. Note that during the
exchange of data SYN=0 and ACK=1.
A connection may be closed by the initiative of the client or the server. A TCP connection
is bidirectional, so closing a connection requires both client and server to send termination
requests. Closing a connection needs four segments. In the example, the client initiates the
procedure to close a connection. The client sends a segment with FIN=1. The server
![Page 12: The aim of this unit is to review the main concepts ...jamhour/Download/pub... · The first difference between TCP and UDP refers to the presence or absence of connection. A connection](https://reader033.fdocuments.in/reader033/viewer/2022052017/60304665cc343633951db6e7/html5/thumbnails/12.jpg)
12
The reception process is transparent to the application, since TCP is implemented by the
O.S. When a TCP segment is transmitted, it is first received by operating system (O.S.) and
stored in a buffer. However, this buffer has limited capacity. The receiver application must
be able to remove the data from the buffer at a rate compatible with the rate of the
transmitter. If the receiver is very slow (or the application is poorly written), the receiver
buffer may be overloaded. When the buffer is full, the O.S. discard all segments received.
In this condition, the automatic retransmission of lost packets will probably worse the
situation, for both, the receiver and the network.
Flow control is a TCP mechanism that prevents this from happening. According to this
mechanism, the receiver informs along with any segment confirmation the amount of
buffer that it still has available using the Receive Window (RcvWindow) field of the TCP
header.
To illustrate how flow control works, consider the scenario of the figure, where computer A
is the transmitter and computer B is the receiver. The algorithm for calculating the Receive
Window uses three parameters:
RcvBuffer = reception buffer of B
LastByteRead = last by read by the B application
LastByteRcvd = last by received by the B O.S.
The receive window sent from B to A is defined as:
RcvWindow = RcvBuffer - [LastByteRcvd - LastByteRead]
![Page 13: The aim of this unit is to review the main concepts ...jamhour/Download/pub... · The first difference between TCP and UDP refers to the presence or absence of connection. A connection](https://reader033.fdocuments.in/reader033/viewer/2022052017/60304665cc343633951db6e7/html5/thumbnails/13.jpg)
13
In practice, TCP requires another window which limits the transmission rate. This window
is called Congestion Window (CongWin). Unlike the Receive Window, the CongWin does
not have a corresponding field in the TCP header. It is calculated internally by the O.S. of
the transmitter based on the success or failure of the segment transmissions. Every time a
segment is lost, TCP assumes that the network is congested and tries to reduce the rate of
the transmitter. When segments are transmitted successfully, the CongWin is
increased. The CongWin is calculated in multiples of MSS (Maximum Segment Size =
1460 bytes).
The figure illustrates how the CongWin evolves over time. Initially, the window is set to 1
MSS and it is doubled at every segment successfully confirmed. This process is called
“exponential growth” and continues until a certain Threshold is achieved. From that point,
the CongWin enters in a “congestion avoidance” phase, where the growth is slower (just a
1 MSS at every successful confirmation). In case of failure, the CongWin and Threshold
are reduced by half.
The algorithm to compute the CongWin can be summarized as follows:
a) Initialization:
CongWin = 1 MSS (Maximum Segment Size = 1460 bytes)
Threshold = 65 kbps
b) Exponential Growth Phase:
At each successful segment aknowledge:
if CongWin < Threshold : CongWin = CongWin + MSS
i.e., CongWin= Congwin*2 per RTT
Otherwise go to congestion avoidance: CongWin = CongWin + (MSS/CongWin)
i.e., CongWin = CongWin + 1 MSS per RTT
c) When a segment is received out of order:
![Page 14: The aim of this unit is to review the main concepts ...jamhour/Download/pub... · The first difference between TCP and UDP refers to the presence or absence of connection. A connection](https://reader033.fdocuments.in/reader033/viewer/2022052017/60304665cc343633951db6e7/html5/thumbnails/14.jpg)
14
At a given instant, the maximum transmission rate is given by the smallest window defined
by flow control and congestion control. This maximum transmission rate is calculated as
follows:
max-rate = [ min ( CongWindow , RcvWindow ) - ( LastByteSent - LastByteAcked ) ] /
RTT bytes / s .
where :
LastByteSent : last byte sent by the transmitter
LastByteAcked : last byte confirmed by receiver
TCP consider two types of failures: “segments lost” (i.e., the receiver does not send any
acknowledgment) and “segments out of order” (i.e., the receiver sends duplicate
acknowledgments). The first event is considered more severe than the latter, because out of
order segments means that some segments are still being received.
There are some variations in the implementation of TCP that differ in the way the
congestion control mechanism reacts to these failure events.
The Tahoe version is the oldest, and returns to slow start (CongWin = 1MSS) for any type
of failure event.
The Reno version is more recent, and takes a quick recovery (CongWin=CongWin/2) in
the case of out of order segments, and slow start (CongWin = 1MSS) in case of segments
lost.
![Page 15: The aim of this unit is to review the main concepts ...jamhour/Download/pub... · The first difference between TCP and UDP refers to the presence or absence of connection. A connection](https://reader033.fdocuments.in/reader033/viewer/2022052017/60304665cc343633951db6e7/html5/thumbnails/15.jpg)
15
OThe UDP (User Datagram Protocol) header is much simpler than TCP because it does not
offer any functionality beyond port number addressing. The PDU of UDP is called
datagram, which is also a frequent synonym for packet. “UDP” means that packets are
created at user (application) level.
Despite having a field for error checking (CheckSum), UDP offers no confirmation service
to the transmitter, or retransmission of lost packets. It doesn’t use connections, being
unable to segment and reassemble data in a transparent way to the application level. That
does not mean it is not possible to develop applications over UDP that are reliable or
capable of transmitting large volumes of data. It simply means that these additional
features should be embedded in the application level, because they are not offered by the
operating system. For example, NFS (Network File System) that allows Unix systems to
share directories over the network is built on UDP.
In many cases, developers choose not to use the features of TCP due to performance issues.
This is particularly true for delay and jitter sensitive applications (i.e., real-time
applications). For example, for VoIP (Voice over IP) is not worth retransmitting lost
packets, since the ability to re-order packets at the VoIP terminal is limited. UDP is still
essential in cases of applications that need to transmit messages to multicast or broadcast
addresses because TCP only supports unicast mode.
![Page 16: The aim of this unit is to review the main concepts ...jamhour/Download/pub... · The first difference between TCP and UDP refers to the presence or absence of connection. A connection](https://reader033.fdocuments.in/reader033/viewer/2022052017/60304665cc343633951db6e7/html5/thumbnails/16.jpg)
16
Code sharing at the application layer is much harder than in transport layer. While TCP,
UDP and IP protocols offer the possibility of reusing the same code among different
applications, application protocols are too specialized to be shared. Thus, it is not worth
implementing application protocols at the operating system level. They are embedded with
the client and server applications.
The purpose of the application protocols is to allow communication between programs
developed by different vendors. Application protocols related to IP networks are
standardized in the form of RFCs by the IETF. Many protocols used on the Internet handle
only text messages. In these protocols, the separation between fields is often made by a
newline character (\n) . For example, a HTTP message from a client requesting the
“http://espec.ppgia.pucpr.br/~jamhour/welcome.html” page may have the following
format:
GET /~jamhour/welcome.html HTTP/1.1\r\n
Host: espec.ppgia.pucpr.br\r\n
Cache-Control: no-cache\r\n
\r\n
Protocols in text format, such as HTTP, the transmission content that include non-printable
characters (such as pictures or videos) must be encoded using algorithms such as base64.
These algorithms are able to encode any binary information into text characters, so it can
be transmitted without breaking the protocol.
![Page 17: The aim of this unit is to review the main concepts ...jamhour/Download/pub... · The first difference between TCP and UDP refers to the presence or absence of connection. A connection](https://reader033.fdocuments.in/reader033/viewer/2022052017/60304665cc343633951db6e7/html5/thumbnails/17.jpg)
17
As discussed earlier, IANA (Internet Assigned Number Authority) defines a standard port
number assignment to TCP or UDP applications. This standard ports is called "Well
Known Ports". The figure illustrates the port numbers associated with some well known
application protocols. Well-known ports are mostly used at the server side.
On the client side, the port number is dynamic, and it is chosen by the operating system
when the client application requests a connection to the server. There are some peer-to-peer
protocols, such as SMB (Sever Message Block), where the client-server paradigm does not
stand, and the fixed port number is used by all peers. For example, wget is a command line
http client used in Linux. When you type the following command, the wget client will
receive a dynamic port that will be associated to it during the transfer of the “vlan.tar.gz”
wget http://espec.ppgia.pucpr.br/~jamhour/vlan.tar.gz
The port number assigned to the wget is released after the end of the file download,
because the TCP connection is terminated by the http server. When you type the wget
command is not necessary to specify which port the http server is listening. This happens
because the client wget by default assumes that the server is connected to the port number
80. However, it is possible to make server applications to listen to alternate ports numbers.
If the http server “espec” is listening on port number 8080 (not default), you must inform
the port number of the server to the wget client as follows:
wget http://espec.ppgia.pucpr.br:8080/~jamhour/vlan.tar.gz
Most server applications in Linux have a name ending with "d". This happens because
applications that run in the background (with no visible user interface) are called daemon
in Linux.
![Page 18: The aim of this unit is to review the main concepts ...jamhour/Download/pub... · The first difference between TCP and UDP refers to the presence or absence of connection. A connection](https://reader033.fdocuments.in/reader033/viewer/2022052017/60304665cc343633951db6e7/html5/thumbnails/18.jpg)
18
In this unit, we reviewed the main concepts related to transport and application layers in
the TCP/IP architecture. Deeper knowledge about the operation of the TCP and UDP are
needed, for example, in network security. Many “attacks” performed against applications
such as “stealing a TCP connection” is performed at the transport level.
Also, information about the TCP flags are used in firewall rules to prevent against “port
spoofing” attacks. The knowledge about how port numbers are assigned is also important
to understand the operation of the Proxy and NAT mechanisms, discussed in the sequence
of this discipline. The concept of “application protocol” is also necessary to understand the
operation of Proxies.