Www.ischool.drexel.edu INFO 320 Server Technology I Week 9 Unix/Linux tools 1INFO 320 week 9.
Www.ischool.drexel.edu INFO 330 Computer Networking Technology I Chapter 3 The Transport Layer Dr....
-
Upload
jazmyn-mabery -
Category
Documents
-
view
221 -
download
1
Transcript of Www.ischool.drexel.edu INFO 330 Computer Networking Technology I Chapter 3 The Transport Layer Dr....
www.ischool.drexel.edu
INFO 330Computer Networking
Technology I Chapter 3
The Transport Layer
Dr. Jennifer Booker
1INFO 330 Chapter 3
www.ischool.drexel.edu
Transport Layer
• The Transport Layer handles logical communication between processes– It’s the last layer not used between processes
for routing, so it’s the last thing a client process and the first thing a server process sees of a packet
– By logical communication, we recognize that the means used to get between processes, and the distance covered, are irrelevant
2INFO 330 Chapter 3
www.ischool.drexel.edu
Transport vs Network
• Notice we didn’t say ‘hosts’ in the previous slide…that’s because– The network layer provides logical communication
between hosts
• Mail analogy– Let’s assume cousins (processes) want to send
letters to each other between their houses (hosts)– They use their parents (transport layer) to mail the
letters, and sort the mail when it arrives
3INFO 330 Chapter 3
www.ischool.drexel.edu
Transport vs Network
– The letters travel through the postal system (network layer) to get from house to house
• The transport layer doesn’t participate in the network layer activities (e.g. most parents don’t work in the mail distribution centers)– The transport layer protocols are localized in
the hosts– Routing isn’t affected by anything the
transport layer added to the messages
4INFO 330 Chapter 3
www.ischool.drexel.edu
Transport vs Network
• Following the analogy, different people might have to pick up and sort the mail; they’re like using different transport layer protocols
• And the transport layer protocols (parents) are often at the mercy of what services the network layer (postal system) provides– Some services can be provided at the transport layer,
even if the network layer doesn’t (e.g. reliable data transfer or encryption)
5INFO 330 Chapter 3
www.ischool.drexel.edu
Two Choices
• Here we choose between TCP and UDP– In the transport layer, a packet is a segment– In the network layer, a packet is a datagram
• The network layer is home to the Internet Protocol (IP)– IP provides logical communication between hosts– IP makes a “best effort” to get segments where they
belong – no guarantees of delivery, or delivery sequence, or delivery integrity
6INFO 330 Chapter 3
www.ischool.drexel.edu
IP
• Each host has an IP address• Common purpose of UDP and TCP is extend
delivery of IP data to the host’s processes – This is called transport-layer multiplexing and
demultiplexing– Both UDP and TCP also provide error checking
• That’s it for UDP – data delivery and error checking!
7INFO 330 Chapter 3
www.ischool.drexel.edu
TCP
• TCP also provides reliable data transfer (not just data delivery)– Uses flow control, sequence numbers,
acknowledgements, and timers to ensure data is delivered correctly and in order
• TCP also provides congestion control– TCP applications share the available
bandwidth (they watched Sesame Street!)
– UDP takes whatever it can get (greedy little protocol)
8INFO 330 Chapter 3
www.ischool.drexel.edu
Multiplexing & Demultiplexing
• At the destination host, the transport layer gets segments from the network layer
• Needs to deliver these segments to the correct process on that host– Do so via sockets, which connect processes
to the network– Each socket has a unique identifier, whose
format varies for UDP and TCP
9INFO 330 Chapter 3
www.ischool.drexel.edu
Multiplexing & Demultiplexing
• Demultiplexing is getting the transport layer segment into the correct socket
• Hence Multiplexing is taking data from various sockets, applying header info, breaking it into segments, and delivering it to the network layer
• Multiplexing and demultiplexing are used in any kind of network; not just in the Internet protocols
10INFO 330 Chapter 3
www.ischool.drexel.edu
Multiplexing & Demultiplexing
application
transport
network
link
physical
P1 application
transport
network
link
physical
application
transport
network
link
physical
P2P3 P4P1
host 1 host 2 host 3
= process= socket
delivering received segmentsto correct socket
Demultiplexing at rcv host:gathering data from multiplesockets, enveloping data with header (later used for demultiplexing)
Multiplexing at send host:
11INFO 330 Chapter 3
www.ischool.drexel.edu
Mail Analogy
• Multiplexing is when a parent collects letters from the cousins, and puts them into the mail
• Demultiplexing is getting the mail, and handing the correct mail to each cousin
• Here we need unique socket identifiers, and some place in the header for the socket identifier information
12INFO 330 Chapter 3
www.ischool.drexel.edu
Segment Header
• Hence the segment header starts with the source and destination port numbers
• Each port number is a 16-bit (2 byte) value
(0 to 65,535)– Well known port numbers are from 0 to 1023
(210 -1)
• After the port numbers are other headers, specific to TCP or UDP, then the message
13INFO 330 Chapter 3
www.ischool.drexel.edu
UDP Multiplexing
• UDP assigns a port number from 1024 to 65,535 to each socket, unless the developer specifies otherwise– UDP identifies a socket only by destination IP
address and destination port number
• The port numbers for source and destination are switched (inverted) when a reply is sent– So a segment from port 19157 to port 46428
generates a reply from port 46428 to 19157 14INFO 330 Chapter 3
www.ischool.drexel.edu
TCP Multiplexing
• TCP is messier, of course• TCP identifies a socket by four values:
– Source IP address, source port number, destination IP address, and destination port number
• Hence if UDP gets two segments with the same destination IP and port number, they’ll both go to the same process– TCP tells the segments apart via source IP/port
15INFO 330 Chapter 3
www.ischool.drexel.edu
TCP Multiplexing
• So if you have two HTTP sessions going to the same web server and page, how can TCP tell them apart?– Even though the destination IP and port (80)
are the same, and the two sessions (processes) have the same source IP address, they have different source port numbers
16INFO 330 Chapter 3
www.ischool.drexel.edu
Port scanning
• Apps called port scanners (e.g. nmap) can scan the ports on a computer and see which are open– This tell us what apps are running on that host– Then target attacks on those apps
• A big security vulnerability is to leave ports open you aren’t using– Could accept hostile TCP connections
17INFO 330 Chapter 3
www.ischool.drexel.edu
Web Servers & TCP
• Each new client connection often uses a new process and socket to send HTTP requests and get responses– But a thread (lightweight process) can be
used, so a process can have multiple sockets for each thread
Host
P1
S1S2
S3
Each connection is a new thread off one process
Host
P1
S1
P2
S2
P3
S3
Each connection is a new process
OR
18INFO 330 Chapter 3
www.ischool.drexel.edu
UDP
• The most minimal transport layer has to do multiplexing and demultiplexing
• UDP does this and a little error checking and, well, um, that’s about it!– UDP was defined in RFC 768– An app that uses UDP almost talks directly to IP– Adds only two small data fields to the header, after
the requisite source/destination addresses– There’s no handshaking; UDP is connectionless
19INFO 330 Chapter 3
www.ischool.drexel.edu
UDP for DNS
• DNS uses UDP• A DNS query is packaged into a segment,
and is passed to the network layer– The DNS app waits for a response; if it
doesn’t get one soon enough (times out), it tries another server or reports no reply
• Hence the app must allow for the unreliability of UDP, by planning what to do if no response comes back
20INFO 330 Chapter 3
www.ischool.drexel.edu
UDP Advantages
• Still UDP is good when:– You want the app to have detailed control over what is
sent across the network; UDP changes it little– No connection establishment delay– No connection state data in the end hosts; hence a
server can support more UDP clients than TCP– Small packet header overhead per segment
• TCP uses 20 bytes of header data, UDP only 8 bytes
21INFO 330 Chapter 3
www.ischool.drexel.edu
UDP Apps
• Other than DNS, UDP is also used for– Network management (SNMP)– Routing (RIP)– Multimedia & telephony (proprietary protocols)– Remote file server (NFS)
• The lack of congestion control in UDP can be a problem when lost of large UDP messages are being sent – can crowd out TCP apps
22INFO 330 Chapter 3
www.ischool.drexel.edu
UDP Header
• The UDP header has four two-byte fields in two lines (8 B total), namely:– Source port number; Destination port number– Length; Checksum
• Length is the total length of the segment, including headers, in bytes
• The checksum is used by the receiving app to see if errors occurred
23INFO 330 Chapter 3
www.ischool.drexel.edu
Checksum
• Noise in the transmission lines can lose bits of data or rearrange them in transit
• Checksums are a common method to detect errors (RFC 1071)
• To create a checksum:– Find the sum of the binary digits of the message– The checksum is the 1s (ones) complement of
the sum– If message is uncorrupted, sum of message plus
checksum is all ones 1111111111111…
24INFO 330 Chapter 3
www.ischool.drexel.edu
1s Complement?
• The 1s complement is a mirror image of a binary number – change all the zeros to ones, and ones to zeros– So the 1s complement of 00101110101 is
11010001010
• UDP does error checking because not all lower layer protocols do error checking– This provides end-to-end error checking,
since it’s more efficient than every step along the way
25INFO 330 Chapter 3
www.ischool.drexel.edu
UDP
• That’s it for UDP!
• The port addresses, the message length, and a checksum to see if it got there intact
• Now see what happens when we want reliable data transfer
26INFO 330 Chapter 3
www.ischool.drexel.edu
Reliable Data Transfer
• Distinguish between the service model, and how it’s really implemented– Service model: From the app perspective, it
just wants a reliable transport layer to connect sending and receiving processes
– Service implementation: In reality, the transport layer has to use an unreliable network layer (IP), so transport has to make up for the unreliability below it
27INFO 330 Chapter 3
www.ischool.drexel.edu
Reliable Data Transfer
• The sending process will give the transport layer a message rdt_send (rdt = reliable data transfer)– The transport protocol will convert to
udt_send (udt = unreliable data transfer; Fig 3.8 has typo) and give to the network layer
• At the receiving end, the protocol gets rdt_rcv from the network layer, – The protocol will convert to deliver_data and
give it to the receiving application process
28INFO 330 Chapter 3
www.ischool.drexel.edu
Reliable Data Transfer
App sees this “service model” But our transport protocol has to do this
netw
or
k la
yer
29INFO 330 Chapter 3
www.ischool.drexel.edu
Reliable Data Transfer
• Here we’ll refer to the data as packets, rather than distinguish segments, etc.
• Also consider that we’ll pretend we only have to send data one direction (unidirectional data transfer)– Bidirectional data transfer is what really occurs, but
the sending and receiving sides get switched
• Time to build a reliable data transfer protocol, one piece at a time
30INFO 330 Chapter 3
www.ischool.drexel.edu
Reliable Data Transfer v1.0
• For the simplest case, called rdt1.0, assume the network is completely reliable
• Finite state machines (FSMs) for the sender and receiver each have one state – waiting for a call– The sending side (rdt_send) makes a packet
(make_pkt) and sends it (udt_send)– The receiving side (rdt_rcv) extracts data from
the packet (extract), and delivers it to the receiving app (deliver_data)
31INFO 330 Chapter 3
www.ischool.drexel.edu
Reliable Data Transfer v1.0
Wait for call from above packet = make_pkt(data)
udt_send(packet)
rdt_send(data)
extract (packet,data)deliver_data(data)
Wait for call from
below
rdt_rcv(packet)
sender receiver
Wait for call from above packet = make_pkt(data)
udt_send(packet)
rdt_send(data)
extract (packet,data)deliver_data(data)
Wait for call from
below
rdt_rcv(packet)
sender receiver
• Here a packet is the only unit of data
• No feedback to sender is needed to confirm receipt of data, and no control over transmission rate is needed
32INFO 330 Chapter 3
www.ischool.drexel.edu
Reliable Data Transfer v2.0
• Now allow bit errors in transmission– But all packets are received, in the correct
order
• Need acknowledgements to know when a packet was correct (OK, 10-4) versus when it wasn’t (please repeat); called positive and negative acknowledgements, respectively– These types of messages are typical for any
Automatic Repeat reQuest (ARQ) protocol33INFO 330 Chapter 3
www.ischool.drexel.edu
Reliable Data Transfer v2.0
• So allowing for bit errors requires three capabilities– Error detection to know if a bit error occurred– Receiver feedback, both positive (ACK) and
negative (NAK) acknowledgements– Retransmission of incorrect packets
34INFO 330 Chapter 3
www.ischool.drexel.edu
Reliable Data Transfer v2.0
Wait for call from above
snkpkt = make_pkt(data, checksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) && isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&isNAK(rcvpkt)
Wait for ACK or
NAK
sender
rdt_send(data)
Wait for call from above
snkpkt = make_pkt(data, checksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) && isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&isNAK(rcvpkt)
Wait for ACK or
NAK
Wait for ACK or
NAK
sender
rdt_send(data)
extract(rcvpkt,data)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) && corrupt(rcvpkt)
Wait for call from
below
receiver
extract(rcvpkt,data)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) && corrupt(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) && corrupt(rcvpkt)
Wait for call from
below
Wait for call from
below
receiver
35INFO 330 Chapter 3
www.ischool.drexel.edu
Reliable Data Transfer v2.0
• Sending FSM (cont.)– The left state waits for a packet from the sending app,
makes a packet with a checksum (make_pkt) – Then the left state sends the packet (udt_send)– It moves to the other state (waiting for ACK/NAK)
• If it gets a NAK response (errors detected), then it resends the packet (udt_send) until it gets it right
• If it gets an ACK response (no errors), then it goes back to the other state to wait for the next packet from the app
36INFO 330 Chapter 3
www.ischool.drexel.edu
Reliable Data Transfer v2.0
• Notice this model does nothing until it gets the NAK/ACK, so it’s a stop-and-wait protocol
• Receiving FSM– The receiving side uses the checksum to see
if the packet was corrupted• If it was (&& corrupt) send a NAK response• If it wasn’t (&& notcorrupt), extract and deliver
the data, and send an ACK response
• But what if the NAK/ACK is corrupted?37INFO 330 Chapter 3
www.ischool.drexel.edu
Reliable Data Transfer v2.0
• Three possible ways to handle NAK/ACK errors– Add another type of response to have the
NAK/ACK repeated; but what if that response got corrupted? Leads to long string of messages…
– Add checksum data to the NAK/ACK, and data to recover from the error
– Resend the packet if the NAK/ACK is garbled;
but introduces possible duplicate packets38INFO 330 Chapter 3
www.ischool.drexel.edu
Reliable Data Transfer v2.1
• TCP and most reliable protocols add a sequence number to the data from the sender– Since we can’t lose packets yet, a one-bit
number is adequate to tell if this is a new packet or a repeat of the previous one
• This gives our new model rdt version 2.1
39INFO 330 Chapter 3
www.ischool.drexel.edu
Reliable Data Transfer v2.1
Wait for call 0 from
above
sndpkt = make_pkt(0, data, checksum)udt_send(sndpkt)
rdt_send(data)
Wait for ACK or NAK 0 udt_send(sndpkt)
rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
sndpkt = make_pkt(1, data, checksum)udt_send(sndpkt)
rdt_send(data)
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt)
Wait forcall 1 from
above
Wait for ACK or NAK 1
Wait for call 0 from
above
sndpkt = make_pkt(0, data, checksum)udt_send(sndpkt)
rdt_send(data)
Wait for ACK or NAK 0
Wait for ACK or NAK 0 udt_send(sndpkt)
rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
sndpkt = make_pkt(1, data, checksum)udt_send(sndpkt)
rdt_send(data)
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt)
Wait forcall 1 from
above
Wait forcall 1 from
above
Wait for ACK or NAK 1
Wait for ACK or NAK 1
sender
40INFO 330 Chapter 3
www.ischool.drexel.edu
Reliable Data Transfer v2.1
• Now the number of states are doubled, since we have sequence numbers 0 or 1– So in make_pkt(1, data, checksum)
the 1 is the sequence number• Sequence number alternates 010101 if everything
works; if a packet is corrupted, the same sequence number is expected two or more times
• Start at ‘Wait for call 0’ state; when get packet, send it to network with sequence 0– Then wait for ACK or NAK with sequence 0
41INFO 330 Chapter 3
www.ischool.drexel.edu
Reliable Data Transfer v2.1
– If the packet was corrupt, or got a NAK, resend that packet (upper right loop)
• Otherwise wait for call with sequence 1 from app
– When call 1 is received, make and send the packet with sequence 1 (desired outcome)
• Then wait for a NAK/ACK with sequence 1
– If corrupt or got a NAK, resend (lower left loop)• Otherwise go to waiting for a sequence 0 call from
the app
– Repeat cycle
42INFO 330 Chapter 3
www.ischool.drexel.edu
Reliable Data Transfer v2.1
Wait for 0 from below
sndpkt = make_pkt(NAK, chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) && not corrupt(rcvpkt) &&has_seq0(rcvpkt)
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq1(rcvpkt)
extract(rcvpkt,data)deliver_data(data)sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)
Wait for 1 from below
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq0(rcvpkt)
extract(rcvpkt,data)deliver_data(data)sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) && (corrupt(rcvpkt)
sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) && not corrupt(rcvpkt) &&has_seq1(rcvpkt)
rdt_rcv(rcvpkt) && (corrupt(rcvpkt)
sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)
sndpkt = make_pkt(NAK, chksum)udt_send(sndpkt)
Wait for 0 from below
Wait for 0 from below
sndpkt = make_pkt(NAK, chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) && not corrupt(rcvpkt) &&has_seq0(rcvpkt)
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq1(rcvpkt)
extract(rcvpkt,data)deliver_data(data)sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)
Wait for 1 from below
Wait for 1 from below
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq0(rcvpkt)
extract(rcvpkt,data)deliver_data(data)sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) && (corrupt(rcvpkt)
sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) && not corrupt(rcvpkt) &&has_seq1(rcvpkt)
rdt_rcv(rcvpkt) && (corrupt(rcvpkt)
sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)
sndpkt = make_pkt(NAK, chksum)udt_send(sndpkt)
receiver
43INFO 330 Chapter 3
www.ischool.drexel.edu
Reliable Data Transfer v2.1
• The receiver side doubles in # of states• When waiting for seq 0 state
– If the packet has sequence 0 and isn’t corrupt, extract and deliver the data, and send an ACK; go to wait for seq 1 state
– If the packet was corrupt, reply with a NAK– If the packet has sequence 1 and was not
corrupt (it’s out of order) send an ACK and keep waiting for a seq 0 packet
• Mirror the above for starting from ‘wait for seq 1’ state
44INFO 330 Chapter 3
www.ischool.drexel.edu
Reliable Data Transfer v2.2
• Could achieve the same effect without a NAK (for corrupt packet) if we only ACK the last correctly received packet
• Two ACKs for the same packet (duplicate ACKs) means the packet after the second ACK wasn’t received correctly
• The NAK-free protocol is called rdt2.2
45INFO 330 Chapter 3
www.ischool.drexel.edu
Reliable Data Transfer v2.2
Wait for call 0 from
above
sndpkt = make_pkt(0, data, checksum)udt_send(sndpkt)
rdt_send(data)
udt_send(sndpkt)
rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||
isACK(rcvpkt,1) )
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt,0)
Wait for ACK
0
sender FSMf ragment
Wait for 0 from below
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq1(rcvpkt)
extract(rcvpkt,data)deliver_data(data)sndpkt = make_pkt(ACK1, chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) && (corrupt(rcvpkt) ||
has_seq1(rcvpkt))
udt_send(sndpkt)
receiver FSMf ragment
Wait for call 0 from
above
Wait for call 0 from
above
sndpkt = make_pkt(0, data, checksum)udt_send(sndpkt)
rdt_send(data)
udt_send(sndpkt)
rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||
isACK(rcvpkt,1) )
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt,0)
Wait for ACK
0
Wait for ACK
0
sender FSMf ragment
Wait for 0 from below
Wait for 0 from below
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq1(rcvpkt)
extract(rcvpkt,data)deliver_data(data)sndpkt = make_pkt(ACK1, chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) && (corrupt(rcvpkt) ||
has_seq1(rcvpkt))
udt_send(sndpkt)
receiver FSMf ragment
46INFO 330 Chapter 3
www.ischool.drexel.edu
Reliable Data Transfer v2.2
• Again, the send and receive FSMs are symmetric for sequence 0 and 1– Sender must now check the sequence
number of the packet being ACK’d (see isACK message)
– The receiver must include the sequence number in the make_pkt message
• FSM on page 211 also has oncethru variable to help avoid duplicate ACKs
47INFO 330 Chapter 3
www.ischool.drexel.edu
Reliable Data Transfer v3.0
• Now account for the possibility of lost packets• Need to detect packet loss, and decide what to
do about it– The latter is easy with the tools we have (ACK,
checksum, sequence #, and retransmission), but need a new detection mechanism
• Many possible loss detection approaches – Focus on making the sender responsible for it
48INFO 330 Chapter 3
www.ischool.drexel.edu
Reliable Data Transfer v3.0
• Sender thinks a packet lost when packet doesn’t get to receiver, or the ACK gets lost
• Can’t wait for worst case transmission time, so pick a reasonable time before error recovery is started– Could result in duplicate packets if it was still
on the way; but rdt2.2 can handle that
• For the sender, retransmission is ultimate solution – whether packet or ACK was lost
49INFO 330 Chapter 3
www.ischool.drexel.edu
Reliable Data Transfer v3.0
• Knowing when to retransmit needs a countdown timer– Count time from sending a packet to still not
getting an ACK• If time is exceeded, retransmit that packet• Works the same if packet is lost or ACK is lost
• Since packet sequence numbers alternate 0-1-0-1-etc., is called an alternate-bit protocol
50INFO 330 Chapter 3
www.ischool.drexel.edu
Reliable Data Transfer v3.0sndpkt = make_pkt(0, data, checksum)udt_send(sndpkt)start_timer
rdt_send(data)
Wait for
ACK0
rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isACK(rcvpkt,1) )
Wait for call 1 from
above
sndpkt = make_pkt(1, data, checksum)udt_send(sndpkt)start_timer
rdt_send(data)
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt,0)
rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isACK(rcvpkt,0) )
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt,1)
stop_timerstop_timer
udt_send(sndpkt)start_timer
timeout
udt_send(sndpkt)start_timer
timeout
rdt_rcv(rcvpkt)
Wait for call 0from
above
Wait for
ACK1
rdt_rcv(rcvpkt)
sndpkt = make_pkt(0, data, checksum)udt_send(sndpkt)start_timer
rdt_send(data)
Wait for
ACK0
Wait for
ACK0
rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isACK(rcvpkt,1) )
Wait for call 1 from
above
Wait for call 1 from
above
sndpkt = make_pkt(1, data, checksum)udt_send(sndpkt)start_timer
rdt_send(data)
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt,0)
rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isACK(rcvpkt,0) )
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt,1)
stop_timerstop_timer
udt_send(sndpkt)start_timer
timeout
udt_send(sndpkt)start_timer
timeout
rdt_rcv(rcvpkt)
Wait for call 0from
above
Wait for call 0from
above
Wait for
ACK1
Wait for
ACK1
rdt_rcv(rcvpkt)
sender
51INFO 330 Chapter 3
www.ischool.drexel.edu
Reliable Data Transfer v3.0
• How does the receiver FSM differ from rdt2.2? It doesn’t. – The sender is responsible for loss detection
• Notice that, even allowing for lost packets, we still assume only once packet is sent completely and correctly at a time
• But rdt3.0 still stops to wait for timeout of each packet – fix with pipelining
52INFO 330 Chapter 3
www.ischool.drexel.edu
Pipelined RDT
• Suppose we implemented rdt3.0 between NYC and LA– Distance of 3000 miles gives RTT of about 30 ms– If transmission rate is 1 Gbps, and packets are 1 kB
(8 kb)• Transmission time is therefore only 8 kb / 1E9 b/s =
8 microseconds (s)
– Even if ACK messages are very small (transmission time about zero), the time for one packet to be sent and ACK is 30.008 ms
53INFO 330 Chapter 3
www.ischool.drexel.edu
Pipelined RDT
• Hence we’re transmitting 0.008 ms out of the 30.008 ms RTT, which equals 0.03% utilization– How a protocol is implemented drastically
affects its usefulness!
• It makes sense to send multiple packets and keep track of the ACKs for each– Methods to do so are Go-Back-N (GBN) and
Selective Repeat (SR)54INFO 330 Chapter 3
www.ischool.drexel.edu
Go-Back-N
• In this protocol, sender can send up to N packets without getting an ACK*
• N is also called a window size, and the protocol is a.k.a. a sliding-window protocol– Let base be the number of the first packet in
a window– The window size, N, is already defined– Then all packets from 0 to base-1 have
already been sent* Why a limit at all? Need for flow and congestion control later.
55INFO 330 Chapter 3
www.ischool.drexel.edu
Go-Back-N
– The window currently focuses on packets number base to base+N, these packets can be sent before their ACK is received
• Packet sequence numbers need to have a maximum value; if ‘k’ bits are in the sequence number, the range of sequence numbers is 0 to 2k-1– The sequence numbers are used in a circle,
so after 2k-1 you use 0 again, then 1, etc.
56INFO 330 Chapter 3
www.ischool.drexel.edu
Go-Back-N
– rdt3.0 only had sequence numbers 0 and 1– TCP has a 32-bit sequence number range for
the bytes in a byte stream
• In the FSMs for Go-Back-N (GBN)– Sender must respond to:
• Call from above (i.e. the app)• Receipt of an ACK from any of the packets
outstanding, providing cumulative acknowledgement
• Timeout – causes all un-ACKed packets re-sent
57INFO 330 Chapter 3
www.ischool.drexel.edu
Go-Back-N
• The GBN receiver does:– If a packet is correct and in order, send an
ACK• Sender moves window up with each correct and in
order packet ACKed – this minimizes resending later
– In all other cases, throw away the packet, and resend ACK for the most recent correct packet
• Hence we throw away correct but out-of-order packets – this makes receiver buffering easier
58INFO 330 Chapter 3
www.ischool.drexel.edu
Go-Back-N
• GBN can be implemented in event-based programming; events here are– App invokes rdt_send– Receiver protocol receives rdt_rcv– Timer interrupts
• In contrast, consider the selective repeat (SR) approach for pipelining
59INFO 330 Chapter 3
www.ischool.drexel.edu
Selective Repeat
• Large window size and bandwidth delay can make a lot of packets in the pipeline under GBN, which can cause a lot of retransmission when a packet is lost
• Selective repeat only retransmits packets believed to be in error – so retransmission is on a more individual basis
• To do this, buffer out-of-order packets until the missing packets are filled in
60INFO 330 Chapter 3
www.ischool.drexel.edu
Selective Repeat
• SR still uses a window of size N packets• SR sender responds to:
– Data from the app above it; finds next sequence number available, and sends as soon as possible
– Timeout is kept for each packet– ACK received from the receiver; then sender
marks off that packet, and moves the window forward; can transmit packets inside the new window
61INFO 330 Chapter 3
www.ischool.drexel.edu
Selective Repeat
• The SR receiver responds to– Packet within the current window; then send
an ACK; deliver packets at the bottom of the window, but buffer higher number packets (out of order)
– Packets that were previously ACKed are ACKed again
– Otherwise ignore the packet
• Notice the sender and receiver windows are generally not the same!!
62INFO 330 Chapter 3
www.ischool.drexel.edu
Selective Repeat
• It’s possible that the sequence number range and window size could be too close, producing confusing signals– To prevent this, need
window size < half of sequence number range
63INFO 330 Chapter 3
www.ischool.drexel.edu
Packet Reordering
• Our last assumption was that packets arrive in order, if at all– What is they arrive out of order?
• Out of order packets could have sequence numbers outside of either window (snd or rcv)
• Handle by not allowing packets older than some max time– TCP typically uses 3 minutes
64INFO 330 Chapter 3
www.ischool.drexel.edu
Reliable Data Transfer Mechanisms
– Checksum, to detect bit errors in a packet– Timer, to know when a packet or its ACK was lost– Sequence number, to detect lost or duplicate
packets– Acknowledgement, to know packet got to receiver
correctly– Negative acknowledgement, to tell packet was
corrupted but received– Window, to pipeline many packets at once before an
ACK was received for any of them
65INFO 330 Chapter 3
www.ischool.drexel.edu
TCP Intro
• Now see how all this applies to TCP– First in RFC 793, now RFC 2581– Invented circa 1974 by Vint Cerf and Robert Kahn
• TCP starts with a handshake protocol, which defines many connection variables– Connection only at hosts, not in between– Routers are oblivious to whether TCP is used!
• TCP is a full duplex service – data can flow both directions at once, and is connection-oriented
66INFO 330 Chapter 3
www.ischool.drexel.edu
TCP Intro
• TCP is point-to-point – between a single sender and a single receiver– In contrast with multipoint technologies
• TCP is client/server based• Client needs to establish a socket to the
server’s hostname and port– Recall default port numbers are app-specific– Special segments are sent by client, server,
and client to make the three-way handshake
67INFO 330 Chapter 3
www.ischool.drexel.edu
TCP Intro
• Once connection exists, processes can send data back and forth
• Sending process sends data through socket to the TCP send buffer– TCP sends data from the send buffer when it feels
like it– Max Segment Size (MSS) is based on the max frame
size, or Max Transmission Unit (MTU)– Want 1 TCP segment to eventually fit in the MTU
68INFO 330 Chapter 3
www.ischool.drexel.edu
TCP Intro
– Typical MTU values are 512 – 1460 bytes
• MSS is the max app data that can fit in a segment, not the total segment size (which includes headers)
• TCP adds headers to the data, creating TCP segments– Segments are passed to the network layer to
become IP datagrams, and so on into the network
69INFO 330 Chapter 3
www.ischool.drexel.edu
TCP Intro
• At the server side, the segment is placed in the receive buffer
• So a TCP connection consists of two buffers (send and receive), some variables, and two socket connections (send and receive) on the corresponding processes
70INFO 330 Chapter 3
www.ischool.drexel.edu
TCP Segment Structure
• A TCP segment consists of header fields and a data field– The data field size is limited by the MSS
• Typical header size is 20 bytes– The header is 32 bits wide (4 bytes), so it has
five lines at a minimum
71INFO 330 Chapter 3
www.ischool.drexel.edu
TCP Header Structure
• The header lines are– Source and destination port numbers (16 bit ea.)– Sequence number (32 bit)– ACK number (32 bit)– A bunch of little stuff (header length, URG, ACK, PSH,
RST, SYN, and FIN bits), then the receive window (16 bit)
– Internet checksum, urgent data pointer (16 bit ea.)– And possibly several options
72INFO 330 Chapter 3
www.ischool.drexel.edu
TCP Segment Structure
• We’ve seen the port numbers (16 bits each), sequence and ACK numbers (32 bits each)
• The ‘bunch of little stuff’ includes – Header length (4 bits)– A flag field includes six one-bit fields: ACK, RST, SYN,
FIN, PSH, and URG• The URG bit marks urgent data later on that line
• The receive window is used for flow control
73INFO 330 Chapter 3
www.ischool.drexel.edu
TCP Segment Structure
• The checksum is used for bit error detection, as with UDP– The urgent data pointer tells where the urgent
data is located
• The options include negotiating the MSS, scaling the window size, or time stamping
74INFO 330 Chapter 3
www.ischool.drexel.edu
TCP Sequence Numbers
• The sequence numbers are important for TCP’s reliability
• TCP views data as unstructured but ordered stream of bytes
• Hence sequence numbers for a segment is the byte-stream number of the first byte in the segment– Yes, each byte is counted!
75INFO 330 Chapter 3
www.ischool.drexel.edu
TCP Sequence Numbers
• So if the MSS is 1000 bytes, the first segment will be number 0, and cover bytes 0 to 999– The second segment is number 1000, and
covers bytes 1000-1999– Third is number 2000, and covers 2000-2999,
etc.
• Typically start sequences at random numbers on both sides, to avoid accidental overlap with previously used numbers
76INFO 330 Chapter 3
www.ischool.drexel.edu
TCP Acknowledgement No.
• TCP acknowledgement numbers are weird• The number used is the next byte number
expected from the sender– So if host B sends to A (!) bytes 0-535 of data,
host A expects byte 536 to be the start of the next segment, so 536 is the Ack number
• This is a cumulative acknowledgement, since it only goes up to the first missing byte in the byte-stream
77INFO 330 Chapter 3
www.ischool.drexel.edu
TCP Out-of-Order Segments
• What does it do when segments arrive out of order? – That’s up to the TCP implementer
• TCP can either discard out of order segments, or keep the strays in buffer and wait for the pieces to get filled in– The former is easier to implement, the latter
is more efficient and commonly used
78INFO 330 Chapter 3
www.ischool.drexel.edu
Telnet Example
• Telnet (RFC 854) is an old app for remote login via TCP
• Telnet interactively echoes whatever was typed to show it got to the other side
• Host A is the client, starts a session with Host B, the server– Suppose client starts with sequence number
42, and server with 79
79INFO 330 Chapter 3
www.ischool.drexel.edu
Telnet ExampleHost A Host B
Seq=42, ACK=79, data = ‘C’
Seq=79, ACK=43, data = ‘C’
Seq=43, ACK=80
Usertypes
‘C’
host ACKsreceipt
of echoed‘C’
host ACKsreceipt of‘C’, echoes
back ‘C’
timesimple telnet scenario
Host A Host B
Seq=42, ACK=79, data = ‘C’
Seq=79, ACK=43, data = ‘C’
Seq=43, ACK=80
Usertypes
‘C’
host ACKsreceipt
of echoed‘C’
host ACKsreceipt of‘C’, echoes
back ‘C’
timetimesimple telnet scenario
• User types a single letter, ‘c’
• Notice how the seq and Ack numbers mirror or “piggy back” each other
80INFO 330 Chapter 3
www.ischool.drexel.edu
Timeout Calculation
• TCP needs a timeout interval, as discussed in the rdt example, but how long?– Longer than RTT, but how much? A week?
• Measure sample RTT for segments here and there (not every one)– This SampleRTT value will fluctuate, with an
average value called EstimatedRTT which is a moving average updated with each measurement
81INFO 330 Chapter 3
www.ischool.drexel.edu
Timeout Calculation
– Naturally, EstimatedRTT is a smoother curve than each SampleRTT
• EstimatedRTT =0.875*EstimatedRTT + 0.125*SampleRTT
• The variability of RTT is measured by DevRTT, which is the moving average magnitude difference between SampleRTT and EstimatedRTT– Let DevRTT = 0.75*DevRTT + 0.25*
|SampleRTT - EstimatedRTT|
82INFO 330 Chapter 3
www.ischool.drexel.edu
Timeout Calculation
• We want the timeout interval larger than EstimatedRTT, but not huge; use– TimeoutInterval = EstimatedRTT + 4*DevRTT
• This is analogous to control charts, where the expected value of a measurement is no more than the (mean + 3*the standard deviation) about ¼% of the time– DevRTT isn’t a standard deviation, but the
idea is similar83INFO 330 Chapter 3
www.ischool.drexel.edu
Timeout Calculation
• Notice this means that the timeout interval is constantly being calculated, and to do so requires frequent measurement of SampleRTT to find current values for:– Estimated RTT– DevRTT– TimeoutInterval
84INFO 330 Chapter 3
www.ischool.drexel.edu
Reliable Data Transfer
• IP is not a reliable datagram service– It doesn’t guarantee delivery, or in order, or
intact delivery
• In theory we saw that separate timers for each segment would be nice; in reality TCP uses one retransmission timer for several segments (RFC 2988)
• For the next example, assume Host A is sending a big file to Host B
85INFO 330 Chapter 3
www.ischool.drexel.edu
Simplified TCP
• Here the sender responds to three events:– Receive data from application
• Then it makes segments of the data, each with a sequence number, and passes them to the IP layer
• Starts timer
– Timer times out• Then it re-sends the segment that timed out
– ACK was received• Compares the received ACK value with SendBase, the last
byte number successfully received• Restart timer if any un-ACK segments left
86INFO 330 Chapter 3
www.ischool.drexel.edu
Simplified TCP
• Even this version of TCP can successfully handle lost ACKs by ignoring duplicate segments (Fig 3.34, p. 256)
• If a segment times out, later segments don’t get re-sent (Fig 3.35, p. 257)
• A lost ACK can still be deduced to not be a lost segment (Fig 3.36, p. 258)
87INFO 330 Chapter 3
www.ischool.drexel.edu
Doubling Timeout
• After a timeout event, many TCP implementations double the timeout interval
• This helps with congestion control, since timeout is often due to congestion, and retransmitting often just makes it worse!
88INFO 330 Chapter 3
www.ischool.drexel.edu
Fast Retransmit
• Waiting for the timeout can be too slow• Might know to retransmit sooner if get
duplicate ACKs– An ACK for a given byte number means a gap
was noted in the segment sequence (since there are no negative NAKs)
• Getting three duplicate ACKs typically forces a fast retransmit of the segment after that value
89INFO 330 Chapter 3
www.ischool.drexel.edu
Go-Back-N vs. Selective Repeat?
• TCP partly looks like Go-Back-N (GBN)– Tracks last sequence number transmitted but not
ACKed (SendBase) and sequence number of next byte to send (NextSeqNum)
• TCP partly looks like Selective Repeat (SR)– Often buffers out-of-order segments to limit the range
of segments retransmitted– TCP can use selective acknowledgment (RFC 2018)
to specify which segments are out of order
90INFO 330 Chapter 3
www.ischool.drexel.edu
Flow Control
• TCP connection hosts maintain a receive buffer, for bytes received correctly and in order– Apps might not read from the buffer for a
while, so it can overflow
• Flow control focuses on preventing overflow of the receive buffer– So it also depends on how fast the receiving
app is reading the data!91INFO 330 Chapter 3
www.ischool.drexel.edu
Flow Control
• Hence the sender in TCP maintains a receive window (RcvWindow) variable – how much room is left in the receive buffer– The receive buffer has size RcvBuffer– The last byte number read by the receiving
app is LastByteRead– The last byte put in the receive buffer is
LastByteRcvd– RcvWindow = RcvBuffer – (LastByteRcvd –
LastByteRead) = rwnd
92INFO 330 Chapter 3
www.ischool.drexel.edu
Flow Control
• So the amount of room in RcvWindow varies with time, and is returned to the sender in the receive window field of every segment (see slide 73)– The sender also keeps track of LastByteSent and
LastByteAcked; the difference between them is the amount of data between sender and receiver
• Keep that difference less than the RcvWindow to make sure the receive buffer isn’t overflowed
• LastByteSent – LastByteAcked <= RcvWindow
93INFO 330 Chapter 3
www.ischool.drexel.edu
Flow Control
• If the RcvWindow goes to zero, the sender can’t send more data to the receiver ever!
• To prevent this, TCP makes the sender transmit one byte messages when RcvWindow is zero, so that the receiver can indicate when the buffer is not full
94INFO 330 Chapter 3
www.ischool.drexel.edu
UDP Flow Control
• There ain’t none (sic!)
• UDP adds newly arrived segments to a buffer in front of the receiving socket– If the buffer gets full, segments are dropped– Bye-bye data!
95INFO 330 Chapter 3
www.ischool.drexel.edu
TCP Connection Management
• Now look at the TCP handshake in detail– Important since many security threats exploit it
• Recall the client process wants to establish a connection with a server process– Step 1 – client sends segment with code SYN=1 and
an initial sequence number (client_isn) to the server
• Choosing a random client_isn is key for security
96INFO 330 Chapter 3
www.ischool.drexel.edu
TCP Connection Management
– Step 2 – Server allocates variables needed for the connection, and sends a connection-granted segment, SYNACK, to the client
• This SYNACK segment has SYN=1, the ack field is set to client_isn+1, and the server chooses its initial sequence number (server_isn)
– Step 3 – Client gets SYNACK segment, and allocates its buffers and variables
• Client sends segment with ack value server_isn+1, and SYN=0
97INFO 330 Chapter 3
www.ischool.drexel.edu
TCP Connection Management
• The SYN bit stays 0 while the connection is open– Why is a three-way handshake used? – Why isn’t two-way enough?
• Now look at closing the connection– Either client or server can close the
connection
98INFO 330 Chapter 3
www.ischool.drexel.edu
TCP Connection Management
• One host, let’s say the client, sends a segment with the FIN bit set to 1
• The server acknowledges this with a return segment, then sends a separate shutdown segment (also with FIN=1)
• Client acknowledges the shutdown from the server, and resources in both hosts are deallocated
99INFO 330 Chapter 3
www.ischool.drexel.edu
TCP State Cycle
• Another way to view the history of a TCP connection is through its state changes (Fig 3.41, 3.42)– The connection starts Closed– After the handshake is completed it’s Established
• Then the processes communicate
– Sending or receiving a FIN=1 starts the closing process, until both sides get back to Closed
• Whoever sent a FIN waits some period (30-120 s) after ACKing the other host’s FIN before closing their connection
100INFO 330 Chapter 3
www.ischool.drexel.edu
Stray Segments
• Receiving a segment with SYN trying to open an unknown or closed port results in:– Server sends a reset message; RST=1,
meaning “go away, that port isn’t open”
• Similarly, a UDP packet with unknown socket results in sending a special ICMP datagram (see next chapter)
101INFO 330 Chapter 3
www.ischool.drexel.edu
Stray Segments
• So mapping ports on a system could yield three responses– Get a TCP SYNACK, implying the port is open
and some app is using it– Get a TCP RST segment, meaning the port is
closed– No response, implying the port could be
blocked by a firewall
102INFO 330 Chapter 3
www.ischool.drexel.edu
SYN Flood Attacks
• The TCP handshake is the basis for an attack called the SYN flood– Have one or more computers sent lots of SYN
messages to a server – but spoof the return IP address so the connection is never finished
– Makes the server waste resources waiting for you; can crash it if done fast enough
– A new defense against this is the SYN cookie
103INFO 330 Chapter 3
www.ischool.drexel.edu
SYN cookie
• When a SYN segment is received, the server creates a sequence number that is a hash function of the source and destination IP addresses and port numbers– It sets up nothing else!– When it receives the ACK response, it uses
the cookie to recover the original info
104INFO 330 Chapter 3
www.ischool.drexel.edu
Congestion Control
• Now address congestion control issues– Congestion is a traffic jam in the middle of the
network somewhere– Most common cause is too many sources
sending data too fast into the network
105INFO 330 Chapter 3
www.ischool.drexel.edu
Congestion Control
• Key lessons from cases b and c are:– A congested network forces
retransmissions for packets lost due to buffer overflow, which adds to the congestion
– A congested network can waste its bandwidth by sending duplicate packets which weren’t lost in the first place
106INFO 330 Chapter 3
www.ischool.drexel.edu
Congestion Control
• (skipping the big messy example)
• The lesson is: dropping a packet wastes the transmission capacity of every upstream link that packet saw
• So what are our approaches for dealing with congestion?
107INFO 330 Chapter 3
www.ischool.drexel.edu
Congestion Control Approaches
• Either the network provides explicit support for congestion control, or it doesn’t– End-to-end congestion control is when the
network doesn’t provide explicit support • Presence of congestion is inferred from packet
loss, delays, etc.• Since TCP uses IP, this is our only option right now
108INFO 330 Chapter 3
www.ischool.drexel.edu
Congestion Control Approaches
– Network-assisted congestion control is when network components (e.g. routers) provide congestion feedback explicitly
• IBM SNA, DECnet, and ATM use this, and proposals for improving TCP/IP have been made
• Network equipment may provide various levels of feedback
– Send a choke packet to tell sender they’re full– Flag existing packets to indicate congestion– Tell what transmission rate the router can support
at the moment
109INFO 330 Chapter 3
www.ischool.drexel.edu
ATM ABR Congestion Control
• ATM Available Bit-Rate (ABR) is one method of network-assisted congestion control– It uses a combination of virtual circuits (VC) and
resource management (RM) cells (packets) to convey congestion information along the VC
– Data cells (packets) contain a congestion bit to prompt sending a RM cell back to the sender
– Other bits convey whether the congestion is mild (don’t increase traffic) or severe (back off) or tell the max rate supported along the circuit
110INFO 330 Chapter 3
www.ischool.drexel.edu
TCP Congestion Control
• As noted, TCP uses end-to-end congestion control, since IP provides no congestion feedback to the end systems– In TCP, each sender limits its send rate based
on its perceived amount of congestion
• Each side of a TCP connection has a send buffer, receive buffer, and several variables
• Each side also has a congestion window variable, CongWin (or cwnd)
111INFO 330 Chapter 3
www.ischool.drexel.edu
TCP Congestion Control
• The max send rate for a sender is the minimum of CongWin and the RcvWindow– LastByteSent – LastByteAcked <=
min(CongWin, RcvWindow)
• Assume for the moment that the RcvWindow is large, so we can focus on CongWin– If loss and transmission delay are small,
CongWin bytes of data can be sent every RTT, for a send rate of CongWin/RTT
112INFO 330 Chapter 3
www.ischool.drexel.edu
TCP Congestion Control
• Now address how to detect congestion
• Call a “loss event” when a timeout occurs or three duplicate ACKs are received– Congestion causes loss events in the network
• If there’s no congestion, lots of happy ACKs tell TCP to increase CongWin quickly, and hence transmission rate– Conversely, slow ACK receipt slows CongWin
increase113INFO 330 Chapter 3
www.ischool.drexel.edu
TCP Congestion Control
• TCP is self-clocking, since it measures its own feedback (ACK receipt) to determine changes in CongWin
• Now look at how TCP defines its congestion control algorithm in three parts– Additive-increase, multiplicative-decrease– Slow start– Reaction to timeout events
114INFO 330 Chapter 3
www.ischool.drexel.edu
Additive-increase, Multiplicative-decrease
• When a loss event occurs, CongWin is halved unless it approaches 1.0 MSS, a process called multiplicative-decrease
• When there’s no perceived congestion, TCP increases CongWin slowly, adding 1 MSS each RTT – this is additive-increase
• Collectively they are the AIMD algorithm
Recall MSS = maximum segment size
115INFO 330 Chapter 3
www.ischool.drexel.edu
AIMD Algorithm
• Over a long TCP connection, when there’s little congestion, AIMD will result in slow rises in CongWin, followed by a cut in half when a loss event occurs; repeated that produces a grumpy sawtooth wave
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
116INFO 330 Chapter 3
www.ischool.drexel.edu
Slow Start
• The initial send rate is typically 1 MSS/RTT, which is really slow
• To avoid a really long ramp up to a fast rate, an exponential increase in CongWin is used until the first loss event occurs– CongWin doubles every RTT during
slow start
• Then the AIMD algorithm takes over
117INFO 330 Chapter 3
www.ischool.drexel.edu
Reaction to Timeout
• Timeouts are not handled the same as triple duplicate ACKs– Triple duplicate ACKs are followed by: halve
CongWin, then use AIMD approach– But true timeout events are handled differently
• The TCP sender returns to slow start, and if no problems occur, ramps up to half of the CongWin value before the timeout occurred– A variable Threshold stores the 0.5*CongWin value
when a loss event occurs
118INFO 330 Chapter 3
www.ischool.drexel.edu
Reaction to Timeout
– Once CongWin gets back to the Threshold value, it is allowed to increase linearly per AIMD
• So after a triple duplicate ACK, CongWin recovers faster (called a fast recovery, oddly enough) than after a timeout– Why do this? Because the triple duplicate ACK
proves that several other packets got there successfully, even if one was lost
– A timeout is a more severe congestion indicator, hence the slower recovery of CongWin
119INFO 330 Chapter 3
www.ischool.drexel.edu
TCP Tahoe & Reno
• TCP Tahoe follows the timeout recovery pattern after any loss event– Go back to CongWin = 1 MSS, ramp up
exponentially until reach Threshold, then follow AIMD
• TCP Reno introduced the fast recovery from triple duplicate ACK (use this)– After loss event, cut CongWin in half, and
resume linear increase until next loss event; repeat
120INFO 330 Chapter 3
www.ischool.drexel.edu
TCP Tahoe & Reno
Assumes loss event from transmission round 8; shows how Tahoe and Reno respond differently.
New Threshold is 12/2=6*MSS
121INFO 330 Chapter 3
www.ischool.drexel.edu
TCP Throughput
• Other variations exist, e.g. TCP Vegas
• If the sawtooth pattern continues, with a loss event occurring at the same congestion window size consistently, then the average throughput (rate) is– Average throughput = 0.75*W/RTT
where W is the CongWin size when the loss event occurs
122INFO 330 Chapter 3
www.ischool.drexel.edu
TCP Future
• TCP will keep changing to meet the needs of the Internet
• Obviously, many critical Internet apps depend on TCP, so there are always changes being proposed– See RFC Index for current ideas
• For example, many want to support very high data rates (e.g. 10+ Gbps)
123INFO 330 Chapter 3
www.ischool.drexel.edu
TCP Future
• In order to support that rate, the congestion window would have to be 83,333 segments– And not lose any of them!
• If we have the loss rate (L) and MSS, we can derive – Average throughput = 1.22*MSS/(RTT*sqrt(L))
• For 10 Gbps throughput, we need L about 2x10-10, or lose one segment in five billion!
124INFO 330 Chapter 3
www.ischool.drexel.edu
Fairness
• If a router has multiple connections competing for bandwidth, is it fair in sharing?
• If two TCP connections of equal MSS and RTT are sharing a router, and both are primarily in AIMD mode, the throughput for each connection will tend to balance fairly, with cyclical changes in throughput due to changes in CongWin after packet drops
125INFO 330 Chapter 3
www.ischool.drexel.edu
Fairness
• More realistically, unequal connections are less fair– Lower RTT gets more bandwidth (CongWin
increases faster)– UDP traffic can force out the more polite
TCP traffic– Multiple TCP connections from a single host
(e.g. from downloading many parts of a Web page at once) get more bandwidth
126INFO 330 Chapter 3
www.ischool.drexel.edu
Are We Done Yet?
• So we’ve covered transport layer protocols from the terribly simple UDP to a seemingly exhaustive study of TCP– Key features along the way include
multiplexing/demultiplexing, error detection, acknowledgements, timers, retransmissions, sequence numbers, connection management, flow control, end-to-end congestion control
– So much for the “edge” of the Internet; next is the network layer, to start looking at the core
127INFO 330 Chapter 3