Transport Layer ECE544: Communication Networks-II, Spring 2013

Post on 03-Jan-2016

25 views 2 download

Tags:

description

Transport Layer ECE544: Communication Networks-II, Spring 2013. Tam Vu WINLAB, Dept. of Computer Science Rutgers University. Includes teaching materials from, L. Peterson, Sumathi Gopal and Sumit Rangwala, D. Raychaudhuri, Mike Freedman. IP Protocol Stack: Key Abstractions. Application. - PowerPoint PPT Presentation

Transcript of Transport Layer ECE544: Communication Networks-II, Spring 2013

Tam VuWINLAB, Dept. of Computer

ScienceRutgers University

Transport Layer ECE544: Communication Networks-II, Spring

2013

Includes teaching materials from, L. Peterson, Sumathi Gopal and Sumit Rangwala, D. Raychaudhuri, Mike Freedman

IP Protocol Stack: Key Abstractions

2

Problem: Network Layer (IP) provides only best-effort communication services

Best-effort local packet delivery

Best-effort global packet delivery

Reliable streams

Applications

Messages

Link

Network

Transport

Application

Applications requirements vs. IP layer limitationsGuarantee message delivery

Network may drop messages.Deliver messages in the same order they are sent

Messages may be reordered in networks and incurs a long delay.

Delivers at most one copy of each messageMessages may duplicate in networks.

Support arbitrarily large messageNetwork may limit message size.

Support synchronization between sender and receiver

Allows the receiver to apply flow control to the sender

Support multiple application processes on each hostNetwork only support communication between hosts

Many more

IP Protocol Stack: Key Abstractions

4

Transport layer:Provide applications with good abstractionsWithout support or feedback from the network Is the lowest layer in the network stack that is an end-to-

end protocol

Best-effort local packet delivery

Best-effort global packet delivery

Reliable streams

Applications

Messages

Link

Network

Transport

Application

Transport Protocols

5

Logical communication between processesSender divides a message into segmentsReceiver reassembles segments into message

Transport services(De)multiplexing packetsDetecting corrupted dataOptionally: reliable delivery, flow control, …

Two Basic Transport FeaturesDemultiplexing: port numbers

Error detection: checksums

Web server(port 80)

Client host

Server host 128.2.194.242

Echo server(port 7)

Service request for128.2.194.242:80(i.e., the Web server)

OSClient

IP payload

detect corruption6

Most Popular Transport ProtocolsUser Datagram Protocol (UDP)

Support multiple applications processes on each host

Option to check messages for correctness with CRC check

Transmission Control Protocol (TCP) Ensures reliable delivery of packets between

source and destination processes Ensures in-order delivery of packets to

destination process Other services

Real Time Protocol (RTP) Serves real-time multimedia applicationsMoves decision making to the applications Runs over UDP

User Datagram Protocol (UDP)

Service: Support for multiple processes on each host to communicate Issue: IP only provides communication between hosts (IP addresses)

SolutionAdd port number and associate a process with a port number4-Tuple Unique Connection Identifier: [SrcPort, SrcIPAddr, DestPort,

DestIPAddr ]

Lightweight communication between processesSend and receive messagesAvoid overhead of ordered, reliable delivery

No connection setup delay, in-kernel connection state

Used by popular appsQuery/response for DNSReal-time data in VoIP

SrcPort DesPort

Length Checksum

Payload

0 16 31

User Datagram Protocol (UDP): Error Detection

Service: Ensure message correctnessIssue: Packet corruption in transit

SolutionUse Checksum. Includes UDP header, payload, pseudo headerPseudo header

Protocol number, source IP address, destination IP address, and UDP length

SrcPort DesPort

Length Checksum

Payload

0 16 31

Transmitting a stream of bytes ?

Stream-of-bytes serviceSends and receives a

stream of bytes

Reliable, in-order deliveryCorruption: checksumsDetect loss/reordering:

sequence numbersReliable delivery:

acknowledgments and retransmissions

Connection orientedExplicit set-up and

tear-down of TCP connection

Flow controlPrevent overflow of

the receiver’s buffer space

Congestion controlAdapt to network

congestion for the greater good

11

Transmission Control Protocol (TCP) First proposed by Vinton Cerf and Robert

Kahn, 1974TCP/IP enabled computers of all sizes, from

different vendors, different OSs, to communicate with each other.

Used by 80% of all traffic on the InternetReliable, in-order delivery, connection-

oriented, bye-stream service

Starting and Ending a Connection:

TCP Handshakes

Establishing a TCP Connection

Three-way handshake to establish connectionHost A sends a SYN (open) to the host BHost B returns a SYN acknowledgment (SYN ACK)Host A sends an ACK to acknowledge the SYN

ACK14

SYN

SYN

ACKACK

Data

A B

Data

Each host tells its Initial Sequence Number (ISN) to the other host.

Each host tells its Initial Sequence Number (ISN) to the other host.

TCP Header

15

Source port Destination port

Sequence number

Acknowledgment

Advertised windowHdrLen

Flags0

Checksum Urgent pointer

Options (variable)

Data

Flags:SYNFINRSTPSHURGACK

Step 1: A’s Initial SYN Packet

16

A’s port B’s port

A’s Initial Sequence Number

Acknowledgment

Advertised window20 Flags0

Checksum Urgent pointer

Options (variable)

Flags:SYNFINRSTPSHURGACK

A tells B it wants to open a connection…

Step 2: B’s SYN-ACK Packet

B’s port A’s port

B’s Initial Sequence Number

A’s ISN plus 1

Advertised window20 Flags0

Checksum Urgent pointer

Options (variable)

Flags:SYNFINRSTPSHURGACK

B tells A it accepts, and is ready to hear the next byte…

… upon receiving this packet, A can start sending data

17

Step 3: A’s ACK of the SYN-ACK

A’s port B’s port

B’s ISN plus 1

Advertised window20 Flags0

Checksum Urgent pointer

Options (variable)

Flags:SYNFINRSTPSHURGACK

A tells B it is okay to start sending

Sequence number

… upon receiving this packet, B can start sending data

18

SYN Loss and Web Downloads

Upon sending SYN, sender sets a timerIf SYN lost, timer expires before SYN-ACK receivedSender retransmits SYN

How should the TCP sender set the timer?No idea how far away the receiver isSome TCPs use default of 3 or 6 seconds

Implications for web downloadUser gets impatient and hits reload … Users aborts connection, initiates new socketEssentially, forces a fast send of a new SYN!

19

Tearing Down the Connection

Closing (each end of) the connectionFinish (FIN) to close and receive remaining bytesAnd other host sends a FIN ACK to acknowledgeReset (RST) to close and not receive remaining

bytes

SYN

SYN

AC

K

AC

KD

ata

FIN

AC

K

AC

K

timeA

BFIN

AC

K

20

Sending/Receiving the FIN Packet

Sending a FIN: close()Process is done sending

data via socketProcess invokes

“close()”Once TCP has sent all

the outstanding bytes…… then TCP sends a FIN

Receiving a FIN: EOFProcess is reading

data from socketEventually, read

call returns an EOF

21

Data transmission

TCP: Byte-streamService: Byte-stream

Application reads or writes a stream of bytes to the transport

Issue: IP is packet-orientedSolution: TCP maintains a local buffer

Chop the stream into packets and transmit (sender)Coalesce data from packets to form a stream (receiver)

TCP “Stream of Bytes” Service

By te 0

By te 1

By te 2

By te 3

By te 0

By te 1

By te 2

By te 3

Host A

Host B

By te 8 0

By te 8 0

24

…Emulated Using TCP “Segments”

By te 0

By te 1

By te 2

By te 3

By te 0

By te 1

By te 2

By te 3

Host A

Host B

By te 8 0

TCP Data

TCP Data

By te 8 0

Segment sent when:1. Segment full (Max Segment Size),2. Not full, but times out, or3. “Pushed” by application

25

TCP SegmentIP packet

No bigger than Maximum Transmission Unit (MTU)

E.g., up to 1500 bytes on an Ethernet link

TCP packetIP packet with a TCP header and data insideTCP header is typically 20 bytes long

TCP segmentNo more than Maximum Segment Size (MSS)

bytesE.g., up to 1460 consecutive bytes from the

stream

IP HdrIP Data

TCP HdrTCP Data (segment)

26

Sequence NumberHost A

Host B

TCP Data

TCP Data

ISN (initial sequence number)

Sequence number = 1st byte

By te 8 1

27

Reliable Delivery on a Lossy Channel With Bit Errors

Challenges of Reliable Data Transfer

Over a perfectly reliable channel: Done

Over a channel with bit errorsReceiver detects errors and requests

retransmission

Over a lossy channel with bit errorsSome data missing, others corruptedReceiver cannot easily detect loss

Over a channel that may reorder packetsReceiver cannot easily distinguish loss vs. out-of-

order

30

An AnalogyAlice and Bob are talking

What if Alice couldn’t understand Bob?Bob asks Alice to repeat what she said

What if Bob hasn’t heard Alice for a while?Is Alice just being quiet? Has she lost

reception?How long should Bob just keep on talking?Maybe Alice should periodically say “uh huh”… or Bob should ask “Can you hear me

now?”

31

Take-Aways from the ExampleAcknowledgments from receiver

Positive: “okay” or “uh huh” or “ACK”Negative: “please repeat that” or “NACK”

Retransmission by the senderAfter not receiving an “ACK”After receiving a “NACK”

Timeout by the sender (“stop and wait”)Don’t wait forever without some

acknowledgment

32

TCP Support for Reliable DeliveryDetect bit errors: checksum

Used to detect corrupted data at the receiver…leading the receiver to drop the packet

Detect missing data: sequence numberUsed to detect a gap in the stream of bytes... and for putting the data back in order

Recover from lost data: retransmissionSender retransmits lost or corrupted dataTwo main ways to detect lost packets

33

TCP AcknowledgmentsHost A

Host B

TCP Data

TCP Data

ISN (initial sequence number)

Sequence number = 1st byte

ACK sequence number = next expected byte

34

Automatic Repeat reQuest (ARQ)

ACK and timeoutsReceiver sends ACK

when it receives packet

Sender waits for ACK and times out

Simplest ARQ protocolStop and waitSend a packet, stop and

wait until ACK arrives

35

Time

Packet

ACKTim

eou

t

Sender Receiver

Quick TCP Math• Initial Seq No = 501. Sender sends 4500 bytes

successfully acknowledged. Next sequence number to send is:

(A) 4501 (B) 5000 (C) 5001 (D) 5002

• Next 1000 byte TCP segment received. Receiver acknowledges with ACK number:

(A) 5001 (B) 6000 (C) 6001

36

Flow Control:TCP Sliding Window

Sliding Window: MotivationStop-and-wait is inefficient

Only one TCP segment is “in flight” at a time

Consider: 1.5 Mbps link with 50 ms round-trip-time (RTT)Assume segment size of 1 KB (8 Kbits)8 Kbits/segment at 50 msec/segment 160 KbpsThat’s 11% of the capacity of 1.5 Mbps link

39

Sliding WindowAllow a larger amount of data “in flight”

Allow sender to get ahead of the receiver… though not too far ahead

Sending process Receiving process

Last byte ACKedLast byte sent

TCP TCP

Next byte expected

Last byte written Last byte read

Last byte received40

Receiver BufferingReceive window size

Amount that can be sent without acknowledgmentReceiver must be able to store this amount of

data

Receiver tells the sender the windowTells the sender the amount of free space left

Window Size

OutstandingUn-ack’d data

Data OK to send

Data not OK to send yet

Data ACK’d

41

TCP: Flow Control Flow Control

“Prevent sender from overrunning the capacity (buffer) of the receiver” Solution: Use adaptive receiver window size

Goal is to keep (C) – (A) < MaxRcvBuffer Every packet carries ACK and AdvertisedWindow

Sending Appl Receiving Appl

LastByteAcked (J) (K) LastByteSent

(I) LastByteWritten

(B) NextByteExpected

(C) LastByteRcvd

LastByteRead(A)TCP TCP

AdvertisedWindow = MaxRcvBuffer- ((NextByteExp-1)-LastByteRead)

LastByteSent (K) – LastByteAcked (J) <= AdvertisedWindowEffWin = AdvertisedWin - (LastByteSent-LastByteAcked)LastByteWritten – LastByteAcked <= MaxSendBuffer

Optimizing Retransmissions

43

Reasons for Retransmission

44

Packet

ACK

Tim

eou

t Packet

ACK

Tim

eou

t

Packet

Tim

eou

t

Packet

ACK

Tim

eou

t

Packet

ACK

Tim

eou

tPacket

ACK

Tim

eou

t

ACK lostDUPLICATE PACKET

Packet lost Early timeoutDUPLICATEPACKETS

How Long Should Sender Wait?Sender sets a timeout to wait for an ACK

Too short: wasted retransmissionsToo long: excessive delays when packet lost

TCP sets timeout as a function of the RTTExpect ACK to arrive after an “round-trip

time”… plus a fudge factor to account for queuing

But, how does the sender know the RTT?Running average of delay to receive an ACK

45

TCP TimeoutIssue: RTT in a wide area network varies

substantiallySolution: Adaptive Timeout

Original Algorithm: EstimatedRTT = x EstimatedRTT + (1-) x

SampleRTT

Timeout = β x EstimatedRTT (β = 2)

Problem Does not distinguish whether the ACK is for original

transmission or retransmissionConstant β is not good.

Assumes constant variance

10

TCP TimeoutKarn/Partridge Algorithm

Whenever TCP retransmits a segment, it stops taking samples of the RTTOnly measure SampleRTT for segments that have been

sent only onceEach time TCP retransmits, set the next timeout to be

twice the last timeoutRelieves congestion

Jacobson/Karels Algorithm: Adaptive variance (uses mean variance)

Difference = SampleRTT - EstimatedRTTEstimatedRTT = EstimatedRTT + ( x Difference) → (same as

in original)Deviation = Deviation + (|Difference|- Deviation)Timeout = x EstimatedRTT + x Deviation(default: set = 1 and = 4 )

10

TCP Deadlock TCP Deadlock

receiver advertises a window size of 0, the sender stops sending data

the window size update from the receiver is lost

To solve it:the sender starts the persist timer when

AdvertisedWindow = 0When the persist timer expires, the sender

sends a small packet

Triggering TransmissionWhen to transmit a segment:

small segments subject to large overheadReach max segment size (MSS): the size

of the largest segment TCP can send without causing the local IP to fragmentMSS = local MTU – IP & TCP header

The sending process explicitly ask the TCP to transmit, “push”

Congestion

When the network cannot support the sender’s rateQueues at the network elements overflow

Source1

Source2

Source3

Dest2

Dest1

Even with flow control packets might not reach the

destination

Congestion Control vs. Flow ControlCongestion Control

Mechanism to prevent sender from overrunning the capacity of the network When network is the bottleneck

Flow Control Mechanism to prevent sender from

overrunning the capacity of the receiverWhen receiver is the bottleneck

Congestion Control: Design ApproachMaintain another window at the sender called

CongestionWindow (cwnd)CongestionWindow is the max number of packets allowed

in the networkNumber of unACKed packets at the sender.

Key: How to calculate congestion window (cwnd)Various approaches possibleTCP estimates it based on observed packet losses

Assumes packet loss as indication of congestion

Since we don’t know whether the network or the receiver is the bottleneck MaxWindow = MIN(CongestionWindow,

AdvertisedWindow)EffectiveWin = MaxWindow – (LastByteSent –

LastByteAcked)

Congestion Avoidance:

(AIMD) If no congestion in the network (increase

conservatively)Increase the congestion window additively every RTT

If congestion in the network (decrease aggressively)Decrease the congestion window multiplicatively,

immediately

How is congestion detected? Estimated (more later)

Every RTT w = w + 1w = cwnd in segments

Every ACK reception w = w + 1/ww = cwnd in segments

Every ACK reception cwnd = cwnd +

MSS*(MSS/cwnd)cwnd in bytes

cwnd = cwnd/2 cwnd in bytes

Congestion Avoidance: (AIMD)

TCP’s saw tooth patternIssues with additive increase

takes too long to ramp up a connection from the beginning

The entire advertised window may be reopened when a lost packet retransmitted and a single cumulative ACK is received by the sender

TimeCongest

ionW

indow

Siz

e

Startup time

TCP “Slow Start”: To start quickly!

Maintain another variable slow start threshold (ssthresh) Last known stable rate If (cwnd > ssthresh)

State = congestion avoidanceElse

State = slow start In Slow start

Increase the congestion window exponentially every RTT

Key: How is ssthresh calculated?

Every ACK reception w = w + 1w = cwnd in segments

Every ACK reception cwnd = cwnd + MSScwnd in bytes

TCP: Congestion Detection and Retransmit

Loss of packet indicates congestionTimer Timeouts (No ACK)

Set according to Jacobson/Karels algorithmOn timer timeout

ssthresh = max(2*MSS, effwin/2); cwnd = MSSNotice this will cause TCP to go into slow start

Issue: takes a long time to detect a packet lossAffects throughput

Any other quicker way of detecting a packet loss?

Fast Retransmit

Observation: A series of duplicate ACKs might mean a packet loss

SolutionEvery time receiver

receives a packet (out-of-order), sends a duplicate ACK

Sender retransmit the missing packet after it receives some number of duplicate ACKs (e.g. 3 duplicate ACKs)

Fast Retransmit does not replace timeouts

Issue: Reduces latency (early retransmit) but still incurs loss in throughput (slow start after packet loss )

ACK 1

ACK 2

ACK 2

ACK 2

ACK 2

ACK 6

PKT 1PKT 2

PKT 4

PKT 5PKT 6

PKT 3Retran

PKT 3

Fast Recovery

Transmit a packet for every ACK received till the retransmitted packet is ACK’dssthresh= (2*MSS,

cwdn/2); cwnd = sshthred + 3

On every ACK will the ACK of retransmitted packet cwnd = cwnd + 1

On reception of ACK of retransmitted packet Start congestion

avoidance instead of slow startcwnd = ssthresh

Homework5.13 (3rd ed and 4th ed)5.165.285.345.39

Due 4/5