Step(1:(A’s(Ini%al(SYN(Packet · 2 Step(2:(B’s(SYN9ACK(Packet 5 B’s port A’s port B’s...

15
1 Star%ng and Ending a Connec%on: TCP Handshakes 1 Establishing a TCP Connec%on Threeway handshake to establish connec%on Host A sends a SYNchronize (open) to the host B Host B returns a SYN ACKnowledgment (SYN ACK) Host A sends an ACK to acknowledge the SYN ACK 2 SYN SYN ACK ACK Data A B Data Each host tells its ISN to the other host. TCP Header 3 Source port Destination port Sequence number Acknowledgment Advertised window HdrLen Flags 0 Checksum Urgent pointer Options (variable) Data Flags: SYN FIN RST PSH URG ACK Step 1: A’s Ini%al SYN Packet 4 A’s port B’s port A’s Initial Sequence Number Acknowledgment Advertised window 20 Flags 0 Checksum Urgent pointer Options (variable) Flags: SYN FIN RST PSH URG ACK A tells B it wants to open a connec4on…

Transcript of Step(1:(A’s(Ini%al(SYN(Packet · 2 Step(2:(B’s(SYN9ACK(Packet 5 B’s port A’s port B’s...

Page 1: Step(1:(A’s(Ini%al(SYN(Packet · 2 Step(2:(B’s(SYN9ACK(Packet 5 B’s port A’s port B’s Initial Sequence Number A’s ISN plus 1 20 0 Flags Advertised window Checksum Urgent

1

Star%ng  and  Ending  a  Connec%on:  TCP  Handshakes  

1

Establishing  a  TCP  Connec%on  

•  Three-­‐way  handshake  to  establish  connec%on  – Host  A  sends  a  SYNchronize  (open)  to  the  host  B  – Host  B  returns  a  SYN  ACKnowledgment  (SYN  ACK)  – Host  A  sends  an  ACK  to  acknowledge  the  SYN  ACK  

2

SYN

SYN ACK

ACK Data

A B

Data

Each host tells its ISN to the other host.!

TCP  Header  

3

Source port Destination port

Sequence number

Acknowledgment

Advertised window HdrLen Flags 0

Checksum Urgent pointer

Options (variable)

Data

Flags: SYN FIN RST PSH URG ACK

Step  1:  A’s  Ini%al  SYN  Packet  

4

A’s port B’s port

A’s Initial Sequence Number

Acknowledgment

Advertised window 20 Flags 0

Checksum Urgent pointer

Options (variable)

Flags: SYN FIN RST PSH URG ACK

A  tells  B  it  wants  to  open  a  connec4on…  

Page 2: Step(1:(A’s(Ini%al(SYN(Packet · 2 Step(2:(B’s(SYN9ACK(Packet 5 B’s port A’s port B’s Initial Sequence Number A’s ISN plus 1 20 0 Flags Advertised window Checksum Urgent

2

Step  2:  B’s  SYN-­‐ACK  Packet  

5

B’s port A’s port

B’s Initial Sequence Number

A’s ISN plus 1

Advertised window 20 Flags 0

Checksum Urgent pointer

Options (variable)

Flags: SYN FIN RST PSH URG ACK

B  tells  A  it  accepts,  and  is  ready  to  hear  the  next  byte…  

…  upon  receiving  this  packet,  A  can  start  sending  data  

Step  3:  A’s  ACK  of  the  SYN-­‐ACK  

6

A’s port B’s port

B’s ISN plus 1

Advertised window 20 Flags 0

Checksum Urgent pointer

Options (variable)

Flags: SYN FIN RST PSH URG ACK

A tells B it is okay to start sending…!

Sequence number

… upon receiving this packet, B can start sending data!

What  if  the  SYN  Packet  Gets  Lost?  

•  Suppose  the  SYN  packet  gets  lost  – Packet  is  lost  inside  the  network,  or  – Server  rejects  the  packet  (e.g.,  listen  queue  is  full)  

•  Eventually,  no  SYN-­‐ACK  arrives  – Sender  sets  a  %mer  and  wait  for  the  SYN-­‐ACK  – …  and  retransmits  the  SYN  if  needed  

•  How  should  the  TCP  sender  set  the  %mer?  – Sender  has  no  idea  how  far  away  the  receiver  is  – Hard  to  guess  a  reasonable  length  of  %me  to  wait  – Some  TCPs  use  a  default  of  3  or  6  seconds  

7

SYN  Loss  and  Web  Downloads  

•  User  clicks  on  a  hypertext  link  –  Browser  creates  a  socket  and  does  a  “connect”  –  The  “connect”  triggers  the  OS  to  transmit  a  SYN  

•  If  the  SYN  is  lost…  –  The  3-­‐6  seconds  of  delay  may  be  very  long  –  The  user  may  get  impa%ent  – …  and  click  the  hyperlink  again,  or  click  “reload”  

•  User  triggers  an  “abort”  of  the  “connect”  –  Browser  creates  a  new  socket  and  does    a  “connect”  –  Essen%ally,  forces  a  faster  send  of  a  new  SYN  packet!  –  Some%mes  very  effec%ve,  and  the  page  comes  fast  

8

Page 3: Step(1:(A’s(Ini%al(SYN(Packet · 2 Step(2:(B’s(SYN9ACK(Packet 5 B’s port A’s port B’s Initial Sequence Number A’s ISN plus 1 20 0 Flags Advertised window Checksum Urgent

3

Tearing  Down  the  Connec%on  

•  Closing  (each  end  of)  the  connec%on  – Finish  (FIN)  to  close  and  receive  remaining  bytes  – And  other  host  sends  a  FIN  ACK  to  acknowledge  – Reset  (RST)  to  close  and  not  receive  remaining  bytes  

9

SYN

SYN

ACK

ACK

D

ata

FIN

ACK

ACK

time A

B

FIN A

CK

Sending/Receiving  the  FIN  Packet  

•  Sending  a  FIN:  close()  –  Process  is  done  sending  data  via  the  socket  

–  Process  invokes  “close()”  to  close  the  socket  

– Once  TCP  has  sent  all  of  the  outstanding  bytes…  

– …  then  TCP  sends  a  FIN  

•  Receiving  a  FIN:  EOF  –  Process  is  reading  data  from  the  socket  

–  Eventually,  the  acempt  to  read  returns  an  EOF  

10

11

Flow  Control:  TCP  Sliding  Window  

12

Page 4: Step(1:(A’s(Ini%al(SYN(Packet · 2 Step(2:(B’s(SYN9ACK(Packet 5 B’s port A’s port B’s Initial Sequence Number A’s ISN plus 1 20 0 Flags Advertised window Checksum Urgent

4

Mo%va%on  for  Sliding  Window  •  Stop-­‐and-­‐wait  is  inefficient  

– Only  one  TCP  segment  is  “in  flight”  at  a  %me  – Esp.  bad  when  delay-­‐bandwidth  product  is  high  

•  Numerical  example  – 1.5  Mbps  link  with  a  45  msec  round-­‐trip  %me  (RTT)  

•  Delay-­‐bandwidth  product  is  67.5  Kbits  (or  8  KBytes)  – But,  sender  can  send  at  most  one  packet  per  RTT  

•  Assuming  a  segment  size  of  1  KB  (8  Kbits)  •  …  leads  to  8  Kbits/seg  /  45  Msec/seg        182  Kbps  •  Just  one-­‐eighth  of  the  1.5  Mbps  link  capacity  

13

Sliding  Window  •  Allow  a  larger  amount  of  data  “in  flight”  

– Allow  sender  to  get  ahead  of  the  receiver  – …  though  not  too  far  ahead  

14

Sending process! Receiving process!

Last byte ACKed!

Last byte sent!

TCP! TCP!

Next byte expected!

Last byte written! Last byte read!

Last byte received!

Receiver  Buffering  •  Window  size  

– Amount  that  can  be  sent  without  acknowledgment  –  Receiver  needs  to  be  able  to  store  this  amount  of  data  

•  Receiver  adver%ses  the  window  to  the  sender  –  Tells  the  sender  the  amount  of  free  space  lel  – …  and  the  sender  agrees  not  to  exceed  this  amount  

15

Window Size

Outstanding Un-ack’d data

Data OK to send

Data not OK to send yet

Data ACK’d

TCP  Header  for  Receiver  Buffering  

16

Source port Destination port

Sequence number

Acknowledgment

Advertised window HdrLen Flags 0

Checksum Urgent pointer

Options (variable)

Data

Flags: SYN FIN RST PSH URG ACK

Page 5: Step(1:(A’s(Ini%al(SYN(Packet · 2 Step(2:(B’s(SYN9ACK(Packet 5 B’s port A’s port B’s Initial Sequence Number A’s ISN plus 1 20 0 Flags Advertised window Checksum Urgent

5

Conclusions  

•  Transport  protocols  – Mul%plexing  and  demul%plexing  – Checksum-­‐based  error  detec%on  – Sequence  numbers  – Retransmission  – Window-­‐based  flow  control  

•  Next  lecture  – Conges%on  control  

17

Conges%on  Control  Reading:  Chapter  3  

18

Goals  of  Today’s  Lecture  

•  Conges%on  in  IP  networks  – Unavoidable  due  to  best-­‐effort  service  model  –  IP  philosophy:  decentralized  control  at  end  hosts  

•  Conges%on  control  by  the  TCP  senders  –  Infers  conges%on  is  occurring  (e.g.,  from  packet  losses)  –  Slows  down  to  alleviate  conges%on,  for  the  greater  good  

•  TCP  conges%on-­‐control  algorithm  – Addi%ve-­‐increase,  mul%plica%ve-­‐decrease  –  Slow  start  and  slow-­‐start  restart  

•  Ac%ve  Queue  Management  (AQM)  –  Random  Early  Detec%on  (RED)  –  Explicit  Conges%on  No%fica%on  (ECN)  

19

No  Problem  Under  Circuit  Switching  

•  Source  establishes  connec%on  to  des%na%on  – Nodes  reserve  resources  for  the  connec%on  – Circuit  rejected  if  the  resources  aren’t  available  – Cannot  have  more  than  the  network  can  handle  

20

Page 6: Step(1:(A’s(Ini%al(SYN(Packet · 2 Step(2:(B’s(SYN9ACK(Packet 5 B’s port A’s port B’s Initial Sequence Number A’s ISN plus 1 20 0 Flags Advertised window Checksum Urgent

6

IP  Best-­‐Effort  Design  Philosophy  

•  Best-­‐effort  delivery  – Let  everybody  send  – Network  tries  to  deliver  what  it  can  – …  and  just  drop  the  rest  

21

source destination

IP network

Conges%on  is  Unavoidable  •  Two  packets  arrive  at  the  same  %me  

– The  node  can  only  transmit  one  – …  and  either  buffer  or  drop  the  other  

•  If  many  packets  arrive  in  short  period  of  %me  – The  node  cannot  keep  up  with  the  arriving  traffic  – …  and  the  buffer  may  eventually  overflow  

22

The  Problem  of  Conges%on  

•  What  is  conges%on?  – Load  is  higher  than  capacity  

•  What  do  IP  routers  do?  – Drop  the  excess  packets  

•  Why  is  this  bad?  – Wasted  bandwidth  for  retransmissions  

23 Load

Goodput “congestion! collapse”!

Increase in load that results in a decrease in

useful work done.

Ways  to  Deal  With  Conges%on  •  Ignore  the  problem  

– Many  dropped  (and  retransmiced)  packets  –  Can  cause  conges%on  collapse  

•  Reserva%ons,  like  in  circuit  switching  –  Pre-­‐arrange  bandwidth  alloca%ons  –  Requires  nego%a%on  before  sending  packets  

•  Pricing  – Don’t  drop  packets  for  the  high-­‐bidders  –  Requires  a  payment  model  

•  Dynamic  adjustment  (TCP)  –  Every  sender  infers  the  level  of  conges%on  –  Each  adapts  its  sending  rate  “for  the  greater  good”  

24

Page 7: Step(1:(A’s(Ini%al(SYN(Packet · 2 Step(2:(B’s(SYN9ACK(Packet 5 B’s port A’s port B’s Initial Sequence Number A’s ISN plus 1 20 0 Flags Advertised window Checksum Urgent

7

Many  Important  Ques%ons  

•  How  does  the  sender  know  there  is  conges%on?  –  Explicit  feedback  from  the  network?  –  Inference  based  on  network  performance?  

•  How  should  the  sender  adapt?  –  Explicit  sending  rate  computed  by  the  network?  –  End  host  coordinates  with  other  hosts?  –  End  host  thinks  globally  but  acts  locally?  

•  What  is  the  performance  objec%ve?  – Maximizing  goodput,  even  if  some  users  suffer  more?  –  Fairness?    (Whatever  the  heck  that  means!)  

•  How  fast  should  new  TCP  senders  send?  25

Inferring  From  Implicit  Feedback  

•  What  does  the  end  host  see?  

•  What  can  the  end  host  change?  26

?

Where  Conges%on  Happens:  Links  •  Simple  resource  alloca%on:  FIFO  queue  &  drop-­‐tail  

•  Access  to  the  bandwidth:  first-­‐in  first-­‐out  queue  – Packets  transmiced  in  the  order  they  arrive  

•  Access  to  the  buffer  space:  drop-­‐tail  queuing  –  If  the  queue  is  full,  drop  the  incoming  packet  

27

How  it  Looks  to  the  End  Host  

•  Packet  delay  – Packet  experiences  high  delay  

•  Packet  loss  – Packet  gets  dropped  along  the  way  

•  How  does  TCP  sender  learn  this?  – Delay  

•  Round-­‐trip  %me  es%mate  

– Loss  •  Timeout    •  Duplicate  acknowledgments  

28

Page 8: Step(1:(A’s(Ini%al(SYN(Packet · 2 Step(2:(B’s(SYN9ACK(Packet 5 B’s port A’s port B’s Initial Sequence Number A’s ISN plus 1 20 0 Flags Advertised window Checksum Urgent

8

What  Can  the  End  Host  Do?  

•  Upon  detec%ng  conges%on  (well,  packet  loss)  – Decrease  the  sending  rate  –  End  host  does  its  part  to  alleviate  the  conges%on  

•  But,  what  if  condi%ons  change?  –  Suppose  there  is  more  bandwidth  available  – Would  be  a  shame  to  stay  at  a  low  sending  rate  

•  Upon  not  detec%ng  conges%on  –  Increase  the  sending  rate,  a  licle  at  a  %me  – And  see  if  the  packets  are  successfully  delivered  

29

TCP  Conges%on  Window  

•  Each  TCP  sender  maintains  a  conges%on  window  – Maximum  number  of  bytes  to  have  in  transit  –  I.e.,  number  of  bytes  s%ll  awai%ng  acknowledgments  

•  Adap%ng  the  conges%on  window  – Decrease  upon  losing  a  packet:  backing  off  –  Increase  upon  success:  op%mis%cally  exploring  – Always  struggling  to  find  the  right  transfer  rate  

•  Both  good  and  bad  –  Pro:  avoids  having  explicit  feedback  from  network  –  Con:  under-­‐shoo%ng  and  over-­‐shoo%ng  the  rate  

30

Addi%ve  Increase,  Mul%plica%ve  Decrease    (AIMD)  

•  How  much  to  increase  and  decrease?  –  Increase  linearly,  decrease  mul%plica%vely  – A  necessary  condi%on  for  stability  of  TCP  – Consequences  of  over-­‐sized  window  are  much  worse  than  having  an  under-­‐sized  window  •  Over-­‐sized  window:  packets  dropped  and  retransmiced  •  Under-­‐sized  window:  somewhat  lower  throughput  

•  Mul%plica%ve  decrease  – On  loss  of  packet,  divide  conges%on  window  in  half  

•  Addi%ve  increase  – On  success  for  last  window  of  data,  increase  linearly  

31

Leads  to  the  TCP  “Sawtooth”  

32

t

Window

halved

Loss

Page 9: Step(1:(A’s(Ini%al(SYN(Packet · 2 Step(2:(B’s(SYN9ACK(Packet 5 B’s port A’s port B’s Initial Sequence Number A’s ISN plus 1 20 0 Flags Advertised window Checksum Urgent

9

Prac%cal  Details  

•  Conges%on  window  – Represented  in  bytes,  not  in  packets  (Why?)  

– Packets  have  MSS  (Maximum  Segment  Size)  bytes  

•  Increasing  the  conges%on  window  –  Increase  by  MSS  on  success  for  last  window  of  data  

•  Decreasing  the  conges%on  window  – Never  drop  conges%on  window  below  1  MSS  

33

Receiver  Window  vs.  Conges%on  Window  

•  Flow  control  –  Keep  a  fast  sender  from  overwhelming  a  slow  receiver  

•  Conges%on  control  –  Keep  a  set  of  senders  from  overloading  the  network  

•  Different  concepts,  but  similar  mechanisms  –  TCP  flow  control:  receiver  window  –  TCP  conges%on  control:  conges%on  window  –  TCP  window:  min  {  conges%on  window,  receiver  window  }  

34

How  Should  a  New  Flow  Start  

35

t

Window

But, could take a long time to get started!

Need to start with a small CWND to avoid overloading the network.

“Slow  Start”  Phase  

•  Start  with  a  small  conges%on  window  –  Ini%ally,  CWND  is  1  Max  Segment  Size  (MSS)  – So,  ini%al  sending  rate  is  MSS/RTT  

•  That  could  be  precy  wasteful  – Might  be  much  less  than  the  actual  bandwidth  – Linear  increase  takes  a  long  %me  to  accelerate  

•  Slow-­‐start  phase  (really  “fast  start”)  – Sender  starts  at  a  slow  rate  (hence  the  name)  – …  but  increases  the  rate  exponen%ally  – …  un%l  the  first  loss  event  

36

Page 10: Step(1:(A’s(Ini%al(SYN(Packet · 2 Step(2:(B’s(SYN9ACK(Packet 5 B’s port A’s port B’s Initial Sequence Number A’s ISN plus 1 20 0 Flags Advertised window Checksum Urgent

10

Slow  Start  in  Ac%on  

37

Double CWND per round-trip time

D A D D A A D D

A A

D

A

Src

Dest

D

A

1 2 4 8

Slow  Start  and  the  TCP  Sawtooth  

38

Loss

Exponential “slow start”

t

Window

Why is it called slow-start? Because TCP originally had no congestion control mechanism. The source would just start by sending a whole receiver window’s worth of data.

Two  Kinds  of  Loss  in  TCP  

•  Timeout  –  Packet  n  is  lost  and  detected  via  a  %meout  –  E.g.,  because  all  packets  in  flight  were  lost  – Aler  the  %meout,  blas%ng  away  for  the  en%re  CWND  – …  would  trigger  a  very  large  burst  in  traffic  –  So,  becer  to  start  over  with  a  low  CWND  

•  Triple  duplicate  ACK  –  Packet  n  is  lost,  but  packets  n+1,  n+2,  etc.  arrive  –  Receiver  sends  duplicate  acknowledgments  – …  and  the  sender  retransmits  packet  n  quickly  – Do  a  mul%plica%ve  decrease  and  keep  going  

39

Repea%ng  Slow  Start  Aler  Timeout  

40

t

Window

Slow-start restart: Go back to CWND of 1, but take advantage of knowing the previous value of CWND.

Slow start in operation until it reaches half of

previous cwnd.

timeout

Page 11: Step(1:(A’s(Ini%al(SYN(Packet · 2 Step(2:(B’s(SYN9ACK(Packet 5 B’s port A’s port B’s Initial Sequence Number A’s ISN plus 1 20 0 Flags Advertised window Checksum Urgent

11

Repea%ng  Slow  Start  Aler  Idle  Period  

•  Suppose  a  TCP  connec%on  goes  idle  for  a  while  – E.g.,  Telnet  session  where  you  don’t  type  for  an  hour  

•  Eventually,  the  network  condi%ons  change  – Maybe  many  more  flows  are  traversing  the  link  – E.g.,  maybe  everybody  has  come  back  from  lunch!  

•  Dangerous  to  start  transmisng  at  the  old  rate  – Previously-­‐idle  TCP  sender  might  blast  the  network  – …  causing  excessive  conges%on  and  packet  loss  

•  So,  some  TCP  implementa%ons  repeat  slow  start  – Slow-­‐start  restart  aler  an  idle  period  

41

TCP  Achieves  Some  No%on  of  Fairness  

•  Effec%ve  u%liza%on  is  not  the  only  goal  – We  also  want  to  be  fair  to  the  various  flows  – …  but  what  the  heck  does  that  mean?  

•  Simple  defini%on:  equal  shares  of  the  bandwidth  – N  flows  that  each  get  1/N  of  the  bandwidth?  –  But,  what  if  the  flows  traverse  different  paths?  –  E.g.,  bandwidth  shared  in  propor%on  to  the  RTT  

42

What  About  Chea%ng?  

•  Some  folks  are  more  fair  than  others  –  Running  mul%ple  TCP  connec%ons  in  parallel  – Modifying  the  TCP  implementa%on  in  the  OS  – Use  the  User  Datagram  Protocol  

•  What  is  the  impact  – Good  guys  slow  down  to  make  room  for  you  –  You  get  an  unfair  share  of  the  bandwidth  

•  Possible  solu%ons?  –  Routers  detect  chea%ng  and  drop  excess  packets?  –  Peer  pressure?  –  ???  

43

Queuing  Mechanisms  

Random  Early  Detec%on  (RED)  Explicit  Conges%on  No%fica%on  (ECN)  

44

Page 12: Step(1:(A’s(Ini%al(SYN(Packet · 2 Step(2:(B’s(SYN9ACK(Packet 5 B’s port A’s port B’s Initial Sequence Number A’s ISN plus 1 20 0 Flags Advertised window Checksum Urgent

12

Bursty  Loss  From  Drop-­‐Tail  Queuing  

•  TCP  depends  on  packet  loss  –  Packet  loss  is  the  indica%on  of  conges%on  –  In  fact,  TCP  drives  the  network  into  packet  loss  – …  by  con%nuing  to  increase  the  sending  rate  

•  Drop-­‐tail  queuing  leads  to  bursty  loss  – When  a  link  becomes  congested…  – …  many  arriving  packets  encounter  a  full  queue  – And,  as  a  result,  many  flows  divide  sending  rate  in  half  – …  and,  many  individual  flows  lose  mul%ple  packets  

45

Slow  Feedback  from  Drop  Tail  •  Feedback  comes  when  buffer  is  completely  full  

– …  even  though  the  buffer  has  been  filling  for  a  while  •  Plus,  the  filling  buffer  is  increasing  RTT  

– …  and  the  variance  in  the  RTT  •  Might  be  becer  to  give  early  feedback  

– Get  1-­‐2  connec%ons  to  slow  down,  not  all  of  them  

– Get  these  connec%ons  to  slow  down  before  it  is  too  late  

46

Random  Early  Detec%on  (RED)  •  Basic  idea  of  RED  

–  Router  no%ces  that  the  queue  is  gesng  backlogged  – …  and  randomly  drops  packets  to  signal  conges%on  

•  Packet  drop  probability  – Drop  probability  increases  as  queue  length  increases  –  If  buffer  is  below  some  level,  don’t  drop  anything  – …  otherwise,  set  drop  probability  as  func%on  of  queue  

47

Average Queue Length

Prob

abili

ty

Proper%es  of  RED  

•  Drops  packets  before  queue  is  full  –  In  the  hope  of  reducing  the  rates  of  some  flows  

•  Drops  packet  in  propor%on  to  each  flow’s  rate  – High-­‐rate  flows  have  more  packets  – …  and,  hence,  a  higher  chance  of  being  selected  

•  Drops  are  spaced  out  in  %me  – Which  should  help  desynchronize  the  TCP  senders  

•  Tolerant  of  burs%ness  in  the  traffic  – By  basing  the  decisions  on  average  queue  length  

48

Page 13: Step(1:(A’s(Ini%al(SYN(Packet · 2 Step(2:(B’s(SYN9ACK(Packet 5 B’s port A’s port B’s Initial Sequence Number A’s ISN plus 1 20 0 Flags Advertised window Checksum Urgent

13

Problems  With  RED  

•  Hard  to  get  the  tunable  parameters  just  right  – How  early  to  start  dropping  packets?  – What  slope  for  the  increase  in  drop  probability?  – What  %me  scale  for  averaging  the  queue  length?  

•  Some%mes  RED  helps  but  some%mes  not  –  If  the  parameters  aren’t  set  right,  RED  doesn’t  help  – And  it  is  hard  to  know  how  to  set  the  parameters  

•  RED  is  implemented  in  prac%ce  –  But,  olen  not  used  due  to  the  challenges  of  tuning  right  

•  Many  varia%ons  in  the  research  community  – With  cute  names  like  “Blue”  and  “FRED”…    

49

Explicit  Conges%on  No%fica%on  •  Early  dropping  of  packets  

– Good:  gives  early  feedback  –  Bad:  has  to  drop  the  packet  to  give  the  feedback  

•  Explicit  Conges%on  No%fica%on  –  Router  marks  the  packet  with  an  ECN  bit  – …  and  sending  host  interprets  as  a  sign  of  conges%on  

•  Surmoun%ng  the  challenges  – Must  be  supported  by  the  end  hosts  and  the  routers  –  Requires  2  bits  in  the  IP  header  for  detec%on  (forward  dir)  

•  One  for  ECN  mark;  one  to  indicate  ECN  capability  •  Solu%on:  borrow  2  of  Type-­‐Of-­‐Service  bits  in  IPv4  header  

– Also  2  bits  in  TCP  header  for  signaling  sender  (reverse  dir)  50

Other  TCP  Mechanisms  

Nagle’s  Algorithm  and  Delayed  ACK  

51

Mo%va%on  for  Nagle’s  Algorithm  

•  Interac%ve  applica%ons  –  Telnet  and  rlogin  – Generate  many  small  packets  (e.g.,  keystrokes)  

•  Small  packets  are  wasteful  – Mostly  header  (e.g.,  40  bytes  of  header,  1  of  data)  

•  Appealing  to  reduce  the  number  of  packets  –  Could  force  every  packet  to  have  some  minimum  size  – …  but,  what  if  the  person  doesn’t  type  more  characters?  

•  Need  to  balance  compe%ng  trade-­‐offs  –  Send  larger  packets  – …  but  don’t  introduce  much  delay  by  wai%ng  

52

Page 14: Step(1:(A’s(Ini%al(SYN(Packet · 2 Step(2:(B’s(SYN9ACK(Packet 5 B’s port A’s port B’s Initial Sequence Number A’s ISN plus 1 20 0 Flags Advertised window Checksum Urgent

14

Nagle’s  Algorithm  •  Wait  if  the  amount  of  data  is  small  

– Smaller  than  Maximum  Segment  Size  (MSS)  •  And  some  other  packet  is  already  in  flight  

–  I.e.,  s%ll  awai%ng  the  ACKs  for  previous  packets  •  That  is,  send  at  most  one  small  packet  per  RTT  

– …  by  wai%ng  un%l  all  outstanding  ACKs  have  arrived  

•  Influence  on  performance  –  Interac%ve  applica%ons:  enables  batching  of  bytes  – Bulk  transfer:  transmits  in  MSS-­‐sized  packets  anyway  

53

vs.

ACK

Nagle’s  Algorithm  •  Wait  if  the  amount  of  data  is  small  

– Smaller  than  Maximum  Segment  Size  (MSS)  •  And  some  other  packet  is  already  in  flight  

–  I.e.,  s%ll  awai%ng  the  ACKs  for  previous  packets  •  That  is,  send  at  most  one  small  packet  per  RTT  

– …  by  wai%ng  un%l  all  outstanding  ACKs  have  arrived  

•  Influence  on  performance  –  Interac%ve  applica%ons:  enables  batching  of  bytes  – Bulk  transfer:  transmits  in  MSS-­‐sized  packets  anyway  

54

vs.

ACK

Turning  Nagle  Off!void !

tcp_nodelay (int s) !{ !

!int n = 1; !!if (setsockopt (s, IPPROTO_TCP, TCP_NODELAY, !! (char *) &n, sizeof (n)) < 0) !

warn ("TCP_NODELAY: %m\n"); !} !

Mo%va%on  for  Delayed  ACK  

•  TCP  traffic  is  olen  bidirec%onal  – Data  traveling  in  both  direc%ons  – ACKs  traveling  in  both  direc%ons  

•  ACK  packets  have  high  overhead  – 40  bytes  for  the  IP  header  and  TCP  header  – …  and  zero  data  traffic  

•  Piggybacking  is  appealing  – Host  B  can  send  an  ACK  to  host  A  – …  as  part  of  a  data  packet  from  B  to  A  

55

TCP  Header  Allows  Piggybacking  

56

Source port Destination port

Sequence number

Acknowledgment

Advertised window HdrLen Flags 0

Checksum Urgent pointer

Options (variable)

Data

Flags: SYN FIN RST PSH URG ACK

Page 15: Step(1:(A’s(Ini%al(SYN(Packet · 2 Step(2:(B’s(SYN9ACK(Packet 5 B’s port A’s port B’s Initial Sequence Number A’s ISN plus 1 20 0 Flags Advertised window Checksum Urgent

15

Example  of  Piggybacking  

57

Data

Data+ACK

Data

A B

ACK

Data

Data + ACK

B has data to send

A has data to send

B doesn’t have data to send

Increasing  Likelihood  of  Piggybacking  

•  Example:  rlogin  or  telnet  – Host  A  types  characters  at  prompt  – Host  B  receives  the  character  and  executes  a  command  

– …  and  then  data  are  generated  – Would  be  nice  if  B  could  send  the  ACK  with  the  new  data  

•  Increase  piggybacking  –  TCP  allows  the  receiver  to  wait  to  send  the  ACK  

– …  in  the  hope  that  the  host  will  have  data  to  send  

58

Data

Data+ACK

Data

A B

ACK

Data

Data + ACK

Delayed  ACK  

•  Delay  sending  an  ACK  – Upon  receiving  a  packet,  the  host  B  sets  a  %mer  

•  Typically,  200  msec  or  500  msec  

–  If  B’s  applica%on  generates  data,  go  ahead  and  send  •  And  piggyback  the  ACK  bit  

–  If  the  %mer  expires,  send  a  (non-­‐piggybacked)  ACK  

•  Limi%ng  the  wait  – Timer  of  200  msec  or  500  msec  – ACK  every  other  full-­‐sized  packet  

59

Conclusions  •  Conges%on  is  inevitable  

–  Internet  does  not  reserve  resources  in  advance  –  TCP  ac%vely  tries  to  push  the  envelope  

•  Conges%on  can  be  handled  –  Addi%ve  increase,  mul%plica%ve  decrease  –  Slow  start,  and  slow-­‐start  restart  

•  Ac%ve  Queue  Management  can  help  –  Random  Early  Detec%on  (RED)  –  Explicit  Conges%on  No%fica%on  (ECN)  

•  Fundamental  tensions  –  Feedback  from  the  network?  –  Enforcement  of  “TCP  friendly”  behavior?  

60