End-to-End Routing Behavior in the Internet Vern Paxson Presented by Sankalp Kohli and Patrick Wong.

Post on 06-Jan-2018

218 views 0 download

description

Definitions Virtual path: network level abstraction of “direct link” between two hosts. At the network layer, it is realized by a single route. Autonomous system (AS): collection of routers and hosts controlled by a single administrative entity.

Transcript of End-to-End Routing Behavior in the Internet Vern Paxson Presented by Sankalp Kohli and Patrick Wong.

End-to-End Routing Behavior in the Internet

Vern Paxson

Presented by Sankalp Kohli and Patrick Wong

Idea Previous studies of routing protocols explain

routing behavior only qualitatively Use end-to-end measurement to determine:

Route pathologies Route stability Route symmetry

Definitions Virtual path: network level abstraction of

“direct link” between two hosts. At the network layer, it is realized by a single route.

Autonomous system (AS): collection of routers and hosts controlled by a single administrative entity.

Routing Protocols Interior Gateway Protocol (IGP): routing

protocol for entities within the same AS. Border Gateway Protocol (BGP): for

inter-AS routing. Each AS keeps a routing table with reachable hosts and corresponding costs. Upon detected changes, only affected part of routing table is shared.

Methodology Run Network Probes Daemon (NPD) on a

number of Internet sites (37)

Methodology Key property (N2 scale)

Use N sites to measure N2 Internet paths Each NPD site periodically measure the route to

another NPD site, by using traceroute

Methodology Two sets of experiments

D1 – measure each virtual path between two NPD’s with a mean interval of 1-2 days, Nov-Dec 1994 – maintains the average load of one measurement per two hours

D2 – measure each virtual path using a bimodal distribution inter-measurement interval, Nov-Dec 1995

60% with mean of 2 hours (burst) 40% with mean of 2.75 days (to measure longer term

behavior) Measurements in D2 were paired Measure A=>B and then B<= A

Methodology Links traversed during D1 and D2  

Methodology Exponential sampling

Unbiased sampling – measures instantaneous signal with equal probability

PASTA principle – Poisson Arrivals See Time Averages Only 37 sites?!

Argue that sampled AS’s are on half of the Internet routes If we weight each AS by its likelihood of occurring in an AS

path, then the AS’s sampled by routes we measured comprised about half of the Internet AS’s by weight

Confidence intervals for probability that an event occurs 

Limitations Just a small subset of Internet paths Just two points at a time Difficult to say why has something happened,

only with end-to-end measurements Possible fixes: something more robust than traceroute

or multiple measurement requests 5%-8% of time couldn’t connect to NPD’s

Introduces bias toward underestimation  D2 Pairing helps correct the underestimation

Routing Pathologies Persistent routing loops Temporary routing loops Erroneous routing Connectivity altered mid-stream Temporary outages (> 30 sec)

Routing Loops & Erroneous Routing Routing Loops:

Forwarding Loop Information Loop Traceroute Loop

Persistent routing loops (10 in D1 and 50 in D2) Several hours long (e.g., > 10 hours) It is not confined to single router

Erroneous routing (one in D1) A route UK=>USA goes through Israel

Route Changes Connectivity change in mid-stream (10 in D1

and 155 in D2) Route changes during measurements Recovering bimodal: (1) 100’s msec to seconds; (2)

order of minutes Route fluttering

Rapid route oscillation Very little fluttering was seen and only happened

within the AS. 

Example of Route Fluttering

wustl (St. Loutis) to umann(Mannheim, Germany)Solid: 17 hops, dotted: 29 hops

Problems with Fluttering Asymmetry Path properties difficult to predict

This confuses RTT estimation in TCP, may trigger false retransmission timeouts

Packet reordering TCP receiver generates DUPACK’s, may trigger

spurious fast retransmits

Infrastructure Failures “host unreachable” from router well inside

the network. 0.21% in D1, estimate availability rate

99.8%. This dropped to 99.5% in D2.

NPD’s unreachable due to many hops (6 in D2) Unreachable more than 30 hops Path length not necessary correlated with

distance 1500 km end-to-end route of 3 hops 3 km (MIT – Harvard) end-to-end route of 11

hops

Temporary Outages

Sequence of traceroute packets lost due to temporary loss of connectivity or heavy congestion.

In D1(D2), 55% (43%) had 0 losses, 44% (55%) had 1 to 5 losses, and 0.96% (2.2%) had 6 or more.

Distribution of Long Outages (>30 sec )

Time-of-Day patterns

Temporary outages: min (0.4%) occurred during the 1:00-2:00 h, max (8.0%) during the 15:00-16:00 h.

Infrastructure failures: min (1.2%) at 9:00-10:00 h, peak during 15:00-16:00 h.

Pathology Summary

Routing Stability Two definitions of stability:

Prevalence: likelihood to observe a particular route

Steady state probability that a virtual path at an arbitrary point in time uses a particular route

Persistence: how long a route remains unchanged

Affects utility of storing state in routers 

Routing Stability Routing Prevalence

Let r be the steady-state probability that a VP uses route r at an arbitrary time, and k the number of times we observe the route.

Due to PASTA, an unbiased estimator of r can be computed as

The prevalence of the dominant route is analyzed.

nk

rr

Routing Prevalence

In general, Internet paths are strongly dominated by a single route, especially if observed at higher granularity.

Routing Persistence The notion of persistence depends on what is

deemed persistent. A series of measurements are undertaken to

classify routes according to their alternation frequency.

Conclusion: routing changes occur over a wide range of time scales, i.e., from minutes to days 

Routing Symmetry

Sources of Routing Asymmetry Link cost metrics “hot potato” routing problem due to the

competing providers. “cold potato”

Routing Symmetry

Analysis of Routing Symmetry Measurements were paired to ensure that an

asymmetry is actually being captured. Asymmetry is quite common (49% on a city

granularity, 30% AS granularity). Size of Asymmetries

Majority confined to one hop (one city or AS)

Summary Pathologies doubled during 1995 Asymmetry is quite common Paths heavily dominated by a single route Over 2/3 of Internet paths are reasonable

stable (> days). The other 1/3 varies over many time scales