On Understanding of Transient Interdomain Routing Failures Feng Wang, Lixin Gao, Jia Wang, and Jian...
-
Upload
augusta-higgins -
Category
Documents
-
view
217 -
download
1
Transcript of On Understanding of Transient Interdomain Routing Failures Feng Wang, Lixin Gao, Jia Wang, and Jian...
On Understanding of On Understanding of Transient Interdomain Routing Transient Interdomain Routing
FailuresFailuresFeng Wang, Lixin Gao, Jia Wang, and Jian Qiu
Department of Electrical and Computer Engineering
University of Massachusetts, Amherst
MA 01002
AT&T Labs-research
180 Park Ave, Florham ParkNJ 07869
OutlineOutline
• What is transient routing failures?
• When can transient routing failures occur?
• How long can transient routing failures last?
• Measurement results
Internet RoutingInternet Routing
• Autonomous systems (ASes)– Internet Service Providers (ISPs)
– Companies
– Universities
• Intradomain Routing Protocols– Static Routing, OSPF, IS-IS
• Interdomain Routing Protocol– Border Gateway Protocol (BGP)
Long Convergence DelayLong Convergence Delay
• Long convergence delay (Labovitz et al, TON2001)
– Bringing a route back
– (Tup): <shortest path length MRAI
– Disconnecting a route
– (Tdown): <longest path length MRAI
• Fail-over: rerouting from Path A to Path B– During the time for discovering Path B, routers
might experience transient routing failures, i.e., no route is available
An Example of Transient Routing An Example of Transient Routing FailureFailure
d
Traffic on data plane
BGP update
W:20W:20
A:10 A:10
AS1AS2
AS0
120
1020
W:2010
A:10210
BGP Routing table
losing reachability
AS3
Our ContributionsOur Contributions
• Identify transient routing failures– Sufficient conditions
• Bound transient routing failure duration
OutlineOutline
• What is transient routing failures?
• When can transient routing failures occur?
• How long can transient routing failures last?
• Measurement results
• Two sufficient conditions for a node must experience a transient routing failure (transient routing failure for sure).
• One sufficient condition for a node may experience a transient routing failure (potential transient routing failure).
When Transient Routing Failures When Transient Routing Failures can Occur?can Occur?
110
210
20
310
w
w3
2
0
20
When Transient Routing Failures When Transient Routing Failures can Occur? (contd.)can Occur? (contd.)
110
210
20
310
w
3
2
0
20
A
w310320320
OutlineOutline
• What is transient routing failures?
• When can transient routing failures occur?
• How long can transient routing failures last?
• Measurement results
How long Transient Routing Failures How long Transient Routing Failures last?last?
d
W: 2 0
A: 10
W: 2 0W: 2 0
A: 10 A: 10
MRAI timerMRAI timer
2
0
112010
1010 210
MRAI TimersMRAI Timers
• Minimum Advertisement Interval timer– Minimum amount of time that must elapse between
routing updates
– Applied to BGP announcement or withdrawal
• Default MRAI value– eBGP session: 30 seconds
– iBGP session: 5 seconds
Upper Bound for Transient Routing Upper Bound for Transient Routing Failure DurationFailure Duration
Transient routing failure min(du +d u ) MRAI
0
u
du
u
v
, du
0
Occurrence of Transient failures in a Occurrence of Transient failures in a typical BGP systemtypical BGP system
• In a typical BGP system, transient failures are prevalent.
– Tier-1 ASes can experience transient routing failures, where alternate routes come from their edge routers.
– Non tier-1 ASes can experience transient routing failures, where alternate routes are obtained from other ASes.
OutlineOutline
• What is transient routing failures?
• When can transient routing failures occur?
• How long can transient routing failures last?
• Measurement results
MeasuringMeasuring Transient Failures within Transient Failures within a tier-1 ASa tier-1 AS
Percentage of transient failures among all routing failures that last less than 30 seconds
Cumulative distribution of transient Failure Duration
BGP updates, BGP tables and router configuration files are collected during July 2004
Measuring Transient Failures Measuring Transient Failures contd.contd.
• Transient failures in tier-2 ASes using Oregon RouteView’s BGP updates (July 2004)
Popularity of Prefixes Experiencing Popularity of Prefixes Experiencing Transient FailuresTransient Failures
• We aggregate the Netflow data collected in the tier-1 AS during the week (1/2/2005~1/8/2005)
• Transient routing failures can impact on popular prefixes and unpopular prefixes
Fra
ctio
n of
tran
sien
t ro
utin
g fa
ilur
es
ConclusionsConclusions
• Transient routing failures are prevalent in the Internet, and can last for a significant period of time.
• Majority of transient failures occur under the commonly applied routing policy setting.
• Popular and unpopular prefixes can experience transient failures.
ThanksThanks