Beyond BGP Dan Massey Colorado State University. 24 October [email protected] Internet...
-
Upload
kerry-glenn -
Category
Documents
-
view
216 -
download
0
Transcript of Beyond BGP Dan Massey Colorado State University. 24 October [email protected] Internet...
24 October 04 [email protected]
Internet Routing Challenges Facing Internet Routing
Internet Has Grown Dramatically– Large number of routing entries– High volumes of updates– Frequent topological changes
Fault-Model Has Changed Dramatically– More malfunctioning components– Intentional attacks
Do we need a fundamentally new routing architecture?
24 October 04 [email protected]
Toward a New Architecture One claim: BGP is nearing the end of its
useful lifetime The Internet will soon collapse unless we
act!! Other claim: BGP is the best engineering
solution we are likely to produce We need incremental patches to new
problems Who is right?
Beyond BGP uses – Measurements to assess where we are– Identification of (new?) routing requirements– Development of changes (incremental or new
system) to address the above
24 October 04 [email protected]
How Did We Get To BGP Simple Distance Vector Routing
Algorithms Used in early Internet routing designs Convey only limited information Prone to long lasting loops
Expensive Link State Routing Algorithms Learn the Full Network Topology Signal every change in every link
Path Vector Routing (BGP) Middle ground that signals some path data But does not signal the full topology
24 October 04 [email protected]
RIP and DBF
RIP
• Keep shortest path only
Distributed Bellman-Ford(DBF)• Keep distance info from all neighbors
A
B
C
E F
D
D:1
D:3
D:2
D:2
D:3
•B’s route to D: Nexthop=A, Dist=4
•B’s route to D: Nexthop=A, dist=4Alternate Nexthop=C, Dist=4
D: infin
ity
• 30sec refreshing interval •Damping timer to space out two triggered updates: 1~5 seconds
•Poison reverse: B sends infinity distance to A
RIP and DBF:
•Exchange distance info.
24 October 04 [email protected]
Internet: composed of thousands of Autonomous Systems(ASes).
BGP Background
BGP (Border Gateway Protocol): the de facto inter-AS routing protocol
AS A R1R2
R3AS B
AS C
R4
R5
AS ER6
BGP Routers BGP Routers
24 October 04 [email protected]
How BGP works Uses path vector protocol
– similar to distance vector protocol.
what if no path available?
Consider an AS as a node
Route via A = <A>Route via C = <C E A>
B’s route to D:
route includes entire path(sequence of nodes)D
A
B
C
E
D:<A>D:<A>
D:<E A>D:<C E A>
24 October 04 [email protected]
Path Vector Routing Changes Worms triggered edge instabilty
Routers crashed due to ARP cache overflow.
Links were congested by worm traffic. BGP Path Exploration Exacerbates
Dynamics B’s route to D
Route via A=<A>
Route via C=<C E A>
D
A
B
C
E
Obsolete backup path <C E A >is used and convergence is delayed
withdrawwithdraw
withdraw
24 October 04 [email protected]
Policies and Policy Withdrawal
But A could stop advertising to B due to a policy change, path <C E A> is still valid!
A
B
C
E
policy withdraw
D
Attach a Failure Withdrawal Community Attribute Only apply the approach to failure withdrawal
B’s route to D
Route via A= <A >
Route via C=<C E A>Route via C=<C E A>
Route via A= <A>
A
B
C
E
24 October 04 [email protected]
BGP Traffic Engineering
BGP Traffic Engineering:R4 chooses path <C B A>R5 chooses path <C E A>
We assumed an AS could be modeled as a node with a single best path to the destination
But a single AS may advertise more than one path.
Divide one AS into Logical ASes such that
All routers within a logical AS have the same best patheach logical AS can be modeled as a node.
24 October 04 [email protected]
Number of Updates
Number of ASes in Network
Nu
mb
er o
f Up
date
s
Original BGP
Enhanced BGP
Substantial reduction is achieved.
E.g. 3766 to 1419 in the 60-AS topology
MinRouteAdver timer: within 30 seconds, only one advertisement is allowed.
It “packs” consecutive changes into one update.
24 October 04 [email protected]
Convergence time
Number of ASes in Network
Con
verg
en
ce
Tim
e(s
econ
ds)
Original BGP
Enhanced BGP
Enhanced BGP reduces the convergence time substantially.
E.g. 337.0 seconds to 19.5 seconds in the 60-AS topology
Elimination of one advertisement can cut convergence time by 30 seconds
24 October 04 [email protected]
Improving Path Vector Convergence Infocom 02 [4] uses consistency to detect invalid
paths. Reject path <x1, x2,…, xn, r1,r2…, rm> if
r1 is adirect neighbor r1’s path is not <r1, r2, …., rm>
Adjusted to account for policy and implement in BGP Infocom 03 [Afek, et al] quickly flushes invalid paths.
BGP requires updates be separated by a min interval Send withdraw (to flush route) if blocked by the interval
Our recent work [5] attaches a new attribute: Root Cause Notification (RCN) Identifies the failed link and includes a sequence number. Allows any route relying on the failed link to be rejected.
24 October 04 [email protected]
Analyzing Path Vector Convergence Route fail-over has
two stages. First, nodes inside
the blue triangle lose routes and explore backup paths. All short invalid paths
are explored Second, an edge
(a0) eventually selects the valid backup path via Sk. Valid routes begin to
propagate through the blue triangle.
24 October 04 [email protected]
Generic Convergence Results
Algorithm Fail-Over Convergence Bounds
SPVP (BGP) (N-1) (M + ld) + 3 Pmax(|E|-degree(G,0))
SPVP-AS (N- degree(G,0) ) (M+ld) + 3Pmax(|E| - |E^| + Degree(G^))
SPVP-GF (N-1) ld + 3Pmax(|E| - degree(G,0))
SPVP-RCN Distance(G,0) (ld) + (Pmax) Distance(G,0)
Pmax = Node Processing Delay, ld = Link Delay
M = Minimum Advertisement Interval
24 October 04 [email protected]
What About Security? Convergence Discussion Neglects Security
What if routers send intentionally bad information?
What is the Simplest Possible Attack? Announce someone elses routes
Example: Suppose Univ. of Colorado announces it is the origin for 129.82.0.0/16 In other words, CU announces CSU IP Address
Space Can this Happen and/or What Would Prevent
It?
24 October 04 [email protected]
Multiple Origin AS (MOAS) Cases
Prefixes originate from Multiple Origin AS (MOAS) Lower curve likely due to valid operational needs
Spikes are errors that disrupt routing to prefix Includes loss of routes to top level DNS servers
24 October 04 [email protected]
Infrastructure Faults and Attacks
InternetInternet c.gtld-servers.net
BGP monitor
192.26.92.30
originates route to 192.26.92/24
BGP and DNS Provide No Authentication Faults and attacks can mis-direct traffic. One (of many) examples observed from BGP
logs. Server could have replied with false DNS data.
ISPs announced new pathfor 20 minutes to 3 hours
24 October 04 [email protected]
BGP-based Solution Example
router bgp 59 neighbor 1.2.3.4 remote-as 52 neighbor 1.2.3.4 send-community neighbor 1.2.3.4 route-map setcommunity outroute-map setcommunity match ip address 18.0.0.0/8 set community 59:MOAS 58:MOAS additive
Example configuration:
AS58
18/8, PATH<4>, MOAS{4,58,59}
AS59
18.0
.0.0
/8 18/8, PATH<58>, MOAS{58,59}
18/8, PATH<59>, MOAS{58,59}
18/8, PATH<52>, MOAS{52, 58}
AS52
24 October 04 [email protected]
(b) Two Origin AS’s(a) One Origin AS
BGP false origin detectionSimulation Results
24 October 04 [email protected]
A Simple Filter Current BGP provides dynamic routes
Explore the opposite extreme...
Select a single static route to each server.
Apply AS path filters to block all other announcements.
– Also filter against more specifics.
Route changes on a frequency of months, if at all.
Change in IP address, origin AS, or transit policy.
Adjust route only after off-line verification
24 October 04 [email protected]
Why This Works: Theory
Scale is limited to a small number of routes. No exponential growth in top level DNS servers.
Loss of a server is tolerable, invalid server is not. Resolvers detect and time-out unreachable servers.
– Provided surviving servers handle load, cost is some delay.
Expect predictable properties and stable routes. Servers don’t change without non-trivial effort.
Servers located in highly available locations.
24 October 04 [email protected]
Why This Works: Data Analysis based on BGP updates from RIPE.
Archive of BGP updates sent by each peer.
9 ISPs from US, Europe, and Japan.
February 2001 - April 2002
Some data collection notes Used only peers that exchange full routing
tables– Otherwise some route changes are hidden by policies
Adjusted data to discount multi-hop effect.– Multi-hop peering session resets don’t reflect ISP ops.
24 October 04 [email protected]
How Static Are The Routes?
3 changes in route to “A” over 14 months.
2 (valid) changes in the origin AS
5/19/01 origin AS changed from 6245 to 11840
6/4/01 origin AS changed from 11840 to 19836
1 change in transit AS routing policy
11/8/01 (*,10913, 10913, 10913,*) -> (*,10913,
*)
Could have built filter to allow this...
24 October 04 [email protected]
What Routes Are Lost? Results from 3/1/01 until 5/19/01 AS change.
Reduced reachability to “A” from 99.997% to 99.904%
18 events when trusted route was withdrawn 2 resulted in no route available (28 secs, 103 secs)
8 instances of a back-up route lasting over 3 minutes
Longest lasting back-up advertised for 15 minutes
Similar results for other time periods and servers.
24 October 04 [email protected]
Example of Filtered RoutesTime Tail of AS Path
12:35:30 * 19836 19836 19836 1983616:06:32 * 10913 10913 10913 10913 10913 10913 10913 1983616:06:59 * 1239 10913 1983616:07:30 * 701 10913 10913 1983616:08:30 withdrawal16:15:55 * 19836 19836 19836 19836
With filter no route at 16:06:32
19836
109131239
701
* server
No route at 16:08:30
24 October 04 [email protected]
Worst Case In StudyISP 3 (Europe)
ISP 3 used one main route and a smallISP 3 used one main route and a smallnumber of consistent back-up routes.number of consistent back-up routes.
24 October 04 [email protected]
Toward a More Balanced Approach Required infrequent updates to the filter.
Especially useful to automate infrequent tasks.– Natural tendency to forget task or forget how to
do task
More paths improves robustness Simple filtered allowed only 1 path. ISP3’s reachability can be improved if filter
allows two routes… Strike a balance between allowing
dynamic changes and restricting to trusted paths.
24 October 04 [email protected]
BGP Adaptive Filters Slow down the route dynamics and
add validation. Apply hysteresis before accepting new
paths
Add options for validating new paths:– Believe route based purely on hysteresis
– Probabilistic query/response testing against known data.
– Trigger off-line checking (did origin AS really change?)
24 October 04 [email protected]
Convergence And Authentication BGP Suffers From Both Convergence
Problems and Authentication Problems Convergence fixes are good, if no attacks. Authentication fixes work for redundant sites
Can you improve both convergence and authentication in a realistic environment? Do you need to replace BGP?
– If yes, with what? Would you pick BGP for your new network?
– If no, what would you do instead?
Wide Variety of Other Routing Challenges Check out CS 580 and BBGP Project if interested
24 October 04 [email protected]
BGP Measurement and Artifacts BGP peers establish TCP session
and send full route table (120K+ routes) Updates sent only if routes change.
Our results show frequent session resets between ISP routers and the monitoring point. Monitoring point sessions cross
multiple systems in the Internet. Each reset adds 120K updates. But very few ISP-ISP session resets.
Our work in [1] presents rules to remove session reset artifacts.
Initial Table(120K+ routes)
Route Changes
Initial Table(120K+ routes)
24 October 04 [email protected]
BGP Updates During Nimda Worm
Measurement Artifacts
Routing Changes
Total Attack
24 October 04 [email protected]
What Our Analysis ShowsBGP Advertisements on 9/18/2001
42%
5%8%8%
37%BGP Table Exchange
Duplicate Advertisements
New Announcements
Withdraws
Implicit Withdraws
40.2%
A substantial percentage of the BGP messages during the worm attack were not about route changes
37.6%
8.8%8.3%
24 October 04 [email protected]
FRTR: Improving Peer Communication BGP Updates Are Not (Topology) Event Driven
Session resets trigger high volume surges– Govindan shows cascade failures can result.
Lifetime of Invalid Routes is Unbounded Never recover (until reset) if update is somehow lost.
– Despite TCP, we found cases of “lost” withdrawals. Attacker can poison a route with one update.
Soft-state (periodic re-announce) is too costly…
FRTR Uses Periodic Bloom Filter Digests Digests quickly confirm state after session reset. Periodic digests bound lifetime of faults (w/ high
prob). Co-Author Keyur Patel (Cisco) is exploring Cisco
development.
24 October 04 [email protected]
FRTR Performance For each route at receiver,
check against the digest. Bloom filter results in no
false negatives. Compare total digests for
missing route detection. False positive possible with
known rate. Add salts to reduce the
chance of repeated false positives.
Overhead is a function of digest size and frequency.
Work with Cisco suggests a 1.3% overhead increase.
Complete Details to appear in [2] (DSN 2004)
24 October 04 [email protected]
Packet Delivery during Routing Convergence
Failures do occur in the Internet– 20% of intra-ISP links have a MTTF < 1 day [Diot:IMW02]– 40% of Inter-ISP routes have a MTT-Change < 1 day [Labovitz:FTCS-29]
Routing convergence after failure takes time– IS-IS(Intra-ISP protocol): 5+ seconds [Diot:IMW02]– BGP(Inter-ISP protocol): 3+ minutes [Labovitz:Sigcomm00]
Packets can be delivered during convergence
A B C
E F
D
G
24 October 04 [email protected]
What Is the Goal of Routing How to maximize packet delivery during routing
convergence?
– Topological connectivity’s impact?
– Studying: RIP, Distributed Bellman-Ford(DBF), BGP
– Previous work focused on: preventing loops, minimizing convergence time and routing overhead
This problem becomes more important with
Larger Internet topology [Huston01] --> higher freq. of component failuresRicher connectivity[Huston01] --> potentially helps with more alternate pathsHigher bandwidth --> more packets sent during convergence
24 October 04 [email protected]
Simulation conducted
7 by 7 mesh topologies similar those in [Baran64]
20 pkts/second
Measure Packet loss, loops, path convergence time, throughput, and e2e delay.
Simulated node degree range [3 ~ 16]
24 October 04 [email protected]
Packet Losses (I) : Observation
RIP
DBF, BGP’ and BGP
Packet losses of DBF, BGP’ and BGP decrease to zero at degree 6.
Richer connectivity helps RIP little.
Node Degree
Pac
ket L
oss
24 October 04 [email protected]
Packet Loss(II): Lessons Learned
Keeping alternate paths
F
DA
B
C
E
F
DA
B
C
E
Connectivity Mattersno immediate available alternative due to poor connectivity and poison reverse
RIP:
DBF, BGP:
alternative is more likely with richer connectivity
24 October 04 [email protected]
Is an alternate path valid?
Valid Alternate Paths: not using the failed link
Poison reverse and BGP’s path information are not enough! [Pei:Infocom2002]
F
DA
B
C
E
U
X
VW
Richer connectivity --> reduces one single link’s impact better availability of valid(but may be suboptimal) path
C2
D: < >
D: < >
D: < >
24 October 04 [email protected]
Transient Loops(I): Observation
DBF
BGP’
BGP•BGP has the most loops!
•RIP has no loops
•Richer connectivity reduces the chance of looping.
Node Degree
Los
ses
due
to lo
ops
24 October 04 [email protected]
F
D
A
B
C
E
Transient Loops(II): Msg Propagation
Damping timer slows the msg propagation, causing looping
UX
V W
Y
D:<C A E F>
D:
<B
A E
F>
D: <B C A E F>
D:<C B A E F>
Richer connectivity can reduce the chance of loopingMore details in:
“A Study of Transient Loops in BGP”
30 seconds!
D: < >
D: < >
D: < >D: <
>
D:
< >
24 October 04 [email protected]
Instantaneous Throughput
RIP
DBF
BGP’
BGP
RIP
Time
Thr
ough
put(
pkts
/sec
ond
24 October 04 [email protected]
Forwarding Path Convergence time
BGP: no loss at degree 6 or higher
Shall we still tune MRAI timer to minimize convergence time(with the risk of increasing overhead)?
Node Degree
BGP:70
BGP’:10
Time till there is no routing msg.BGP:13
BGP’:2
Time till the forwarding path from S to D stabilizes.