Internet Routing (COS 598A) Today: Non-Convergence: Policy Conflicts
description
Transcript of Internet Routing (COS 598A) Today: Non-Convergence: Policy Conflicts
Internet Routing (COS Internet Routing (COS 598A)598A)
Today: Non-Convergence: Policy Today: Non-Convergence: Policy ConflictsConflicts
Jennifer RexfordJennifer Rexford
http://www.cs.princeton.edu/~jrex/teaching/http://www.cs.princeton.edu/~jrex/teaching/spring2005spring2005
Tuesdays/Thursdays 11:00am-12:20pmTuesdays/Thursdays 11:00am-12:20pm
Outline
• Stable Paths Problem– The problem BGP is solving– Abstract model for BGP– Translating reality into SPP
• Conflicting routing policies– Examples of policy conflicts– Difficulty of identifying conflicts
• Guaranteeing convergence– Guidelines based on business relationships– Provable convergence without global control
• Recent work and a project idea
What Problem Does a Routing Protocol Solve?
• Most do shortest-path routing– Shortest hop count
• Distance vector routing (e.g., RIP)– Shortest path as sum of link weights
• Link-state routing (e.g., OSPF and IS-IS)• Policy makes BGP is more complicated
– An AS might not tell a neighbor about a path• E.g., Sprint can’t reach UUNET through AT&T
– An AS might prefer one path over a shorter one• E.g., ISP prefers to send traffic through a
customerWhat is a good model for BGP?
Could Use A Simulation Model
• Simulate the message passing– Advertisements and withdrawals– Message format– Timers
• Simulate the routing policy on each session– Filter certain route advertisements– Manipulate the attributes of others
• Simulate the decision process– Each router applying all the steps per prefix
Feasible, but tedious and ill-suited for formal arguments
1
Stable Paths Problem (SPP) Instance
• Node– BGP-speaking router– Node 0 is destination
• Edge– BGP adjacency
• Permitted paths– Set of routes to 0 at
each node – Ranking of the paths
2 5 5 2 1 0
0
2 1 02 0
1 3 01 0
3 0
4 2 04 3 0
3
42
1
most preferred…least preferred
5 5 2 1 0
1
A Solution to a Stable Paths Problem• Solution
– Path assignment per node– Can be the “null” path
• If node u has path uwP– {u,w} is an edge in the
graph– Node w is assigned path
wP• Each node is assigned
– The highest ranked path consistent with the assignment of its neighbors
2
0
2 1 02 0
1 3 01 0
3 0
4 2 04 3 0
3
42
1
A solution need not represent a shortest path tree, or a spanning tree.
Translating a Real Configuration into SPP
• Permitted paths at a node– Composition of export policies at other nodes
• Ranking of paths at a node– Import policies at the node– Rank in terms of BGP decision process (i.e.,
local preference, AS path length, origin type, MED, …)
55 2 1 0
0
2 1 02 0
2
Node 0 exports route to node 2
Node 2 exports “2 1 0” but not “2 0”
Node 1 exports “1 0” to node 2
An SPP May Have Multiple Solutions
1
0
2
1 2 01 0
2 1 02 0
First solution
1
0
2
1 2 01 0
2 1 02 0
1
0
2
1 2 01 0
2 1 02 0
Second solution
An SPP May Have No Solution
2
0
31
2 1 02 0
1 3 01 0
3 2 03 0
4
3
Stable System Unstable After Failure
2
0
31
2 1 02 0
1 3 01 0
3 4 2 03 0
44 04 2 04 3 0
Becomes a BAD GADGET if link (4, 0) goes down.
BGP is not robust : it is not guaranteed to recover from network failures.
Strawman Solution Doesn’t Work
• Create a global Internet routing registry– Store the AS-level graph and all routing
policies– Store all routing policies– But, ASes may be unwilling to divulge
• Check for conflicting policies– Analyze the global system and identify
conflicts– Contact the affected ASes to resolve them– But, checking is an NP-complete problem– … and, a safe system may be unsafe after
failureGoal: sufficient condition for convergence with local control
Guaranteeing Convergence
Think Globally, Act Locally
• Key features of a good solution– Flexibility: allow diverse local policies for each
AS– Privacy: do not force ASes to divulge their
policies– Backwards-compatibility: no changes to BGP– Guarantees: convergence even if system
changes• Restrictions based on AS relationships
– Path selection rules: which route you prefer– Export policies: who you tell about your route– AS graph structure: who is connected to who
Customer-Provider Relationship• Customer pays provider for Internet access
– Provider exports customer’s routes to everybody– Customer exports only to downstream customers
d
d
provider
customer
customer
provider
Traffic to the customer Traffic from the customer
advertisements
traffic
Peer-Peer Relationship
• Peers exchange traffic between customers – AS exports only customer routes to a peer– AS exports a peer’s routes only to its
customers
peerpeer
Traffic to/from the peer and its customers
d
advertisements
traffic
Hierarchical AS Relationships• Provider-customer graph is directed & acyclic
– If u is a customer of v and v is a customer of w– … then w is not a customer of u
u
v
w
Local Path Selection Rules• Classify routes based on next-hop AS
– Customer routes, peer routes, and provider routes• Rank routes based on classification
– Prefer customer routes over peer/provider routes• Allow any ranking of routes within a class
– E.g., rank one customer route higher than another– Gives network operators the flexibility they need
• Consistent with traffic engineering practices– Customers pay for service, and providers are paid– Peer relationship based on balanced traffic load
Two Interpretations
• System is stable because ASes act like this– High-level argument
•Export and topology assumptions are reasonable•Path selection rule matches with financial incentives
– Empirical results•BGP routes for popular destinations stable for ~10
days•Most instability from a few flapping destinations
• ASes should follow rules for system stability– Encourage operators to obey these guidelines– … and provide ways to verify the configuration– Need to consider more complex relationships
Playing One Condition Off Against Another
• All three conditions are important– Path ranking, export policy, and graph structure
• Allowing more flexibility in ranking routes– Allow same preference for peer and customer routes – Never choose a peer route over a shorter customer route
• … at the expense of stricter AS graph assumptions– Hierarchical provider-customer relationship (as before)– No private peering with (direct or indirect) providers
Peer-peer
Extension to Backup Relationships
• Backups: liberal export and ranking policies– The motivation is increased reliability – …but ironically it may cause routing
instability!
backup pathprimaryprovider
backupproviderfailure
Backup Provider
backup pathfailure
peer
provider
Peer-Peer Backup [RFC 1998]
Backup Path Needs Global Significance
2 3 4
10
• Peer-backup relationship between 0 and 1– Adds backup paths (2,1,0), (3,1,0), …
• When link {2,0} fails…– Node 2 prefers (2,3,1,0) through a peer over
the backup path (2,1,0)– Leads to the “bad gadget” example
Backup Paths: Keeping Count of Backup Edges
• Solution– Prefer routes with fewest backup links– Then, break ties by preferring customer routes
• Mechanism– Tag BGP route advertisement with a counter– Increment the count as you cross a backup edge
2 3 4
10
2 02 1 02 3 1 02 4 1 0
No backup
One backupcustomer
One backuppeer
Recent Work
Recent Work: Relaxing Export Rules
• Goal: no restrictions on export and topology– Allow an AS to decide whether to export– Do not require hierarchical relationships
• Question– How much do you have to restrict path ranking
to have a guarantee that the system is safe?• Answer
– Limited to shortest-path routing• Implications
– Trade-off in safety, autonomy, & expressiveness
Recent work by Nick Feamster and Ramesh Johari
Recent Work: MED Oscillation (RFC 3345)
• MED comparison when next-hop AS is same• No total ordering at the leftmost router
– B > A: preferring smaller router-id– C > B: preferring smaller MED attribute– A > C: preferring eBGP-learned over iBGP
A: Id=2 C: MED=10
iBGP
B: Id=1, MED=20
AS 1 AS 2
Project Idea: Stable Paths Problem and Root-Cause
Analysis
Project Idea: Root-Cause Analysis
• Root-cause analysis– Identify location and cause of routing changes– Inference from BGP protocol messages
• Active area of research– Several proposed algorithms– Limited accuracy in making inferences
• Research question– Is the problem just very hard?– Does the data not reveal enough information?
• Project idea: study using SPP
Project Idea, Continued
• Model root-cause analysis– Start with an SPP instance– Fail a link (or a node)– See what path changes would occur
• What events might cause these changes?
1 2 3
40
1 2 01 0
2 3 4 02 0
3 4 03 2 0
4 0
Questions
• Can you infer cause and location– If you observe routing changes at all nodes– If you observe only some of the nodes
• What if you make some assumptions– E.g., policies based on business relationships
• Where would you place monitors?– Best locations to place n monitors– Minimum number of monitors you need
• What changes would you make to the routing protocol to make diagnosis easier?
Next Time: Hot-Potato Routing
• Two papers– “Dynamics of Hot-Potato Routing in IP
Networks”– “TIE Breaking: Tunable Interdomain Egress
Selection”• NANOG video
– Covering material in the first paper• In honor of spring break
– No written reviews• Talk with me about your course project
– ... by Thursday March 24– Final written report due Tuesday May 10