Advanced Topics in Routing

64
Advanced Topics in Routing EE 122: Intro to Communication Networks Fall 2010 (MW 4-5:30 in 101 Barker) Scott Shenker TAs: Sameer Agarwal, Sara Alspaugh, Igor Ganichev, Prayag Narula http://inst.eecs.berkeley.edu/~ee122/ Materials with thanks to Jennifer Rexford, Ion Stoica, Vern Paxson and other colleagues at Princeton and UC Berkeley

description

Advanced Topics in Routing. EE 122: Intro to Communication Networks Fall 2010 (MW 4-5:30 in 101 Barker) Scott Shenker TAs: Sameer Agarwal, Sara Alspaugh, Igor Ganichev, Prayag Narula http://inst.eecs.berkeley.edu/~ee122/ - PowerPoint PPT Presentation

Transcript of Advanced Topics in Routing

Page 1: Advanced Topics in Routing

Advanced Topics in Routing

EE 122: Intro to Communication Networks

Fall 2010 (MW 4-5:30 in 101 Barker)

Scott Shenker

TAs: Sameer Agarwal, Sara Alspaugh, Igor Ganichev, Prayag Narula

http://inst.eecs.berkeley.edu/~ee122/

Materials with thanks to Jennifer Rexford, Ion Stoica, Vern Paxsonand other colleagues at Princeton and UC Berkeley

Page 2: Advanced Topics in Routing

Routing Lectures

• Link-layer (L2)– Self-learning

• Intradomain (L3)– Link-state– Distance vector

• Interdomain– Path-vector

2

Page 3: Advanced Topics in Routing

But there is more to the story….

• Normally these three approaches (LS, DV, PV) are presented as the holy trinity of routing

• But we know how to do better

• That is what we will talk about today…..– Augmenting LS with Failure-Carrying Packets– Augmenting DV with Routing Along DAGs– Augmenting PV with Policy Dispute Resolution

3

Page 4: Advanced Topics in Routing

Self-Learning

• Requirements:– Plug-and-play (no management needed)– Flat address space

• Forwarding rules:– Watch which port MAC addresses come from– When in doubt, flood to all other ports

• Spanning tree needed for flooding– Does not use shortest data paths– Network unusable while spanning tree recomputed

4

Page 5: Advanced Topics in Routing

Intradomain (L3)

• Link-state:– Global state, local computation– Flood local topology information– Each router computes shortest paths on graph– Requires routers to have globally consistent state

• Distance-vector:– Local state, global computation– Routers participate in distributed computation– Exchange current views on “shortest paths”– Hard to prevent loops while responding to failures

o E.g., count-to-infinity5

Page 6: Advanced Topics in Routing

Interdomain: Path-vector

• Path-vector enables:– General policy autonomy for import and export– Easy detection of loops

• Disadvantages of path-vector– High churn rate (many updates) [why higher than DV?]– Convergence after failure can be slow– Policy oscillations possible

o Even limited degrees of autonomy can result in policy oscillations

6

Page 7: Advanced Topics in Routing

Three Routing Challenges

• Resilience– Slow convergence after failures, worse as networks grow– Network not reliable during convergence– Most important barrier to a more reliable Internet

o Goal is 99.999% availability, now 99.9% at besto Gaming, media, VoIP, finance (trading) make this more important

• Traffic engineering (what is this?)– Current traffic engineering mechanisms cumbersome– Must find more adaptive methods

• Policy oscillations in interdomain routing7

Page 8: Advanced Topics in Routing

Outline of Lecture

• Multipath Routing (one slide)

• Failure-Carrying Packets

• Routing Along DAGs

• Policy Dispute Resolution

8

Page 9: Advanced Topics in Routing

Multipath Routing

• Multipath: – Providing more than one path for each S-D pair– Allow endpoints to choose among them

• Helps to solve:– Resilience: if one path goes down, can use another– Traffic engineering: let network or endpoints spread

traffic over multiple paths

• Challenges:– Scalability (various approaches, none ideal)– Delay for endpoints to detect failure and switch paths

9

Page 10: Advanced Topics in Routing

Failure-Carrying Packets(FCP)

10

Page 11: Advanced Topics in Routing

Dealing with Link Failures

• Traditional link-state routing requires global reconvergence after failures– Flood new state, then recompute

o In the meantime, looping is possible due to inconsistencies

– Can speed up by tuning timers, etc.

• Can precompute some number of backup paths– Need backup paths for every failure scenario

• Question: can we completely eliminate the need to “reconverge” after link failures?

11

Page 12: Advanced Topics in Routing

• Ensure all routers have consistent view of network– But this view can be out-of-date– Consistency is easy if timeliness not required

• Use reliable flooding– Each map has sequence number

• Routers write this number in packet headers, so packets are routing according to the same “map”– Routers can decrement this counter, not increment it– Eventually all routers use the same graph to route packet

Our Approach: Step 1

12

Page 13: Advanced Topics in Routing

Our Approach: Step 2

• Carry failure information in the packets!– Use this information to “fix” the local maps

• When a packet arrives and the next-hop link for the path computed with the consistent state is down, insert failure information into packet header– Then compute new paths assuming that link is down

• If failure persists, it will be included in next consistent picture of network

13

Page 14: Advanced Topics in Routing

Example: FCP routing

B D

C E

A FIP packetsource destination

14

Page 15: Advanced Topics in Routing

F

Example: FCP routing

B D

C E

Asource destination

IP packet

(C,E) IP packet

(D,F) (C,E) IP packet

15

Page 16: Advanced Topics in Routing

• Eliminates the convergence process

• Guarantees packet delivery– As long as a path exists during failure process

• Major conceptual change– Don’t rely solely on protocols to keep state consistent– Information carried in packets ensures eventual

consistency of route computation

Properties of FCP

16

Page 17: Advanced Topics in Routing

17

Results: OSPF vs. FCP

• Unlike FCP, OSPF cannot simultaneously provide low churn and high availability

0

50

100

150

200

250

300

350

400

0.01 1 100 10000OSPF hello interarrival time [sec]

Ove

rhe

ad

[

0.00.10.20.30.40.50.60.70.80.91.0

Lo

ss r

ate

Ove

rhea

d [

ms

gs/

sec

pe

r lin

k]

OSPF-overhead

FCP-lossrate

OSPF-lossrate

Page 18: Advanced Topics in Routing

18

Results: Backup-paths vs. FCP

• Unlike FCP, Backup-paths cannot simultaneously provide low state and lossrate

0

2000

4000

6000

8000

10000

12000

14000

1 10 100Number of backup paths

Sta

te [e

ntr

ies

pe

r ro

ute

r]

0

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

0.045

Lo

ss r

ate

Bkup-state

FCP-state FCP-lossrate

Bkup-lossrate

Page 19: Advanced Topics in Routing

Problems with FCP

• Requires changes to packet header

• Does not address traffic engineering

19

Page 20: Advanced Topics in Routing

Routing Along DAGs(RAD)

20

Page 21: Advanced Topics in Routing

Avoiding Recomputation: Take II

• Recover from failures without global recomputation

• Support locally adaptive traffic engineering

• Do so in ways that work at both L2 and L3

• Without any change in packet headers, etc.

21

Page 22: Advanced Topics in Routing

Move from path to DAG (Directed Acyclic Graph)Routing compute paths from source to destination

If a link fails, all affected paths must be recomputed

Path

Our Approach: Shift the Paradigm

DAG

X

22

X

Packets can be sent on any of the DAG’s outgoing linksNo need for global recomputation after each failure

Page 23: Advanced Topics in Routing

DAG Properties

• Guaranteed loop-free

• Local decision for failure recovery

• Adaptive load balancing

X

X 0.7

0.3

23

Page 24: Advanced Topics in Routing

Load Balancing

• Use local decisions:– Choose which outgoing links to use– Decide how to spead the load across these links– Push back when all outgoing links are congested

o Send congestion signal on incoming links to upstream nodes

• Theorem:– When all traffic goes to a single destination, local load

balancing leads to optimal throughput

• Simulations:– In general settings, local load balancing close to optimal

24

Page 25: Advanced Topics in Routing

Computing DAG

• DAG iff link directions follow global order

• Computing a DAG for destination v is simple:– Essentially a shortest-path computation– With consistent method of breaking ties

25

Page 26: Advanced Topics in Routing

What about Connectivity?

• Multiple outgoing links improve connectivity– But can RAD give “perfect” connectivity?

• If all outbound links fail that node is disconnected– Even if underlying graph is still connected

• How can we fix this?

26

Page 27: Advanced Topics in Routing

Link Reversal

• If all outgoing links fail, reverse incoming links to outgoing

XX

XX

27

Page 28: Advanced Topics in Routing

Link Reversal Properties

• Always loop-free

• Local reaction, not global recomputation

• The scope of link reversal is as local as possible

• Connectivity guaranteed!– If graph is connected, link reversal process will restore

connectivity in DAG

• This has been known in wireless literature– Now being applied to wired networks

28

Page 29: Advanced Topics in Routing

Summary of RAD

• Local responses lead to:– Guaranteed connectivity– Close-to-optimal load balancing

• Can be used for L2 and/or L3– No change in packet headers

29

Page 30: Advanced Topics in Routing

30

5 Minute Break

Questions Before We Proceed?

Page 31: Advanced Topics in Routing

Announcements

• HW3a was due today

• HW3b will be posted tonight, due in two weeks

• Midterm regrading requests due today– Check our addition!

31

Page 32: Advanced Topics in Routing

Policy Dispute Resolution

Page 33: Advanced Topics in Routing

Problem: Policy Oscillations

• Policy Oscillations are:– Persistent– Hard to determine in advance

• Involve delicate interplay of domain policies– Not issue of correctness, but of conflicting preferences– Therefore, can’t just define them away

• Need new approach

33

Page 34: Advanced Topics in Routing

Objectives

• Do not reveal any ISP policies

• Distributed, online dispute detection and resolution

• Routers get to select most preferred route if no dispute resulting in oscillations exist

• Account for transient oscillations, don’t permanently blacklist routes

34

Page 35: Advanced Topics in Routing

35

Example of Policy Oscillation

1

2 3

1 3 0 1 0

3 2 0 3 0

2 1 0 2 0

0

“1” prefers “1 3 0” over “1 0” to reach “0”

Page 36: Advanced Topics in Routing

36

Step-by-Step of Policy Oscillation

Initially: nodes 1, 2, 3 know only shortest path to 0

1

2 3

1 3 0 1 0

3 2 0 3 0

2 1 0 2 0

0

Page 37: Advanced Topics in Routing

37

1 advertises its path 1 0 to 2

1

2 3

1 3 0 1 0

3 2 0 3 0

2 1 0 2 0

0adve

rtis

e: 1

0

Step-by-Step of Policy Oscillation

Page 38: Advanced Topics in Routing

38

1

2 3

1 3 0 1 0

3 2 0 3 0

2 1 0 2 0

0

Step-by-Step of Policy Oscillation

Page 39: Advanced Topics in Routing

39

1

2 3

1 3 0 1 0

3 2 0 3 0

2 1 0 2 0

0

advertise: 3 0

3 advertises its path 3 0 to 1

Step-by-Step of Policy Oscillation

Page 40: Advanced Topics in Routing

40

1

2 3

1 3 0 1 0

3 2 0 3 0

2 1 0 2 0

0

Step-by-Step of Policy Oscillation

Page 41: Advanced Topics in Routing

41

1

2 3

1 3 0 1 0

3 2 0 3 0

2 1 0 2 0

0with

draw

: 1 0

1 withdraws its path 1 0 from 2

Step-by-Step of Policy Oscillation

Page 42: Advanced Topics in Routing

42

1

2 3

1 3 0 1 0

3 2 0 3 0

2 1 0 2 0

0

Step-by-Step of Policy Oscillation

Page 43: Advanced Topics in Routing

43

1

2 3

1 3 0 1 0

3 2 0 3 0

2 1 0 2 0

0

advertise: 2 0

2 advertises its path 2 0 to 3

Step-by-Step of Policy Oscillation

Page 44: Advanced Topics in Routing

44

1

2 3

1 3 0 1 0

3 2 0 3 0

2 1 0 2 0

0

Step-by-Step of Policy Oscillation

Page 45: Advanced Topics in Routing

45

1

2 3

1 3 0 1 0

3 2 0 3 0

2 1 0 2 0

0

withdraw: 3 0

3 withdraws its path 3 0 from 1

Step-by-Step of Policy Oscillation

Page 46: Advanced Topics in Routing

46

1

2 3

1 3 0 1 0

3 2 0 3 0

2 1 0 2 0

0

Step-by-Step of Policy Oscillation

Page 47: Advanced Topics in Routing

47

1

2 3

1 3 0 1 0

3 2 0 3 0

2 1 0 2 0

0

1 advertises its path 1 0 to 2

Step-by-Step of Policy Oscillation

adve

rtis

e: 1

0

Page 48: Advanced Topics in Routing

48

1

2 3

1 3 0 1 0

3 2 0 3 0

2 1 0 2 0

0

Step-by-Step of Policy Oscillation

Page 49: Advanced Topics in Routing

49

1

2 3

1 3 0 1 0

3 2 0 3 0

2 1 0 2 0

0

withdraw: 2 0

2 withdraws its path 2 0 from 3

Step-by-Step of Policy Oscillation

Page 50: Advanced Topics in Routing

50

1

2 3

1 3 0 1 0

3 2 0 3 0

2 1 0 2 0

0

We are back to where we started!

Step-by-Step of Policy Oscillation

Page 51: Advanced Topics in Routing

Nodes See Signs of Trouble

• Route choices oscillation– Node 1:

o 1 0 , 1 3 0 , 1 0 , 1 3 0 , …..

– Node 2:o 2 0 , 2 1 0 , 2 0 , 2 1 0 , …..

– Node 3:o 3 0 , 3 2 0 , 3 0 , 3 2 0 , …..

• Choices alternate between more preferred and less preferred routes

51

Page 52: Advanced Topics in Routing

Basic Idea

• If node notices that it selects routes that are more / less preferred than previous route– Node thinks it may be involved in oscillation

• Computes local “precedence” figure– Higher precedence value for less preferred routes– In example, 1 0 gets higher value than 1 3 0

• Route advertisements carry this precedence– Two precedence values:

o Incoming (carried by packet)o Local (determined by own past history)

52

Page 53: Advanced Topics in Routing

Precedence Calculation

• Routes are first ranked by “incoming precedence”– Pick most preferred route among those with lowest

incoming precedence value

• Outgoing precedence is sum of incoming and local precedence

53

Page 54: Advanced Topics in Routing

Use of Precedence Values

• Maintain history of routes encountered during oscillations

• In table below, prefer P0 to P1 to P2

• Pick path P2, mark with precedence 1

54

Page 55: Advanced Topics in Routing

55

Example of Policy Oscillation

1

2 3

1 3 0 1 0

3 2 0 3 0

2 1 0 2 0

0

Page 56: Advanced Topics in Routing

56

1 advertises its path 1 0 to 2

1

2 3

1 3 0 1 0

3 2 0 3 0

2 1 0 2 0

0adve

rtis

e: 1

0

Prec

eden

ce 0

Step-by-Step of Policy Oscillation

Page 57: Advanced Topics in Routing

57

1

2 3

1 3 0 1 0

3 2 0 3 0

2 1 0 2 0

0

Step-by-Step of Policy Oscillation

Page 58: Advanced Topics in Routing

58

1

2 3

1 3 0 1 0

3 2 0 3 0

2 1 0 2 0

0

advertise: 3 0

Precedence 0

3 advertises its path 3 0 to 1

Step-by-Step of Policy Oscillation

Page 59: Advanced Topics in Routing

59

1

2 3

1 3 0 1 0

3 2 0 3 0

2 1 0 2 0

0

Step-by-Step of Policy Oscillation

Page 60: Advanced Topics in Routing

60

1

2 3

1 3 0 1 0

3 2 0 3 0

2 1 0 2 0

0with

draw

: 1 0

1 withdraws its path 1 0 from 2

Step-by-Step of Policy Oscillation

Page 61: Advanced Topics in Routing

61

1

2 3

1 3 0 1 0

3 2 0 3 0

2 1 0 2 0

0

Step-by-Step of Policy Oscillation

Page 62: Advanced Topics in Routing

Routes stabilize at this point3 cannot choose 2 0 route from 2, because of higher precedence value

62

1

2 3

1 3 0 1 0

3 2 0 3 0

2 1 0 2 0

0

advertise: 2 0Precedence 1

2 advertises its path 2 0 to 3

Step-by-Step of Policy Oscillation

Page 63: Advanced Topics in Routing

Properties of Solution

• If no policy oscillation exists, get usual routes

• If policy oscillation would have existed, approach short-circuits oscillation

• If, after convergence, non-zero global precedence values exist, dispute(s) exist

• Only precedence values advertised, no other routes or policies revealed

63

Page 64: Advanced Topics in Routing

Review

• Major Routing Challenges:– Resilience– Traffic Engineering– Policy Oscillations

• We have solutions for all of them!– FCP, RAD, and Policy Dispute Resolution

• Are they deployed? No…..– Will they be deployed? Maybe…..

• Next lecture: Multicast and QoS– The lost decade…..

64