1 Computer Networks Network layer (Part 4). 2 Network layer (so far) Network layer functions Network...

Post on 15-Jan-2016

247 views 1 download

Tags:

Transcript of 1 Computer Networks Network layer (Part 4). 2 Network layer (so far) Network layer functions Network...

1

Computer Networks

Network layer (Part 4)

2

Network layer (so far)

• Network layer functions• Network layer implementation (IP)• Today

– Network layer devices (routers)• Network processors• Input/output port functions• Forwarding functions• Switching fabric

– Advanced network layer topics• Routing problems• Routing metric selection• Overlay networks

3

NL: Router Architecture OverviewKey router functions:

– Run routing algorithms/protocol (RIP, OSPF, BGP) and construct routing table– Switch/forward datagrams from incoming to outgoing link based on route

4

NL: Routing vs. Forwarding

• Routing: process by which the forwarding table is built and maintained– One or more routing protocols

– Procedures (algorithms) to convert routing info to forwarding table.

• Forwarding: the process of moving packets from input to output– The forwarding table

– Information in the packet

5

NL: What Does a Router Look Like?

• Network processor/controller– Handles routing protocols, error conditions

• Line cards– Network interface cards

• Forwarding engine– Fast path routing (hardware vs. software)

• Backplane– Switch or bus interconnect

6

NL: Network Processor

• Runs routing protocol and downloads forwarding table to forwarding engines– Use two forwarding tables per engine to allow easy

switchover (double buffering)

• Typically performs “slow” path processing– ICMP error messages

– IP option processing

– IP fragmentation

– IP multicast packets

7

NL: Fast-path router processing

• Packet arrives arrives at inbound line card• Header transferred to forwarding engine• Forwarding engine determines output interface• Forwarding engine signals result to line card• Packet copied to outbound line card

8

NL: Input Port Functions

Decentralized switching: • given datagram dest., lookup output port using

routing table in input port memory

• goal: complete input port processing at ‘line speed’

• queuing: if datagrams arrive faster than forwarding rate into switch fabric

Physical layer:bit-level reception

Data link layer:e.g., Ethernetsee chapter 5

9

NL: Input Port Queuing

• Fabric slower than input ports combined => queuing may occur at input queues

• Head-of-the-Line (HOL) blocking: queued datagram at front of queue prevents others in queue from moving forward

• queueing delay and loss due to input buffer overflow!

10

NL: Input Port Queuing

• Possible solution– Virtual output buffering

• Maintain per output buffer at input

• Solves head of line blocking problem

• Each of MxN input buffer places bid for output

– Crossbar connect

– Challenge: map of bids to schedule for crossbar

11

NL: Forwarding Engine

• General purpose processor + software• Packet trains help route hit rate

– Packet train = sequence of packets for same/similar flows– Similar to idea behind IP switching (ATM/MPLS) where long-lived

flows map into single label

• Example– Partridge, et. al. “A 50-Gb/s IP Router”, IEEE Trans. On Networking,

Vol 6, No 3, June 1998. – 8KB L1 Icache

• Holds full forwarding code

– 96KB L2 cache• Forwarding table cache

– 16MB L3 cache• Full forwarding table x 2 - double buffered for updates

12

NL: Binary trie

Route PrefixesA 0* B 01000*C 011*D 1*E 100*F 1100*G 1101*H 1110*I 1111*

A

0

0

0

0

1

1

0

0 0

0 0

1

1

1 1

1

B

C

D

E

F G H I

13

NL: Path-compressed binary trie

• Eliminate single branch point nodes

• Variants include PATRICIA and BSD tries

Route PrefixesA 0* B 01000*C 011*D 1*E 100*F 1100*G 1101*H 1110*I 1111*

A

0

1 0

0

0 0

1

1

1 1

1

B C

D

E

F G H I

0

Bit=3 Bit=2

Bit=3

Bit=4 Bit=4

Bit=1

14

NL: Patricia tries and variable prefix match

128.2/16

10

16

19128.32/16

128.32.130/240 128.32.150/24

default0/0

0

• Patricia Tree• Arrange route entries into a series of bit tests• Worst case = 32 bit tests• Problem: memory speed is a bottleneck• Used in older BSD Unix routing implementations

Bit to test – 0 = left child,1 = right child

15

NL: Multi-bit tries

• Compare multiple bits at a time– Reduces memory accesses

– Forces table expansion for prefixes falling in between strides

– Variable-length multi-bit tries

– Fixed-length multi-bit tries

• Most route entries are Class C

• Cut prefix tree at 16 bit depth – Many prefixes 8, 16, 24 bits in length

– 64K bit mask

– Bit = 1 if tree continues below cut (root head)

– Bit = 1 if leaf at depth 16 or less (genuine head)

– Bit = 0 if part of range covered by leaf

16

NL: Variable stride multi-bit trie

• Single level has variable stride lengths

Route PrefixesA 0* B 01000*C 011*D 1*E 100*F 1100*G 1101*H 1110*I 1111*

A

0 1

0 1

00 01 10 11

A D D

B

CC E

00 01 10 11

GF IH

00 01 10 11

17

NL: Fixed stride multi-bit trie

• Single level has equal strides

Route PrefixesA 0* B 01000*C 011*D 1*E 100*F 1100*G 1101*H 1110*I 1111*

A

000 001 010 011 100 101 110 111

A A

00 01 10 11 00 01 10 11 00 01 10 11

C E D D D

B F F G HG H II

18

NL: Other data structures

• Ruiz-Sanchez, Biersack, Dabbous, “Survey and Taxonomy of IP address Lookup Algorithms”, IEEE Network, Vol. 15, No. 2, March 2001– LC trie– Lulea trie– Full expansion/compression– Binary search on prefix lengths– Binary range search– Multiway range search– Multiway range trees– Binary search on hash tables (Waldvogel – SIGCOMM 97)

19

NL: Prefix Match issues

• Scaling – IPv6

• Stride choice– Tuning stride to route table

– Bit shuffling

20

NL: Speeding up Prefix Match - Alternatives

• Route caches– Temporal locality– Many packets to same destination

• Protocol acceleration– Add clue (5 bits) to IP header– Indicate where IP lookup ended on previous node (Bremler-Barr SIGCOMM

99)

• Content addressable memory (CAM)– Hardware based route lookup– Input = tag, output = value associated with tag– Requires exact match with tag

• Multiple cycles (1 per prefix searched) with single CAM• Multiple CAMs (1 per prefix) searched in parallel

– Ternary CAM• 0,1,don’t care values in tag match• Priority (i.e. longest prefix) by order of entries in CAM

21

NL: Types of network switching fabrics

Memory

Bus

Multistage interconnectionCrossbar interconnection

22

NL: Types of network switching fabrics

• Issues– Switch contention

• Packets arrive faster than switching fabric can switch

• Speed of switching fabric versus line card speed determines input queuing vs. output queuing

23

NL: Switching Via MemoryFirst generation routers:• packet copied by system’s (single) CPU• 2 bus crossings per datagram• speed limited by memory bandwidth Modern routers:• input port processor performs lookup, copy into memory• Cisco Catalyst 8500

InputPort

OutputPort

Memory

System Bus

24

NL: Switching Via Bus

• Datagram from input port memory

to output port memory via a shared bus• Bus contention: switching speed limited by bus

bandwidth• 1 Gbps bus, Cisco 1900: sufficient speed for access

and enterprise routers (not regional or backbone)

25

NL: Switching Via An Interconnection Network

• Overcome bus bandwidth limitations• Crossbar networks

– Fully connected (n2 elements)

– All one-to-one, invertible permutations supported

26

NL: Switching Via An Interconnection Network

• Crossbar with N2 elements hard to scale

• Multi-stage interconnection networks (Banyan)– Initially developed to connect processors in multiprocessor

– Typically (n log n) elements

– Datagram fragmented fixed length cells

– Cells switched through the fabric

– Cisco 12000: Gbps through an interconnection network

– Blocking possible (not all one-to-one, invertible permutations supported)

A

B

C

D

W

X

Y

Z

27

NL: Output Ports

• Output contention– Datagrams arrive from fabric faster than output port’s transmission rate– Buffering required– Scheduling discipline chooses among queued datagrams for transmission

28

NL: Output port queueing

• buffering when arrival rate via switch exceeds ouput line speed

• queueing (delay) and loss due to output port buffer overflow!

29

NL: Advanced topics

• Routing synchronization• Routing instability• Routing metrics• Overlay networks

30

NL: Routing Update Synchronization

• Another interesting robustness issue to consider...• Even apparently independent processes can eventually

synchronize– Intuitive assumption that independent streams will not

synchronize is not always valid

– Periodic routing protocol messages from different routers

– Abrupt transition from unsynchronized to synchronized system states

31

NL: Examples/Sources of Synchronization

• TCP congestion windows– Cyclical behavior shared by flows through gateway

• Periodic transmission by audio/video applications• Periodic downloads• Synchronized client restart

– After a catastrophic failure

• Periodic routing messages– Manifests itself as periodic packet loss on pings

• Pendulum clocks on same wall• Automobile traffic patterns

32

NL: How Synchronization Occurs

T

AMessage from B

Weak Coupling when A’s behavior is triggered off of B’smessage arrival!

A

T

Weak couplingcan result in

eventual synchronization

33

NL: Routing Source of Synchronization• Router resets timer after processing its own and incoming

updates

• Creates weak coupling among routers

• Solutions – Set timer based on clock event that is not a function of processing

other routers’ updates, or

– Add randomization, or reset timer before processing update• With increasing randomization, abrupt transition from predominantly

synchronized to predominantly unsynchronized

• Most protocols now incorporate some form of randomization

34

NL: Routing Instability

• References– C. Labovitz, R. Malan, F. Jahanian,

``Internet Routing Stability'', SIGCOMM 1997.

• Record of BGP messages at major exchanges• Discovered orders of magnitude larger than expected

updates– Bulk were duplicate withdrawals

• Stateless implementation of BGP – did not keep track of information passed to peers

• Impact of few implementations

– Strong frequency (30/60 sec) components• Interaction with other local routing/links etc.

35

NL: Route Flap Storm

• Overloaded routers fail to send Keep_Alive message and marked as down

• BGP peers find alternate paths• Overloaded router re-establishes peering session• Must send large updates • Increased load causes more routers to fail!

36

NL: Route Flap Dampening

• Routers now give higher priority to BGP/Keep_Alive to avoid problem

• Associate a penalty with each route– Increase when route flaps

– Exponentially decay penalty with time

• When penalty reaches threshold, suppress route

37

NL: BGP Oscillations

• Can possible explore every possible path through network (n-1)! Combinations

• Limit between update messages (MinRouteAdver) reduces exploration– Forces router to process all outstanding messages

• Typical Internet failover times– New/shorter link 60 seconds

• Results in simple replacement at nodes

– Down link 180 seconds• Results in search of possible options

– Longer link 120 seconds• Results in replacement or search based on length

38

NL: Routing Metrics

• Choice of link cost defines traffic load– Low cost = high probability link belongs to SPT and will

attract traffic, which increases cost

• Main problem: convergence– Avoid oscillations

– Achieve good network utilization

39

NL: Metric Choices

• Static metrics (e.g., hop count)– Good only if links are homogeneous

– Definitely not the case in the Internet

• Static metrics do not take into account– Link delay

– Link capacity

– Link load (hard to measure)

40

NL: Original ARPANET Metric

• Cost proportional to queue size– Instantaneous queue length as delay estimator

• Problems– Did not take into account link speed

– Poor indicator of expected delay due to rapid fluctuations

– Delay may be longer even if queue size is small due to contention for other resources

41

NL: Metric 2 - Delay Shortest Path Tree

• Delay = (depart time - arrival time) + transmission time + link propagation delay– (Depart time - arrival time) captures queuing

– Transmission time captures link capacity

– Link propagation delay captures the physical length of the link

• Measurements averaged over 10 seconds– Update sent if difference > threshold, or every 50 seconds

42

NL: Performance of Metric 2

• Works well for light to moderate load– Static values dominate

• Oscillates under heavy load– Queuing dominates

43

NL: Specific Problems

• Range is too wide– 9.6 Kbps highly loaded link can appear 127 times costlier

than 56 Kbps lightly loaded link

– Can make a 127-hop path look better than 1-hop

• No limit to change between reports• All nodes calculate routes simultaneously

– Triggered by link update

44

NL: Example

Net X Net Y

B

A

45

NL: Example

Net X Net Y

B

A

After everyone re-calculates routes:

.. Oscillations!

46

NL: Consequences

• Low network utilization (50% in example)• Congestion can spread elsewhere• Routes could oscillate between short and long paths• Large swings lead to frequent route updates

– More messages

– Frequent SPT re-calculation

47

NL: Revised Link Metric

• Better metric: packet delay = f(queueing, transmission, propagation)

• When lightly loaded, transmission and propagation are good predictors

• When heavily loaded queueing delay is dominant and so transmission and propagation are bad predictors

48

NL: Normalized Metric

• If a loaded link looks very bad then everyone will move off of it

• Want some to stay on to load balance and avoid oscillations

• It is still an OK path for some • Hop normalized metric diverts routes that have an

alternate that is not too much longer • Also limited relative values and range of values

advertised gradual change

49

NL: Revised Metric

• Limits on relative change– Measured link delay is taken over 10sec period

– Link utilization is computed as .5*current sample + .5*last average

– Max change limited to slightly more than ½ hop

– Min change limited to slightly less than ½ hop

– Bounds oscillations

• Normalized according to link type – Satellite should look good when queueing on other

links increases

50

NL: Routing Metric vs. Link Utilization

0

30

60

140

75

50% 100%25% 75%

225

New metric(routing units)

Utilization

9.6 satellite

9.6 terrestrial

56 terrestrial

56 satellite

90

51

NL: Observations

• Utilization effects– High load never increases cost more than 3*cost of idle link

– Cost = f(link utilization) only at moderate to high loads

• Link types– Most expensive link is 7 * least expensive link

– High-speed satellite link is more attractive than low-speed terrestrial link

• Allows routes to be gradually shed from link

52

NL: Idealized Network Response Maps

0.0

1.0

4.0

0.6

0.8

0.4

2.0 3.01.0 1.5 2.5 3.50.5

Link cost

Mean loadon link

•Load of “average” link as a function of that link’s cost•Created empirically

0.2

Increasingappliednetworkload

53

NL: Equilibrium Calculation

0.0

1.0

4.0

0.6

0.8

0.4

2.0 3.01.0 1.5 2.5 3.50.5

Link cost

Mean loadon link

0.2

HN-SPF

D-SPF

•Combine utilization to cost and cost to utilization maps•Equilibrium points at intersections

Increasingappliednetwork load

54

NL: Routing Dynamics

0

1.0

4.0

0.5

0.75

0.25

2.0 3.01.0 1.5 2.5 3.50.5

Link reported cost

Utilization

Boundedoscillation

Metric map

Network response

•Limiting maximum metric change bounds oscillation

55

NL: Routing Dynamics

0

1.0

4.0

0.5

0.75

0.25

2.0 3.01.0 1.5 2.5 3.50.5

Reported cost

Utilization

Metric map

Network response

Easing ina new link

56

NL: Overlay Routing

• Basic idea:– Treat multiple hops through IP network as one hop in an

overlay network

– Run routing protocol on overlay nodes

• Why?– For performance – can run more clever protocol on overlay

– For efficiency – can make core routers very simple

– For functionality – can provide new features such as multicast, active processing, IPv6

57

NL: Overlay for Performance

• References– Savage et. al. “The End-to-End Effects of Internet Path

Selection”, SIGCOMM 99– Anderson et. al. “Resilient Overlay Networks”, SOSP 2001

• Why would IP routing not give good performance?– Policy routing – limits selection/advertisement of routes– Early exit/hot-potato routing – local not global incentives– Lack of performance based metrics – AS hop count is the

wide area metric

• How bad is it really?– Look at performance gain an overlay provides

58

NL: Quantifying Performance Loss

• Measure round trip time (RTT) and loss rate between pairs of hosts– ICMP rate limiting

• Alternate path characteristics– 30-55% of hosts had lower latency

– 10% of alternate routes have 50% lower latency

– 75-85% have lower loss rates

59

NL: Bandwidth Estimation

• RTT & loss for multi-hop path– RTT by addition

– Loss either worst or combine of hops – why?• Large number of flows combination of probabilities

• Small number of flows worst hop

• Bandwidth calculation– TCP bandwidth is based primarily on loss and RTT

• 70-80% paths have better bandwidth• 10-20% of paths have 3x improvement

60

NL: Overlay for Efficiency

• Multi-path routing– More efficient use of links or QOS

– Need to be able to direct packets based on more than just destination address can be computationally expensive

– What granularity? Per source? Per connection? Per packet?• Per packet re-ordering

• Per source, per flow coarse grain vs. fine grain

– Take advantage of relative duration of flows• Most bytes on long flows

61

NL: Overlay for Features

• How do we add new features to the network?– Does every router need to support new feature?

– Choices• Reprogram all routers active networks

• Support new feature within an overlay

– Basic technique: tunnel packets

• Tunnels– IP-in-IP encapsulation

– Poor interaction with firewalls, multi-path routers, etc.

62

NL: Examples

• IP V6 & IP Multicast– Tunnels between routers supporting feature

• Mobile IP– Home agent tunnels packets to mobile host’s location

– http://www.rfc-editor.org/rfc/rfc2002.txt

• QOS– Needs some support from intermediate routers

63

NL: Overlay Challenges

• How do you build efficient overlay– Probably don’t want all N2 links – which links to create?

– Without direct knowledge of underlying topology how to know what’s nearby and what is efficient?

64

NL: Future of Overlay

• Application specific overlays– Why should overlay nodes only do routing?

• Caching– Intercept requests and create responses

• Transcoding– Changing content of packets to match available bandwidth

• Peer-to-peer applications

65

NL: Network layer summary

• Network layer functions• Specific network layers (IPv4, IPv6)• Specific network layer devices (routers)• Advanced network layer topics

66

NL: Network trace

• http://www.cse.ogi.edu/class/cse524/trace.txt

67

NL: End of material for midterm

• Midterm next Monday 10/29/2001 covering….– Technical material in lectures

– Chapters 1, 4, and 5• Chapter 1

• Chapter 4.1-4.7

• Chapter 5

– Review questions at end of chapters