What about the Network?

48
What about the Network? CS 525 Spring 2009 Advanced Distributed Systems

description

What about the Network?. CS 525 Spring 2009 Advanced Distributed Systems. End-to-End Arguments in System Design. J.. Saltzer , D.P. Reed and D.D. Clark M.I.T. Laboratory for Computer Science Presented by: Abdullah Al- Nayeem. Where to Place Functionalities?. - PowerPoint PPT Presentation

Transcript of What about the Network?

Page 1: What about the Network?

What about the Network?CS 525 Spring 2009

Advanced Distributed Systems

Page 2: What about the Network?

End-to-End Arguments in System Design

J.. Saltzer, D.P. Reed and D.D. ClarkM.I.T. Laboratory for Computer Science

Presented by: Abdullah Al-Nayeem

Page 3: What about the Network?

Where to Place Functionalities?

• Example: Reliable file transfer• Should reliability be implemented per-hop by the

communication subsystem?

• Or, end-to-end by host applications?

34/14/2009 Department of Computer Science, UIUC

Page 4: What about the Network?

Where to Place Functionalities?• Possible failures in file transfer:

– Disk access failure (hardware)– Packet drop or duplicated packet (communication)– File system error (software)

• Communication subsystem cannot itself guarantee reliability.– Also increases network complexity– More overheads for applications that do not require reliability.

• Application layer can provide full reliability, even without any support from lower layers of the network.– End-to-end checksum and retry

44/14/2009 Department of Computer Science, UIUC

Page 5: What about the Network?

End-to-End Argument (E2EA)• The lower layers of the network are not the right

place to implement application-specific functions– Move functions “up and out”

• “The function in question can completely and correctly be implemented only with the knowledge and help of the application standing at the end points of the communication system. Therefore, providing that questioned function as a feature of the communication system itself is not possible.”

54/14/2009 Department of Computer Science, UIUC

Page 6: What about the Network?

Typical Examples

• Bit error recovery• Security using encryption• Duplicate message suppression• Recovery from system crashes• Delivery acknowledgement

64/14/2009 Department of Computer Science, UIUC

Page 7: What about the Network?

Benefits of E2EA

• Core network can be simpler and faster• Less assumptions required on the networks• More flexibility in developing new network

technologies and applications– Helped in proliferation of the Internet

• Dumb networks, intelligent hosts

74/14/2009 Department of Computer Science, UIUC

Page 8: What about the Network?

Extension of E2EA• Lower layers may implement partial application-

specific functions, but only for performance improvements.– Reducing retries in data transmissions

• Should the level of reliability at the network be higher than the expected application reliability?

• What are the possible tradeoffs?– Short-term performance vs. long-term flexibility– Performance vs. cost

84/14/2009 Department of Computer Science, UIUC

Page 9: What about the Network?

Identifying the Ends

• VoIP: Human user is the end-point• File Transfer: Application is the end-point• Only the end-points knows how to guarantee

required reliability4/14/2009 Department of Computer Science, UIUC 9

Voice over IP

Voice Files

File Transfer

Page 10: What about the Network?

Moving Away from E2EA• Hosts are not always trustworthy

– Security attacks, e.g. denial of service

• E2EA does not guarantee congestion control– Unfriendly host

• Communications are not always between two end-points– Multicast, broadcast

• How does the network handle these circumstances?

104/14/2009 Department of Computer Science, UIUC

Page 11: What about the Network?

Other Issues

• ISP control, filtering, network monitoring• Government interventions• More subtle end points

– Anonymous users using third-party services– Cloud computing entities (SaaS user, SaaS

provider, Cloud provider)

• Do these factor imply the end of E2EA?

114/14/2009 Department of Computer Science, UIUC

Page 12: What about the Network?

Summary

• End-to-End argument is not an absolute, but a design tool

• End-to-End argument can help in organizing “layered” communication systems.

124/14/2009 Department of Computer Science, UIUC

Page 13: What about the Network?

Consensus Routing: The Internet as a Distributed System

John P. John1, Ethan Katz-Bassett1, Arvind Krishnamurthy1, Thomas Anderson1,Arun Venkataramani2

1Dept. of Computer Science, Univ. of Washington, Seattle2University of Massachusetts Amherst

5th USENIX Symposium on Networked Systems Design and Implementation (NSDI), 2008

Presented by: Ahmed Khurshid

Page 14: What about the Network?

Motivation

• Internet routing protocols (both intra and inter domain) usually favors responsiveness over consistency– A new route is incorporated in the forwarding table before

propagating the same to neighbors

• Results in routing loops and blackholes• Usually there is no extra effort to ensure consensus

– Solutions have been proposed for intra-domain routing

144/14/2009 Department of Computer Science, UIUC

Page 15: What about the Network?

Motivation – Routing loop

15

Link failure causing BGP loops at 2 and 3

Policy change causing BGP loops at 2 and 3 when 4 withdraws a prefix from 2 and 3 but not 6

5: 1-5, 5: 4-55: 3-4-5

5: 4-55: 2-4-5

2 prefers the path through 3 2 and 3 each prefer the other over 6

4/14/2009 Department of Computer Science, UIUC

Minimum Route Advertisement Interval (MRAI)

Timer

Page 16: What about the Network?

Motivation – Blackhole

16

iBGP link recovery causing blackholes

AP is prefered over CD

Recovered

4/14/2009 Department of Computer Science, UIUC

CD

Page 17: What about the Network?

Consensus Routing

• A consistency first approach that cleanly separates safety and liveness of routing– Safety: All the routers use a consistent route towards a

destination (i.e. no loops)– Liveness: Quick reaction to failures and policy changes

• Uses two simple ideas to ensure both consistent behavior and quick reaction1. Runs a distributed coordination algorithm to ensure

globally consistent view of routing state2. Forwards packets using one of two logically distinct modes

174/14/2009 Department of Computer Science, UIUC

Page 18: What about the Network?

Stable Mode• Unlike BGP, consensus routing does not immediately

incorporate a newly learned route into the forwarding table• Periodically, all routers engage in a distributed coordination

algorithm that determine the most recent set of complete updates

• The coordination is based on classical distributed snapshot and consensus algorithms• Chandy-Lamport snapshot algorithm• Paxos

• Output of the coordination is used to compute a set of stable forwarding tables (SFTs) that are guaranteed to be consistent• SFTs replace traditional FIBs (Forwarding Information Base)

184/14/2009 Department of Computer Science, UIUC

Page 19: What about the Network?

Stable Mode – Update Log

19

A C

D E GF

B

H I J K

Tier-1

Tier-2

Tier-3(Stub)

Users Users Users Users

Store updates into the update log without modifying the SFT

Route advertisement/withdrawal

4/14/2009 Department of Computer Science, UIUC

Page 20: What about the Network?

Stable Mode – Distributed Snapshot

20

A C

D E GF

B

H I J K

Tier-1

Tier-2

Users Users Users Users

Updates in the snapshot may be complete or incomplete

Marker message

Tier-3(Stub)

4/14/2009 Department of Computer Science, UIUC

Page 21: What about the Network?

Stable Mode – Aggregation

21

A C

D E GF

B

H I J K

Tier-1

Tier-2

Users Users Users Users

Tier-1 ASes are good candidates for being consolidators

Snapshots

Tier-3(Stub)

Why?

• Better reachability• Longevity• Full mesh topology among the ASes

4/14/2009 Department of Computer Science, UIUC

Consolidators

Page 22: What about the Network?

Stable Mode – Consensus

22

A C

D E GF

B

H I J K

Tier-1

Tier-2

Users Users Users Users

Consolidators run Paxos to agree upon a global view by extracting incomplete updates from the reported snapshots

Paxos message

Tier-3(Stub)

4/14/2009 Department of Computer Science, UIUC

Page 23: What about the Network?

Stable Mode – Flood

23

A C

D E GF

B

H I J K

Tier-1

Tier-2

Users Users Users Users

Message contains the set of incomplete updates (I) and the set of ASes (S) that successfully responded to the snapshot

Flooding message

Tier-3(Stub)

4/14/2009 Department of Computer Science, UIUC

Page 24: What about the Network?

Stable Mode

• SFT Computation– SFT is computed using the global set of

incomplete updates (I) and local logs– Routes involving ASes not present in S are not

placed in the SFT

244/14/2009 Department of Computer Science, UIUC

What happens to those ASes?

How does this strategy achieve consensus in an asynchronous system?

Page 25: What about the Network?

Router State

• Routing Information Base (RIB)– Stores for each prefix the most recent

• Route update received from each neighbor• Locally selected best route• Route advertised to each neighbor

• History– Stores for each prefix a chronological list of received and

selected routes in the RIB

• Stable Forwarding Table (SFT)– Stores next hop interfaces corresponding to stable routes

254/14/2009 Department of Computer Science, UIUC

Page 26: What about the Network?

Triggers

• Each update carries a trigger• A trigger is a globally unique identifier for a set of

causally related events propagating the network– It is a two-tuple: (AS number, trigger number)

• Triggers ease tracking updates and reduces control overhead in consensus routing

• A router ‘A’ stores all the received triggers in its local History

• Triggers under processing are temporarily stored in a local set IA

264/14/2009 Department of Computer Science, UIUC

Page 27: What about the Network?

Distributed Coordination

27

• During snapshot, router ‘A’ saves the sequence of triggers in local History as HA

• Prepare a set of incomplete triggers (IA) that contains– All the triggers present in IA– Triggers waiting in the outgoing queues– Logged triggers received over incoming channels (after

the start of the current snapshot round)

• HA and IA are sent to the consolidators

4/14/2009 Department of Computer Science, UIUC

Page 28: What about the Network?

View Change

28

A

B C

D E

Destination (Y)Source (X)

Prefix - Y A B C D E

kth SFT B->C->D C->D D Y

(k+1)th SFT B->C->E C->E E Y Y

Use (k+1)th SFT

Hasn’t finished computing (k+1)th SFT

yetUse kth SFT

Send packet to Y

4/14/2009 Department of Computer Science, UIUC

Page 29: What about the Network?

Transient Mode

• Consensus routing switches to this mode when– The next-hop router along a stable route is

unreachable– A stable route may not be available

• Uses several known schemes– Routing deflection– Detour Routing– Backup route

294/14/2009 Department of Computer Science, UIUC

Page 30: What about the Network?

Route Deflection

• After encountering a failed link, deflect the packet to a neighboring AS after consulting RIB

• If no neighbor can be chosen, then deflect the packet back to the sending AS (backtracking)– However, backtracking alone

is not sufficient to guarantee reachability (see figure)

30

Limitations of backtracking

4/14/2009 Department of Computer Science, UIUC

D

5-D

D

1-5-D, 2-5-D, 3-5-D

DD

5-D5-D

D

Page 31: What about the Network?

Other Transient Schemes

• Detour Routing– After encountering a failed link, select a

neighboring AS (arbitrarily) and tunnel transient packets to it

– Tier-1 ASes are good choices in this selection

• Backup Routes– Use pre-computed backup routes to forward

packets during failure

314/14/2009 Department of Computer Science, UIUC

Page 32: What about the Network?

Evaluation

• Simulation Methodology– CAIDA AS-level graphs gathered from RouteViews

BGP tables• Includes 23,390 ASes and 46,095 links annotated with

inferred business relationships of the linked ASes

• Using XORP prototype to measure implementation overhead

• Using PlanetLab nodes to measure the cost of consensus

324/14/2009 Department of Computer Science, UIUC

Page 33: What about the Network?

Link Failure• One of the links of a multi-homed stub AS is failed during each

experiment

33

Consensus routing provides significantly higher levels of connectivity than BGP

4/14/2009 Department of Computer Science, UIUC

Page 34: What about the Network?

Effect of Traffic Engineering• Withdraw a subprefix from all but one of the providers (3 or

more) of a multi-homed AS

34

Consensus routing does not affect routing in case of policy changes

4/14/2009 Department of Computer Science, UIUC

Page 35: What about the Network?

Overhead

35

In terms of bandwidth and time, consensus routing incurs little overhead

Control traffic required by consensus routing

Delay incurred by consensus routing

4/14/2009 Department of Computer Science, UIUC

Page 36: What about the Network?

Discussion Points

• Selection of consolidators– Will Tier-1 ASes (or other ASes) agree to perform

this additional duty?

• Slow ASes may face periods of disconnectivity– How to handle this situation?

• What can we say about completeness and accuracy of this strategy?

• Will ASes readily cooperate to handle transient packets?

364/14/2009 Department of Computer Science, UIUC

Page 37: What about the Network?

CAIDA Tools

Presented by: Abdullah Al-Nayeem

Page 38: What about the Network?

CAIDA

• The Cooperative Association for Internet Data Analysis (CAIDA)– San Diego Supercomputing Center (SDSC), UCSD

• CAIDA provides data, tools and analyses on Internet traffic for better understanding of – current and future network topology, routing,

security, performance and economic issues.

4/14/2009 38Department of Computer Science, UIUC

Page 39: What about the Network?

CAIDA Tools

• Measurement– Tools for active or passive measurement of

Internet traffic and flow patterns

• Utilities – Utilities to aid analysis of Internet traffic and flow

patterns

• Visualization– Tools to visualize Internet data

4/14/2009 39Department of Computer Science, UIUC

Page 40: What about the Network?

Internet Measurement Infrastructure

• Archipelago (Ark): CAIDA’s next-generation active measurement infrastructure– An evolution of the skitter infrastructure

33 active monitorsat different counties.

4/14/2009 40Department of Computer Science, UIUC

Page 41: What about the Network?

Scamper• Measurement tool used at Ark monitors

• Teams of Scamper probers probe all routed /24's in a short period of time: – a random address in each /24 prefix is probed

approximately every 48 hours (one probing cycle)– Supports ICMP-Paris, TCP, UDP traceroute

• Features:– Measures forward IP paths– Measures round-trip time– Discovers maximum transmission unit (MTU) length

4/14/2009 41Department of Computer Science, UIUC

Page 42: What about the Network?

Scamper Datasets

• IPv4 Routed /24 Topology Dataset– Useful for understanding the topology of internet

• IPv4 Routed /24 AS Links Dataset – contains Autonomous System (AS) links derived

from the IP paths of the Topology Dataset– RouteViews BGP data is used to know the AS

4/14/2009 42Department of Computer Science, UIUC

Page 43: What about the Network?

Visualization of IPv4 Internet Topology

• 1-17 Jan, 2008• 4,853,991 IPv4 address • 5,682,419 IP links • 17,791 Ases • Outdegree of an AS is the number of next-hop

ASes that were observed accepting traffic from this AS

4/14/2009 43Department of Computer Science, UIUC

Page 44: What about the Network?

RRDTool• Round Robin Database tool

– A system to store and display time-series data– Network bandwidth, machine-room temperature, server

load average, etc.

• Features:– Archives of fixed size for unlimited data– Overwrite old spots if full

• Limitations:– Can’t add data for past events– Can’t add data twice at the same timestamp

4/14/2009 44Department of Computer Science, UIUC

Page 45: What about the Network?

RRDTool (2)

• Example: Statistics for network interfaces

4/14/2009 45Department of Computer Science, UIUC

Page 46: What about the Network?

Beluga

• Provides a real-time graph of RTTs and packet loss to an end host

Stanford to m-root-server(Tokyo)

4/14/2009 46Department of Computer Science, UIUC

Page 47: What about the Network?

Walrus

• Directed-graph visualization tool in 3D space• A meaningful spanning tree is required, for

better visualization.

4/14/2009 47Department of Computer Science, UIUC

Page 48: What about the Network?

Thanks

Questions and Comments?