The SAHARA Project: Composition and Cooperation in the New Internet Randy H. Katz, Anthony Joseph,...

41
The SAHARA Project: Composition and Cooperation in the New Internet Randy H. Katz, Anthony Joseph, Ion Stoica Computer Science Division Electrical Engineering and Computer Science Department University of California, Berkeley Berkeley, CA 94720-1776
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    217
  • download

    0

Transcript of The SAHARA Project: Composition and Cooperation in the New Internet Randy H. Katz, Anthony Joseph,...

The SAHARA Project:Composition and Cooperation

in the New Internet

Randy H. Katz, Anthony Joseph, Ion StoicaComputer Science Division

Electrical Engineering and Computer Science DepartmentUniversity of California, Berkeley

Berkeley, CA 94720-1776

Presentation Outline

• Service Architecture Opportunity• SAHARA Project and Architecture• Routing as Service Composition• Summary and Conclusions

Presentation Outline

• Service Architecture Opportunity• SAHARA Project and Architecture• Routing as Service Composition• Summary and Conclusions

The New Opportunity

• New things you can do inside the network• Connecting end-points to “services” with

processing embedded in the network fabric• Not protocols but “agents,” executing in places

in the network• Location-aware, data format aware• Controlled violation of layering necessary!• Distributed architecture aware of network

topology• No single technical architecture likely to

dominate: think overlays, system of systems

Services in Converged Networks

Services in Converged Networks

Presentation Outline

• Service Architecture Opportunity• SAHARA Project and Architecture• Routing as a Service Composition• Summary and Conclusions

The SAHARA Project

• Service• Architecture for• Heterogeneous• Access,• Resources, and• Applications

Composition Scenario:Universal In-box

– Message type (phone, email, fax)

– Access network (data, telephone, pager)

– Terminal device (computer, phone, pager, fax)

– User preferences & rules

– Message translation & storage

Separate end device andnetwork from end-to-endcommunications service:indirection via compositionof translators with access

SAHARA Focus• New mechanisms, techniques for end-to-end

services w/ desirable, predictable, enforceable properties spanning potentially distrusting service providers– Tech architecture for service composition & inter-operation

across separate admin domains, supporting peering & brokering, and diverse business, value-exchange, access-control models

– Functional elements• Service discovery• Service-level agreements• Service composition under constraints• Redirection to a service instance• Performance measurement infrastructure• Constraints based on performance, access control,

accounting/billing/settlements• Service modeling and verification

“The Network Effect”

• Creation and deployment of new services– Achieving desirable end-to-end properties,

e.g., by controlling the end-to-end path– Deploying computation and storage INSIDE the

network

• BUT new networks are expensive; evolving existing networks virtually impossible– E.g., Cost of 3G licenses and networks– “Even if I had $1 billion and set up 1000s of locations,

I could never in my network have a completely ubiquitous footprint.”—Sky Dayton, founder of Boingo

– QoS: IntServ, DiffServ; New Function: Multicast, …

• Approaches:– Composition, Overlays, Peering– Cooperation, Brokering

AccessNetwork

s

Core Networks

Internet Connectivity and Processing

Transit Net

Transit Net

Transit Net

PrivatePeering

NAP

PublicPeering

InternetDatacenter

PSTNRegional

WirelineRegionalVoiceVoice

CellCell

Cell

CableModem

LAN

LAN

LAN

Premises-based

WLAN

WLAN

WLAN

Premises-based

Operator-based

H.323Data

Data

RAS

Analog

DSLAM

H.323

Interconnected World:Agile or Fragile?

• Baltimore Tunnel Fire, 18 July 2001– “… The fire also damaged fiber optic cables, slowing Internet

service across the country, …”– “… Keynote Systems … says the July 19 Internet slowdown was

not caused by the spreading of Code Red. Rather, a train wreck in a Baltimore tunnel that knocked out a major UUNet cable caused it.”

– “PSINet, Verizon, WorldCom and AboveNet were some of the bigger communications companies reporting service problems related to ‘peering,’ methods used by Internet service providers to hand traffic off to others in the Web's infrastructure. Traffic slowdowns were also seen in Seattle, Los Angeles and Atlanta, possibly resulting from re-routing around the affected backbones.”

– “The fire severed two OC-192 links between Vienna, VA and New York, NY as well as an OC-48 link from, D.C. to Chicago. … Metromedia routed traffic around the fiber break, relying heavily on switching centers in Chicago, Dallas, and D.C.”

Internet Routing Realities

• Provider-customer vs. peer-to-peer• Relationships established by BGP

protocol• Charging based on traffic volumes

ISP A

ISP B

Hot PotatoRouting

PeeringPoint

PeeringPoint

Mobile Virtual Network Operator:Composition and Cooperation

one2one

one2one

1-to-1 Relationship

InterCall

M-to-N Relationships

Competition

PeeringPolicy-Based Routing

• Multi-homing– Reliability of network connectivity– Traffic discrimination

End Network

PrimaryTransit

Network

AlternativeTransit

NetworkPeer

NetworkPeer

NetworkPeer

NetworkPeer

Networks

BerkeleyCampus

CalREN

ResearchTraffic

DormTraffic

Fail-over

New PrimaryTransit

IsolatedIntra-cloud

service

Traditionalunicastpeering

Administrativedomain

Admindomain

Administrativedomain

Admindomain

Admindomain

OverlaysCreating New Interdomain

Services• Deploy new services above the routing

layer– E.g., interdomain multicast management and peering– E.g., alternative connectivity for performance, resilience

Steve McCanne

Single LocationNetwork Operator

(SLN)Single LocationNetwork Operator

(SLN)CooperativeNetworking

Full ServiceNetworkOperator

Full ServiceNetworkOperator Premises-based

Access

Wireless ISP Composition

Full ServiceNetworkOperator

Single LocationNetwork Operator

(SLN)

SLN Aggregator

WISP Aggregator

RevenueSharing

Single Sign-onUnified Billing

Billing, ECommerceAuthentication

Inter-site Mobility

Private Brand NetOperator (MVNO)

VPN Operator, Client-Software

Layered Reference Modelfor Service Composition

• Connectivity Plane– End-to-end network with desirable properties

composed on top of commodity IP network– Enhanced Links & Paths: QoS and protocol

verification within and between connectivity service providers

• Applications Plane– Services strategically placed and actively managed

within the network topology– Applications and Middleware Services: end-client

oriented vs. infrastructure oriented

Layered Reference Model for Service Composition

IP Network

Enhanced Links

Enhanced Paths

End-to-End NetworkWith Desirable Properties

Middleware Services

Applications Services

End-User Applications

Connect

ivit

yPla

ne

Applic

ati

on

Pla

ne

Serv

ice

Com

posi

tion

Presentation Outline

• Service Architecture Opportunity• SAHARA Project and Architecture• Routing as Service Composition• Summary and Conclusions

Routing as a Composed Service

• Routing as a Reachability “Service”– Implementing paths between composed service instances,

e.g., “links” within an overlay network– Multi-provider environment, no centralized control

• Desirable Properties– Trust: verify believability of routing advertisements– Agility: converge quickly in response to global routing

changes to retain good reachability “performance” (e.g., latency)?

– Reliability: detect service composition path failures quicklyto enable fast recomposition to maintain reachability

– Scalability and Interoperability: Adapt protocols via processing at “impedance” matching points between administrative domains

Characterizing the Internet Hierarchy from Multiple Vantage Points

• Customer-Provider Relationships– Customer pays provider for Internet access– AS exports customer’s routes to all neighbors– AS exports provider’s routes only to its customers

• Peer-to-Peer Relationships– Peers exchange traffic between their customers – Free of charge (assumption of even traffic load)– AS exports a peer’s routes only to its customers

Sharad Agarwal. Lakshmi Subramanian, Jennifer Rexford

Knowing These Relationships Matters!

• Useful for:– Placement of servers for content distribution– Selection of new peers or providers for an AS– Analyzing convergence properties of BGP– Installing route filters to protect against misconfiguration– Understanding basic structure of the Internet

• Knowing the AS graph is Not Enough– Interdomain routing is not shortest-path routing– Some paths not allowed (e.g., transit through a peer)– Local preference of paths (e.g., prefer customer path)– Node degree does not define the Internet hierarchy

• Need to Know Relationship between AS Pairs

Revealed Structure

• Peer-peer relationships hard to infer– Mislabeling peer-peer edge as provider-

customer does not change valid path into invalid

– Heuristics to detect peer-peer edges • Some AS pairs unusually related

– Siblings providing mutual transit– Backup relationship for connectivity under

failure– Misconfiguration of conventional

relationship– Detect such cases by analyzing “invalid”

paths• Access to large path set is hard

– Exploit BGP routing tables from multiple vantage points (10 public BGP tables)

8898 AS’s

971 AS’s

897 AS’s

129 AS’s

20 AS’s

April 200111K ASs

24K edges

Policy Management for BGP

• Integrate BGP with a new Policy Agent control plane– Improved BGP convergence

through explicit fail over policies– Constrained routing for

performance or trust reasons– Traffic discrimination, low

quality vs. high quality connectivity or fair use issues

– Load balancing outbound and inbound flows for multi-homed ASs

– Sharad Agarwal’s Ph.D. thesis, currently interning at Sprint ATL

Agility in Response to Route Changes:

Internet Converges Slowly• Convergence Times [Labovitz et al.]

– Theory: O(n!) (n: number of ASes)– Practice: linear with the longest backup path length– Measurement: up to 15 minutes

• Why so slow?– BGP protocol effects: path exploration– Route flap damping!?

• Delay convergence of relatively stable routes• Unexpected interaction between flap damping and

convergence

Morley Mao, Ramesh Govindan, George Varghese

How Does Flap Damping Work?

Reuse threshold

Time

Pen

alt

y

Suppress threshold

Exponentially decayed

RFC2439:• For each peer, per

destination, keep penalty value, increase it for each flap

• Flap is a route change• Penalty decays exponentially

• Parameters:– Fixed: Penalty increment– Configurable: half-life,

suppress-, reuse-threshold, max suppressed time

)'()()'( ttetPtP

A Better Way:Selective Route Flap

Damping• Flaps happen because of certain topologies among

routers, causing triggered announcements and withdrawals—these are not toy scenarios

• Approach: ignore flap sequences indicating path exploration—these are likely to trigger more changes in near future

• In essence, we redefine what constitutes a flap:– From “any route change is considered a flap” to “must

alter direction of route preference value change, relative to flaps”

– Flaps due to withdrawal: increasing ASPath lengths, route value keeps decreasing

• Morley Mao Ph.D. dissertation, currently interning at AT&T Labs

• Stability achieved through flap damping [RFC2439]• BUT unexpected:flap damping delays

convergence!

Topology: clique of routers

Selective flap damping– Duplicate suppression: ignore flaps

caused by transient convergence instability

– Eliminates undesired interaction without sacrificing stability

Trusting the Routing InfrastructureBGP Route Verification

• BGP protocol vulnerable– Single misconfigured router can cause long outages– Malicious routers can cause larger damage

• Pretend to be a genuine end-host!!!• Misroute or sniff on traffic• Potential collusion with other malicious nodes?

• Verify BGP routes without PKI-based authentication?– Secure-BGP, tier-1 ISP proposal, yet to be deployed

• Assumed an Internet wide PKI with ICANN as root!

Approach:Detection and Containment

• Misconfiguration affects reachability– Roughly 6% of misconfigurations cause reachability

problems [Mahajan02]– “Passive” TCP-probing: modified nodes watch TCP traffic

to detect reachability problems• No modifications to BGP, incrementally deployable, but

ineffective for detecting malicious hosts

• Contain malicious nodes– Without authentication, can’t distinguish between

genuine and malicious hosts• Two BGP enhancements--hash chains, loop-testing• Avoid routes through nodes (misconfigured/malicious)

affecting routes to multiple destinations

• Lakshmi Subramanian Ph.D. Dissertation

Overlay Approach for Achieving Desirable Performance:

OverQoS• Embed QoS functionality in Internet via overlays

– Overlay nodes implement QoS functions– No support needed from IP routers

• Virtual Links– Underlying path between two OverQoS routers– Characterized by three time-varying parameters

• Available bandwidth, b(t), using fairness criterion(e.g., N TCP flows) or by explicit SLA with ISP

• Loss rate, p(t)• Delay, d(t)

• Challenges– Nodes not connected to congested points, have no control

on cross-traffic, cannot avoid losses (reducing sending rate doesn’t help!)

Lakshmi Subramanian, Hari Balakrishnan, Ion Stoica

ArchitectureAS

AS AS

AS

ASAS

IP IP IP IP

Virtual links

OverQoS routers

AS

Controlled-Loss Virtual Link (CLVL)• Control losses if you can’t avoid them

– Aggregate a set of flows along a virtual link in a bundle– Protect the bundle’s traffic against losses– Redistribute b/w and loss across flows in a bundle at entry node

• Two parameters:– Statistical bound on loss rate, q (<= p; typically << p)– Capacity, c(t), possibly time-varying

• Can prove: if offered load < c(t), then loss rate < q• c provided in two ways:

– Implicit: b is bundle’s bandwidth; c is some part of b– Explicit: via provisioning in underlying Internet path

Buffer mgmt &Scheduling & Traffic regulator

Coderc(t), q De-

coderb(t), p(t)

Flow 1

Flow 2

Flow n

OverQos Nodecontrol planeCLVL

Textto

audio

Textto

audio

Text Source

Text Source

• > 15 s outage• BGP recovery much worse!

[Labovitz’00]

• End-to-end recovery in 3.6 s: 2 s detect, ~600 ms signaling, ~1 s state restoration

•Detect & recover from failures via service replicas

•Aggressive heartbeat msgs:

– Quick detection (~2 s)– Scalable messaging for

recovery (1000s of clients)•Load balancing + slack

service provisioning to handle fast path fall-over

•Wide-area/multi-provider composition•Fast recovery improves service availability

Reliability in Wide-AreaService Composition

Wide-area Experiment: UCB, Berk. (Cable), SF (DSL), Stan., CMU, UCSD, UNSW (Aus), TU-Berlin

(Germany)Bhasker Raman

Scalability and Interoperability: Multicast Broadcast Federation

• Compose non-interoperablem/c domains to provide e2e m/c service– IP and App-layer protocols

• Overlays of Broadcast Gateways (BGs)– Peering between domains– Internal m/c inside domain– Clustered gateways for

scalability across domains– Independent data flows

and control flow

• Implementation :– Linux/C++ event-driven

program– Customizable interface to local

multicast (~700 lines)– 1 Gbps BG thruput with 6

nodes– 2500 sessions with 6 nodes

Source

Clients

BG

Broadcast Domains

PeeringData

CDN

IP Mul

SSM

Mukund Seshadri, Yatin Chawathe

Presentation Outline

• Service Architecture Opportunity• SAHARA Project and Architecture• Routing as Service Composition• Summary and Conclusions

SAHARA Project• Evolve Internet architecture to better support

multi-network/multi-service provider model– Dynamic environment, large numbers of service

providers & service instances– Achieve desirable properties across multiple, potentially

distrusting (Internet) service providers– Exploit PlanetLab infrastructure to construct wide-area

prototype

• Routing as a composed service– Trust: BGP Verification/Detection + Containment– Agility: Fast Convergence– Reliability: Keep-Alive Messaging– Scalability: Clustered Gateways– Interoperability: M/C Protocol Transformation– New Policy/Control Planes

New Service ArchitectureIntegrated Communications and

Processing

• Increasing diversity of interconnected devices• Increasing importance of “services” to mitigate

diversity/provide new functionality and customization• Enabled by processing embedded in the network

interconnect, locally and globally– “Active networking” is real

• Global services via managed composition– Role of multiple service providers and administrative domains– Separation of services from connectivity via overlays– No single operator deploys the global service

The SAHARA Project:

Composition and Cooperationin the New

Internet

Randy H. Katz

Thank You!